Database abstractions for Golang

John Wang
9 min readJun 30, 2023

--

At Assembled, we’ve been using Golang as our exclusive backend language since our founding in 2018. We run a pretty standard web application, but we found that accessing the database comes with its own particular set of challenges that haven’t been fully addressed by the Go standard library or community packages.

In this article, we’ll talk about 3 abstractions we’ve built at Assembled that make database access in Golang easier:

  • An interface to share code between single- and multi-row getters
  • A helper method to ensure you’re always handling errors and closing rows when scanning from the database
  • An interface to share code between transactions and non-transactions

Challenge 1: Writing performant, reusable SQL queries

The problem: Sharing code between single and multi-row getters

When we first started writing SQL queries, we dutifully wrote raw SQL like many Golang tutorials told us. But we soon ran into problems with this approach. Let’s say you’re writing an e-commerce application, then you might have the following method to get the information for a particular order:

type Order struct {
ID string
ItemID string
Price int
}

func GetOrder(id string) (*Order, error) {
var order Order

row := db.QueryRow("SELECT id, item_id, price FROM orders WHERE id = $1;", id)
err := row.Scan(&order.ID, &order.ItemID, &order.Price)
if err != nil {
return nil, err
}
return &order, nil
}

This is great if you only need to get one order, but what if you want to implement a page where a customer can see all their orders and now you need to add a method to fetch multiple orders? The easiest way to reuse your old code is by getting all the order ids that match and then reusing that original method that you wrote for GetOrder().

func GetAllOrders() ([]Order, error) {
// Pull out the ids of all of the orders
rows, err := db.QueryRows("SELECT id FROM orders;")
if err != nil {
return nil, err
}
defer rows.Close()

var ids []string
for rows.Next() {
var id string
if err := rows.Scan(&id); err != nil {
return nil, err
}
ids = append(ids, id)
}

// For each id, reuse the GetOrder code. Notice that we must run a
// separate database query for every order
var orders []Order
for _, id := range ids {
order, err := GetOrder(id)
if err != nil {
return nil, err
}
orders = append(orders, order)
}
return orders, nil
}

The problem with the above is that you’re now making O(# of orders) queries. This is expensive and non-performant because:

  • Postgres has to parse and generate a query plan for every query
  • You’ll add the packet roundtrip time from your webserver to the database to every request, which can blow up very quickly if you have lots of requests [0].

The solution: Create an abstraction for scanning a database row

To solve this problem at Assembled, we introduced an abstraction for scans. The important insight here is to realize that whether you’re scanning a single database row or multiple database rows, you should be performing the same operations. You always want to populate the same fields on an Order every time you pull one out of the database (whether you’re fetching one order or multiple). So we created a Scannable interface that hides the way in which you’re fetching a database row.

type Scannable interface {
Scan(dest ...interface{}) error
}

Now, you can pass in either sql.Row or sql.Rows into a single method and perform the same operation. Here’s an example of how you might use the Scannable interface to reuse code:

var orderAttributes = []string{
"id",
"item_id",
"price",
}

func ScanOrder(row Scannable) (*Order, error) {
var order Order

err := row.Scan(&order.ID, &order.ItemID, &order.Price)
if err != nil {
return nil, err
}
return &order, nil
}

func GetOrder(id string) (*Order, error) {
query := fmt.Sprintf("SELECT %s FROM orders WHERE id = $1;",
strings.Join(orderAttributes, ","))

row := db.QueryRow(query, id)
return ScanOrder(row)
}

func GetOrders(ids []string) ([]Order, error) {
query := fmt.Sprintf("SELECT %s FROM orders WHERE id = ANY($1);",
strings.Join(orderAttributes, ","))

rows, err := db.Query(query, pq.Array(ids))
if err != nil {
return nil, err
}
defer rows.Close()

var orders []Order
for rows.Next() {
order, err := ScanOrder(rows)
if err != nil {
return nil, err
}
orders = append(orders, *order)
}
return orders, nil
}

Now the total time to run GetOrders is just a single roundtrip time to your database plus the time it takes to select your matching orders and return them from Postgres. In addition to the query speed improvements, you’ve reduced the number of database queries to a constant number for each GetOrders call and significantly decreased database load. Finally, you’ve also made the code easier to reason about and refactor because there is only a single point of entry when you update an attribute on the Order struct.

Challenge 2: Remembering to close a set of rows

The problem: At some point, you’re going to forget to close your rows

Nothing is certain, except death, taxes, and forgetting to close your rows.
— Benjamin Franklin (probably)

One of the nasty things about Golang’s SQL driver is the mandatory call torows.Close() after completion which releases your connection back into your pool. Failure to call this method results in increased latency, escalating connection pool sizes, and in the worst-case scenario, outages during holidays when no one is deploying.

Unfortunately, this is one of the hardest problems to debug if you don’t know what you’re looking for. You have to step through a giant codebase, looking for those places where someone forgot to call rows.Close(). Let me tell you — it’s not easy to find these instances.

The solution: A helper method where you can’t forget

How did we fix this problem at Assembled? We had weekly trainings to remind everyone to never ever ever forget to call defer rows.Close() and publicly shamed engineers who still forgot.

Just kidding — we created a better abstraction via the ScanRows helper method:

type Rows interface {
Close() error
Err() error
Next() bool
Scan(dest ...interface{}) error
}

func ScanRows(r Rows, scanFunc func(row Scannable) error) error {
var closeErr error
defer func() {
if err := r.Close(); err != nil {
closeErr = err
}
}()

var scanErr error
for r.Next() {
err := scanFunc(r)
if err != nil {
scanErr = err
break
}
}
if r.Err() != nil {
return r.Err()
}
if scanErr != nil {
return scanErr
}

return closeErr
}

Notice that ScanRows will always close the rows after it’s finished with them. The function has an added convenience benefit too: it contains error handling that previously was copy pasted over and over again by every engineer.

Here’s how it would work in our GetOrders function:

func GetOrders(ids []string) ([]Order, error) {
query := fmt.Sprintf("SELECT %s FROM orders WHERE id = ANY($1);",
strings.Join(orderAttributes, ","))

rows, err := db.Query(query, pq.Array(ids))
if err != nil {
return nil, err
}

var orders []Order
err := models.ScanRows(rows, func(row Scannable) error) error {
order, err := ScanOrder(rows)
if err != nil {
return err
}
orders = append(orders, *order)
return nil
})
if err != nil {
return nil, err
}

return orders, nil
}

Challenge 3: Reusing queries inside of transactions

The problem: Sharing SQL between transactions and non-transactions

Let’s say you just wrote a method to store an Order into your database:

func StoreOrder(db *sql.DB, order Order) error {
_, err := db.Exec("INSERT INTO orders (item_id, price) VALUES ($1, $2)",
order.ItemID,
order.Price,
)
if err != nil {
return err
}

return nil
}

There are a couple of ways you might want to call this method:

  1. Use StoreOrder directly. For example, if you’re syncing orders from Stripe
  2. Use StoreOrder in conjunction with other database methods. For example, if someone makes a purchase on your site, you want to store both payment information and order information at the same time

In case 1, you don’t want to store orders in a transaction — long running transactions can be bad for database performance, so you can simply use your StoreOrder method that you’ve already written. But in case 2, you do want to store your order in a transaction, so you have to add some additional code. Here’s what it ends up looking like:

func StoreOrder(db *sql.DB, order Order) error {
_, err := db.Exec("INSERT INTO orders (item_id, price) VALUES ($1, $2)",
order.ItemID,
order.Price,
)
if err != nil {
return err
}

return nil
}

func StoreOrderTx(tx sql.Tx, order Order) (*Order, error) {
_, err := tx.Exec("INSERT INTO orders (item_id, price) VALUES ($1, $2)",
order.ItemID,
order.Price,
)
if err != nil {
return err
}

return nil
}

// SyncOrderFromStripe takes an order from Stripe and syncs it into your database without a transaction
func SyncOrderFromStripe(db *sql.DB, stripeID string) (*Order, error) {
stripeOrder, err := stripeClient.Get(stripeID)
if err != nil {
return err
}
order := Order{ItemID: stripeOrder.Items[0].ID, Price: stripeOrder.Amount}
return StoreOrder(db, order)
}

// StoreOrderAndPayment stores the order and payment information in a single transaction
func StoreOrderAndPayment(db *sql.DB, order Order, payment Payment) (*Order, *Payment, error) {
tx, err := db.Begin()
if err != nil {
return nil, nil, err
}

storedOrder, err := StoreOrderTx(tx, order)
if err != nil {
return nil, nil, err
}
storedPayment, err := StorePaymentTx(tx, payment)
if err != nil {
return nil, nil, err
}

err = tx.Commit()
if err != nil {
return nil, nil, err
}
return storedOrder, storedPayment, nil
}

Notice that you have to basically copy everything inside of StoreOrder into StoreOrderTx with the only difference being that in the former you run the method on sql.DB whereas in the latter you run it on sql.Tx.

This is a lot of unfortunate code copying, and if you change any attribute in Order, you have to remember to update both StoreOrder and StoreOrderTx. And let’s face it, at some point someone is going to forget and cause a bug.

The solution: Interface for database-like objects and a helper for transactions

Instead of copying code, notice that the StoreOrder method doesn’t really care whether it’s operating on sql.DB or sql.Tx, it just cares that it can write to the database. This is a perfect time to bring in the Database abstraction to hide this away:

type Database interface {
Query(query string, args ...interface{}) (*sql.Rows, error)
QueryRow(query string, args ...interface{}) *sql.Row
Exec(query string, args ...interface{}) (sql.Result, error)
}

Now you can delete your StoreOrderTx method because both sql.DB and sql.Tx will implement the Database interface, which can greatly simplify your code:

func StoreOrder(db Database, order Order) error {
_, err := db.Exec("INSERT INTO orders (item_id, price) VALUES ($1, $2)",
order.ItemID,
order.Price,
)
if err != nil {
return err
}

return nil
}

// SyncOrderFromStripe takes an order from Stripe and syncs it into your database without a transaction
func SyncOrderFromStripe(db *sql.DB, stripeID string) (*Order, error) {
stripeOrder, err := stripeClient.Get(stripeID)
if err != nil {
return err
}
order := Order{ItemID: stripeOrder.Items[0].ID, Price: stripeOrder.Amount}
return StoreOrder(db, order)
}

// StoreOrderAndPayment stores the order and payment information in a single transaction
func StoreOrderAndPayment(db *sql.DB, order Order, payment Payment) (*Order, *Payment, error) {
tx, err := db.Begin()
if err != nil {
return nil, nil, err
}

storedOrder, err := StoreOrder(tx, order)
if err != nil {
return nil, nil, err
}
storedPayment, err := StorePayment(tx, payment)
if err != nil {
return nil, nil, err
}

err = tx.Commit()
if err != nil {
return nil, nil, err
}
return storedOrder, storedPayment, nil
}

The Database abstraction allows you to create methods for storing and getting from the database that don’t care whether they’re used in a transaction or not.

Conclusion

We came up with a set of abstractions that solve some common problems we’ve run into while running Golang and PostgreSQL in production. The abstractions are pretty simple, but they’ve saved us a ton of headaches in production and made it much easier to reason about the code we write.

If you’re interested in building on top of these abstractions (or creating more of them), reach out to me at john@assembled.com or check out our careers page: https://www.assembled.com/careers-at-assembled.

[0]: It’s especially expensive if your web server and your database aren’t co-located — this was the case for Assembled in some cases as we started to build out a more global infrastructure. If the query to select your matching order ids takes 100ms and if it takes 30ms to round trip from your webserver to your database, and let’s say it takes 10ms to fetch a single order from Postgres, then a measly 30 orders will take (30ms + 100ms) + (30ms + 10ms)*30 = 1.3s. That’s unacceptable performance for most self-respecting e-commerce applications. Of course, you can always try to colocate your webserver and your database in the same datacenter, but that has its own set of problems. In addition, you’d still be making O(# of orders)separate database connections which can very quickly cause degraded database performance if you’re not careful.

Many thanks to Anthony Duong and Ryan Wang for reading drafts of this.

--

--

Responses (2)