See all blog posts

Gocqlx: A Productivity Toolkit for ScyllaDB in Go

Gocqlx is an extension to the Go ScyllaDB / Apache Cassandra driver Gocql. It aims to boost developer productivity while not sacrificing query performance. It’s inspired by Sqlx, a tool for working with SQL databases, but it goes beyond what Sqlx provides.

For this blog post, we will pretend we’re a microblogging service and use the following schema:

Gocql is a very popular Cassandra driver for the Go programming language. Usually working with it looks more or less like this (source: Gocql README):

At first glance, it looks ok but there are some problems:

  • Gocql does not provide you with named query parameters. This means that you will have to watch the parameters order while binding. This is not very flexible and can easily lead to errors detected only at runtime.
  • Scanning a row into a struct fields involves providing proper pointers to the Scan method. Like before, order matters and you have to provide the exact list of columns in the query and keep it in sync with the scan parameters. The lists can quickly become quite long in a production system.
  • Loading data into memory involves writing scan loops that look almost the same except that the target type is different.

Gocqlx builds on top of Gocql to eliminate those issues and provides:

  • Builders for creating queries
  • Support for named parameters in queries
  • Support for binding parameters from struct fields, maps, or both
  • Scanning query results into structs based on field names
  • Convenience functions for common tasks like loading a single row into a struct or all rows into a slice (list) of structs

Building queries

Gocqlx provides query builders for SELECT, INSERT, UPDATE, DELETE, and BATCH statements. The builders create a CQL query and a list of the named parameters used in it. Let’s take a look at the sample code for a builder:

The builders promote the use of named parameters. The produced CQL does not contain constants or values which allow us to leverage the Gocql prepared statement cache. Gocqlx implements the full spec of SELECT, INSERT, UPDATE, DELETE, BATCH, and can build advanced queries. For the sake of this blog post, we will keep things simple.

Gocqlx also supports queries with named parameters (‘:’ identifier). Such queries are automatically rewritten to standard Gocql queries and parameter names are extracted. Building queries, however, should be preferred to compiling as it’s more flexible and faster.

Binding query parameters

Once we have a CQL query and a list of parameter names, Gocqlx can help with binding the query parameters. Gocqlx can bind parameters from struct fields, maps, or both (bind from struct and fallback to map). Thanks to that, many operations become much simpler to implement. For example, updates:

It’s worth noting that Gocqlx by default maps camelcase Go struct field names to snake case database columns (i.e. “InReplyToScreenName” to “in_reply_to_screen_name”) so there is no need for manual tagging.

Scanning Rows

Gocqlx provides two convenience functions: Get and Select. The former scans the first query results into a struct. The latter scans all the query results into a slice. This greatly simplifies reading data into memory.

If scanning all of the rows is not desired, one can use struct scanning on a query iterator.

Performance

Unlike many ORMs, Gocqlx is fast. It uses the excellent Reflectx package (part of Sqlx) for cached reflections.

For iterative rebinding of query parameters, i.e. insert multiple rows into a table, Gocqlx proved to be significantly faster than raw Gocql. That’s because of Gocqlx, compared to the traditional use of Gocql, reuses memory for the values being bound.

Below is a result of a Go benchmark comparing INSERT, Get, and Select with Gocqlx to plain Gocql.

BenchmarkE2EGocqlInsert-4    500000   258434 ns/op   2627 B/op   59 allocs/op
BenchmarkE2EGocqlxInsert-4  1000000   120257 ns/op   1555 B/op   34 allocs/op
BenchmarkE2EGocqlGet-4      1000000   131424 ns/op   1970 B/op   55 allocs/op
BenchmarkE2EGocqlxGet-4     1000000   131981 ns/op   2322 B/op   58 allocs/op
BenchmarkE2EGocqlSelect-4     30000  2588562 ns/op  34605 B/op  946 allocs/op
BenchmarkE2EGocqlxSelect-4    30000  2637187 ns/op  27718 B/op  951 allocs/op

Conclusions

Gocqlx gives you much flexibility and significantly simplifies working with a ScyllaDB / Apache Cassandra database. The code is faster to write and easier to maintain and eliminates repetitive code and replaces it with more idiomatic constructs. The modular and simple design enables Gocqlx to live along with Gocql and leverage it where Gocql shines. Gocqlx is fast and optimized.

Gocqlx is licensed under Apache License 2.0.

Apache®, Apache Cassandra®,  are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.

 

About Michal Matczuk

Michał is a Software Team Leader working on ScyllaDB Manager, Drivers and ScyllaDB Cloud. He's the author of GocqlX, an ORM framework for ScyllaDB, and contributor to many open-source projects. He holds an MS in CS and BS in Math from the University of Warsaw (MIM UW).