Case Study: How Hyver uses ScyllaDB for API Key Management

This case study first appeared on the Compose website as part of their Write Stuff series.

Building a platform means building an API to go with it and building an API means managing the keys to that API efficiently. Kirk Morales, CEO at Hyver, has used ScyllaDB to solve this tricky problem and in this Write Stuff article, he shows how you can too.

When building a platform, clients will need to interface with it in various ways. At first, basic user authentication may be all that’s necessary, but as you start allowing for integrations and open your platform up to developers, you’ll need to develop an API key management solution that scales.

At Hyver, we consider what we’ve built a platform more than just a product. As such, we realize more and more types of clients and integration points for pushing and pulling data to/from us. We made the mistake early on of building highly-specific authentication mechanisms for our immediate clients: a web user and Zapier. Instead of a generic key management system, we authenticated Zapier requests uniquely and even had a custom table and columns for keys in our database. As our need for additional integration points surfaced, we realized we had to re-build how we authenticate requests and the underlying architecture for doing so.

ScyllaDB is great for storing, managing, and performing quick lookups of API keys when needed for authentication. Due to its high throughput and fast reads, request latency isn’t increased drastically as it could be with other databases. Also, the schema and read/write operations are incredibly simple to maintain.

Defining the schema

At a high level, we need to consider the following entities:

Permissions
A set of permissions for an API key, defining what its capabilities are. Each API key should have a unique set of permissions so as to control how it can be used. If extra or fewer permissions are needed by another client, a brand new API key should be generated for that specific client.

For the sake of simplicity, we’ll define permissions by using just a string, consisting of one of the following values:

management – Access to management functionality, minus billing, e.g. updating account settings or getting a list of objects within an account.

billing – Ability to manage an account’s billing settings, including the subscribed plan and card on file.

full – Full access to the account.

Accounts
This is our parent object, to which API keys will belong and the object being authenticated. Whatever a key’s permissions will be allowed only on the account it’s associated with.

At a minimum, we’ll need a table to store accounts with a map containing its API keys.

CREATE TABLE accounts (  
  account_id uuid PRIMARY KEY,
  api_keys map<uuid, text>
);

We use a map type here for a few reasons. First, storing the keys directly in the accounts table prevents us from having to make subsequent SELECT statements to read all keys for an account. Additionally, we can update multiple keys at once with a single UPDATE command. Lastly, since we’re storing permissions as a basic string, retrieving API keys as a map allows us to do a quick permissions lookup in code.

API Key
A key that is provided in the request, belonging to a single account. An account may have multiple API keys for different purposes. When creating the table to store keys, we want to make sure each key is linked to the account it belongs to as well as the appropriate permissions for a quick look up.

CREATE TABLE api_keys (  
  api_key uuid PRIMARY KEY,
  account_id uuid,
  permissions text
);

Since a client will only ever be sending an API key with a request, we need API keys to be unique to a single account (therefore, unique universally), so we place the PRIMARY KEY only on api_key. If we made it across (api_key, account_id), we could have duplicate API keys in the wild, each associated with a different account…very bad.

Now, you may be wondering why we split these out. Having redundant data in both the accounts and api_keys tables seems like a waste of space. Extra storage? Yes. A waste? Absolutely not.

We could put an index on account.api_keys to use when authenticating requests, but due to how ScyllaDB handles indexes, this is wildly inefficient and much slower. It is recommended not to use indexes on high-cardinality values and instead store them in a separate table. Read this Datastax article for more insight as to why.

So, by creating another table, yes, we’re increasing our storage footprint, but we’re ensuring a fast and efficient lookup.

Creating API Keys

Let’s start to fill up our database. First, we’ll create an account.

INSERT INTO accounts (account_id) VALUES (uuid());  
SELECT account_id FROM accounts;

From this, we’ll get a new Account ID, let’s say 4e45f0e8-ade5-48fd-862d-d657f299828b.

Next, we’ll define three API keys, one for each permission type of “full”, “management”, and “billing”.

UPDATE accounts  
SET api_keys = {uuid(): 'full', uuid(): 'management', uuid(): 'billing'}  
WHERE account_id = 4e45f0e8-ade5-48fd-862d-d657f299828b;

In this example, we’re letting CQL create new UUIDs for us, however, if we already have IDs we want to use (such as ones generated in code), we can set them in a similar fashion:

UPDATE accounts  
SET api_keys = {  
  4f6f1cf5-ba62-4b9c-be39-ee73a02e6dab: 'full', 
  462d3e55-4903-405f-aeec-cb46b327b025: 'management', 
  282a3b8b-f20c-4b5c-b38b-ec2e7d943c32: 'billing'
}
WHERE account_id = 4e45f0e8-ade5-48fd-862d-d657f299828b;

We now have an account with three active API keys. Last thing to do is add them to our api_keys table for use in our API when authenticating requests. We’ll add each key as a separate record, associating them with the account and the appropriate permissions:

INSERT INTO api_keys (api_key, account_id, permissions) VALUES (4f6f1cf5-ba62-4b9c-be39-ee73a02e6dab, 4e45f0e8-ade5-48fd-862d-d657f299828b, 'full');  
INSERT INTO api_keys (api_key, account_id, permissions) VALUES (462d3e55-4903-405f-aeec-cb46b327b025, 4e45f0e8-ade5-48fd-862d-d657f299828b, 'management');  
INSERT INTO api_keys (api_key, account_id, permissions) VALUES (282a3b8b-f20c-4b5c-b38b-ec2e7d943c32, 4e45f0e8-ade5-48fd-862d-d657f299828b, 'billing');

Authenticating Requests

Now for the easy part — when a client makes a request, to check the validity of the supplied API key and get permissions, we make one request:

SELECT * from api_keys WHERE api_key = SOME_KEY;

We’ll get the Account ID (which we’ll likely need to execute the requested operation) and will get the text-based permissions for the key, telling us if we can authorize the request. From here, there are a few possibilities:

The API key doesn’t exist – we’ll throw a 401 Unauthorized
The API key exists, but its permissions don’t match the request. We can throw a 401 Unauthorized or 403 Forbidden.

Unless we need to see an existing account value, we can process the requested action without having to do another SELECT since we already have the account ID.

Managing Existing API Keys

The trade-off for efficiency is seen when we have to update existing keys by making multiple writes. Not only do we need to make multiple writes, but we’ll want to batch the requests to ensure they all happen at once. We’re okay with this, though, because we can afford the extra write cost now more than a larger read cost when authorizing a request.

Update permissions

Updating the permissions for an existing key requires us to update the text value for the map in the accounts table as well as the permissions column in the api_keys table.

Let’s update the permissions for the API key 4f6f1cf5-ba62-4b9c-be39-ee73a02e6dab to be “management”.

BEGIN BATCH  
  UPDATE accounts SET api_keys[4f6f1cf5-ba62-4b9c-be39-ee73a02e6dab] = 'management' WHERE account_id = 4e45f0e8-ade5-48fd-862d-d657f299828b;
  UPDATE api_keys SET permissions = 'management' WHERE api_key = 4f6f1cf5-ba62-4b9c-be39-ee73a02e6dab;
APPLY BATCH;

Add a key

To add a new key, we simply add it to the map in accounts and create a new row in api_keys. Let’s issue a new “full” API key 1939b3d5-34ae-436f-a7f0-02ac1368194b:

BEGIN BATCH  
  UPDATE accounts SET api_keys = api_keys + {1939b3d5-34ae-436f-a7f0-02ac1368194b: 'full'} WHERE account_id = 4e45f0e8-ade5-48fd-862d-d657f299828b;
  INSERT INTO api_keys (api_key, account_id, permissions) VALUES (1939b3d5-34ae-436f-a7f0-02ac1368194b, 4e45f0e8-ade5-48fd-862d-d657f299828b, 'full');
APPLY BATCH;

Remove a key

If you need to deactivate or remove a key, we simply remove all references to it. Let’s delete the key we just created:

BEGIN BATCH  
  DELETE api_keys[1939b3d5-34ae-436f-a7f0-02ac1368194b] FROM accounts WHERE account_id = 4e45f0e8-ade5-48fd-862d-d657f299828b;
  DELETE FROM api_keys WHERE api_key = 1939b3d5-34ae-436f-a7f0-02ac1368194b;
APPLY BATCH;

Wrap up

Although it may seem trivial, proper and efficient management of API keys on your platform is crucial to the security of your data and speed of fulfilling requests.

It’s tempting to simplify your schema into a single table — sure, you only have to make a single request to update keys and permissions and don’t have to worry about data consistency across tables. Those benefits, however, are highly outweighed by the inefficiencies and subsequent speed loss due to performing an indexed lookup rather than a PRIMARY KEY lookup on a separate table.

However you define your API key and permissions scheme, be sure to keep a few points in mind:

Determine how your data will be looked up most frequently and determine if it makes sense to abstract it out to its own table rather than relying on an INDEX.
Your permissions scheme may be more complex than the text types we used in this example. In that case, consider linking each API key to a uuid that represents another object defining the permissions OR simply JSON-encode the permission data into the same text value.
Monitor common requests and their subsequent performance against your database. Look for long-running requests and other areas you can improve your schema to limit the number of reads from your database.

Let’s do this

Getting started takes only a few minutes. ScyllaDB has an installer for every major platform. If you get stuck, we’re here to help.

Get started

Apache® and Apache Cassandra® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. Amazon DynamoDB® and Dynamo Accelerator® are trademarks of Amazon.com, Inc. No endorsements by The Apache Software Foundation or Amazon.com, Inc. are implied by the use of these marks.

Why ScyllaDB?

Is ScyllaDB right for me?

ScyllaDB University

ScyllaDB Blog