See all blog posts

Role-based Access Control in ScyllaDB

rbac

The next open-source release (version 2.2) of ScyllaDB will include support for role-based access control. This feature was introduced in version 2.2 of Apache Cassandra. This post starts with an overview of the access control system in ScyllaDB and some of the motivation for augmenting it with roles. We’ll explain what roles are and show an example of their use. Finally, we’ll cover how ScyllaDB transitions existing access-control data to the new roles-based system when you upgrade a cluster.

Access Control in ScyllaDB

There are two aspects of access control in ScyllaDB: controlling client connections to a ScyllaDB node (authentication), and controlling which operations a client can execute (authorization).

By default, no access control is enabled on ScyllaDB clusters. This means that a client can connect to any node unrestricted, and that the client can execute any operation supported by the database.

When we enable access-control (which is described in ScyllaDB’s documentation), there are two important changes to ScyllaDB’s behavior:

  • A client cannot connect to a node unless it provides valid credentials for an identity known to the system (a username and password)
  • A CQL query can only be executed if the authenticated identity has been granted the applicable permissions on the database objects involved in the query

For example, a logged-in user jsmith will only be permitted to execute

SELECT * FROM events.ingress;

if jsmith has been granted (directly or indirectly) the SELECT permission on the events.ingress table.

One way to grant jsmith the permissions they need is to grant SELECT on the entirety of the event keyspace. This encompasses all tables in the keyspace as well.

GRANT SELECT ON KEYSPACE events TO jsmith;

We can verify the permissions granted to jsmith:

LIST ALL PERMISSIONS OF jsmith;

role username resource permission
jsmith jsmith <keyspace events> SELECT

Limitations of User-based Access Control

Access control based only on users can quickly be unwieldy. To see why, consider a large set of resources that all analysts at an organization need to have similar permissions on.

GRANT SELECT ON events.ingress TO jsmith;
GRANT MODIFY ON events.ingress TO jsmith;
GRANT SELECT ON events.egress TO jsmith;
GRANT MODIFY ON events.egress TO jsmith;
GRANT SELECT ON KEYSPACE endpoints TO jsmith;

The same permissions have been granted to users aburns, tpetty, and many others. If an analyst joins the company, then an administrator needs to carefully grant them the applicable permissions. If the set of resources changes, then all the analysts need to be modified with the updated permissions.

To avoid this problem, a critical administrator might decide to create an “umbrella” user, like analyst, and have all analysts log in with that username and password whenever they interact with the system. That way, we only have to deal with a permission set for a single user. Unfortunately, by doing this, we lose an important security property: non-repudiation. This roughly means that the origin of data can be traced to a particular identity. We may want to know who modified data or accessed a particular table (i.e., we want access auditing), and having a single user makes this impossible.

Introducing Roles

One solution to the complexity described above is the use of roles. A role is an identity with a permission set, just like a user. Roles generalize users, though, because a role can also be granted to other roles.

In our example, we could create an analyst role and grant them all of the permissions that analysts need to do their job. An analyst has no credentials associated with it and cannot login to the system. We grant analyst to aburns to give aburns all the permissions of analyst. If the permission set for analysts needs to change, we only need to change the analyst role.

A Concrete Example

We’ll briefly go through the example above to demonstrate the CQL syntax of the roles-based system. This particular example is from the master branch of ScyllaDB (specifically at commit 4419e602074c8d647f492612979cd98c677d89d9), as we are preparing for the next release.

First, we create the analyst role and grant them the necessary permissions.

CREATE ROLE analyst;

GRANT SELECT ON events.ingress TO analyst;
GRANT MODIFY ON events.ingress TO analyst;
GRANT SELECT ON events.egress TO analyst;
GRANT MODIFY ON events.egress TO analyst;
GRANT SELECT ON KEYSPACE endpoints TO analyst;

Then we create a user that can login for each of the analysts in our system.

CREATE ROLE jsmith WITH LOGIN = true AND PASSWORD = 'jsmith';
CREATE ROLE aburns WITH LOGIN = true AND PASSWORD = 'aburns';
CREATE ROLE tpetty WITH LOGIN = true AND PASSWORD = 'tpetty';

We grant analyst to each.

GRANT analyst TO jsmith;
GRANT analyst TO aburns;
GRANT analyst TO tpetty;

We can inspect the permissions of a user and see that they inherit those of analyst:

LIST ALL PERMISSIONS OF jsmith;

role username resource permission
analyst analyst <table events.egress> MODIFY
analyst analyst <table events.egress> SELECT
analyst analyst <table events.ingress> MODIFY
analyst analyst <table events.ingress> SELECT
analyst analyst <keyspace endpoints> SELECT

The Old USER CQL Statements

Astute readers may be wondering about the old user-based CQL statements: CREATE USER, ALTER USER, DROP USER, and LIST USERS. These still exist and with the same syntax as they had before.

What is important to understand is that roles generalize users. All roles can be granted permissions, can be granted to other roles, have authentication credentials, and can be allowed to login to the system. By convention, when a role is allowed to login to the system, we call it a user. Therefore, all users are roles but not all roles are users.

CREATE USER is just like CREATE ROLE (with different syntax), except CREATE USER implicitly sets LOGIN = true.

Executing LIST ROLES will display all the roles in the system, but LIST USERS will only display roles with LOGIN = true.

Migrating Old ScyllaDB Clusters

With the switch to role-based access control, ScyllaDB internally uses a new schema for storing metadata. ScyllaDB will automatically convert the old user-based metadata into the new format during a cluster upgrade.

When the first node in the cluster is restarted with the new ScyllaDB version, the metadata will be converted with a log message like the following:

INFO 2018-04-05 09:53:53,061 [shard 0] password_authenticator - Starting migration of legacy authentication metadata.
INFO 2018-04-05 09:53:53,065 [shard 0] password_authenticator - Finished migrating legacy authentication metadata.
INFO 2018-04-05 09:53:54,005 [shard 0] standard_role_manager - Starting migration of legacy user metadata.
INFO 2018-04-05 09:53:54,015 [shard 0] standard_role_manager - Finished migrating legacy user metadata.
INFO 2018-04-05 09:53:54,681 [shard 0] default_authorizer - Starting migration of legacy permissions metadata.
INFO 2018-04-05 09:53:54,690 [shard 0] default_authorizer - Finished migrating legacy permissions metadata.

Importantly, we do not support modifying access-control data during a cluster upgrade.

If a client is connected to an already-upgraded node in the midst of an upgrade, all modification statements will fail with an error message about incomplete cluster upgrades.

If a client is connected to an un-upgraded node, then the modification statements will succeed but not be reflected in the upgraded cluster.

The following table describes the old and new metadata tables, with the correspondence between the two if it exists.

Old table New Table
system_auth.users system_auth.roles
system_auth.role_members
system_auth.credentials
system_auth.permissions  system_auth.role_permissions

Once the cluster has been fully upgraded and you have verified that all access-control information is correct, you can drop the legacy metadata tables:

DROP TABLE system_auth.users;
DROP TABLE system_auth.credentials;
DROP TABLE system_auth.permissions;

Conclusion and Acknowledgments

Roles can make it easier to achieve good security properties in your ScyllaDB cluster and can simplify a lot of common operations.

Please give this new feature a try and provide feedback either as a GitHub issue (in the case of bugs), on the mailing list, or on our Slack Channel.

Adding roles support to ScyllaDB also required restructuring existing support for access-control and many other parts of the system. Thanks to everyone involved for their careful review and input during this process.

Next Steps

  • Learn more about ScyllaDB on our product page.
  • See what our users are saying about ScyllaDB.
  • Download ScyllaDB. Check out our download page to run ScyllaDB on AWS, install it locally in a Virtual Machine, or run it in Docker.