
The next open-source release (version 2.2) of ScyllaDB will include support for role-based access control. This feature was introduced in version 2.2 of Apache Cassandra. This post starts with an overview of the access control system in ScyllaDB and some of the motivation for augmenting it with roles. We’ll explain what roles are and show an example of their use. Finally, we’ll cover how ScyllaDB transitions existing access-control data to the new roles-based system when you upgrade a cluster.
Access Control in ScyllaDB
There are two aspects of access control in ScyllaDB: controlling client connections to a ScyllaDB node (authentication), and controlling which operations a client can execute (authorization).
By default, no access control is enabled on ScyllaDB clusters. This means that a client can connect to any node unrestricted, and that the client can execute any operation supported by the database.
When we enable access-control (which is described in ScyllaDB’s documentation), there are two important changes to ScyllaDB’s behavior:
- A client cannot connect to a node unless it provides valid credentials for an identity known to the system (a username and password)
- A CQL query can only be executed if the authenticated identity has been granted the applicable permissions on the database objects involved in the query
For example, a logged-in user jsmith will only be permitted to execute
SELECT * FROM events.ingress;
if jsmith
has been granted (directly or indirectly) the SELECT permission on the events.ingress
table.
One way to grant jsmith
the permissions they need is to grant SELECT
on the entirety of the event
keyspace. This encompasses all tables in the keyspace as well.
GRANT SELECT ON KEYSPACE events TO jsmith;
We can verify the permissions granted to
jsmith
:
LIST ALL PERMISSIONS OF jsmith;
role | username | resource | permission |
jsmith | jsmith | <keyspace events> | SELECT |
Limitations of User-based Access Control
Access control based only on users can quickly be unwieldy. To see why, consider a large set of resources that all analysts at an organization need to have similar permissions on.
GRANT SELECT ON events.ingress TO jsmith;
GRANT MODIFY ON events.ingress TO jsmith;
GRANT SELECT ON events.egress TO jsmith;
GRANT MODIFY ON events.egress TO jsmith;
GRANT SELECT ON KEYSPACE endpoints TO jsmith;
The same permissions have been granted to users aburns
, tpetty
, and many others. If an analyst joins the company, then an administrator needs to carefully grant them the applicable permissions. If the set of resources changes, then all the analysts need to be modified with the updated permissions.
To avoid this problem, a critical administrator might decide to create an “umbrella” user, like analyst
, and have all analysts log in with that username and password whenever they interact with the system. That way, we only have to deal with a permission set for a single user. Unfortunately, by doing this, we lose an important security property: non-repudiation. This roughly means that the origin of data can be traced to a particular identity. We may want to know who modified data or accessed a particular table (i.e., we want access auditing), and having a single user makes this impossible.
Introducing Roles
One solution to the complexity described above is the use of roles. A role is an identity with a permission set, just like a user. Roles generalize users, though, because a role can also be granted to other roles.
In our example, we could create an analyst
role and grant them all of the permissions that analysts need to do their job. An analyst
has no credentials associated with it and cannot login to the system. We grant analyst
to aburns
to give aburns
all the permissions of analyst
. If the permission set for analysts needs to change, we only need to change the analyst
role.
A Concrete Example
We’ll briefly go through the example above to demonstrate the CQL syntax of the roles-based system. This particular example is from the master branch of ScyllaDB (specifically at commit 4419e602074c8d647f492612979cd98c677d89d9
), as we are preparing for the next release.
First, we create the analyst
role and grant them the necessary permissions.
CREATE ROLE analyst;
GRANT SELECT ON events.ingress TO analyst;
GRANT MODIFY ON events.ingress TO analyst;
GRANT SELECT ON events.egress TO analyst;
GRANT MODIFY ON events.egress TO analyst;
GRANT SELECT ON KEYSPACE endpoints TO analyst;
Then we create a user that can login for each of the analysts in our system.
CREATE ROLE jsmith WITH LOGIN = true AND PASSWORD = 'jsmith';
CREATE ROLE aburns WITH LOGIN = true AND PASSWORD = 'aburns';
CREATE ROLE tpetty WITH LOGIN = true AND PASSWORD = 'tpetty';
We grant analyst
to each.
GRANT analyst TO jsmith;
GRANT analyst TO aburns;
GRANT analyst TO tpetty;
We can inspect the permissions of a user and see that they inherit those of analyst
:
LIST ALL PERMISSIONS OF jsmith;
role | username | resource | permission |
analyst | analyst | <table events.egress> |
MODIFY |
analyst | analyst | <table events.egress> |
SELECT |
analyst | analyst | <table events.ingress> |
MODIFY |
analyst | analyst | <table events.ingress> |
SELECT |
analyst | analyst | <keyspace endpoints> |
SELECT |
The Old USER
CQL Statements
Astute readers may be wondering about the old user-based CQL statements: CREATE USER, ALTER USER, DROP USER, and LIST USERS
. These still exist and with the same syntax as they had before.
What is important to understand is that roles generalize users. All roles can be granted permissions, can be granted to other roles, have authentication credentials, and can be allowed to login to the system. By convention, when a role is allowed to login to the system, we call it a user. Therefore, all users are roles but not all roles are users.
CREATE USER
is just like CREATE ROLE
(with different syntax), except CREATE USER
implicitly sets LOGIN = true
.
Executing LIST ROLES
will display all the roles in the system, but LIST USERS
will only display roles with LOGIN = true
.
Migrating Old ScyllaDB Clusters
With the switch to role-based access control, ScyllaDB internally uses a new schema for storing metadata. ScyllaDB will automatically convert the old user-based metadata into the new format during a cluster upgrade.
When the first node in the cluster is restarted with the new ScyllaDB version, the metadata will be converted with a log message like the following:
INFO 2018-04-05 09:53:53,061 [shard 0] password_authenticator - Starting migration of legacy authentication metadata.
INFO 2018-04-05 09:53:53,065 [shard 0] password_authenticator - Finished migrating legacy authentication metadata.
INFO 2018-04-05 09:53:54,005 [shard 0] standard_role_manager - Starting migration of legacy user metadata.
INFO 2018-04-05 09:53:54,015 [shard 0] standard_role_manager - Finished migrating legacy user metadata.
INFO 2018-04-05 09:53:54,681 [shard 0] default_authorizer - Starting migration of legacy permissions metadata.
INFO 2018-04-05 09:53:54,690 [shard 0] default_authorizer - Finished migrating legacy permissions metadata.
Importantly, we do not support modifying access-control data during a cluster upgrade.
If a client is connected to an already-upgraded node in the midst of an upgrade, all modification statements will fail with an error message about incomplete cluster upgrades.
If a client is connected to an un-upgraded node, then the modification statements will succeed but not be reflected in the upgraded cluster.
The following table describes the old and new metadata tables, with the correspondence between the two if it exists.
Old table | New Table |
system_auth.users | system_auth.roles |
system_auth.role_members | |
system_auth.credentials | |
system_auth.permissions | system_auth.role_permissions |
Once the cluster has been fully upgraded and you have verified that all access-control information is correct, you can drop the legacy metadata tables:
DROP TABLE system_auth.users;
DROP TABLE system_auth.credentials;
DROP TABLE system_auth.permissions;
Conclusion and Acknowledgments
Roles can make it easier to achieve good security properties in your ScyllaDB cluster and can simplify a lot of common operations.
Please give this new feature a try and provide feedback either as a GitHub issue (in the case of bugs), on the mailing list, or on our Slack Channel.
Adding roles support to ScyllaDB also required restructuring existing support for access-control and many other parts of the system. Thanks to everyone involved for their careful review and input during this process.
Next Steps
- Learn more about ScyllaDB on our product page.
- See what our users are saying about ScyllaDB.
- Download ScyllaDB. Check out our download page to run ScyllaDB on AWS, install it locally in a Virtual Machine, or run it in Docker.