We had a lot of fun and learned a lot at Scylla’s internal developer conference and hackathon this year. For our hackathon project, we decided to create an API to manage a Scylla cluster using native Cassandra Query Language (CQL) commands.
Background and Motivation
Scylla (and Cassandra for that matter) uses nodetool as its prime command line management tool. Nodetool is a Java-based application that connects over Java’s JMX API and, in Scylla’s case, using a JMX proxy and a RESTful API.
What we were after was to have similar capabilities using Scylla’s native CQL interface. This approach has its advantages:
- CQL is a native API in Scylla, that by definition is already supported by Scylla users
- Makes the management API a first-class citizen
- CQL is easily understood by humans and computers alike
- CQL can be secured, subject to access control
- You can use extra facilities in CQL, such as filtering with SELECT
- Does not require external tools and drops the last Scylla dependency on JVM
Our team combined Scylla server-side development expertise with the operations expertise of our management and monitoring leaders.
The CQL Management API team, clockwise from upper left included Kostja Osipov (best known for his work on LWT), Amnon Heiman from the Scylla Monitoring Stack team, Michal Matczuk from the Scylla Manager team, and Tomasz Grabiec, one of the few black belts of Scylla core.
We started with looking at common nodetool commands to see how they can be mapped to CQL. What we found out is that it depends on the command, some commands like nodetool status are tabular in nature and would be best served as selecting from a Virtual table.
Other commands like nodetool repair can be best described as a SQL procedure call and there are some commands like nodetool version that return a single value and would be best implemented as functions.
So we did it all.
We use virtual tables for nodetool commands that are tabular in nature and that can benefit from filtering capabilities.
A virtual table is a facade that acts like a regular CQL table that you can select from but the data behind is generated programmatically. For example, consider getting your system status as a CQL table:
As you can see, it acts just like a regular table, cool, right?
Other supported virtual tables are:
Both act like the equivalent nodetool commands (compare here and here).
Now to management commands, like repair, take snapshot, etc’.
We used a SQL-like CALL syntax with parameters:
cqlsh> CALL system.take_snapshot(ks=>'keyspace1', tag=>'mysnp');
Procedure can return a result set if needed.
Finally, there are nodetool commands that return a single value; nodetool version is a good example of this.
We implement those as functions and to make it easy to use, added the DUAL table that acts as a dummy table that you can query anything from.
SELECT scylla_version() FROM dual; scylla_version --------------------------------------- 666.development-0.20201026.e3d1d458c2
What We Achieved
We set out to create a basic working nodetool-like functionality that is based on CQL. We were able to accomplish this in the course of the Hackathon. However, integrating our work into the Scylla code base, optimizing it, and fleshing it out for production-readiness will take more time and testing. Look forward to this being introduced in Scylla Open Source in the coming future.
Also don’t worry! We are committed to supporting standard nodetool because of its ubiquity and familiarity to users. However, for those who want to do in-band management through CQL, in the days ahead you will have a unique option with Scylla.
Open Issues and Future Steps
During the Hackathon we accomplished a great deal, but we have more ideas for how to make this capability work even better and smoother. Here’s a list of our thoughts on how to improve it more in the days to follow:
- Compile-time schema definition:
- reduce mutation model boilerplate
- make the code compile-time safe
- Make “distinct” CQL keyword work (queue_reader::next_partition())
- Proper memory accounting for reader admission control
- Use temporary tables to avoid OOM
- Allow running from a single shard (fix sharders)
- Unit tests
- Ordinal routine parameters
- Parameter markers, UDTs, compounds
- SYSTEM.FUNCTIONS should be SYSTEM.ROUTINES
- ANSI User-Defined Procedures
- ANSI INFORMATION_SCHEMA (must-have for JDBC)
- Oracle-style PERFORMANCE_SCHEMA
See You at Scylla Summit!
Scylla Summit is right around the corner, January 12-14, 2021. You’ll learn more about our future roadmap, and hear how your industry peers are managing Scylla in their production networks. We also have a day of online classes for both application developers and administrators. Best of all, it’s free, virtual and online this year. Sign up today!