The next open-source release (version 2.2) of Scylla will include support for role-based access control. This feature was introduced in version 2.2 of Apache Cassandra. This post starts with an overview of the access control system in Scylla and some of the motivation for augmenting it with roles. We’ll explain what roles are and show an example of their use. Finally, we’ll cover how Scylla transitions existing access-control data to the new roles-based system when you upgrade a cluster. Access Control in Scylla There are two aspects of access control in Scylla: controlling client connections to a Scylla node (authentication), […]
What is a User Defined Type? (UDT)? User Defined Types (UDTs) allow a definition of struct that includes multiple typed named fields (including other UDTs). Once a UDT is defined, it can be used as a column type in a table definition. In Scylla, you can define a Column as a frozen<UDT>.
At ScyllaDB, our development team is all about performance with improved latency and throughput. Our speakers at our recent Scylla Summit provided many tips and tricks to make Scylla’s superior latency and performance even better. ScyllaDB’s VP of R&D, Schlomi Livne, added to the growing repertoire of these tips with his talk Planning your queries for maximum performance. In it, he outlined some of the how and why of Scylla performance, and concluded with seven rules to optimize your queries.
The data model in Scylla and Apache Cassandra partitions data between cluster nodes using a partition key, which is defined by the database schema. Using a partition key provides an efficient way to look up rows using the partition key because you can find the node that owns the row by hashing the partition key. Unfortunately, this also means that finding a row using a non-partition key requires a full table scan which is inefficient. Secondary Indexes are a mechanism in Apache Cassandra that allows efficient searches on non-partition keys by creating an index.
When most server application developers think of I/O, they consider network I/O since most resources these days are accessed over the network: databases, object storage, and other microservices. The developer of a database, however, also has to consider file I/O. This article describes the available choices and their tradeoffs and why Scylla chose asynchronous direct I/O (AIO/DIO) as its access method.
By default, Scylla SSTables will be compressed when they are written to disk. As mandated by the file format, data is compressed in chunks of a certain size – 4kB if not explicitly set. The size of the chunk is one of the parameters for the compression property to be set at table creation. Chunk-based compression presents trade-offs that users may not be aware of. In this post, I will try to explore what those trade-offs are and how to set them correctly for maximum benefit. As trade-offs imply different results for different loads, we will focus on single-partition read […]
What is Workload Conditioning? What is the best request rate I should throw at my cluster? What disk bandwidth should I make available for compactions? How many reader or writer threads should I have? What are the best size for my memtables?