ScyllaDB Open Source has a rich set of new production-ready features, including Lightweight Transactions (LWT), DynamoDB API compatibility, Change Data Capture, offline installers and more.
Already the best high-performance NoSQL database for big data workloads, ScyllaDB builds upon the best attributes of Apache Cassandra and Amazon DynamoDB by improving performance, scalability, and cost-efficiency.
ScyllaDB has been compiled and tested to run on Arm-based architectures to support a range of new instance types on AWS. For example, the Im4gn and Is4gen storage-optimized instances use Graviton2 processors and are now ready for production workloads. Meanwhile the low cost T4g burstable instance is appropriate for development of cloud-based applications. Since ScyllaDB is now compiled to run on any Aarch64 architecture, you can even run it on an Arm-based M1-powered Macintosh using Docker for local development.
Repair-Based Node Operations (RBNO) use the same underlying implementation for repair and node operations such as bootstrap, decommission, removenode, and replace. Row-based repair is oriented towards small amounts of data, not an entire node’s worth. This resulted in smaller, more atomic updates that allowed users to begin, pause and resume operations. It also uses offstrategy compaction to compact SSTables efficiently without impacting the main workload.
Service Levels allows the user to attach attributes to Rules and Users. These attributes apply to each session the user opens, enabling granular control of the session properties, like time out and shedding (overload handling). ScyllaDB Open Source supports two main properties: per service level timeouts and workload types.
A reverse query is a query SELECT that uses a reverse order compared to the one used in the table schema. If no order was defined, the default order is ascending (ASC). Improving the performance of reverse query is an ongoing process, with the following updates in ScyllaDB 4.6:
ScyllaDB is a very powerful tool, with many features and options. In many cases, these options, such as experimental or performance-impacting features, or a combination of them, are not recommended to run in production. Guardrails are a collection of reservations that make it harder for the user to use non-recommended options in production. A few examples:
ScyllaDB administrators can use our default settings or customize guardrails for their own environment and needs.
The Amazon DynamoDB-compatible interface has been updated to include a number of new features:
There is now new syntax for setting timeouts for individual queries with “USING TIMEOUT”. The new Timeout per Operation allows you to define the timeout in a more granular way. Conversely, some queries might have tight latency requirements, in which case it makes sense to set their timeout to a small value. Such queries would get time out faster, which means that they won’t needlessly hold the server’s resources. You can use the new TIMEOUT parameters for both queries (SELECT) and updates (INSERT, UPDATE, DELETE).
The Seastar I/O scheduler is used to maximize the requests throughput from all shards to the storage. Till now, the scheduler was running in a per shard scope: each shard runs its own scheduler, balanced between its I/O tasks, like reads, updates and compactions. This works well when the workload between shards is approximately balanced; but when, as often happened, one shard was more loaded, it could not take more I/O, even if other shards were not using their share. I/O scheduler 2.0 included in ScyllaDB 4.4 fixes this. As storage bandwidth and IOPS are shared, each shard can use the whole disk if required.
This feature allows you to track changes made to a base table in your database for visibility or auditing. Changes made to the base table are stored in a separate table that can be queried by standard CQL. Our CDC implementation uses a configurable Time To Live (TTL) to ensure it does not occupy an inordinate amount of your disk.
Prior to ScyllaDB Open Source 4.2, lookups in the promoted index were done by scanning the index linearly, so that the lookup took O(n) time. This is inefficient for large partitions, consuming a great deal of CPU and I/O. Now the reader scans the SSTable promoted index with a binary search, reducing search time to O(log n). In our testing, searches were conducted 12x faster, CPU utilization was only 1/10th (10%), and disk I/O was reduced to 1/20th (5%) the prior rate.
Our Amazon DynamoDB API implementation, known as Project Alternator, enables you to connect applications/services built for the DynamoDB API to ScyllaDB without changing your client code. This gives your team multi-cloud, multi-vendor flexibility to improve system resilience and include a disaster recovery plan in your playbook.
ScyllaDB LWTs allow stronger consistency guarantees using the Paxos consensus algorithm. They ensure requests on a distributed database are processed in a strict, linearized (serial) method, in a process known as ‘Compare and Set.’. They are also called ‘Conditional Updates,’ because they can test the databases’ existing values before submitting the update. This provides atomic consistency for single keys, allowing updates to be performed in order on a global basis. They can also be used for batches, to ensure all conditions are met before submitting a batch update.
Kubernetes has become the go-to technology for the cloud devops community. It allows for the automated provisioning of cloud-native containerized applications, supporting the entire lifecycle of deployment, scaling, and management. ScyllaDB Operator is our extension to enable the management of ScyllaDB clusters. It currently supports deploying multi-zone clusters, scaling up or adding new racks, scaling down, and monitoring ScyllaDB clusters with Prometheus and Grafana.
User Defined Functions (UDF) in Lua adds the ability for teams to build server-side scripts that can run complex transforms such as aggregations, sums, averages, minimums, and maximums. This allows developers to simplify their database queries and the use of multiple large queries with large payloads.
How to use lightweight transactions in ScyllaDB; learn the similarities and differences between ScyllaDB’s Paxos implementation and Cassandra’s.
CDC enables the user to track updates to tables in real time. Learn how they work under the hood, and why they are a vast improvement over Cassandra.
Get started on the path to ScyllaDB expertise
It’s easy to get started with our NoSQL DBaaS