
Note: ScyllaDB Summit has concluded! Watch top tech talks on demand.
Curious about what’s next for low-latency data-intensive applications? Attend ScyllaDB Summit (free + virtual) on February 15 and 16.
In addition to hearing how teams at Discord, Epic Games (developer of Unreal Engine), Strava, Sharechat & other gamechangers are achieving impressive engineering feats, you’ll also discover the latest trends across:
- The wide wide world of NoSQL and SQL
- Innovative approaches to event streaming
- Using Rust & event streaming in distributed data systems
- Architecting applications for gaming, social media, AdTech & IoT
- Distributed data system research
- The intersection of Webassembly and data
- Tackling latency across, and beyond, the data pipeline
If you’ve never attended a ScyllaDB virtual conference (e.g., P99 CONF or ScyllaDB Summit), be prepared – this isn’t your typical virtual conference that inflicts death by PowerPoint. It’s highly interactive, including the ability to engage with speakers during their sessions and continue the conversation afterward.
ScyllaDB Summit 2023 features 30+ speakers and 25+ sessions over two half days. There are already thousands of engineers registered, and we expect a lively global crowd to be joining us live. Here’s a look at some of the most highly-anticipated “industry trends” sessions on the agenda so far.
Note: The conference also includes a core subset of sessions more distinctly focused on ScyllaDB, the monstrously fast and scalable NoSQL database. These sessions cover topics like simplifying DynamoDB migration, consistency algorithms, compaction strategies, goodput vs throughput, and database observability – all with an eye toward database performance at scale. For a look at those sessions, see the blog ScyllaDB Summit: For the ScyllaDB Curious + Veteran Sea Monsters.
Database Research and Trends 
The Consistency vs Throughput Tradeoff in Distributed Databases
Daniel Abadi, University of Maryland, College Park
It is well known that there are availability and latency tradeoffs that are required in order to achieve strong consistency in distributed systems. This talk by PACELC inventor Dr. Daniel Abadi will discuss whether or not there is a consistency vs throughput tradeoff in distributed database systems that guarantee ACID transactions.
The Database Trends that are Transforming Your Database Infrastructure Forever
Peter Zaitsev, Percona
Open source software is the de facto standard for many new applications, especially in the database industry. Currently, MySQL, PostgreSQL, MariaDB, MongoDB, Elastic, and others have shown up in every industry and organization in the world in some form or another. People are no longer choosing a single database for the company, they are letting developers and architects choose the best database for the job.
This has led to the increased number of technologies operations teams have to support. Couple that increase in technologies with a growing micro-service ( or cloud-native ) development paradigm where every service has its own database and all the data is valuable.
Now companies are now faced with dozens of technologies, hundreds or even thousands of individual database instances, and petabytes of data. The management of the complexity of such an environment is changing the way we look at systems and operations.
Solving the Issue of Mysterious Database Benchmarking Results
Daniel Seybold
Benchmarking is important but also known to be biased, especially in the cloud. In this talk, we demonstrate how to solve this issue based on a comparative ScyllaDB multi-cloud performance study.
Key-Key-Value Store: Generic NoSQL Datastore with Tombstone Reduction & Automatic Partition Splitting
Stephen Ma, Discord
Discover Discord’s approach to more quickly and simply onboarding new data storage use cases with their key-value store service that hides many ScyllaDB-specific complexities–like schema design and performance impacts from tombstones and large partitions–from developers.
WebAssembly + Databases
Everything in its Place: Putting Code and Data Where They Belong
Brian Sletten, Boatsu Consulting
There is an old saying: A place for everything and everything in its place. It brings to mind a natural order to things to facilitate a smooth traversal of our days. If we don’t have to work hard to locate our tools and things that we need to accomplish life’s tasks, everything is just easier. The idea that code runs on computers and data is stored in databases is increasingly only part of the story. Brian will highlight the changing trends to this notion and the issues with getting it wrong.
libSQL
Piotr Sarna, Chiselstrike
SQLite is a widely used embedded database engine, known for its simplicity and lightweight design. However, the original SQLite project does not accept contributions from third parties and does not use third-party code, which can limit its potential for innovation. This talk is an overview of SQLite architecture and introduction to libSQL: an open source fork of SQLite that accepts contributions and uses Rust for new features. Piotr Sarna will show how this fork can be used in distributed settings, with automatic backups and the ability to replicate data across multiple nodes. Chiselstrike’s modifications also include integration with WebAssembly.
Rust + Databases
Building a 100% ScyllaDB Shard-Aware Application using Rust
Alexys Jacob, Yassir Barchi, Joseph Perez – Numberly
At Numberly we designed an entire data processing application on ScyllaDB’s low-level internal sharding using Rust. Starting from what seemed like a crazy idea, our application design actually delivers amazing strengths like idempotence, distributed and predictable data processing with infinite scalability thanks to ScyllaDB.
Having ScyllaDB as our only backend, we managed to reduce operational costs while benefiting from core architectural paradigms like:
- Predictable data distribution and processing capacity
- Idempotence by leveraging deterministic data sharding
- Optimized data manipulation using consistent shard-aware partition keys
- Virtually infinite scaling along ScyllaDB
This talk will walk you through this amazing experience. We will share our thought process, the roadblocks we overcame and the numerous contributions we made to ScyllaDB to reach our goal in production. Guaranteed 100% made with love in Paris using ScyllaDB and Rust!
Building Next Generation Drivers: Optimizing Performance in Go and Rust
Piotr Grabowski, ScyllaDB
Optimizing shard-aware drivers for ScyllaDB has taken multiple initiatives, often requiring a complete rewrite from scratch. Learn the work undertaken to improve the performance of ScyllaDB drivers for both Go and Rust, plus how the Rust code base will be used as a core for drivers with other language bindings going forward. The session highlights performance increases obtained using techniques available in the respective programming languages, including shaving performance off Google’s B-tree implementation with Go generics, and using the asynchronous Tokio framework as the basis of a new Rust driver.
How Discord Stores Trillions of Messages on ScyllaDB
Bo Ingram, Discord
Learn why and how Discord’s persistence team recently completed their most ambitious migration yet: moving their massive set of trillions of messages from Cassandra to ScyllaDB. Bo Ingram, Senior Software Engineer at Discord, provides a technical look, including:
- Their reasons for moving from Apache Cassandra to ScyllaDB
- Their strategy for migrating trillions of messages
- How they designed a new storage topology – using a hybrid-RAID1 architecture – for extremely low latency on GCP
- The role of their existing Rust messages service, new Rust data service library, and new Rust data migrator in this project
- What they’ve achieved so far, lessons learned, and what they’re tackling next
Note: Piotr Sarna’s libSQL talk (above, in the WebAssembly section) also discusses Rust.
Event Streaming
How Proxima Beta Implemented CQRS and Event Sourcing on Top of Apache Pulsar and ScyllaDB
Lei Shi, Zhiwei Peng, Zhihao Chen – Proxima Beta, Tencent IEG Global
How Level Infinite uses ScyllaDB as the state store of the Proxima Beta gaming platform’s service architecture, including strategies for globally replicating data to simplify configuration management and using time window compaction strategy to power a distributed queue-like event store.
Aggregations at Scale for ShareChat Using Kafka Streams and ScyllaDB
Charan Movva, ShareChat
How ShareChat handles the aggregations of a post’s engagement metrics/counters at scale with sub-millisecond P99 latencies for reads and writes.
Sink Your Teeth into Streaming at Any Scale
Timothy Spann & David Kjerrumgaard, StreamNative
How to build a low-latency scalable platform for today’s massively data-intensive real-time streaming applications using ScyllaDB, Pulsar, and Flink.
Strategies For Migrating From SQL to NoSQL — The Apache Kafka Way
Geetha Anne, Confluent
A simple way to deploy an end-to-end streaming data pipeline that facilitates real-time data transfer from an on-premises relational datastore to a document-oriented NoSQL database with low latency– all cloud-deployed with Kubernetes.
Build Low Latency, Windowless Event Processing Pipelines with Quine and ScyllaDB
Matthew Cullum, thatDot
How to build an event processing pipeline that scales to millions of events per second with sub-millisecond latencies while ingesting multiple streams and demonstrates resiliency in the face of host failures.