Inside the Database Internals Talks at P99 CONF 2025

By Cynthia Dunlop

September 29, 2025

“Never write a database. Even if you want to, even if you think you should. Resist. Never write a database. Unless you have to write a database. But you don’t.” – Charity Majors

But someone has to write the databases that others rely on. Hearing about the engineering challenges they’re tackling is both fascinating and Schadenfreude-invoking – so perfect tech conference material. 😉

Since database performance is so near and dear to ScyllaDB, we reached out to our friends and colleagues across the community to that ensure a nice range of distributed data systems, approaches, and challenges would be represented at P99 CONF 2025. As you can see from our agenda, the response was overwhelming.

A quick PSA for the uninitiated: P99 CONF is a free 2-day community event that’s intentionally virtual, highly interactive, and purely technical. It’s an immersion into all things performance. Distributed systems, database internals, Rust, C++, Java, Go, Wasm, Zig, Linux kernel, tracing, AI/ML & more – it’s all on the agenda.

This year, you can look forward to first-hand engineering experiences from the likes of Pinterest, Clickhouse, Gemini, Arm, Rivian and VW Group Technology, Meta, Wayfair, Disney, NVIDIA, Turso, Neon, TigerBeetle, ScyllaDB, and too many others to list here.

Here’s a sneak peek of the database internals talks you can look forward to at P99 CONF 2025…

Join us at P99 CONF (free + virtual)

Clickhouse’s C++ and Rust Journey

Alexey Milovidov, Co-founder and CTO at Clickhouse

Full rewrite from C++ to Rust or gradual integration with Rust libraries? For a large C++ codebase, only the latter works, but even then, there are many complications and rough edges. In my presentation, I will describe our experience integrating Rust and C++ code and some weird and unusual problems we had to overcome.

Rethinking Durable Workflows and Queues: A Library-based Approach

Qian Li, Co-founder at DBOS, Inc

Durable workflow engines checkpoint program state to persistent storage (like a database) so that execution can always recover from where it left off. Most systems today rely on external orchestration: a centralized orchestrator and distributed workers communicating via message-passing. While this model is well-established, it’s often heavyweight, introducing substantial overhead, write amplification, and operational complexity.

In this talk, we explore an alternative: a lightweight library-based durable workflow engine that embeds into application code and checkpoint state directly to the database. It handles queues and flow control through the database itself. This approach eliminates the need for a separate orchestrator, reduces network traffic, and improves performance by avoiding unnecessary writes.

We’ll share our experience building DBOS, a library-based engine designed for simplicity and efficiency. We’ll discuss the architectural trade-offs, challenges in failure recovery, and key optimizations for scalability and maintainability.

The Gory Details of a Full-Featured Userspace CPU Scheduler

Avi Kivity, Co-founder and CTO at ScyllaDB

Userspace CPU schedulers, which often accompany asynchronous I/O engines like io_uring and Linux AIO, are usually simplistic run-to-completion FIFO loops. This suffices for I/O bound applications, but for use cases that can be both CPU bound and I/O bound, this is not enough.

Avi Kivity, CTO of ScyllaDB and co-maintainer of Seastar, will cover the design and implementation of the Seastar userspace CPU scheduler, which caters to more complex applications that require preemption and prioritization.

The Tale of Taming TigerBeetle’s Tail Latency

Tobias Ziegler, Software Engineer at Tigerbeetle

In this talk, we dive into how we reduced TigerBeetle’s tail latency through algorithm engineering. ‘Algorithm engineering goes beyond studying theoretical complexity and considers how algorithms are executed efficiently on modern super-scalar CPUs. Specifically, we will look at Radix Sort and a k-way merge and explore how to implement them efficiently. We then demonstrate how we apply these algorithms incrementally to avoid latency spikes in practice.

Why We’re Rewriting SQLite in Rust

Glauber Costa, Co-founder and CEO at Turso

Over two years ago, we forked SQLite. We were huge fans of the embedded nature of SQLite, but wanted a more open model of development…and libSQL was born as an Open Contribution project. Last year, as we were adding Vector Search to SQLite, we had a crazy idea. What could we achieve if we were to completely rewrite SQLite in Rust? This talk explains what drove us down this path, how we’re using deterministic simulation testing to ensure the reliability of the Rust rewrite, and the lessons learned (so far). I will show how a reimagining of this iconic database can lead to performance improvements of over 500x in some cases by looking at what powers it under the hood.

Shared Nothing Databases at Scale

Nick Van Wiggeren, CTO at PlanetScale

This talk will discuss how PlanetScale scales databases in the cloud, focusing on a shared-nothing architecture that is built around expecting failure. Nick will go into how they built low-latency high-throughput systems that span multiple nodes, availability zones, and regions, while maintaining sub-millisecond response times. This starts at the storage layer and builds all the way up to micro-optimizing the load balancer, with a lot of learning at every step of the way.

Reworking the Neon IO stack: Rust+tokio+io_uring+O_DIRECT

Christian Schwarz, Member of Technical Staff at Databricks

Neon is a serverless Postgres platform. Recently acquired by Databricks, the same technology now also powers Databricks Lakebase. In this talk, we will dive into Pageserver, the multi-tenant storage service at the heart of the architecture. We share techniques and lessons learned from reworking its IO stack to a fully asynchronous model, with direct IO against local NVMe drives; all during a period of rapid growth. Pageserver is implemented in Rust, we use the tokio async runtime for networking, and integrate it with io_uring for filesystem access.

A Deep Dive into the Seastar Event Loop

Pavel Emelyanov, Principal Software Engineer at ScyllaDB

The core and the basis of ScyllaDB’s outstanding performance is the Seastar framework, and the core and the basis of seastar is its event loop. In this presentation, we’ll see what the loop does in great detail, analyze the limitations that it runs in and all the consequences that follow those limitations. We’ll also learn how the loop is observed by the user and various means to understand its behavior.

Cost Effective, Low Latency Vector Search In Databases: A Case Study with Azure Cosmos DB

Magdalen Manohar, Senior Researcher at Microsoft

We’ve integrated DiskANN, a state-of-the-art vector indexing algorithm, into Azure Cosmos DB NoSQL, a state-of-the-art cloud-native operational database. Learn how we overcame the systems and algorithmic challenges of this integration to achieve <20ms query latency at the 10 million scale, while supporting scale-out to billions of vectors via automatic partitioning.

Measuring Query Latency the Hard Way: An Adventure in Impractical Postgres Monitoring

Simon Notley, Observability and Optimization at EnterpriseDB

Sampling the session state (as exposed by pg_stat_activity) is a surprisingly powerful way to understand how your Postgres instance spends its time. It is something I can wholeheartedly recommend to any Postgres DBA that needs a lightweight way to monitor query performance in production. However, it’s a terrible way to measure query latency, fraught with complexity and weird statistical biases that could be avoided by simply using an extension built for the job, or even log analysis. But pursuing terrible ideas can be fun, so in this talk, I dive into my adventures in measuring query latency from session sampling, generate some extremely funky charts, and end up unexpectedly performing a vector similarity search.

In this talk I’ll show how instead of attempting to correct the biases that plague estimates of query latency based time-domain sampling, we can instead pre-calculate the distribution of (biased) estimates based on a range of true distributions and use vector search to compare our observed distribution to these pre-calculate ones, thereby inferring the true query latency. This ‘eccentric’ method is actually surprisingly effective, and surprisingly fun.

Fast and Deterministic Full Table Scans at Scale

Felipe Cardeneti Mendes, Technical Director at ScyllaDB

ScyllaDB’s new tablet replication algorithm replaces static vNodes with dynamic, elastic data distribution that adapts to shifting workloads. This talk discusses how tablets enable fast, predictable full table scans by keeping operations shard-local, balancing load automatically, and scaling linearly through a simple layer of indirection.

Optimizing Tiered Storage for Low-Latency Real-Time Analytics

Neha Pawar, Founding Engineer and Head of Data at StarTree

Real-time OLAP databases usually trade performance for cost when moving from local storage to cloud object storage. This talk shows how we extended Apache Pinot to use cloud storage while still achieving sub-second P99 latencies. We’ll cover the abstraction that makes Pinot location-agnostic, strategies like pipelining, prefetching, and selective block fetches, and how to balance local and cloud storage for both cost efficiency and speed.

As Fast as Possible, But Not Faster: ScyllaDB Flow Control

Nadav Har’El, Distinguished Engineer at ScyllaDB

Pushing requests faster than a system can handle results in rapidly growing queues. If unchecked, it risks depleting memory and system stability. This talk discusses how we engineered ScyllaDB’s flow control for high volume ingestions, allowing it to throttle over-eager clients to exactly the right pace – not so fast that we run out of memory, but also not so slow that we let available resources go to waste.

Push the Database Beyond the Edge

Nikita Sivukhin, Software Engineer at Turso

Almost any application can benefit from having data available locally – enabling blazing-fast access and optimized write patterns. This talk will walk you through one approach to designing a full-featured sync engine, applicable across a wide range of domains, including front-end, back-end, and machine learning training.

Engineering a Low-Latency Vector Search Engine for ScyllaDB

Pawel Pery, Senior Software Engineer at ScyllaDB

Implementing Vector Search in ScyllaDB brings challenges from low-latency to predictable performance at scale. Rather than embedding HNSW indexing directly into the core database, we decoupled vector indexing and similarity search into a dedicated Rust engine. Learn about the architectural design decisions that enabled us to combine and integrate ScyllaDB’s shard-per-core for real-time operations and high-performance ANN processing via USearch.

We Told B+ Trees to Do Sorted Sets—They Nailed It (Joe Zhou, Dragonfly)

Joe Zhou, Developer Advocate at DragonflyDB

Sorted sets are a critical Redis data type used for leaderboards, time-series data, and priority queues. However, Redis’s skiplist-based implementation introduces significant memory overhead—averaging 37 bytes per entry on top of the essential 16 bytes for the (member, score) pair. For large sorted sets, this inefficiency can become a major bottleneck.

In this talk, we’ll explore how Dragonfly reimplemented sorted sets using a B+ tree, reducing memory overhead to just 2-3 bytes per entry while improving performance. We’ll cover:

Why skiplists are inefficient for large sorted sets.
How B+ trees with bucketing drastically cut memory usage while maintaining O(log N) operations.
Benchmark results showing 40% lower memory and better throughput vs. Redis.

This optimization, now stable in Dragonfly, demonstrates how rethinking core data structures can unlock major efficiency gains. Attendees will leave with insights into:

Trade-offs between skiplists and B+ trees.
Real-world impact on memory and latency (P99 improvements).
Lessons from implementing a custom ranking API for B+ trees.

Keynote: Andy Pavlo

You can also look forward to a keynote by Andy Pavlo. We’re not revealing the topic yet, but if you know Andy, you know you won’t want to miss it.

Join us at P99 CONF (free + virtual)

kernel P99 CONF

Why ScyllaDB?

Is ScyllaDB right for me?

ScyllaDB University

ScyllaDB Blog

Inside the Database Internals Talks at P99 CONF 2025

Clickhouse’s C++ and Rust Journey

Rethinking Durable Workflows and Queues: A Library-based Approach

The Gory Details of a Full-Featured Userspace CPU Scheduler

The Tale of Taming TigerBeetle’s Tail Latency

Why We’re Rewriting SQLite in Rust

Shared Nothing Databases at Scale

Reworking the Neon IO stack: Rust+tokio+io_uring+O_DIRECT

A Deep Dive into the Seastar Event Loop

Cost Effective, Low Latency Vector Search In Databases: A Case Study with Azure Cosmos DB

Measuring Query Latency the Hard Way: An Adventure in Impractical Postgres Monitoring

Fast and Deterministic Full Table Scans at Scale

Optimizing Tiered Storage for Low-Latency Real-Time Analytics

As Fast as Possible, But Not Faster: ScyllaDB Flow Control

Push the Database Beyond the Edge

Engineering a Low-Latency Vector Search Engine for ScyllaDB

We Told B+ Trees to Do Sorted Sets—They Nailed It (Joe Zhou, Dragonfly)

Keynote: Andy Pavlo

Start scaling with the world's best high performance NoSQL database.

Why ScyllaDB?

Is ScyllaDB right for me?

ScyllaDB University

ScyllaDB Blog

Inside the Database Internals Talks at P99 CONF 2025

Clickhouse’s C++ and Rust Journey

Rethinking Durable Workflows and Queues: A Library-based Approach

The Gory Details of a Full-Featured Userspace CPU Scheduler

The Tale of Taming TigerBeetle’s Tail Latency

Why We’re Rewriting SQLite in Rust

Shared Nothing Databases at Scale

Reworking the Neon IO stack: Rust+tokio+io_uring+O_DIRECT

A Deep Dive into the Seastar Event Loop

Cost Effective, Low Latency Vector Search In Databases: A Case Study with Azure Cosmos DB

Measuring Query Latency the Hard Way: An Adventure in Impractical Postgres Monitoring

Fast and Deterministic Full Table Scans at Scale

Optimizing Tiered Storage for Low-Latency Real-Time Analytics

As Fast as Possible, But Not Faster: ScyllaDB Flow Control

Push the Database Beyond the Edge

Engineering a Low-Latency Vector Search Engine for ScyllaDB

We Told B+ Trees to Do Sorted Sets—They Nailed It (Joe Zhou, Dragonfly)

Related Posts

Subscribe to the ScyllaDB Blog