
Learn how to build a Rust application that tracks Bluesky user experiences and events.
Let’s build a high-performance, scalable, and reliable application that can:
- Fetch and process public events from the Bluesky platform.
- Track user events and experiences.
- Implement a leveling system with experience points (XP).
- Display user levels and progress based on XP via a REST API.
1. Background
Bluesky, which uses a mix of SQLite and ScyllaDB to store data, has a really cool feature called Firehose. Firehose is an aggregated stream of all the public data updates in the network.
You can understand it by accessing FireSky.tv, an app that implements this stream and serves it directly in the browser. Implementing it from scratch requires deep knowledge of the AT Protocol. But a Bluesky engineer built Jetstream: a Firehose aggregator. With Firehose, you can just listen on a websocket and get a JSON stream of selected events.
Here’s a sample of an event payload from Jetstream:
Just listening to one of these streams without any issues is amazing. And it turns out that you can even select which type of event you want to listen to, like:
- app.bsky.graph.follow;
- app.bsky.feed.post;
- app.bsky.feed.like;
- app.bsky.feed.repost;
- and many more!
But how can we turn it into an application? Well, it depends on your needs. The data is there; just consume it and do your magic! In my case, I like to transform data into games.
2. Gamifying Jetstream
I’m not a game developer, but games follow an Event-Driven Development approach, right? Every time that you earn some points in something, you level up or learn a new skill.
But to earn experience points, users need to take actions. And that’s what you do inside a Social Network: actions!
Imagine that every time you:
- Post:
- Just Text? Earns 50 experience
- Have Media? Earns 60 experience
- Have Media with Alt Text? Earns 70
- Like: Earns 10 experience
- Repost:
- Just Text? Earns 50 experience
- Have Media? Earns 60 experience
- Have Media with Alt Text? Earn 70 experience!
There are plenty of other abstractions that can be done, but that’s the idea.
The experience will be calculated using arithmetic progression, and should follow this simple rule:
With that, we can now talk about the technologies used in this project.
3. Meet the Stack
Bluesky uses ScyllaDB to serve all the AppView layer thinking about high availability and throughput, so we’re going to do the same!
Also, I’ve been using Rust extensively (and always learning more!), so I decided to implement this project with Rust. Here’s the tech stack in a nutshell:
- Language: Rust
- Database: ScyllaDB
- Packages:
- HTTP Server: actix-web
- ORM: charybdis
- Jetstream Client: jetstream-oxide
- Bluesky Client: atrium-api
My goal is to build something that, besides creating cool charts on Grafana, can also display something via REST API. First, let’s explore our data modeling strategy.
4. What about the Data Modeling?
Initially, the idea was to just store the events and test how stressed the app/database would become. But, at this point, we can go a little bit further.
ScyllaDB follows a Query Driven Development approach because it’s a Wide-Column NoSQL Database. Let’s think about that. First, it’s an RPG focused on a timeline profile, so it will have heavy read operations on top of the “characters”:
Since we only have one item in the WHERE CLAUSE, it means that our query is a Key Value
lookup.
But wait…we also need to store the current experience of this user. For that, I would use the Counter type
to atomically store it using key-value
pairs:
It’s supposed to be simple, just like this! But it also has to be fast enough to serve 1M requests/s with ease.
WARNING: Counter types can’t be clusterized or used as partition keys. Also, if you use them in a table, all fields besides the Partition Keys aggregates must be Counters!
I also want to track all possible events happening in a user’s account and list them in our extension to show how that person can be a better Bluesky user.
So, the queries would be around users and they must be clusterized in descending order:
Alright, that should be enough for an MVP. Now let’s model each part showing some Rust and Charybdis ORM!
4.1 Modeling: Leveling State UDT
Since we’re using ScyllaDB, we can use UDTs (User Defined Types). Keeping track of operations can be a pain. However, if you’re making this a pattern across all tables, UDTs can be useful when you don’t want to recreate the same fields every time.
Now we can just use it around the other tables, whether it’s related to events or characters.
4.2 Modeling: Characters Table
This will be the most accessed table inside our project via REST API. And the modeling (at this moment) is simple since we only want the user_handle and the leveling state (udt).
Check it out:
With the UDT, we can serve exactly the latest leveling state to build a UI later on. We can also add new fields since none of them will be part of the Partition Key.
4.3 Modeling: Characters Experience Table
As mentioned earlier, we should store the experience so that it won’t become a race condition. Why? ScyllaDB is a highly available database that can replicate your data across multiple nodes.
To avoid race conditions, we need to use the only Atomic Type available: the Counter type. With that, we will ensure that every write/read will be the latest there. Yes, it impacts performance. However, Counters are planned and optimized for this type of operation.
The modeling would be:
Now the last one, the events table!
4.4 Modeling: Events Table and MV
This is the most “complicated” part, but it’s not that hard. As mentioned before, there are plenty of events around ATProto Bluesky, and I want to give all the possible events for each user.
Displaying data in descending order is a must. ScyllaDB can provide this functionality if you include a Clustering Key in your table.
Check it out:
With the CLUSTERING ORDER BY (event_at DESC)
I’m basically telling it that every time I fetch a chunk of data from this table, it ALWAYS will be the recent inserts.
However, now we have a problem. Imagine that we want to list all events from a specific type. With this table, we’re not able to do that.
Why? Because you can only use as WHERE clause items that you add inside your Partitions or Clustering Keys. However, we can get around this by creating a Materialized View!
Materialized Views are tables created based on a parent table. Every time that this parent table receives a write, your view will also receive it. You can then play with the partition/clusterization.
Check it out:
Now, we have different partitions for the same user, storing different types of events that we’re able to query directly.
With that, our data modeling is finally DONE! Let’s jump into some business rules implementation.
5. Hands-on: Application Flow
With the basics taken care of, let’s explain how everything works under the hood.
5.1 App: Jetstream Oxide
At the Websocket layer, we’re using the Jetstream Oxide package to receive all the events in an elegantly structured way. The boilerplate can be like:
For each type of event, we’ll receive a specific amount of experience and a different response in asynchronicity. With that, the goal was to make an OCP integration where we only need to add new events when possible:
That takes us to the last step, which sets up the event default behavior at the Trait. We have three types of event actions: Create, Update, and Delete.
The Handler will take care of the whole Action/Communication with ScyllaDB through Charybdis ORM. In this example, you can check how the CreateEventHandler works:
We can implement other types of events by only extending the trait to the new Dynamic Struct, and it will be working fine.
5.2 App: Actix Web
For serving this data, there’s a simple implementation of an endpoint using Actix. Since the long-term goal is to build a browser extension, we need to serve an endpoint with the character/user information:
6. Conclusion
This exploration of Bluesky Jetstream and its potential for gamification showcases the power of leveraging cutting-edge technologies like ScyllaDB and Rust to build scalable, high-performance applications.
By focusing on event-driven development, we successfully demonstrated how to create an interactive system that transforms social media activities into measurable, gamified metrics.
You can check out the project here.