Lubos Kosco and Andrej Chu of Rocket Fuel are coming to Scylla Summit 2017 to present how they use Scylla for Page Context Categorization. If you work at the infrastructure level, this talk is a great opportunity to learn more about processing large amounts of data with Scylla. Let’s learn more about the speakers and their presentation.
Please tell us about yourselves and what you do at Rocket Fuel?
I am a Senior Rocket Scientist, responsible for programmatic advertisement bidders in Java.
I am leading the Serving Infrastructure team in Prague, responsible for bidding and Ad serving.
How did you both get started in the advertising space?
We’ve got an opportunity to work for a Silicon Valley company with cool culture and cool new technologies so we just decided to go for it.
What will you be talking about at Scylla Summit 2017?
We will be talking about using Scylla for Page Context Categorization at Rocket Fuel.
What is Page Context Categorization?
Page Context Categorization is a concept where we try to get as much information as possible about the context in which we will be displaying our ads. This includes getting the categories in which the page where we are about to display an ad fits.. This way the clients are able to target certain categories (“I want to display my ads on the pages that are related to the car industry”) or exclude them (“I don’t want to display my ads on the pages that fall into the gambling category”) from their ads.
What type of audience will be interested in your talk?
The audience for our talk will be mainly for Developers, system administrators, and DevOps.
Can you please tell me more about your talk?
At Rocket Fuel, the first use-case for Scylla is fast caching of contextual information about pages where we bid our ads. We have 2,000 clients in seven data centers connecting to multiple Scylla clusters, in a typical setup of 3-node cluster per data center.
In the second scenario, we leverage Scylla as a cache for IP address blacklists. Here, the ‘freshness’ of information available to the clients is crucial. We need to repeatedly load lists of IP addresses (~60M records) and then in 15 minutes intervals refresh them, which typically involves 1M+ upserts / deletes during a single run. With this level of data churning we still need to serve the clients as mentioned in the first scenario above.
In total, the cache needs to handle more than 150k requests per second (from cca 1400 connections) in a single data center.
High availability is critical and there are tight SLAs around the whole process, requiring low latency with average response values well under one millisecond.
Why did Rocket Fuel choose Scylla?
We were looking for a solution that would offer big-table capabilities, high availability, and (especially) low latency, which is critical for our systems, as we have very stringent SLAs to meet.
How can the people get in touch with you?
Thank you very much, Lubos and Andrej. We can not wait to see your talk in person and learn more. If you want to attend Scylla Summit 2017 and enjoy more talks like this one, please register here.
Scylla Summit is taking place in San Francisco, CA on October 24-25. Check out the current agenda on our website to learn about the rest of the talks—including technical talks from the Scylla team, the Scylla roadmap, and a hands-on workshop where you’ll learn how to get the most out of your Scylla cluster.