Case Study: TellusLabs Taps ScyllaDB to Power Predictive Maps of Global Agriculture

By Ophir Horn, Head of Engineering, TellusLabs

About TellusLabs

TellusLabs builds a living map of the world’s food supply by combining satellite feeds with meteorological data to track crop health, predict yields, and translate raw data into vital signs for global agriculture.

TellusLabs’ overarching goal is to provide a Google Earth-like experience that delivers insights at both the national and global levels, while also enabling users to zoom in all the way to individual fields. On top of this, TellusLabs provides predictive capabilities based on machine learning and artificial intelligence to estimate crop yields at different levels.

Customers range from chicken farmers in Asia who are sensitive to the price of corn in the US, to ag companies comparing the effectiveness of new seeds, to traders seeking to profit from fluctuations in commodity futures to insurance companies looking to better assess the actual condition of the fields and crops under coverage.

The Challenge

Agriculture is a fluid enterprise. Fields are continuously changing, as farmers switch up crops in response to weather and market demand. TellusLabs tracks these changes in real-time using data-intensive image processing. Satellite images are converted at the pixel level into time series data structures, across greens, reds, blues and other wavelengths. Pixel data is time stamped, rolled up to fields and counties, and then to states and countries. Pixel data is augmented and enriched with global meteorological information and other features — like moisture — that are critical to understanding and estimating crop growth.

With 15 years of data, 365 days a year, 250 metrics per field and a goal to support 500 million fields globally, TellusLabs aims to manage more than 600 trillion records. Even five years ago, the sheer compute and storage requirements meant only large corporations and governments could leverage datasets of this magnitude. TellusLabs needed a database capable of handling a big data use case in real time.

“Our primary requirement was for a super space efficient database that can store this amount of data at a reasonable price, while still serving this data with low latency when slicing a relevant subset from those 600 trillion records,” explained Ophir Horn, Head of Engineering at TellusLabs. “Being a very small team, keeping the operational overhead as well as the number of instances as low as possible was also critically important.

None of the existing hosted services (such as Amazon DynamoDB) were even close to meeting our price/storage requirements, rendering a cluster at that size simply too expensive. It seemed that a columnar database with linear scalability such as Cassandra or ScyllaDB was going to bethe best fit from both space and performance perspectives.”

The Solution

After evaluating Apache Cassandra, it quickly became apparent that while the technology can help TellusLabs to efficiently store that amount of data planned, the number of instances required to store the data while still achieving low latency would pose a significant operational overhead on TellusLabs’ small team.

The promise of high performance, autonomous tuning, and optimal utilization led TellusLabs to evaluate ScyllaDB. Initial testing revealed that ScyllaDB delivered on expectations, allowing TellusLabs to run a cluster one-third the size of Apache Cassandra. In tests, ScyllaDB performed 100% better than Cassandra, while making more efficient use of the available hardware. These results led TellusLabs to replace Cassandra and deploy ScyllaDB in production.

“To be honest, I’ve spent zero time trying to figure out what’s going on. ScyllaDB just works.”

– Ophir Horn, Head of Engineering, TellusLabs

“Having worked with Cassandra, I was very pleased to see that ScyllaDB does what I expect a database to do,” said Horn. “ScyllaDB hasn’t failed or had any hiccups. To be honest, I’ve spent zero time trying to figure out what’s going on. ScyllaDB just works.”

Migration from Cassandra to ScyllaDB was quick and painless. With a primary copy of data from a previous migration at hand, Horn installed ScyllaDB, did a post-stress test, switched over, and decommissioned Cassandra. Since that time, only a small set of the data was stored in the DB, migrating the entire dataset took just a few hours.

ScyllaDB also provided data security. Some of TellusLabs’ customers incorporate proprietary, yield-sensitive data into their data models. TellusLabs uploads that data into the system, where it must remain isolated from other customers. ScyllaDB’s partition scheme enables TellusLabs to protect proprietary data in a multi-tenant scenario.

Horn sees a bright future for ScyllaDB at TellusLabs. “There are tons of image sensors around the world, tracking everything. With ScyllaDB, we can achieve our goal of building a platform that incorporates more data sources into our models, increasing flexibility and profitability for our customers.”