Case Study: To Prevent Accidents Before They Happen, Meshify Calls on ScyllaDB to Scale IoT Sensor Data

By Sam Kenkel, DevOps Lead, Meshify

About Meshify

Meshify provides an IoT platform with a focus on the insurance industry. The company’s battery-powered wireless sensors transmit environmental data that is used to predict and prevent accidents, thereby averting insurance claims. Based in Austin, Texas, Meshify is a wholly owned subsidiary of HSB Group in Hartford, CT, part of Munich Re, one of the world’s largest reinsurers.

Meshify provides a range of sensors for various environments. Those sensors collect data for real-time alerting as well as historical data used to build predictive models. For example, pipe sensors compare ambient temperature with fluid temperature, enabling the system to alert a customer before a pipe freezes and potentially bursts. Other examples of Meshify sensors include water sensors that detect leaks, and refrigerator sensors that detect thawing.

Meshify sells sensors primarily to insurance companies, who in turn provide the sensors directly to the insured. Unsurprisingly, the cost of sensors proves to be much lower than payouts on accident claims.

The Challenge

Industry analysts estimate that sensor data is growing at fifty times the rate of traditional business data. Like the rest of the IoT industry, Meshify needed to scale rapidly, and traditional relational databases weren’t fitting the bill.

The Meshify platform runs against three primary time-series databases: one stores the history of every sensor’s data, one stores a history of alarm states, and one stores the history of alerts sent to customers. Originally, Meshify built out a version of the service on a relational database, specifically MySQL. Like many other companies, Meshify soon found that MySQL, or any other relational solution for that matter, is a fundamentally poor fit for time-series data.

“I don’t have to worry about tuning the JVM or even wait for ScyllaDB to benchmark.”

Sam Kenkel, DevOps and Database Reliability Engineer, Meshify

“We knew that MySQL wasn’t going to scale in the abstract, but the half-life of migrating to larger and more expensive EC2 instances kept shrinking,” said Sam Kenkel, DevOps and Database Reliability Engineer at Meshify. “When you’re working with IoT time series data that you’re storing forever, while also continually adding more nodes into your platform, you discover that MySQL is aggressively not designed for that type of scalability.”

Recognizing the need for a non-relational database, Meshify turned to Apache Cassandra, ScyllaDB and managed cloud database Amazon DynamoDB.

Meshify quickly disqualified managed cloud due to vendor lock-in and the fear of having a core component of their platform tied to a single cloud provider. It was important for Meshify to preserve their cloud provider independence.

Another strike against a managed cloud database was data residency. Since Meshify’s parent company is based in Germany, where there are strict regulations around data residency and where GDPR laws apply, the team needed to have absolute confidence about where data is stored. On top of that, long-term TCO projections revealed that a managed cloud solution would have been prohibitively expensive for Meshify’s use case.

The ScyllaDB Solution

In a side-by-side comparison, the team found that ScyllaDB beat Cassandra in performance and ease-of-use. A key factor was ScyllaDB’s ability to make optimal use of modern hardware and scale up, enabling Meshify to run a smaller cluster of larger nodes. “The fewer nodes, the easier it is to manage,” noted Kenkel. “Every node is another hostname, another key address, another object to keep track of in terms of the alerting and reporting stack.“

“I love the fact that when we needed more ScyllaDB resources, we just did a round-robin and added a large node, retired a small node, rinse and repeat. Suddenly, we’re at three large nodes with better performance. For our visualizations and internal monitoring infrastructure, it’s much easier to run and keep track of three nodes instead of nine.”

Running fewer nodes also helps Meshify save on infrastructure monitoring costs. Meshify monitors ScyllaDB using Datadog, who charges per monitored server. Three nodes are less expensive than nine.

Kenkel appreciates the speed with which he can now scale out. According to Kenkel, ScyllaDB’s prebuilt, pre-tuned AMI images enable Meshify to spin up a ScyllaDB node on an i3 instance in less than five minutes, without worrying about keyspaces or sharding. “I don’t have to worry about tuning the JVM or even wait for ScyllaDB to benchmark. That’s a tangible benefit when I need to spin up a new node for testing or for disaster recovery after losing a node.”

Meshify also benefits from ScyllaDB’s self-optimizing operations,which enable background operations to run safely alongside operational workloads. “The ability to handle batched workloads without interrupting our sensor ingest is a key component of our disaster recovery planning. Doing disaster recovery planning for worst case scenarios, spinning up a new cluster and restoring the schema enables our product to work for our customers while we simultaneously batch in the historical data.”

By the Numbers

  • Data points ingested:23,323,005
  • Number of nodes: 3
  • writes per minute: 5,000
  • SLA for each insert operation:
  • 99 percentile read latencies: Averaging 0.9ms on one node, 0.7ms on the other two
  • 99 percentile write latencies: 1.05ms on one node, averaging < 0.7ms on the other two