ScyllaDB and Apache Spark

ScyllaDB is the highly scalable, high performance NoSQL database that can keep up with the streaming analytics demands of Apache Spark

ScyllaDB is the fastest, most powerful and scalable NoSQL database. Apache Spark is the fastest, most powerful and scalable data analytics framework. Many users who deploy one deploy the other because they are entirely complementary technologies.

Hooking Up Spark and ScyllaDB

Four part blog series providing a primer on how to use Spark and ScyllaDB together: We provide all the open source code in Github for you to try this yourself.

Part One:
Overview

Part Two:
Data Transformations

Part Three:
DataFrames

Part Four:
Structured Streaming

Mastering the ScyllaDB Spark Migrator

+

The ScyllaDB Spark Migrator is our workhorse engine written in Apache Spark, capable of taking data from multiple sources, including databases such as Apache Cassandra or DynamoDB, or big data file formats like Apache Parquet, and migrating them into ScyllaDB or ScyllaDB Cloud.

ScyllaDB and Spark

Moving from Cassandra to ScyllaDB via Apache Spark: The ScyllaDB Migrator

Migrating from DynamoDB to ScyllaDB’s DynamoDB- compatible API

Migrate Parquet Files with the ScyllaDB Migrator

ScyllaDB and Spark

Migrating to ScyllaDB Cloud

Migration Methods

Spark, File Transfer, and More: Strategies for Migrating Data to and from a Cassandra or ScyllaDB Cluster

Migrating to ScyllaDB Cloud

Deep Dive into the ScyllaDB Spark Migrator

Case Studies

GE Predix: Industrial-
Strength IoT at Scale

Ola Cabs: Two Years of Using ScyllaDB in Production

Tubi: Scaling Up Machine Experimentation with ScyllaDB and Scala

Natura Achieves Beautiful Results with ScyllaDB (also Em português)

Augury Foresees a Bright Future with ScyllaDB

From SAP to ScyllaDB:
Tracking the Fleet at GPS Insight

Additional Resources

Apache Spark at ScyllaDB Summit, Part 1: Best Practices

Apache Spark at ScyllaDB Summit, Part 2: Tips for Building Resilient Pipelines

Spark Powered by ScyllaDB: Your Questions Answered

Databricks Discusses Stateful Streaming Applications with Apache Spark

ScyllaDB University

Get started on the path to ScyllaDB expertise

ScyllaDB Cloud

It’s easy to get started with our NoSQL DBaaS