Disney+ Hotstar was originally launched in 2015 as Hotstar, the streaming service for Star India, which was later acquired by The Walt Disney Company. In March 2020 the digital service was rebranded as Disney+ Hotstar and subscriptions soared. By November 2020 they had increased to 18.5 million paid subscriptions, becoming the fastest growing segment of Disney+ global subscribers. They also expanded into Indonesia in September 2020.
Disney+ Hotstar in addition provides an ad supported content tier, which is growing even faster than paid subscribers. Total paid and unpaid subscribers account for over 300 million users. Disney+ Hotstar coverage of last year’s India Premier League cricket competition (IPL20) set a record for concurrent streaming viewership of any sporting event, with an audience of more than 25 million subscribers.
We were thrilled to have the engineering team of Disney+ Hotstar join us to present at our online Scylla Summit 2021 this January. The speakers included Vamsi Subash Achanta, the Architect behind their Scylla deployment, and Balakrishnan Kaliyamoorthy, their Senior Data Engineer.
Use Case: Continue Watching
Their “Continue Watching” feature uses Scylla to track every show for every user, remembering the timestamp where they were last watching their show; or, if you were done with one episode in a series, it will prompt you to watch the next episode. It can even alert you to continue watching when a new episode of your favorite show becomes available. This feature can be used cross-platform, so you might have begun watching on your computer, and then can continue watching from your mobile or tablet device.
Moving to Scylla
Prior to their adoption of Scylla, their infrastructure was built on a combination of Redis and Elasticsearch, connected to an event processor for Kafka streaming data. Their Redis cluster held 500 GB of data, and the Elasticsearch cluster held 20 TB. Their key-value data ranged from 5kb to 10kb per event.
This presented the team with a few problems. First was that the multiple data stores also meant maintaining multiple data models. The data was scaling rapidly, which meant that costs were also rising dramatically.
The first redesign decision was to adopt a new data model. For the user content table, the userid acted as the primary key, the content ID as the secondary (clustering) key, plus a timestamp and additional fields.
The team considered a number of alternatives, from Apache Cassandra and Apache HBase to Amazon DynamoDB to Scylla. Why did they choose Scylla? Two important reasons: first and foremost, consistently low latencies for both reads and writes, which would ensure a snappy user experience. Secondly, Scylla Cloud, our fully managed database as a service (NoSQL DBaaS), offered a much lower cost than the other options they considered.
Performance monitoring results of Scylla showing sub-millisecond p99 latencies, and average read and write latencies in the range of 150 – 200 µseconds (microseconds).
Balakrishnan (“Bala”) gave an overview of their migration process. They began with saving a Redis snapshot in an RDB format file, which was then converted into Comma Separated Value (CSV) for uploading into Scylla using cqlsh. One thing Bala cautioned was to watch for maximum useful concurrency of your writes to ensure you do not end up with write timeouts.
A similar process was applied to the Elasticsearch migration.
Once Scylla Cloud had been loaded with the historical data from both Redis and Elasticsearch, it was kept in sync by modifying their processor application, to ensure that all new writes also were made to Scylla, and an upgrade to the API server so that all reads could be made from Scylla as well.
At that point, writes and reads could be cut out from the legacy Redis and Elasticsearch systems, leaving Scylla to handle ongoing traffic. This completely avoided any downtime.
The Disney+ Hotstar team had also done some work with Scylla Open Source, and needed to move that data into their managed Scylla Cloud environment as well. There were two different processes they could use: SSTableloader or the Scylla Spark Migrator.
SSTableloader uses a nodetool snapshot of each server in a cluster, and then uploads the snapshots to the new database in Scylla Cloud. This can be run in batches, or all at once. Bala noted that this migration process slowed down noticeably when they had a secondary (composite key).
To avoid this slowdown the team implemented the Scylla Spark Migrator instead.
In this process, the data was first backed up to S3 storage, then put onto a single node Scylla Open Source instance; a process known as unirestore. From there it was pumped into Scylla Cloud using the Scylla Spark Migrator. Bala found our blog, Deep Dive into the Scylla Spark Migrator, particularly helpful in setting up his migration.
Advantages of Scylla Cloud
Disney+ Hotstar is now running on Scylla Cloud. So beyond the improved performance, predictable low latencies, and better TCO, they are also relieved of the burden of administrative tasks like backups, upgrades and repairs. Now they can focus on scaling their business.
If you’d like to learn more about the advantages of Scylla Cloud for your own organization, please feel free to contact us.