See all blog posts

ScyllaDB Summit Preview: ScyllaDB: Protecting Children Through Technology

In the run-up to ScyllaDB Summit 2018, we’ll be featuring our speakers and providing sneak peeks at their presentations. This interview in our ongoing series is with Jose Garcia-Fernandez, Senior Vice President of Technology at Child Rescue Coalition, a 501(c)(3) non-profit organization. His session is entitled ScyllaDB: Protecting Children through Technology.

Jose, it is our privilege to have you at ScyllaDB Summit this year. Before we get into the work of the Child Rescue Coalition, I’d like to know how you got on this path. What is your background, and how did you end up working on this project?

I have been developing software solutions for many years. I hold a Master in Computer Science and I work on software solutions in the areas of Big Data, Computer Networks and Cyber Security. I am responsible for the development, operation, and enhancement of the tools that makes CRC’s “Child Protection System” (CPS). CPS is the main tool thousands of investigators, on all U.S. states, and more than 90 countries, use in a daily basis to track, catch, and prosecute online pedophiles who use the Internet to harm children.

I started this path when I was working developing TLOXp, an investigative tool that fusions billions of records about people and companies for investigative purposes. About 10 years ago, a group of investigators showed us how pedophiles were using the Internet to harm children. I was shocked to know that the same power we use in a daily basis to share information, connect with other people on social media, and all benefits we have from using the Internet, was also used by pedophiles to communicate, share illegal material, mentor each other, and, the worst of all, contacting new victims in a way that was not possible before. We worked together, and, as a result of that work, we developed a set of tools for Law Enforcement, and they have been using them successfully over the last years to catch online child predators. More than a thousand kids have been rescued, and 10,000 pedophiles have been prosecuted, as a direct consequence of the use of the tools we created and maintain at CRC, along with the extraordinary work of committed law enforcement investigators. In 2014, that platform and the people involved on the project created Child Rescue Coalition in order to further grow the platform and expand its reach to other countries.

For those not familiar with your work, can you describe the challenge and the goals of the Child Rescue Coalition, and the technology you are using to address the problem?

Child Rescue Coalition’s mission is to protect children by developing online state-of-the-art technology. We deal with more than 17 billion records, we combine them into target reports, ranked them using several algorithms developed with law enforcement organizations and provide, through a web-based application, free of charge, to law enforcement agents, ranked targets in their respective jurisdictions. Technology comes in different tools for different purposes, we deal with a lot of open-source product as well as proprietary technology to deal with big distributed systems.

In human terms, what is the scale of this issue?

People may not know of bad the problem is. We have programs called “bots” or “crawlers” on the Internet. These programs identify and send events informing about illegal activity to our servers. We deal with more than 50 million leads per day. Every year, we identify more than 5 million computers generating those leads, in other words, this means, millions of pedophiles looking for ways to victimize children.

Let me quote you this statement: “This tool stores several billions of records in ScyllaDB, and it is expected to grow in the tens of billions of records in the near future.” That’s a shocking thing to imagine; just the raw quantity of data. What are the main considerations you face in managing it?

Our main consideration, and the reason why we selected ScyllaDB, was its efficient and optimized design for modern hardware. This means, we can implement our solution with only 5 servers, but it would have required at least 20 servers using other technologies based on JVM. Having less servers mean lower hardware costs, but, more importantly, less time maintaining server failures, and more time for developing new projects. That also means we can horizontally scale as needed, with almost no impact.

Besides ScyllaDB, what other technologies are critical to your mission’s success?

In order to grow we have been working towards making our platform more flexible, efficient and scalable. Recently, we have implemented Kubernetes to containerize ours tools and expand into the cloud. We have implemented Kafka and Apache NiFi for the expanding our data flow to new sources and processing with minimum impact, and standardizing to ScyllaDB for NoSQL storage for all the new tools.

Child Rescue Coalition is constantly sharing information about potential and actual crimes involving minors. Data privacy, retention and governance must be paramount. What special considerations do you have in that regard?

Our systems have been challenged over the years in court, and time after time we have proven, even with independent third-party validators, that our tools do not invade privacy, or use any nefarious or invasive ways to obtain the information we gather, or the processes law enforcement used with our tools were appropriate.

How can readers help if they want to support the Child Rescue Coalition?

We are a 501(c)(3) non-profit organization, meaning all received donations or funding could be tax-deductible. At the individual level, the easiest way is to become a coalition club member or make your contributions using this page: https://childrescuecoalition.org/donate/

Corporations can also become corporate sponsors, and/or fund specific projects or activities, or donate hardware, software, or services.

All funds are used primarily bring awareness, training and certification of new investigators on the use of our tools, at no cost, on underserved communities. Funds are also use for development and maintenance of new software as new online threads emerge. You can also follow us or find more about our organization on these links:

Thank you for all you do. Your session at the ScyllaDB Summit will certainly be riveting.

About Peter Corless

Peter Corless is the Director of Technical Advocacy at ScyllaDB. He listens to users’ stories and discovers wisdom in each to share with other practitioners in the industry at large, whether through blogs or in technical presentations. He occasionally hosts live industry events from webinars to online conferences.