Tom Creighton

Chief Technology Officer, FamilySearch

Supernova Award Category

Next Generation Customer Experience

The Problem

The Family Tree application is extremely popular and generates significant demand from more than 500,000 customers. Over the last year, we struggled with our prior database technology as it strained to service our customers’ experience expectations. As the application grew in popularity, we realized we had vertically scaled our database technology as far as was cost-effective and found we could not get beyond 60 million transactions per hour for the Family Tree application – which created a technological headwind for future growth. We had experienced a tremendous response from customers as the Family Tree application exploded in popularity, but in order to deliver a great customer experience, we needed a more scalable database that could prevent downtime and allow us to deliver even more features for our users.

The Solution

FamilySearch anticipated 10-100X more usage of the site over the next three years and wanted to position ourselves to handle this rapid growth. We conducted in-depth, head-to-head comparisons between several relational and NoSQL databases including open source Cassandra and DataStax Enterprise. To support the demands of our growing customer base, we selected DataStax Enterprise for its scalability and performance. In addition, the masterless architecture of DataStax Enterprise provides the satisfaction of 100 percent availability with no downtime during traffic surges and cluster maintenance. DataStax Enterprise is the distributed, responsive and intelligent database foundation used to build and run the Family Tree application.

The results

FamilySearch experiences the highest amount of traffic every Sunday, and prior to our database migration, we would approach our capacity limits every week. Fortunately, we made the switch just in time. The transition to production took less than two hours of being offline. We really were ready in minutes, but we were doing extensive testing before we opened up to the public. Within two weeks of going live, we would have hit the capacity limit of our previous system but were able to seamlessly continue delivering the customer experience our users demand. We now routinely serve 125 million transactions per hour during peak usage with plenty of room for future growth. We’re on a solid path for future growth with customers experiencing faster response times, high availability and no database downtime. In addition, we were able to bring new capabilities to market. New applications like Record Hints, which helps users make new research discoveries, were not possible with our previous infrastructure.

Metrics

60 million transactions per hour used to be the absolute limit of the system. Now, we have scaled to 125 million transactions per hour during peak usage, with plenty of room for future growth.

We sometimes had outages before. Now, we have zero downtime and customers experience faster response times and high availability.

We can introduce features more rapidly and implement marketing campaigns that might have otherwise stressed the system.

Because of greater efficiencies in how we were able to implement certain features in Datastax Cassandra, we have a much better user experience. For example, we can give a much more detailed history of changes to ancestor records in the database than before. This is crucial to maintaining quality data. In addition, we are able to do this orders of magnitude faster than in the past. There are several other features of our application that have been similarly improved because of the flexibility and speed of the database on Cassandra.

The Technology

DataStax Enterprise, the always-on, distributed cloud database built on Apache Cassandra™.

Disruptive Factor

We knew that, at some time, we’d have billions of records on FamilySearch – and we were right. Today, the primary canonical dataset that we work with is around 1.2 billion records. We also expected to have a very high read transaction rate and a reasonably high write transaction rate and we needed a way to cope with that.

This read/write ratio was important to establish because while many people visit FamilySearch simply to search for records, others build their own family trees on the site, contribute notes about their own findings and a large crowdsourcing effort is involved in getting volunteers to correct and clarify some of the records on offer there. This work is all carried out through the Family Tree application.

Resilience and scalability have also opened the doors to new features and functions – like Record Hints, which helps users make new research discoveries. This would not have been possible with the previous infrastructure.

In addition to the great improvements in our FamilyTree application, we are seeing similar gains in our other, related applications. For example, management of several billions of source records (such as census records) will soon also be done in DataStax based on Search. These changes will enable much better metadata searches on record sets than what we have had in the past.

Shining Moment

I discovered that, way back in my family history, certain ancestors were cattle thieves on the UK/Scottish border! We have also had many reports from our patrons regarding the deep emotional connection they experience as they discover their roots – their family history. Of course, as with the case of the cattle thieves, one must be careful when shaking the family tree – one never knows what nuts may fall out.

Chief Technology Officer

Submission Details

Year
Category
Next Generation Customer Experience
Result