Supernova Award Category
The Problem
Advanced, predictive analysis has been MarketShare’s differentiator since its founding in 2005. But what started as a Small Data analysis challenge in the company’s early days evolved into a Big Data challenge by 2012. That’s when the firm shifted from using gigabyte- scale aggregated data to terabyte-scale raw data. The goal of using more detailed data was to measure campaign effectiveness down to individual impressions by channel.
With the move to Big Data, the company switched from using MySQL to Amazon’s Hadoop-based Elastic MapReduce (EMR) service. The company faced big problems using Hadoop as a platform.
- Hadoop-based data-processing workloads could not be completed within 72 hours due to high job-failure rates.
- Three to four employees were preoccupied with administering Hadoop services and troubleshooting failed jobs.
- The company lacked self-service reporting capabilities.
- Advanced analytic modeling processes were limited to slow, single-server processing.
The Solution
MarketShare started exploring alternative Hadoop-as-a-service options and was drawn to Altiscale’s fully managed service because the vendor handles day-to-day platform operations (whereas others leave administration to the customer). In 2013 MarketShare moved all Big Data analysis and modeling work to Altiscale’s Hadoop-as-a-service platform.
To strengthen the Altiscale deployment, MarketShare added the following partner services:
- Alation uses machine learning to automatically capture information about data, its origin, who’s using it and how they are using it. Describing Alation as “a Google search for data,” Ramachandran says data modelers use it to find relevant data and salespeople use it to explain data to customers without assistance from IT.
- Arcadia Data provides self-service reporting.
- H2O distributed analytic services supports automated predictive modeling.
- Altiscale Apache Spark services support near-real-time services and programmatic buying.
The results
The switch to Altiscale not only enabled MarketShare to complete Hadoop data processing in 1/4 - 1/5 the time previously required, the company estimates that the savings in processing time reduced Hadoop platform costs by 30%-35%.
The implementation helped MarketShare deliver faster, better results with improved predictive accuracy that’s driving improved performance and higher customer satisfaction. With the addition of third-party tools MarketShare gained the following benefits:
● Process Hadoop jobs faster. Hadoop data-processing jobs are now completed within 48 hours, helping to reduce total time to new models to one work week or less.
● Speed up analysis. Alation enables data scientists can quickly find relevant data on Hadoop for modeling work and salespeople can explain data-analysis results to customers without assistance from data scientists.
● Democratize data. Arcadia Data enables non-data-scientists to develop reports against data in Hadoop without assistance from IT.
● Develop models faster. H2O’s distributed predictive analytics platform has helped MarketShare reduce model-development time from 1-2 weeks to 1-2 days, reducing total time to new models to one work week or less.
● Next-generation capabilities. Alation’s Apache Spark service has MarketShare poised to deliver next-generation capabilities to meet rising customer expectations for low-latency insight & support analytics-guided programmatic buying.
Metrics
- Completing Hadoop jobs in 1/4-1/5 the time and at 65%-70% of the cost of the old platform. Hadoop data-processing jobs are now completed within 48 hours, helping to reduce total time to new models to one work week or less.
- Ad-hoc data exploration and self-service reporting are now supported.
- More accurate predictive models are delivered within 1-2 instead of the 1-2 weeks as previously required.
- Before the implementation three to four MarketShare IT employees were preoccupied administering Hadoop services and tracking down failed jobs. With Altiscale managing configuration, administration and monitoring, those three to four employees focus on the modeling work and handle a much larger volume of clients.
The Technology
- Altiscale (Hadoop and Spark services)
- Alation (search & discovery)
- Arcadia Data (BI for Hadoop)
- H2O (distributed analytics)
Disruptive Factor
Democratization of big data at MarketShare.
Recognizing the implementation grows stronger when more people are able to access the data, Arcadia Data enables non-data-scientists to develop reports against data in Hadoop without assistance from IT, improving responsiveness, depth, and breadth of insight and customer satisfaction.
On our implementation, Data scientists can quickly find relevant data on Hadoop for modeling work while salespeople can explain data-analysis results to MarketShare customers without assistance from data scientists.
Shining Moment
Prioritizing customers throughout the implementation led to a successful deployment.
The purpose of the Altiscale implementation was to deliver better predictive capabilities more quickly and conveniently to MarketShare customers. MarketShare never wavered from this focus on helping customers to improve marketing click-through and conversions rates. This focus helped keep MarketShare's technical priorities aligned with customer benefits throughout the implementation.
