We attended the Teradata Influence Summit last week in Del Mar, North of San Diego, (Progress Report here) and fresh off the heels of this event, Teradata announces support for Presto.
 

Let’s dissect the press release (find it here) our customary style:

SAN DIEGO – June 8, 2015 – To make it easier for more users to extract insights from data lakes, Teradata (NYSE: TDC), the big data analytics and marketing applications company, today announced a multi-year commitment to contribute to Presto’s open source development and provide the industry’s first commercial support. Based on a three-phase roadmap, Teradata’s contributions will be 100 percent open source under the Apache® license and will advance Presto’s modern code base, proven scalability, iterative querying, and the ability to query multiple data repositories.

MyPOV – Interesting and good move by Teradata, partnering with a promising open source initiative with good DNA (originally from Facebook) and proven practical usage (e.g. AirBnb, Facebook and more). As part of the Teradata UDA, the vendor used to execute its federated queries on the Hadoop side on Hive, with Presto it get a more generic and more powerful support for these queries. And good to see that Teradata will be a ‘good citizen’ for open sources, becoming a (substantial) contributor to Presto. Also good to see it is a multiyear commitment, and the roadmap (we saw it under NDA) is rich, but realistic.

Developed and used by Facebook, Presto is a powerful, next-generation, open source SQL query engine which supports big data analytics. There is a growing interest in Presto, as these corporations have adopted it: Airbnb, DropBox, Gree, Groupon, and Netflix.

MyPOV – And here is the value, Presto is SQL base, a commonly known language for millions of business users and analysts. Bringing the ‘SQL back to no-SQL’ is a giant quest, and Presto is one of the more successful initiatives on that strategy path. It also has great user DNA.

Presto complements the Teradata® QueryGridTM and fits within the Teradata® Unified Data Architecture™ vision. Presto integrates with the Teradata® Unified Data Architecture™ by providing users the ability to originate queries directly from their Hadoop platform, while Teradata QueryGrid allows queries to be initiated from the Teradata Database and the Teradata Aster Database all through a common SQL protocol.

MyPOV – Teradata lays out the strategy here, which is good for transparency. Querygrid will be used on the Teradata and Aster side, throwing off queries to Presto as needed, no surprise. But Teradata will make its Presto offering open for direct Hadoop queries, a good move.

Presto is agnostic and runs on multiple Hadoop distributions. In addition, Presto can reach out from a Hadoop platform to query Cassandra, relational databases, or proprietary data stores. This flexibility allows Presto to combine data from multiple sources, allowing for analytics across the entire organization through a single query. This cross-platform analytic capability allows Presto users to extract the maximum business value from data lakes of any size, from gigabytes to petabytes.

MyPOV – The paragraph describes the value that Presto brings pretty well, from Presto a user can query pretty much anything. So Presto gives Teradata flexible data access again, but not from the Teradata level, but the Presto / OpenSource level. A very new approach for Teradata, but a good sign as it shows that Teradata is walking the path of times, which has a clear rise of open source at its end.

Teradata’s three-phase contribution to 100-percent open source code will advance Presto’s enterprise capabilities, which benefit customers.

Phase 1 - Enhance essential features that simplify the adoption of Presto, including installation, support documentation, and basic monitoring. The Phase 1 capabilities are available today for download at Teradata.com/Presto or on Github


MyPOV – Kudos for laying out a roadmap, always something appreciated by the ecosystem. Apparently the installation of Presto was not trivial, so Teradata focused on that logically as Phase 1/

Phase 2 - Integrate Presto with other key parts of the big data ecosystem, such as standard Hadoop distribution management tools, interoperability with YARN, and connectors that extend Presto’s capabilities beyond the Hadoop distributed file system (HDFS). These features will be available at the end of 2015.

MyPOV – This will be the key release for the Teradata / Presto offering.

Phase 3 –Enable ODBC (Open Database Connectivity) and JDBC (Java Database Connectivity API) to expand adoption within organizations and enhance integration with business intelligence tools. Enhance security by providing access based on job roles. These enhancements will be completed and available in 2016.

MyPOV – And this will be to make the Teradata / Presto release very, very attractive to SQL savvy business users (and developers).

In addition to its open source contributions, Teradata commercial support is now available from Think Big consulting. Think Big will offer its proven expertise in three areas to enable users to feel confident about putting Presto into production with assistance:

Presto Jumpstart – In the cloud or onsite, Think Big will assist with piloting new functionality
Presto Development – In the cloud or onsite, Think Big consultants will help customers design, build, and deploy a Presto solution
Think Big Academy - Two-day workshops will help customers understand the best uses and criteria for architectural decisions.


MyPOV – No surprise – Teradata will offer services here, and Thing Big is the place where Teradata offers these. A good move.

Overall MyPOV

Opensource is on the rise. In the last 12 months we have seen more and more open source uptake from Oracle, IBM and even outspoken past open source sceptics like Microsoft and SAP. Major ‘gifts’ have been made to open source – think of Pivotal’s recent move (see here). This all means that even skeptical enterprises have no choice than to implement, run and operate open source. The good news is, that vendors see their opportunities on the services side, which will have to be paid for, but overall it looks (for now) as if open source is a significant relief to IT budgets. The less shared secret is – it save time and reduces R&D budgets and efforts at ISVs, too.

Closer to Terada – a very smart move. Take a promising open source offering, free from competitor influence, and own the place. Contribute generously and lavishly to the roadmap, be a good open source citizen, and own it even more – all good moves. 
 
On the strategic side Presto is a huge hedge for Teradata – in the worst case scenario (Teradata ‘classic’ business slowly winding down), this is the first step of re-inventing Teradata on Hadoop. We will see if it comes to that, but a hedge is a hedge, even if not needed. The cross database type capabilities of Presto are very attractive. 
 
Teradata has done a good first move here, it will be interesting how the competition responds (find other open source initiatives – or even join Presto?). We will be watching.