About this Constellation ShortList™

On-premises and cloud-based data lakes are a destination for data in all its forms, including structured transactional records, semistructured/variably structured data types (e.g., sensor data, log files, clickstreams, email, images, social streams, text documents) and unstructured content (e.g., images, audio and video files). Hybrid-cloud and multicloud data lake management capabilities enable organizations to bring order and accessibility to distributed data lake environments, filling gaps in platform- and cloud-specific metadata management, cataloging, and governance capabilities.

The category refers to data management, data lineage, metadata management and governance. This category is important for any organization that wants to take advantage of high-scale data. Management prowess is particularly important as data sources and data-driven applications multiply, as data scale and diversity grow, and as the number and diversity of data lake deployment choices multiply. All of the above contribute to complexity and the risk of security and access- control lapses, the inability to meet compliance requirements, and lost and unusable data.

With this update, the name of this Shortlist was expanded from “Data Lake Management” to “Hybrid-Cloud and Multicloud Data Lake Management” to reflect the growing distribution of high-scale data across cloud and on-premises deployments. Okera was added to this Shortlist on the strength of its progress in addressing hybrid- and multicloud data lake deployments. Unifi Software, which was acquired by Boomi in Q1 2020, has been deleted from this list. Unifi had shifted its focus to data cataloging and data prep even before the acquisition, and Boomi is following this same direction in integrating and investing in Unifi’s technology.

Threshold Criteria

Constellation considers the following criteria for these solutions: 

  • Native ability to connect to myriad data sources and ingest diverse data types, including structured, semistructured and unstructured data sources, and file formats native to cloud object stores, relational platforms, NoSQL databases, Spark and Hadoop
  • Works with cloud-based object stores and data lake services as well as Apache Spark and Hadoop distributions and cloud services; includes compatibility with platform-native security and access controls and governance modules, and supports management of data lakes deployed on-premises or on multiple public clouds
  • Metadata management, including the ability to capture, catalog and apply metadata classifications by source, asset type and business language, giving organizations better insight into data lake content
  • Governance capabilities ensuring that organizations can meet strict policies and compliance requirements, supplementing platform- and cloud-service-native security and access controls and data-lineage tracking

The Constellation ShortList™

Constellation evaluates more than a dozen solutions categorized in this market. The Constellation ShortList is determined by client inquiries, partner conversations, customer references, vendor selection projects, market share and internal research.

  • IBM
  • Informatica
  • Okera
  • Qlik
  • Zaloni

Frequency of Evaluation

Each Constellation ShortList will be updated at least once per year. There could be an update after six months, should the analyst deem it necessary.

Evaluation Services

Constellation clients can work with the analyst and research team to conduct a more thorough discussion of this ShortList. Constellation can also provide guidance in vendor selection and contract negotiation.

Download Research Click to Download Report