Aggregating data from multiple OLTP systems, an operational data lake enables companies to:
- offload real-time analytics from OLTP and data warehouses DBs
- replace obsolete Operational Data Stores
- streamline ETL pipelines on Apache Hadoop®
Learn about the advantages of a Hadoop Operational Data Lake
An operational data lake (ODL) is a modern replacement for older Operational Data Stores (ODSs). An ODL provides two key services:
In addition, an operational data lake has the following additional benefits over an ODS:
Many companies find that replacing an ODS with an operational data lake represents an excellent first project in Big Data.
Many companies have turned to Hadoop reduce the costs of their ETL processing pipeline. However, ETL processing often encounters data quality issues that require changing or updating data. Because most SQL-on-Hadoop systems such as Apache Hive™ or Impala do not support full Create-Read-Update-Delete (CRUD) operations, any changes require reloading all data, which can take hours and extend past the ETL update window.
With the ability to perform transactional updates , Splice Machine can resolve ETL data quality in seconds to minutes, instead of hours. This minimizes any downtime due to ETL errors, allowing users and applications to access data quickly.
For an existing Hadoop-based data lake, Splice Machine becomes a powerful and flexible repository for structured data:
Read more about the Operational Data Lake with this white paper.