Operational Data Lake

Offload Real-Time Analytics from OLTP and Data Warehouse DBs

Aggregating data from multiple OLTP systems, an operational data lake enables companies to:
- offload real-time analytics from OLTP and data warehouses DBs
- replace obsolete Operational Data Stores
- streamline ETL pipelines on Apache Hadoop®

Operational Data Lake White Paper

Learn about the advantages of a Hadoop Operational Data Lake

Download

Replace Obsolete Operational Data Stores (ODSs)

An operational data lake (ODL) is a modern replacement for older Operational Data Stores (ODSs). An ODL provides two key services:

  • Offloading real-time reporting and analytics from more expensive OLTP and data warehouse systems
  • Performing aggregation and transformation for ETL processes

In addition, an operational data lake has the following additional benefits over an ODS:

  • Based on modern scale-out technology. Compared to older RDBMSs like Oracle, operational data lakes can be 5-10x faster with 75% less cost.
  • Handle semi-structured and unstructured data. When part of a larger Hadoop-based data lake, you can now analyze structured, semi-structured, and unstructured data together.

Many companies find that replacing an ODS with an operational data lake represents an excellent first project in Big Data.

Streamline & Harden the ETL Processing Pipeline on Hadoop

Many companies have turned to Hadoop reduce the costs of their ETL processing pipeline. However, ETL processing often encounters data quality issues that require changing or updating data. Because most SQL-on-Hadoop systems such as Apache Hiveā„¢ or Impala do not support full Create-Read-Update-Delete (CRUD) operations, any changes require reloading all data, which can take hours and extend past the ETL update window.

With the ability to perform transactional updates , Splice Machine can resolve ETL data quality in seconds to minutes, instead of hours. ThisĀ minimizes any downtime due to ETL errors, allowing users and applications to access data quickly.

Complementing an Existing Hadoop-Based Data Lake

For an existing Hadoop-based data lake, Splice Machine becomes a powerful and flexible repository for structured data:

  • Using Splice Machine allows you to directly store structured data in the same relational framework as the native representations with no unnecessary transformations or extractions to flat files.
  • Enable ad-hoc analytics through Hadoop tools such as MapReduce or Hive on both structured and unstructured data.

Read more about the Operational Data Lake with this white paper.

 

Free High Quality Images Download Free Stock Images Download Free Images Download YouTube Videos. Mp3,MP4 Converter