You Need a Better Lambda Architecture
A Lambda Architecture is a hybrid, scale-out data platform that can process operational OLTP workloads and analytical OLAP workloads concurrently. It is usually built from components such as HDFS, Hive, Spark, Kafka, Hbase, Cassandra, Impala and/or Druid to support applications. Though effective, these systems are complex to integrate and maintain and they leave much of the data synchronization and consistency tasks to the application programmer.
As we like to say “It takes a lot of duct tape to keep a Lambda Architecture going.”
Why Choose Splice Machine for Your Lambda Architecture
Splice Machine offers a better solution to the complexity of Lambda Architectures. We call it “Lambda-in-a-Box.” With the new scale-out RDBMS systems, you get all the benefits of Lambda with a much simpler architecture and it adds full SQL support for both transactional (OLTP) and analytical (OLAP) applications.
For example, here is how a machine learning application can use Lambda-in-a-Box:
Batch File Ingestion – Imports of raw data files are directly inserted into sharded tables in parallel with indexes that are atomically updated with the data for fast access
Real-time Stream Ingestion – Stored procedures continuously ingest streams with standard SQL and auto-shards
Data Cleansing – Use standard SQL, with constraints and triggers, to clean up small subsets of data as well as entire data sets efficiently, without big batch runs or file explosions
Feature Engineering and Extensive ETL – Execute complex aggregations, joins, sorts, and groupings with efficient SQL that is automatically parallelized and optimized without writing code at the application level
Model Training – Stored procedures execute analytics directly on the data, for example, using built-in functions like ResultSetToRDD that take SQL results and treat them as Spark RDDs or execute R and Python libraries directly on database result sets.
Application Logic – ACID semantics enable the architecture to power concurrent CRUD applications without additional moving parts
Model Execution – Stored procedures and user-defined functions wrap models
Reporting and Data Visualization – Use Tableau, Domo, MicroStrategy and other ODBC/JDBC compliant tools turnkey