Massive Scale-Out Power Incredibly Easy to Operate
With Splice Machine Cloud Manager, configuring a new cluster is as easy as using a few sliders to set compute units for OLTP and OLAP processing, allocate storage, and schedule backup frequency and retention. Splice Machine does the rest.
You can seamlessly scale out from gigabytes to petabytes of data when needs or data volumes change, and the same configurator adds or subtracts resources dynamically. You pay only for what you use.
ANSI SQL
Applications interact with Splice Machine over JDBC or ODBC connections using a full ANSI SQL implementation. The list below shows a sampling:
Data Types – e.g., INTEGER, REAL, CHARACTER, DATE, BOOLEAN, BIGINT DDL – e.g., CREATE TABLE, CREATE SCHEMA, ALTER TABLE, DELETE, UPDATE TABLE Predicates – e.g., IN, BETWEEN, LIKE, EXISTS DML – e.g., INSERT, DELETE, UPDATE, SELECT Joins – e.g., INNER JOIN, LEFT OUTER JOIN Query Specification – e.g., GROUP BY, HAVING SET Functions – e.g., UNION, ABS, MOD, ALL, INTERSECT, EXCEPT Aggregation Functions – e.g., AVG, MAX, COUNT Constraints – e.g., PRIMARY KEY, CHECK, FOREIGN KEY, UNIQUE, NOT NULL | Conditional Functions – e.g., CASE, searched CASE String Functions – e.g., SUBSTRING, concatenation, UPPER, LOWER, TRIM, LENGTH Privileges – e.g., privileges for SELECT, DELETE, INSERT, EXECUTE Transactions – e.g., COMMIT, ROLLBACK, Snapshot Isolation Sub-queries Stored Procedures Triggers User-defined functions (UDFs) Views – including grouped views Window Functions – e.g., FIRST_VALUE, LAST_VALUE, LEAD, LAG |
The full ANSI-SQL capabilities make it the perfect platform for migrating applications from traditional RDBMS databases to a more scalable and cost-effective data platform. Splice Machine even offers native PL/SQL support to facilitate migrations of applications that use this procedural language extension.
Hybrid Transactional and Analytical Processing (a.k.a. “HTAP”)
Splice Machine has a unique “Dual Engine” architecture that it uses to provide outstanding performance for concurrent transactional (OLTP) and analytical (OLAP) workloads. The SQL parser and cost-based optimizer analyze an incoming query and then determine the best execution plan based on query type, data sizes, available indexes and more. It will deploy HBase for OLTP-type lookups, inserts and short range scans, and it will use Spark for lightning-fast in-memory processing of analytical workloads.
The Dual Engine architecture gives you the best of multiple worlds in a hybrid database: the performance, scale-out, and resilience of HBase, the in-memory analytics performance of Spark, and the performance of a cost-based optimizer.
Resource Isolation
Splice Machine isolates the resources allocated to HBase and Spark from each other, so each can progress independent of the workload of the other. Combined with the MVCC locking mechanism, this ensures that the performance level of transactional workloads can remain high, even if large reports or analytic processes are running.
Cost-based Optimizer with Advanced Statistics
A cost-based optimizer can only be as good as the statistics it has access to. Most big-data statistics are poor and improving them would impose a significant performance cost on the system. Splice Machine implements “sketches” to compute cardinalities, which can produce results orders-of magnitude faster and with mathematically proven error bounds.
The combination of the cost-based optimizer and the right statistics can reduce big data query response times from days and hours to minutes and seconds.
Hybrid Storage
Row-based storage is write-optimized and very adept at single-record lookups and short scans. Columnar storage is optimized for large table scans, large joins, aggregations and groupings. Splice Machine supports both. In addition to its internal row-based representation, it supports external ORC and Parquet columnar tables. The Splice query optimizer leverages this hybrid architecture optimize performance internal and external storage.
By allowing companies to use lower cost, higher latency storage options such as AWS S3, companies can obtain significant cost savings due to the hybrid storage capability. It also speeds up large OLAP use cases that benefit from the columnar layout.