ApacheCalciteRunTime - eellpp/pubScratchpad GitHub Wiki
Apache Calcite has two main execution runtimes:
- Enumerable-based Runtime (for in-memory execution)
- JDBC-based Runtime (for external database execution)
These two runtimes determine how and where Calcite executes queries.
- Fully in-memory execution using Java iterators (
Enumerable<Object[]>
). - Ideal for structured data sources like CSV, JSON, or in-memory collections.
- No external database required.
- Executes using functional-style transformations (like Java Streams).
- SQL queries are converted into Java execution plans (RelNode → EnumerableRel → Java Code).
https://github.com/eellpp/pubScratchpad/wiki/ApacheCalciteEnumerable
- Uses a traditional database (PostgreSQL, MySQL, etc.) to execute queries.
- Calcite translates SQL into a query plan and delegates execution to the database.
- Supports SQL optimization but relies on the database for execution.
Connection connection = DriverManager.getConnection("jdbc:calcite:", properties);
PreparedStatement stmt = connection.prepareStatement("SELECT * FROM employees WHERE salary > ?");
stmt.setInt(1, 50000);
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
System.out.println(rs.getString("name"));
}
- Querying external relational databases.
- Connecting to JDBC sources like PostgreSQL, MySQL, Oracle.
- Leveraging database indexing and optimizations.
Depends on an external database (cannot execute in-memory).
Feature | Enumerable-Based Runtime | JDBC-Based Runtime |
---|---|---|
Execution | In-memory, Java-based | Delegated to an external database |
SQL Parsing & Optimization | Done in Calcite | Done in Calcite |
Query Execution | Uses Java Enumerable operations |
Uses external JDBC database engine |
Data Storage | Does not store data (only processes it) | Uses database storage |
Performance | Fast for small datasets (avoids network overhead) | Scales better with large datasets (uses indexing, disk IO) |
Use Case | In-memory analytics, embedded SQL execution | Traditional database querying |
- You are working with in-memory data structures (Lists, Maps, CSV).
- You need embedded SQL execution inside a Java application.
- You don’t want to set up an external database.
- You need to query large datasets stored in databases.
- You want to leverage database indexes and optimizations.
- You already have a JDBC data source.
Apache Calcite provides an embedded JDBC driver that works entirely in-memory. Instead of connecting to an external database, it:
- Maps SQL queries to an in-memory schema (tables, views, or structured data like CSV/JSON).
- Uses its query optimizer and execution engine to process the SQL.
- Returns results like a normal JDBC driver.
This allows you to use standard JDBC APIs (Connection
, PreparedStatement
, ResultSet
), just like you would with a traditional database.
Here’s how Calcite lets you run SQL queries using JDBC APIs without a database:
SchemaPlus rootSchema = Frameworks.createRootSchema(true);
MemorySchema employeesSchema = new MemorySchema();
rootSchema.add("EMPLOYEES", employeesSchema);
Here, MemorySchema
is a custom in-memory schema (you can use JSON, CSV, or programmatically added tables).
Properties info = new Properties();
info.setProperty("model", "path/to/model.json"); // Schema definition file
Connection connection = DriverManager.getConnection("jdbc:calcite:", info);
The "jdbc:calcite:"
driver does not need a database—it simply maps SQL queries to Calcite's internal execution engine.
PreparedStatement stmt = connection.prepareStatement("SELECT * FROM EMPLOYEES WHERE ID = ?");
stmt.setInt(1, 101);
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
System.out.println(rs.getInt("ID") + " - " + rs.getString("NAME"));
}
Even though there’s no real database, Calcite’s built-in SQL parser, optimizer, and executor run the query on the in-memory schema.
-
Calcite Parses SQL:
- It translates the SQL into an internal relational algebra (
RelNode
).
- It translates the SQL into an internal relational algebra (
-
Optimizes Execution Plan:
- Uses rule-based transformations for efficient query execution.
-
Executes on the In-Memory Schema:
- Uses Enumerable execution instead of calling a database engine.
-
Returns JDBC-Style
ResultSet
:- Makes the execution feel like a real database query.
Yes, Apache Calcite includes a built-in JDBC driver.
It allows you to run SQL queries without an external database.
Calcite translates SQL into relational algebra and executes it in-memory.
You can use standard PreparedStatement
, ResultSet
, and other JDBC APIs.
Great for embedding SQL-like querying in applications without a real database.