ApacheCalciteRunTime - eellpp/pubScratchpad GitHub Wiki

Apache Calcite has two main execution runtimes:

  1. Enumerable-based Runtime (for in-memory execution)
  2. JDBC-based Runtime (for external database execution)

These two runtimes determine how and where Calcite executes queries.


Enumerable-Based Runtime

  • Fully in-memory execution using Java iterators (Enumerable<Object[]>).
  • Ideal for structured data sources like CSV, JSON, or in-memory collections.
  • No external database required.
  • Executes using functional-style transformations (like Java Streams).
  • SQL queries are converted into Java execution plans (RelNode → EnumerableRel → Java Code).

https://github.com/eellpp/pubScratchpad/wiki/ApacheCalciteEnumerable


JDBC-Based Runtime

  • Uses a traditional database (PostgreSQL, MySQL, etc.) to execute queries.
  • Calcite translates SQL into a query plan and delegates execution to the database.
  • Supports SQL optimization but relies on the database for execution.

Example: Running a Query in JDBC Runtime

Connection connection = DriverManager.getConnection("jdbc:calcite:", properties);
PreparedStatement stmt = connection.prepareStatement("SELECT * FROM employees WHERE salary > ?");
stmt.setInt(1, 50000);

ResultSet rs = stmt.executeQuery();
while (rs.next()) {
    System.out.println(rs.getString("name"));
}

Use Cases:

  • Querying external relational databases.
  • Connecting to JDBC sources like PostgreSQL, MySQL, Oracle.
  • Leveraging database indexing and optimizations.

Key Limitation:

Depends on an external database (cannot execute in-memory).


** Comparing Calcite’s Two Runtimes**

Feature Enumerable-Based Runtime JDBC-Based Runtime
Execution In-memory, Java-based Delegated to an external database
SQL Parsing & Optimization Done in Calcite Done in Calcite
Query Execution Uses Java Enumerable operations Uses external JDBC database engine
Data Storage Does not store data (only processes it) Uses database storage
Performance Fast for small datasets (avoids network overhead) Scales better with large datasets (uses indexing, disk IO)
Use Case In-memory analytics, embedded SQL execution Traditional database querying

🚀 Which Runtime Should You Use?

Use Enumerable-Based Runtime if:

  • You are working with in-memory data structures (Lists, Maps, CSV).
  • You need embedded SQL execution inside a Java application.
  • You don’t want to set up an external database.

Use JDBC-Based Runtime if:

  • You need to query large datasets stored in databases.
  • You want to leverage database indexes and optimizations.
  • You already have a JDBC data source.

Calcite JDBC Driver

How Does Calcite’s JDBC Driver Work?

Apache Calcite provides an embedded JDBC driver that works entirely in-memory. Instead of connecting to an external database, it:

  1. Maps SQL queries to an in-memory schema (tables, views, or structured data like CSV/JSON).
  2. Uses its query optimizer and execution engine to process the SQL.
  3. Returns results like a normal JDBC driver.

This allows you to use standard JDBC APIs (Connection, PreparedStatement, ResultSet), just like you would with a traditional database.


Example: Running a PreparedStatement on an In-Memory Table

Here’s how Calcite lets you run SQL queries using JDBC APIs without a database:

** Define an In-Memory Schema**

SchemaPlus rootSchema = Frameworks.createRootSchema(true);
MemorySchema employeesSchema = new MemorySchema();
rootSchema.add("EMPLOYEES", employeesSchema);

Here, MemorySchema is a custom in-memory schema (you can use JSON, CSV, or programmatically added tables).

** Register Calcite’s JDBC Driver**

Properties info = new Properties();
info.setProperty("model", "path/to/model.json");  // Schema definition file
Connection connection = DriverManager.getConnection("jdbc:calcite:", info);

The "jdbc:calcite:" driver does not need a database—it simply maps SQL queries to Calcite's internal execution engine.

** Use PreparedStatement Like a Regular Database**

PreparedStatement stmt = connection.prepareStatement("SELECT * FROM EMPLOYEES WHERE ID = ?");
stmt.setInt(1, 101);

ResultSet rs = stmt.executeQuery();
while (rs.next()) {
    System.out.println(rs.getInt("ID") + " - " + rs.getString("NAME"));
}

Even though there’s no real database, Calcite’s built-in SQL parser, optimizer, and executor run the query on the in-memory schema.


🔍 How Is This Possible?

  1. Calcite Parses SQL:

    • It translates the SQL into an internal relational algebra (RelNode).
  2. Optimizes Execution Plan:

    • Uses rule-based transformations for efficient query execution.
  3. Executes on the In-Memory Schema:

    • Uses Enumerable execution instead of calling a database engine.
  4. Returns JDBC-Style ResultSet:

    • Makes the execution feel like a real database query.

📌 Key Takeaways

Yes, Apache Calcite includes a built-in JDBC driver.
It allows you to run SQL queries without an external database.
Calcite translates SQL into relational algebra and executes it in-memory.
You can use standard PreparedStatement, ResultSet, and other JDBC APIs.
Great for embedding SQL-like querying in applications without a real database.

⚠️ **GitHub.com Fallback** ⚠️