Enumerable Adapter

The Enumerable Adapter in Apache Calcite serves a specific and important purpose: it provides a default implementation for executing queries in Calcite when no other adapter or data source is explicitly configured. It is a fallback mechanism that allows Calcite to execute queries in-memory using the Linq4j library, which is Calcite's implementation of LINQ (Language Integrated Query) for Java.

Purpose of the Enumerable Adapter

Fallback Execution Engine:
- When Calcite cannot push down a query to an external data source (e.g., a database or a custom adapter), it falls back to the Enumerable Adapter.
- The Enumerable Adapter executes the query in-memory using the Linq4j library.
In-Memory Query Execution:
- The Enumerable Adapter allows Calcite to execute queries on data that is already loaded into memory (e.g., Java collections or in-memory tables).
- This is useful for testing, prototyping, or when working with small datasets.
Reference Implementation:
- The Enumerable Adapter serves as a reference implementation for how to implement query execution in Calcite.
- It demonstrates how to translate Calcite's relational algebra into executable code using Linq4j.
Support for Advanced Features:
- The Enumerable Adapter supports advanced SQL features like joins, aggregations, and sorting, even when the underlying data source does not.

How the Enumerable Adapter Works

The Enumerable Adapter works by translating Calcite's relational algebra into Linq4j expressions, which are then executed in-memory. Here’s a high-level overview of the process:

Query Parsing and Optimization:
- Calcite parses the SQL query and optimizes it into a relational algebra plan.
Enumerable Convention:
- If the query cannot be pushed down to an external data source, Calcite converts the relational algebra plan into the Enumerable convention.
- The Enumerable convention represents the plan as a series of Linq4j expressions.
Code Generation:
- Calcite generates Java code (using Linq4j) to execute the query in-memory.
- This code is typically a series of nested loops and function calls that implement the relational operators (e.g., filter, project, join).
Execution:
- The generated code is executed on the data, producing the query results.

Use Cases for the Enumerable Adapter

Testing and Prototyping:
- The Enumerable Adapter is useful for testing and prototyping SQL queries without needing to connect to an external data source.
Small Datasets:
- For small datasets that can fit into memory, the Enumerable Adapter provides a simple and efficient way to execute queries.
Custom Adapters:
- When building custom adapters, the Enumerable Adapter can be used as a fallback for operations that the adapter does not support natively.
Standalone Query Engine:
- The Enumerable Adapter allows Calcite to function as a standalone in-memory query engine, independent of external data sources.

Example: Enumerable Adapter in Action

Here’s an example of how the Enumerable Adapter works in practice:

1. Define an In-Memory Table

import org.apache.calcite.DataContext;
import org.apache.calcite.linq4j.Enumerable;
import org.apache.calcite.linq4j.Linq4j;
import org.apache.calcite.rel.type.RelDataType;
import org.apache.calcite.rel.type.RelDataTypeFactory;
import org.apache.calcite.schema.ScannableTable;
import org.apache.calcite.schema.impl.AbstractTable;

import java.util.Arrays;
import java.util.List;

public class InMemoryTable extends AbstractTable implements ScannableTable {
    private final List<Object[]> data;

    public InMemoryTable(List<Object[]> data) {
        this.data = data;
    }

    @Override
    public RelDataType getRowType(RelDataTypeFactory typeFactory) {
        return typeFactory.builder()
                .add("id", typeFactory.createSqlType(SqlTypeName.INTEGER))
                .add("name", typeFactory.createSqlType(SqlTypeName.VARCHAR))
                .add("age", typeFactory.createSqlType(SqlTypeName.INTEGER))
                .build();
    }

    @Override
    public Enumerable<Object[]> scan(DataContext root) {
        return Linq4j.asEnumerable(data);
    }
}

2. Register the Table and Execute a Query

import org.apache.calcite.jdbc.CalciteConnection;
import org.apache.calcite.schema.SchemaPlus;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
import java.util.Arrays;
import java.util.List;

public class EnumerableAdapterExample {
    public static void main(String[] args) throws Exception {
        // Create an in-memory table
        List<Object[]> data = Arrays.asList(
                new Object[]{1, "Alice", 30},
                new Object[]{2, "Bob", 25},
                new Object[]{3, "Charlie", 35}
        );
        InMemoryTable table = new InMemoryTable(data);

        // Register the table with Calcite
        Connection connection = DriverManager.getConnection("jdbc:calcite:");
        CalciteConnection calciteConnection = connection.unwrap(CalciteConnection.class);
        SchemaPlus rootSchema = calciteConnection.getRootSchema();
        rootSchema.add("MEMORY_TABLE", table);

        // Execute a query
        Statement statement = calciteConnection.createStatement();
        ResultSet resultSet = statement.executeQuery("SELECT id, name, age FROM MEMORY_TABLE WHERE age > 30");
        while (resultSet.next()) {
            System.out.println(
                    resultSet.getInt("id") + ", " +
                            resultSet.getString("name") + ", " +
                            resultSet.getInt("age"));
        }

        // Clean up
        resultSet.close();
        statement.close();
        connection.close();
    }
}

Output

3, Charlie, 35

Advantages of the Enumerable Adapter

Simplicity:
- Provides a straightforward way to execute queries in-memory.
Flexibility:
- Supports advanced SQL features without requiring external data sources.
Fallback Mechanism:
- Ensures that queries can always be executed, even if no adapter is available.

Disadvantages of the Enumerable Adapter

Performance:
- In-memory execution can be slow for large datasets.
Memory Usage:
- Requires all data to fit into memory, which may not be feasible for large datasets.
Limited Push-Down:
- Does not leverage the capabilities of external data sources (e.g., indexes, distributed processing).

Conclusion

The Enumerable Adapter is a key component of Calcite that provides a fallback mechanism for query execution. It is particularly useful for testing, prototyping, and working with small datasets. However, for production use cases with large datasets, it is recommended to use adapters that can push down queries to external data sources for better performance and scalability.

Enumerable Interface

Exposes the enumerator, which supports a simple iteration over a collection

public interface Enumerable extends RawEnumerable, Iterable, ExtendedEnumerable

Enumerable<T> is Calcite's equivalent of a relational iterator, inspired by .NET's LINQ Enumerable.
It allows row-by-row or batch execution of relational queries.
It is fully in-memory, meaning it operates on Java collections instead of a database.

Example: Query Execution in `Enumerable`

Enumerable<Object[]> result = EnumerableInterpretable.toEnumerable(optimizedPlan);

for (Object[] row : result) {
    System.out.println(Arrays.toString(row));
}

Here, result is an Enumerable that executes the query in memory, without a database.
Each row is represented as an Object[], just like a SQL row.

How Does the `Enumerable` Runtime Work?

Calcite translates SQL queries into Java-based Enumerable expressions, which are then executed dynamically.

Execution Flow

1️⃣ SQL Parsing & Logical Plan Generation

SQL query is converted into a RelNode (Relational Algebra Expression).
No actual data is accessed yet.

2️⃣ Optimization

Logical optimizations (e.g., predicate pushdown, projections).
Conversion to a physical plan using Enumerable operators.

3️⃣ Execution Using EnumerableRel (Calcite’s Physical Execution Plan)

The final execution plan consists of Enumerable-based operations (e.g., filtering, sorting, aggregations).
These operations are executed in-memory using Java.

What Does This Mean in Practice?

1. Queries Run in Java, Without a Database

Unlike traditional databases that execute queries using a disk-based engine, Calcite:

Executes directly in memory using Java’s Enumerable API.
Processes data using function-style operations (like Java Streams).

2. Query Execution is Lazy (Row-by-Row or Batch)

Enumerable<Object[]> fetches rows lazily.
This prevents loading the entire dataset into memory at once.

3. Supports Efficient Query Execution Without JDBC

You don’t need a database engine—Calcite executes queries in-memory.
Works well for structured data in JSON, CSV, or programmatically defined tables.

Example: SQL Query Execution Using `EnumerableRel`

Consider this SQL query:

SELECT name, salary FROM employees WHERE salary > 50000;

Step 1: Convert SQL to `RelNode`

Calcite first parses and optimizes it into a relational algebra expression (RelNode):

LogicalProject(name, salary)
  LogicalFilter(condition=[salary > 50000])
    LogicalTableScan(table=[employees])

No data is accessed yet—Calcite is still building the execution plan.

Step 2: Convert `RelNode` to `EnumerableRel`

Calcite converts it into an EnumerableRel physical plan, which uses Java iterators:

EnumerableProject(name, salary)
  EnumerableFilter(condition=[salary > 50000])
    EnumerableTableScan(table=[employees])

Here, EnumerableFilter and EnumerableTableScan process rows in-memory.

Step 3: Execute the Query Using `Enumerable`

Enumerable<Object[]> result = EnumerableInterpretable.toEnumerable(optimizedPlan);

for (Object[] row : result) {
    System.out.println(Arrays.toString(row));
}

This code lazily fetches rows and applies the WHERE and SELECT filters dynamically.

How is This Different from a Traditional Database Runtime?

Feature	Calcite Enumerable Runtime	Traditional Database (PostgreSQL, MySQL)
Execution Model	Java-based in-memory execution	Disk-based execution engine
Optimization	Rule-based optimization (RBO)	Cost-based optimization (CBO)
Data Access	Operates on Java collections, in-memory tables	Reads from disk-based tables
Laziness	Fetches rows lazily (`Enumerable.next()`)	Uses disk buffers, indexes, or materialized execution
Use Case	Query execution on CSV, JSON, in-memory collections	Traditional relational databases

When Should You Use Calcite’s Enumerable Runtime?

Best for:

In-memory analytics on structured data (CSV, JSON, Parquet, Java collections).
Embedding SQL-like query capabilities inside Java applications.
Querying without a database (Calcite acts as a SQL engine).

Not ideal for:

Very large datasets that don’t fit in memory (because Enumerable is in-memory).
Transactional workloads (Calcite is not an OLTP database).

Summary

✅ Calcite's runtime is based on Enumerable, meaning it executes queries in-memory using Java.
✅ It allows SQL-like operations without a database, using function-style execution.
✅ This makes it ideal for embedded analytics, querying in-memory structures, and working with non-database sources.
✅ It does not store data—it only processes structured data when needed.

ApacheCalciteEnumerable - eellpp/pubScratchpad GitHub Wiki

Enumerable Adapter

Purpose of the Enumerable Adapter

How the Enumerable Adapter Works

Use Cases for the Enumerable Adapter

Example: Enumerable Adapter in Action

1. Define an In-Memory Table

2. Register the Table and Execute a Query

Output

Advantages of the Enumerable Adapter

Disadvantages of the Enumerable Adapter

Conclusion

Enumerable Interface

Example: Query Execution in `Enumerable`

How Does the `Enumerable` Runtime Work?

Execution Flow

What Does This Mean in Practice?

1. Queries Run in Java, Without a Database

2. Query Execution is Lazy (Row-by-Row or Batch)

3. Supports Efficient Query Execution Without JDBC

Example: SQL Query Execution Using `EnumerableRel`

Step 1: Convert SQL to `RelNode`

Step 2: Convert `RelNode` to `EnumerableRel`

Step 3: Execute the Query Using `Enumerable`

How is This Different from a Traditional Database Runtime?

When Should You Use Calcite’s Enumerable Runtime?

Summary

⚠️ GitHub.com Fallback ⚠️

ApacheCalciteEnumerable - eellpp/pubScratchpad GitHub Wiki

Enumerable Adapter

Purpose of the Enumerable Adapter

How the Enumerable Adapter Works

Use Cases for the Enumerable Adapter

Example: Enumerable Adapter in Action

1. Define an In-Memory Table

2. Register the Table and Execute a Query

Output

Advantages of the Enumerable Adapter

Disadvantages of the Enumerable Adapter

Conclusion

Enumerable Interface

Example: Query Execution in Enumerable

How Does the Enumerable Runtime Work?

Execution Flow

What Does This Mean in Practice?

1. Queries Run in Java, Without a Database

2. Query Execution is Lazy (Row-by-Row or Batch)

3. Supports Efficient Query Execution Without JDBC

** Example: SQL Query Execution Using EnumerableRel**

Step 1: Convert SQL to RelNode

Step 2: Convert RelNode to EnumerableRel

Step 3: Execute the Query Using Enumerable

How is This Different from a Traditional Database Runtime?

When Should You Use Calcite’s Enumerable Runtime?

Summary

⚠️ **GitHub.com Fallback** ⚠️

Example: Query Execution in `Enumerable`

How Does the `Enumerable` Runtime Work?

Example: SQL Query Execution Using `EnumerableRel`

Step 1: Convert SQL to `RelNode`

Step 2: Convert `RelNode` to `EnumerableRel`

Step 3: Execute the Query Using `Enumerable`

⚠️ GitHub.com Fallback ⚠️