Linq4J - eellpp/pubScratchpad GitHub Wiki
Linq4j (short for Language-Integrated Query for Java) is a library developed as part of Apache Calcite that provides LINQ (Language-Integrated Query)-like functionality for Java. LINQ is a feature originally introduced in Microsoft's .NET framework that allows developers to write queries directly within their programming language (e.g., C#) to interact with data sources like collections, databases, XML, and more. Linq4j brings similar capabilities to Java.
-
Query Expressions:
- Linq4j allows you to write declarative queries in Java, similar to SQL or LINQ in C#.
- Example:
Enumerable<Employee> employees = Linq4j.asEnumerable(employeeList); Enumerable<Employee> result = employees .where(e -> e.getSalary() > 50000) .orderBy(e -> e.getName()) .select(e -> new EmployeeDto(e.getName(), e.getSalary()));
-
Lazy Evaluation:
- Queries are evaluated lazily, meaning the data is processed only when the result is actually needed (e.g., when iterating over the result).
-
Interoperability with Collections:
- Linq4j works seamlessly with Java collections (
List
,Set
,Map
, etc.) and allows you to perform complex operations like filtering, sorting, grouping, and joining.
- Linq4j works seamlessly with Java collections (
-
Integration with Apache Calcite:
- Linq4j is tightly integrated with Apache Calcite, enabling querying over relational data sources and in-memory collections using the same API.
-
Functional Programming Support:
- Linq4j leverages Java's functional programming features (e.g., lambda expressions) to provide a concise and expressive way to write queries.
-
Enumerable<T>
:- The primary interface in Linq4j, representing a sequence of elements that can be queried.
- Provides methods like
where
,select
,orderBy
,groupBy
,join
, etc.
-
EnumerableDefaults
:- A utility class that provides default implementations for the methods in the
Enumerable
interface.
- A utility class that provides default implementations for the methods in the
-
Linq4j
:- A utility class with static methods to create
Enumerable
instances from collections, arrays, or other data sources.
- A utility class with static methods to create
-
Query Operators:
- Linq4j supports a wide range of query operators, including:
-
Filtering:
where
,ofType
-
Projection:
select
,selectMany
-
Sorting:
orderBy
,orderByDescending
-
Grouping:
groupBy
-
Joining:
join
,groupJoin
-
Aggregation:
count
,sum
,average
,min
,max
-
Filtering:
- Linq4j supports a wide range of query operators, including:
Hereβs an example of how to use Linq4j to query a list of employees:
import org.apache.calcite.linq4j.Enumerable;
import org.apache.calcite.linq4j.Linq4j;
import java.util.Arrays;
import java.util.List;
public class Linq4jExample {
public static void main(String[] args) {
// Sample data
List<Employee> employees = Arrays.asList(
new Employee("Alice", 60000),
new Employee("Bob", 45000),
new Employee("Charlie", 70000)
);
// Create an Enumerable from the list
Enumerable<Employee> employeeEnumerable = Linq4j.asEnumerable(employees);
// Query: Filter employees with salary > 50000, sort by name, and project to a DTO
Enumerable<EmployeeDto> result = employeeEnumerable
.where(e -> e.getSalary() > 50000)
.orderBy(e -> e.getName())
.select(e -> new EmployeeDto(e.getName(), e.getSalary()));
// Print the result
result.forEach(System.out::println);
}
}
class Employee {
private String name;
private int salary;
public Employee(String name, int salary) {
this.name = name;
this.salary = salary;
}
public String getName() { return name; }
public int getSalary() { return salary; }
}
class EmployeeDto {
private String name;
private int salary;
public EmployeeDto(String name, int salary) {
this.name = name;
this.salary = salary;
}
@Override
public String toString() {
return "EmployeeDto{name='" + name + "', salary=" + salary + "}";
}
}
EmployeeDto{name='Alice', salary=60000}
EmployeeDto{name='Charlie', salary=70000}
-
Declarative Syntax:
- Write queries in a concise and readable way, similar to SQL or LINQ.
-
Type Safety:
- Queries are type-safe, reducing the risk of runtime errors.
-
Integration with Calcite:
- Use the same API to query both in-memory collections and relational data sources.
-
Functional Programming:
- Leverage Java's lambda expressions for a functional programming style.
- When you need to perform complex queries on in-memory collections.
- When you want a LINQ-like experience in Java.
- When working with Apache Calcite and need to query relational data sources.
- Linq4j is not as widely used as other Java query libraries (e.g., Stream API in Java 8+).
- It is primarily designed for use with Apache Calcite, so it may not be as feature-rich as standalone LINQ implementations in other languages.
Feature | Linq4j | Java Stream API |
---|---|---|
Declarative Syntax | Yes (LINQ-like) | Yes |
Lazy Evaluation | Yes | Yes |
Integration | Tightly integrated with Calcite | Part of the Java standard library |
Functional Style | Yes | Yes |
Type Safety | Yes | Yes |
In summary, Linq4j is a powerful library for querying collections and relational data in Java, providing a LINQ-like experience. It is particularly useful when working with Apache Calcite or when you need a more expressive query syntax than what the Java Stream API offers.
Apache Calcite's LINQ4J (a Java version of .NET's LINQ) provides advanced query capabilities that Java Stream API lacks. LINQ4J is more SQL-like, supporting relational-style operations, whereas Java Streams are focused on functional-style processing of in-memory collections.
Here are some key operations that are available in LINQ4J but not in Java Streams:
πΉ LINQ4J Query is Composable & Translatable
πΉ Java Streams always execute immediately
Queryable<Employee> employees = new Linq4jQueryable<>(dataContext, EMPLOYEE_TABLE);
// No execution yet (deferred)
Queryable<Employee> filtered = employees.where(e -> e.salary > 50000);
Queryable<Employee> sorted = filtered.orderBy(e -> e.name);
// Execution happens when iterated
for (Employee e : sorted) {
System.out.println(e.name);
}
β
LINQ4J allows composing queries before execution
β Java Streams execute immediately (no query planning possible)
πΉ LINQ4J supports SQL-like JOIN
operations
πΉ Java Streams do not have built-in join support
Queryable<Employee> employees = ...;
Queryable<Department> departments = ...;
Queryable<Tuple2<Employee, Department>> joined = employees
.join(departments,
e -> e.departmentId, // Key from Employee
d -> d.id, // Key from Department
(e, d) -> new Tuple2<>(e, d)); // Result tuple
// Iterating the join result
for (Tuple2<Employee, Department> tuple : joined) {
System.out.println(tuple.v1.name + " - " + tuple.v2.departmentName);
}
β
LINQ4J allows relational-style joins
β Java Streams require manual nested loops for joins (inefficient)
πΉ LINQ4J allows SQL-style GROUP BY
with aggregation
πΉ Java Streams require workarounds like Collectors.groupingBy()
but lack flexibility
employees
.groupBy(e -> e.departmentId) // Group by department
.select(g -> new DepartmentSalary(
g.key(),
g.sum(e -> e.salary))) // Aggregate sum of salaries
.forEach(ds -> System.out.println(ds.departmentId + " - " + ds.totalSalary));
β
LINQ4J natively supports GROUP BY
with multiple aggregates
β Java Streams need workarounds with collectors (Collectors.groupingBy
)
πΉ LINQ4J queries can be converted into SQL
πΉ Java Streams cannot be translated to SQL
String sql = RelOptUtil.toString(EnumerableInterpretable.toRel(myLinqQuery));
System.out.println(sql);
β
LINQ4J queries can be executed on databases (SQL translation)
β Java Streams only work in-memory (no SQL support)
πΉ LINQ4J supports relational-style set operations
πΉ Java Streams require manual merging and filtering
Queryable<Employee> set1 = ...;
Queryable<Employee> set2 = ...;
Queryable<Employee> unionQuery = set1.union(set2);
β
LINQ4J supports UNION
, INTERSECT
, and EXCEPT
β Java Streams require manual filtering (concat()
is not the same as UNION DISTINCT
)
Feature | LINQ4J | Java Stream API |
---|---|---|
Deferred Execution | β Yes | β No (executes immediately) |
Joins (JOIN equivalent) |
β Yes | β No (requires manual loops) |
Grouping with Aggregation (GROUP BY equivalent) |
β Yes | β Limited (only Collectors.groupingBy ) |
SQL Translation | β Yes | β No |
Set Operations (UNION , INTERSECT , EXCEPT ) |
β Yes | β No |
- When working with relational-style data.
- When needing SQL-like joins, grouping, and filtering.
- When integrating with Apache Calcite for query optimization.
- When wanting deferred execution and query composition.
Would you like a performance comparison between LINQ4J and Java Streams for a specific use case? π