Skip to content

Linq To DB 6

MaceWindu edited this page May 5, 2024 · 73 revisions

About This Release

In Linq To DB 6 we concentrated on refactoring of existing query parser, which architecture had no significant changes for more than 10 years, leading to multiple issues when you need to add new functionality or fix existing one. With this refactoring we plan to reach following goals:

  • make code more clean and easy to understand;
  • get rid of a lot of shortcuts and workarounds;
  • fix performance issues with old parser implementation;
  • fix long standing issues in query parsing and generation;
  • prepare grounds for future improvements.

Doing such refactorings could introduce regressions, so we urge you to try it early and report any issues you discovered.

API Removals/Changes

As part of work on this release we removed or changed some of existing APIs, which could require changes from you to use new version of Linq To DB.

  • [PR #4332] We removed context cloning functionality, which includes IDataContext.Clone(bool) and IAsyncDbConnection.TryClone() interface methods alongside with ICloneable interface implementation on DataConnection. This code was used by previous versions of Linq To DB as quite ugly workaround for complex queries translation. It was never intended for use by users and we don't plan to introduce any replacements.
  • [PR #4002] We replaced some provider configuration extension methods for DataOptions with extensions having better name and/or parameters. For more details see Unification of Database Configuration API below.
  • [PR #4002] We removed obsoleted APIs from <DB>Tools classes and unified GetDataProvider/CreateDataConnection APIs in those. For more details see Changes to DBTools Classes below.
  • Removed obsoleted T4 scaffold template property GenerateLinqToDBConnectionOptionsConstructors. It was replaced with GenerateDataOptionsConstructors property some time ago.

Changes to GroupBy extensions

For Sql.GroupBy.* methods (Rollup, Cube, GroupingSets) we simplified parameter type from delegate to object, so you will need to update your calls and remove () => from them:

var grouped = from q in query
    // old syntax
    //group q by Sql.GroupBy.Rollup(() => new { q.Id1, q.Id2 })
    // new syntax
    group q by Sql.GroupBy.Rollup(new { q.Id1, q.Id2 })
    into g
    select new
    {
        g.Key.Id1,
        Count = g.Count()
    };

Target Frameworks Review

PR #4370

We performed some cleanup to list of supported frameworks to remove some that are EOL and add new ones. New list of supported TFMs:

  • net462: .NET Framework 4.6.2, which is an oldest suppored .NET Framework version. Replaces net45, net46 and net472 TFMs we shipped with previous releases
  • netstandard2.0: to provide support for out-of-date runtimes and TFMs (netstandard2.1 and netcoreapp3.1 were removed)
  • net6.0: as lowest supported .NET version
  • net8.0: as latest supported .NET version

All listed above TFMs added to all binary nugets except linq2db.Remote.Wcf, which currently supports only .NET Framework.

New features

Helper For Table Functions and Expressions Mapping

Previously client-side implementation of table function or expression required user to use reflection to get instance of MethodInfo for mapped method:

// table function example
[Sql.TableFunction(Name="GetParentByID")]
public ITable<Parent> GetParentByID(int? id)
{
    // to be able to call GetParentByID from non-Expression context
    // we need to provide code, that will return ITable instance
    // for mapped function.
    // This API requires MemberInfo of mapped method as parameter
    var methodInfo = typeof(Functions).GetMethod("GetParentByID", new [] {typeof(int?)})!;
    return _ctx.GetTable<Parent>(this, methodInfo, id);
}

static readonly MethodInfo _methodInfo = /* get generic MethodInfo*/;

// Table expression example for generic method and context passed as parameter
// Generic method requires that we instantiate MemberInfo for concrete generic parameter type
[Sql.TableExpression("{0} {1} WITH (TABLOCK)")]
public static ITable<T> WithTabLock<T>(IDataContext ctx)
    where T : class
{
    return ctx.GetTable<T>(null, _methodInfo.MakeGenericMethod(typeof(T)));
}

To make this code easier we added new IDataContext extenstion method TableFromExpression, which could be used instead:

[Sql.TableFunction(Name="GetParentByID")]
public ITable<Parent> GetParentByID(int? id)
{
    return _ctx.TableFromExpression(() => GetParentByID(id));
}

[Sql.TableExpression("{0} {1} WITH (TABLOCK)")]
public static ITable<T> WithTabLock<T>(IDataContext ctx)
    where T : class
{
    return ctx.TableFromExpression(() => ctx.WithTabLock<T>());
}

We also update table functions scaffolding using CLI and T4 templates to use this new extension (not available in Preview 1).

New Query Parser

This is a huge piece of work that took us (mostly @sdanyliv) couple of years to complete. It introduces a lot of improvements to LINQ handling and query generation and below we will try to mention most of the changes. This work contributes to wast majority of issues, fixed with this release and probably more as we need some time to verify and close issues, reported over years.

Projection handling improvements

A lot improvements made to to track columns in intermediate projections. Some examples are:

  • we now can track Tuple<> instances in projections
  • use of client-side value in projection doesn't produce error in cases when it cannot be converted to SQL. E.g. you add some class instance to intermediate query projection just to be able to select it in final projection later
// instance of local class, which clearly not translatable to SQL
var c = new LocalClass();

var query =
    from p in db.Parent
    // tuple construnctor parameters properly mapped to tuple fields later
    select Tuple.Create(Tuple.Create(p.ParentID, p.Value1, c), Tuple.Create(p.Value1, p.ParentID));

var resultQuery = from q in query
    where q.Item2.Item1 != null
    // Item3 references instance of LocalClass we added to projection above
    select new { q.Item1, q.Item3 };

For final query projection we don't try to translate it to SQL anymore and set it's value on client during materialization. This has several benefits:

  • we don't roundtrip column value to server and back
  • we don't need to generate column typing code for databases that have issues with literals and parameters typing when they don't used in context where type could infered without inspecting parameter type
  • we don't risk to fail/loose data for types which doesn't support lossless round-tripping for some or all values (e.g. NaN floating point number)

Aggregation functions improvements

Related issues:

  • #680 : better support for standard aggregation functions with filters

We finally implement proper translation of standard LINQ aggregate methods (Count, Avg, Min, Max, Sum) with filters to SQL. Old implementation converted such aggregates to scalar subqueries with filter moved to sub-query filter. As workaround you could have used aggregate functions, provided by Linq To DB. Note that they are still useful as they provide support for more aggregate functions and support SQL-specific parameters, not available for IEnumerable classes.

from r in db.Address
    group r by r.City
     select
     {
        r.City,
        // this will be converterted to subquery in linq2db 5
        // and to COUNT(BuildingNumber = 1) in this release
        Count1 = r.Count(r => r.BuildingNumber == 1),
        // workaround with linq2db-provided aggregate mapping
        Count2 = r.CountExt(r => r.BuildingNumber == 1)
     }

Improvements to DML queries with OUTPUT/RETURNING clause

Old implementation of this feature had a lot of issues when user tried to return data from complex DML queries resulting in wrong data or even invalid SQL generation. New parser fix them all making this feature first-class citizen.

Note that this doesn't include support for querying to DML outputs, supported by some databases like PostgreSQL. This is a separate feature we didn't implemented yet.

Improvements to MERGE queries

Not many noticable changes here as it already worked quite well. We've made improvements to associations handling to offload them to source query in more cases instead of using sub-queries in operations clause (which was not supported by some databases too).

Improvements to CTE (Common Table Expression) clause

Related issues:

  • #1644: Expression is not a table error in CTE
  • #2264: circular reference errors in CTE

Some improvements are:

  • CTE benefits from projections tracking improvements making it possible to use more complex queries in CTE
  • new you can use scalar-typed CTEs without need to declare wrapping object
  • selection of entities in CTE projection without need to define explicit column for each field is now supported
// no need to define wrapping class to store int-typed column anymore
// you can use int or any other scalar type as CTE record type directly
var cteRecursive = db.GetCte<int>(cte =>
    (
        from c in db.Child.Take(1)
        // select whole entity without explicit decomposition
        select new { c.ChildID, c }
    )
    .Concat
    (
        from c in db.Child
        from ct in cte.InnerJoin(ct => ct == c.ChildID + 1)
        select new { c.ChildID + 1, (Child?)null }
    )
    , "MY_CTE");

var result = cteRecursive.ToArray();

Improvements to SET operators translation

Related issues:

  • #2461: support for associations and eager load over SET queries
  • #2511: support for associations and eager load over SET queries
  • #2948: support complex sub-queries in SET queries
  • #3150: support contant columns in SET sub-queries
  • #3346: support projection of composite properties in SET sub-queries

SET operators (UNION, EXCEPT, INTERSECT) support was a big pain point in previous versions of Linq To DB (you can check issues above for some examples) having very limited functionality and a lot of issues with existing functionality.

This release tries to address all those issues:

  • Linq To DB now is not anymore confused by constants in selection list of SET sub-queries leading to wrong results where constant value from one side of SET could appear as value in record, produced by another side of SET operator
  • you can finally use associations and eager load functionality with records, returned by SET operator
  • you can use entities with inheritance mappings with SET operator
  • composite column mappings also work now.

You can find some nice use-case examples in our tests here.

Better non-SQL GroupBy handling

C# GroupBy method allows you to write LINQ query that cannot be directly translated to SQL, because it allows you return data, which is not a part of grouping key clause or aggregate, required for SQL's GROUP BY.

For such queries Linq To DB used very old and ugly hack where it loaded this additional data using extra queries we created copy of current connection, which was already busy with main query, to load extra data for each returned group:

  • cloned connection wasn't attached to main connection transaction, meaning it could have returned (or not) unexpected data
  • each group record produced separate select query leading to infamous n+1 query problem

As a partial "workaround" we had (and still have) GuardGrouping (=true) setting to produce exception for such queries because it is quite easy to write such query without noticing it will not translate well to SQL.

With current release where we migrated such queries handling from legacy approach to eager load mechanism where Linq To DB tries to load requested data effectively with few if not one query.

There are couple of side-effects:

  • as mentioned above we removed connection cloning API used only by old GroupBy implementation
  • there is a behavior change to GuardGrouping guard. We still have it in place in cases you don't want to have unexpected eager load from your queries, but we discovered that old guard implementation was actually faulty and failed to fire in some cases. This means you could start getting exceptions from queries that should have failed before but "worked" due to bad validation.

Changes to order by entity behavior

Previously for queries with code like .OrderBy(entity => entity) we produced ORDER BY over primary keys. From now on it will generate sort over all entity columns. If you used such ordering code, we would recommend to replace it with explicit ordering over specific fields.

Note that such code is higly discouraged anyways as column order (for entities with more than one column or composite primary key) in ORDER BY expression here is not defined in general and could depend on column order in your source code or even on version of used runtime.

Column nullability tracing

We greatly improved tracing of nullability for translated code which could result in some rare cases in removal of unnecessary IS [NOT] NULL predicates (or addition of missing before).

Imrovements to sub-queries and JOINs

Ineffective ORDER BY removal

ORDER BY clause is not generated on sorted sub-query if it doesn't use it actually:

  • sub-query has no paging options
  • sub-query don't use window functions
Better JOINs optimization

We've made some improvements to JOIN optimization logic which helps to detect and remove more unused JOIN clauses.

Wider LATERAL/APPLY JOINs use

Linq To DB uses LATERAL/APPLY joins and window functions to generate more effective SQL in many places for databases that support such functionality. E.g. of such code could be translation of First/Single|OrDefault subqueries in projections.

Required functionality is enabled for supporting databases where it was missing. For more details check notes on specific database support changes below.

Parameters translation changes

We revisited logic for query parameters generation to prefer literals for constant values and parameters for dynamic values:

  • when value is not constant (comes from variable, parameter, field, property or method result) we generate parameters
  • when value is hardcoded constant we try to generate literal
  • when value is fixed for query we also use literal. Good example of such values are paging options for First/Single|OrDefault queries

INSERT improvements

Improved generated SQL for INSERTs which reference other tables for some databases.

UPDATE improvements

  • don't generate unnecessary FROM clause in some cases, where it was generated before.
  • simplify SET clause for simple INSERT to not mention target table UPDATE table SET table.field = value -> UPDATE table SET field = value

DELETE improvements

Fixed issue when Delete applied to query with SelectMany to select association could delete data from wrong table:

// previously this query would generate DELETE for Parent table instead of GrandChildren table
db.GetTable<Parent>()
    .Where     (x => harnessIds.Contains(x.ParentID))
    .SelectMany(x => x.Children)
    .SelectMany(x => x.GrandChildren)
    .Delete();

Other parser improvements and fixes

Improved parsing LINQ methods allowed to fix several issues with LINQ generated by other compilers (e.g. VB.NET and F#):

  • #417: support F# leftOuterJoin LINQ operator translation
  • #649: support VB.NET Group By LINQ operator translation
  • #3699: NotImplementedException translating expressions with Invoke calls. Such expressions could be generated by F# compiler

Some other fixed issues with parser:

  • #3586: cannot be converted to SQL exception for complex query used as filter with Any LINQ method
  • #4284: InvalidCastException when parse complex query

F# Support Changes

PR #4362

As a part of query parser refactoring we decided to move F# support code to separate library which has following benefits:

  • support is not enabled by default, meaning it doesn't affect performance of non-F# projects
  • new library use F# code instead of old approach with reflection and dynamic code where we need to access F#-specific functionality
  • it will allow us to extend F# support in future more easily.

For this release there is no functional changes to F# support except some general fixes to query parsing. We just moved support code to separate library.

If you use Linq To DB with F# code, you shold add reference to linq2db.FSharp nuget and enable F# support in your database context:

using var db = new DataConnection(
  new DataOptions()
    .UseSqlServer(@"Server=.\;Database=Northwind;Trusted_Connection=True;")
    // enables F# Services for connection
    .UseFSharp());

Unification of Database Configuration API

PR 4002

Changes to DataOptions configuration API

To configure database provider for connection context using DataOption we had a set of methods like Use<DB_NAME>(...) with naming and set of parameters and overloads vary per database without any system. To address this issue we:

  • introduced 4 overloads for databases with versioning support (per-provider or/and per-dialect)
  • 2 overloads for databases without versioning (currently it is only SQL CE)
  • added versioning support for databases that lacked it (and plan to add some more before final release)

New API with 4 overloads:

// DB here is database provider name, e.g. MySql

// overloads without version parameters
UseDB(this DataOptions options, Func<DBOptions, DBOptions> optionSetter);
UseDB(this DataOptions options, string connectionString, Func<DBOptions, DBOptions> optionSetter);

// overloads with versioning parameters (dialect/provider)
// database provider could have only one versioning parameter if it doesn't have versioning for second
UseDB(
    this DataOptions                 options,
         DBVersion                   dialect          = DBVersion.AutoDetect,
         DBProvider                  provider         = DBProvider.AutoDetect,
         Func<DBOptions, DBOptions>? optionSetter     = null);
UseDB(
    this DataOptions                 options,
         string                      connectionString,
         DBVersion                   dialect          = DBVersion.AutoDetect,
         DBProvider                  provider         = DBProvider.AutoDetect,
         Func<DBOptions, DBOptions>? optionSetter     = null);

API with 2 overloads:

UseSqlCe(this DataOptions options, Func<DBOptions, DBOptions>? optionSetter = null);
UseSqlCe(this DataOptions options, string connectionString, Func<DBOptions, DBOptions>? optionSetter = null);

For most of cases migration to new API will not require changes to code as we already had required overloads, but if you used removed method, you will need to update call to use new one.

Changes to database/provider version configuration

We adding provider/dialect versioning support for many providers that lacked it before.

New dialect configuration enums:

  • FirebirdVersion
  • MySqlVersion

New provider configuration enums:

  • AccessProvider
  • InformixProvider
  • MySqlProvider
  • SQLiteProvider
  • SapHanaProvider
  • SybaseProvider

We are open for adding more dialects and providers in future. If you want to propose dialect/provider, please fill-in corresponding feature request to issues.

Changes to DBTools classes

Each database provider has utility class <DB>Tools where DB is provider name.

Obsoleted members removal

We are removing bunch of obsoleted methods and properties in those classes (e.g. global provider configuration options and BulkCopy methods) that duplicate existing functionality.

Unification of *Tools API
  • all Tools classes now expose AutoDetectProvider option to globally enable or disable provider/dialect detection logic. This option is missing for SQL CE provider as it doesn't have versioning and thus detection logic. Note that we always recommend to specify provider and dialect explicitly to avoid incorrect detection if you don't need to work with dynamic environment.
  • *Tools.GetDataProvider API signature changed to accept provider version/dialect enumerations where applicable.
  • *Tools.CreateDataConnection API refactored to be 3 methods with
    • string connectionstring/DbConnection/DbTransaction as first parameter
    • provider version/dialect enumerations as second/third parameter where applicable

Example (for SQL Server):

public static class SqlServerTools
{
    // AutoDetectProvider property
    public static bool AutoDetectProvider { get; set; }

    // GetDataProvider API shape
    public static IDataProvider GetDataProvider(
        SqlServerVersion  version          = SqlServerVersion.AutoDetect,
        SqlServerProvider provider         = SqlServerProvider.AutoDetect,
        string?           connectionString = null);

    // CreateDataConnection API shape
    public static DataConnection CreateDataConnection(
        string            connectionString,
        SqlServerVersion  version  = SqlServerVersion.AutoDetect,
        SqlServerProvider provider = SqlServerProvider.AutoDetect);
    public static DataConnection CreateDataConnection(
        DbConnection      connection,
        SqlServerVersion  version  = SqlServerVersion.AutoDetect,
        SqlServerProvider provider = SqlServerProvider.AutoDetect);
    public static DataConnection CreateDataConnection(
        DbTransaction     transaction,
        SqlServerVersion  version  = SqlServerVersion.AutoDetect,
        SqlServerProvider provider = SqlServerProvider.AutoDetect);
}

Other Changes

  • MappingSchema.AddScalarType and MappingSchema.SetDataType methods for value types (structs) now automatically add mapping for nullable version of type so you don't need to use two calls to map both nullable and non-nullable value type. You still can do it if for some reason you want completely different mapping for nullable and non-nullable types (highly discouraged and not supported), but call for nullable type should go after call for non-nullable type to avoid it being overwritten
  • database dialect detection now works without error with configuration that use external DbConnection or DbTransaction instance without connection string specified in configuration
  • various predicate optimization improvements
  • avoid removal of sub-queries with complex columns if those columns used multiple times by outer query to avoid multiple evaluations
  • generate column aliases in final projection where it could confuse database otherwise for some databases
  • prefer to generate CROSS JOIN instead of non-ANSI join syntax FROM Table1 t1, Table2 t2 for databases that support CROSS JOIN for clarity and to avoid unnecessary sub-queries when database doesn't support mixed JOINs in single FROM clause

Database-specific Changes

Access

Not yet

ClickHouse

PR #4452

  • dropped support for old versions (pre-2.2.10) of Octonica provider for TCP protocol. In any case we recommend to use latest version (currently 3.1.3) as one having less compatibility issues with Linq To DB.
  • integer literal not typed explicitly anymore using toInt32(x)

DB2

  • add/improve parameters wrapping into type casts in places where database could fail to type parameter and report typing error

Firebird

Firebird Dialect selection

For a long time we generated Firebird SQL to be compatible with 2.5 release. To improve this situation we introduce FirebirdVersion dialect configuration enumeration which has 4 dialects currently:

  • Firebird 2.5
  • Firebird 3.0
  • Firebird 4.0
  • Firebird 5.0

Please inform us if you need to use older Firebird version (e.g. 2.1) or Dialect 1 databases, so we can add support for them.

Firebird dialect-specific SQL changes

  • 3.0+ : bool type mapped to BOOLEAN type by default. To use old mapping to 1/0 characters, use 2.5 dialect version or specify DataType.Char data type for boolean in your mapping
  • 3.0+ : instead of TAKE/SKIP paging clause Linq To DB will generate OFFSET/FETCH clause
  • 3.0+ : batch size for BulkCopy rised to 10Mb of generated SQL per batch instead of 64Kb for older versions
  • 4.0+ : enabled support for LATERAL JOINs
  • 5.0+ : rised limit of items in single IN predicate from 1 500 to 65 535 items
  • 5.0+ : added support for WHEN NOT MATCHED BY SOURCE MERGE operations
  • 5.0+ : added support for native QUARTER qualifier in DatePart functions

Other Firebird support changes

  • improve parameters wrapping into type casts in places where database could fail to type parameter and report typing error

Informix

Not yet

MySQL/MariaDB

PR #4002

MySQL Dialect selection

To improve SQL generation for modern MySQL and MariaDB we introduce MySqlVersion dialect configuration enumeration which has 3 dialects currently:

  • MySql 5.7 rolling release
  • MySql 8.0 rolling release
  • MariaDB 10/11 rolling release (we don't currently distinguish 10 vs 11 as MariaDB 11 doesn't have changes to SQL yet)

As side-effect it could result in unsupported SQL generation for older MySql versions (pre-5.7). If you need to work with older MySql servers, please inform us, so we can (re)add support for them.

MySQL dialect-specific SQL changes

  • MySQL8/MariaDB : for those dialects we don't generate FROM DUAL fake source anymore for table-less queries with filter
  • MySQL8/MariaDB : conversion to float/double now generates CAST(? as DOUBLE/FLOAT) instead of CAST(? as DECIMAL(..))
  • MySQL8 : enabled support for LATERAL JOINs
  • MySQL8/MariaDB : enabled generation of EXCEPT/INTERSECT ALL/DISTINCT set operators instead of emulated SQL
  • ALL: generation of non-standard FLOAT(N) type for column table creation removed. If you need such type for some reason you can create feature request and use DbType="FLOAT(5)" column mapping as workaround

Oracle

Not yet

PostgreSQL

  • prefer to use :: type cast instead of CAST(...)

SAP HANA

  • PR #4002

  • #3154 : LATERAL JOIN support

  • enabled support for LATERAL JOINs

  • enabled paging support in sub-queries

SQL CE

Not yet

SQLite

  • updated InsertOrUpdate API to generate single INSERT .. ON CONFLICT UPDATE/IGNORE query instead of two separate queries
  • use proper INTEGER storage type instead of Int in type conversions: CAST(smth AS INTEGER)
  • fix DateTime string literal milliseconds generation to emit 3 digits always, even if they are trailing zeroes
  • enable SQL row support in update queries

SQL Server

  • prefer to use CAST over CONVERT where it is possible

SAP/Sybase ASE

Not yet.

Clone this wiki locally