Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precompiled queries #25009

Closed
1 task
Tracked by #24903 ...
roji opened this issue Jun 1, 2021 · 6 comments
Closed
1 task
Tracked by #24903 ...

Precompiled queries #25009

roji opened this issue Jun 1, 2021 · 6 comments
Assignees
Labels
area-aot area-perf area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Milestone

Comments

@roji
Copy link
Member

roji commented Jun 1, 2021

This tracks the precompiled query feature, where EF generates interceptors at publish time to intercept static LINQ query operators and execute the query directly, without going through compilation. This:

  • NativeAOT: this is (currently) a prerequisite to NativeAOT support, since we're not (yet) going to make query compilation NativeAOT-compatible (this may be done in 10 to support dynamic queries).
  • Improve query runtime: no more parameter extraction, cache lookup, etc. A bit similar to compiled queries but goes even further, and without requiring the user to use any special APIs.
  • Reduced startup time: no more EF query compilation.

Note that although this is a prerequisite to NativeAOT support, precompiled queries can be used in non-NativeAOT applications to get the above performance benefits (faster execution and startup time).

The main subtasks are tracked on the general NativeAOT epic.

  • Integrate precompilation into dotnet publish, and possibly also into dotnet ef for manual precompilation.
    • Note that we'll need to be able to raise warnings from this step (e.g. some query failed compilation, or a dynamic query was detected), and to cause the publish to fail (warnings as errors).

** PREVIOUS DESIGN THOUGHTS **

General design

When a LINQ query is first encountered, EF "compiles" it, producing a code-generated shaper, SQL (for relational databases), etc. This process is both a bit long (increasing startup times), and incompatible with AOT environments (since code generation is used at runtime). While several approaches have been discussed in the past to improve this (e.g. #16496), with the advent of source generators we have some new possibilities. I've done some work on a proof-of-concept source generator which identifies EF queries and precompiles them; the work is far from complete but indicates that the approach is feasible.

In a nutshell, we would:

  1. Identify a query in user source code
    • A first implementation would identify invocations of EF's compiled query API (EF.CompileQuery); this is trivial and low-risk way to immediately identify EF queries in the user's code.
    • We could later also attempt to precompile regular queries which don't use EF.CompileQuery. This would be an additional step in which we identify DbSets (as member accesses on a DbContext-typed identfier), and then walk up the syntax tree, progressively including methods as long as they accept IQueryable. Once we reach a method which doesn't accept IQueryable (e.g. ToList), we've reached the end of the query to be compiled.
    • Dynamically-constructed queries wouldn't be supported.
  2. Transform the query to a LINQ expression tree
    • Once we have a Roslyn syntax tree representing a query (either from EF.CompileQuery or from a regular query), it needs to be transformed into a LINQ expression tree, which is what EF's query pipeline requires.
    • Unlike the Roslyn structures, LINQ expression trees refer to actual .NET types, MemberInfos, etc. We would therefore need to load the user's assembly (from the input compilation given to the source generator), and use reflection to load actual types from it (e.g. entity CLR types). See note on AssemblyLoadContext below.
  3. Compile the query with EF Core
    • Once we have a LINQ expression tree, we need to pass it to EF's query compiler. To do this:
      • We instantiate the user's DbContext type, using the parameterless constructor
      • Extract the IQueryCompiler service from it
      • Invoke the compiler, passing it the LINQ expression tree.
    • The output of this compilation is another LINQ expression tree, which instantiates e.g. a SingleQueryingEnumerable given a QueryContext. This output tree must not contain any compiled elements, e.g. the shaper must be present in non-compiled form. This would require some refactoring of the last parts of the query pipeline.
  4. Generate C# out of the compilation output
    • In the normal flow, the output LINQ expression tree is now compiled to produce a lambda (returning e.g. an enumerable given a QueryContext).
    • In the AOT flow, the expression tree would instead be outputted as C# code into a file emitted by the source generator. This generated code would be invoked by EF as part of startup, and would pre-populate its query cache.
    • This would require writing a component to convert a LINQ expression tree to C# code - possibly passing through a Roslyn syntax tree for maximum flexibility etc..

The final code added by the source generator would look something like the following:

var selectExpression = ...;

var readColumns = ...;

var relationalCommandCache = new RelationalCommandCache(
    memoryCache,
    querySqlGeneratorFactory,
    RelationalParameterBasedSqlProcessFactory,
    selectExpression,
    readColumns,
    useRelationalNulls: false
);

var shaper = ...;

var enumerable = new SingleQueryingEnumerable<Blog>(
    (RelationalQueryContext)QueryCompilationContext.QueryContextParameter,
    relationalCommandCache,
    shaper,
    typeof(Blog),
    standAloneStateManager: false,
    detailedErrorsEnabled: false,
    threadSafetyChecksEnabled: true);

// Pre-populate EF Core's cache with the above enumerable

Additional notes:

  • The above does not cover relational command caching (including SQL), which depends on parameter nullability. This means that some query compilation still remains at runtime (but no code generation).
  • We may be able to reuse previously-precompiled queries if their source file hasn't change (e.g. store file hashes). This would make this feature suitable also for speeding up the developer inner loop.
  • Query precompilation isn't necessarily dependent on using compiled models (Reduce EF Core application startup time via compiled models #1906), though using that would speed the process up.
  • This could be helpful (thanks @bricelam)

EDIT: Following internal discussion it has become clear that doing this as a source generator isn't practical (see #25009 (comment) below). Instead, this would be a design-time CLI command or similar.

  • This would most likely be opt-in-only (via a csproj property), and probably makes most sense in Release builds.
  • When loading user assemblies (and their dependents), we probably want to isolate them in their own AssemblyLoadContext. This isn't trivial - we need to take Roslyn-provided syntax tree and semantic models (default assembly loader), transform them into an expression tree, and pass that into the query pipeline isolated inside the special AssemblyLoadContext. In my prototype, the default AssemblyLoadContext is used to avoid these issues.
@AndriySvyryd
Copy link
Member

Alternatively instead of implementing this as a source generator it could be a design-time tool that uses Roslyn and avoids the issues related to loading user assemblies and resolving types used in queries.

@roji
Copy link
Member Author

roji commented Jun 3, 2021

@AndriySvyryd yeah. To sum up an internal conversation with @jaredpar:

  • Running user code from a source generator (i.e. in the compiler process) could cause severe build perf issues if the user code does something bad (e.g. hang VS). This could be considered a bit less risky if the feature is opt-in, but the potential for trouble is still very big.
  • Loading a user assembly from a source generator would probably not work, since the compiler process is frequently still on .NET Framework (i.e. when running in VS), but the user assembly is .NET Core.
  • Ordering issues could make the EF source generator run before another source generator; if that other source generator is necessary in order to produce a working assembly (e.g. produce some required partial method), then the EF source generator would see a non-compiling Compilation, and cannot run any user code in it.

So yeah, we'll probably go with a design-time tool (e.g. CLI command). The general plan outlined above should still apply to that (and the need for isolating the user assemblies is no longer relevant).

@AndriySvyryd
Copy link
Member

To improve trimming, we can define a feature switch for query compilation to allow all that code to be trimmed if all queries are precompiled.

@roji
Copy link
Member Author

roji commented Nov 3, 2022

Another relevant doc: https://github.com/dotnet/designs/blob/main/accepted/2020/feature-switch.md

I wonder if we can automatically trim that code when publishing AOT, without requiring another gesture from the user - if possible, we should avoid having too many knobs and switches which users have to set to get to the right place.

@roji roji changed the title AOT query mode: precompiled queries AOT query mode: precompiled queries (1st part of the query pipeline) Dec 4, 2022
@roji
Copy link
Member Author

roji commented Dec 4, 2022

Note that this includes AOT execution of the first part of the query pipeline; the is the part the generates the materializer (and so uses code generation), and has no access to parameter values. It notably does not include SQL generation (that's #29753).

maumar pushed a commit that referenced this issue Mar 28, 2024
maumar pushed a commit that referenced this issue Mar 28, 2024
maumar pushed a commit that referenced this issue Mar 28, 2024
maumar pushed a commit that referenced this issue Mar 28, 2024
maumar pushed a commit that referenced this issue Mar 28, 2024
maumar pushed a commit that referenced this issue Mar 28, 2024
maumar pushed a commit that referenced this issue Mar 29, 2024
maumar pushed a commit that referenced this issue Mar 29, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Part of #25009
maumar pushed a commit that referenced this issue Apr 1, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Part of #25009
maumar added a commit that referenced this issue Apr 1, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Part of #25009
maumar added a commit that referenced this issue Apr 3, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Part of #25009
maumar added a commit that referenced this issue Apr 4, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Part of #25009
maumar added a commit that referenced this issue Apr 4, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
maumar added a commit that referenced this issue Apr 4, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
maumar added a commit that referenced this issue Apr 8, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
maumar added a commit that referenced this issue Apr 9, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
maumar added a commit that referenced this issue Apr 9, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
maumar added a commit that referenced this issue Apr 9, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
maumar added a commit that referenced this issue Apr 9, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
maumar added a commit that referenced this issue Apr 9, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
maumar added a commit that referenced this issue Apr 12, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
maumar added a commit that referenced this issue Apr 12, 2024
Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
maumar added a commit that referenced this issue Apr 12, 2024
…#33351)

Fix materializer code to replace non-primitive constants with liftable constants.
Precompilation is opt-in in QueryCompilationContext, switched on for relational, off for everything else.

Architecture, design and initial implementation done by @roji, polish done by @maumar

Part of #25009
@roji
Copy link
Member Author

roji commented Jul 12, 2024

I'm closing this as complete for 9. We certainly have remaining issues and improvements to work on, but we can track those separately.

@roji roji closed this as completed Jul 12, 2024
@ajcvickers ajcvickers added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Aug 21, 2024
@ajcvickers ajcvickers modified the milestones: 9.0.0, 9.0.0-preview7 Aug 21, 2024
@roji roji modified the milestones: 9.0.0-preview7, 9.0.0 Oct 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-aot area-perf area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Projects
None yet
Development

No branches or pull requests

3 participants