Compilation perf: consider not exhaustively including all query sub-nodes in hash calculation #19859

roji · 2020-02-10T17:39:03Z

Our hash code calculation currently meticulously traverses the entire subtree of a given node. Unlike with equality, this isn't necessary, and may cost us significant compilation time. For example, we could decide to stop hashcode calculation at the first SelectExpression we see.

See also #19860, which is about caching hashcodes and could obviate this (not sure).
See #19737 for a case where exhaustive hash code calculation caused exponential calculation.

roji · 2020-07-21T12:56:34Z

Related: #21700

sulton-max · 2023-12-23T09:35:13Z

@roji

I was confused why ExpressionEqualityComparer didn't visit all nodes to compute hash, this lead me to this issue, can you help with how to implement hashing of query ?

If it's already implemented in internal infrastructure could you share references ?

I'm struggling to implement my own expression tree visitor, because I thought it would be as simple as calling ExpressionEqualityComparer.GetHashCode for each node, but that didn't work, now I'm trying to address all LINQ methods one by one

If there is already solution for this, that could really help for my work, thank you

roji · 2023-12-23T10:06:08Z

@sulton-max unless I'm mistaken, ExpressionEqualityComparer does currently visit all nodes to compute the tree hashcode (https://github.com/dotnet/efcore/blob/main/src/EFCore/Query/ExpressionEqualityComparer.cs#L34). Are you seeing a case where it does not visit a node or something?

ranma42 · 2024-08-22T09:42:58Z

Our hash code calculation currently meticulously traverses the entire subtree of a given node. Unlike with equality, this isn't necessary, and may cost us significant compilation time. For example, we could decide to stop hashcode calculation at the first SelectExpression we see.

This is not 100% clear to me; is the plan something like saying that instead of recursing into a SelectExpression we simply use a constant to represent it?

Assuming that each hash code is computed upon construction, using the sub-expression hash should be about as cheap (no visit, just a memory access vs a constant).

roji · 2024-08-22T11:32:33Z

Yeah, on second thought, you're probably right that if we cache the hash code, then there's no real point in doing something here, as it's already super cheap. I'll go ahead and close.

roji · 2024-08-22T11:34:15Z

BTW we still can compute the hash code lazily on 1st need, rather than upon construction, since it's not sure that we'll actually need the hash code for all expressions. At that point it may be beneficial to not recurse all the way (that was the original thinking here I think), but it doesn't seem important enough.

roji added type-enhancement area-perf labels Feb 10, 2020

This was referenced Feb 10, 2020

Compilation perf: consider caching hash codes for immutable query expression node types #19860

Closed

Exponential hash code calculation in ReplacingExpressionVisitor #19737

Closed

ajcvickers added this to the Backlog milestone Feb 10, 2020

ajcvickers added the consider-for-current-release label Feb 10, 2020

ajcvickers assigned roji Feb 10, 2020

ajcvickers added customer-reported and removed customer-reported labels Mar 10, 2020

roji mentioned this issue Jul 21, 2020

Don't ask how! 😢 #21707

Merged

smitpatel removed this from the Backlog milestone Jul 21, 2020

smitpatel removed the consider-for-current-release label Jul 21, 2020

smitpatel unassigned roji Jul 21, 2020

smitpatel added this to the Backlog milestone Jul 21, 2020

roji mentioned this issue Jan 14, 2024

Move all pruning code to the pruner and make it an immutable visitor #32817

Merged

ADNewsom09 mentioned this issue Jun 28, 2024

Improve performance for RelationalCommandCache by adding a hashcode to the Command #34117

Closed

roji mentioned this issue Aug 22, 2024

Optimize SQL expression comparison by caching hash codes (currently probably not needed) #34149

Open

roji closed this as not planned Won't fix, can't repro, duplicate, stale Aug 22, 2024

roji removed this from the Backlog milestone Aug 22, 2024

roji added the closed-no-further-action The issue is closed and no further action is planned. label Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compilation perf: consider not exhaustively including all query sub-nodes in hash calculation #19859

Compilation perf: consider not exhaustively including all query sub-nodes in hash calculation #19859

roji commented Feb 10, 2020 •

edited

Loading

roji commented Jul 21, 2020

sulton-max commented Dec 23, 2023

roji commented Dec 23, 2023

ranma42 commented Aug 22, 2024

roji commented Aug 22, 2024

roji commented Aug 22, 2024

Compilation perf: consider not exhaustively including all query sub-nodes in hash calculation #19859

Compilation perf: consider not exhaustively including all query sub-nodes in hash calculation #19859

Comments

roji commented Feb 10, 2020 • edited Loading

roji commented Jul 21, 2020

sulton-max commented Dec 23, 2023

roji commented Dec 23, 2023

ranma42 commented Aug 22, 2024

roji commented Aug 22, 2024

roji commented Aug 22, 2024

roji commented Feb 10, 2020 •

edited

Loading