Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving specification evaluation performance #182

Closed
devbased opened this issue Dec 10, 2021 · 9 comments · Fixed by #185
Closed

Improving specification evaluation performance #182

devbased opened this issue Dec 10, 2021 · 9 comments · Fixed by #185
Milestone

Comments

@devbased
Copy link
Contributor

devbased commented Dec 10, 2021

I don't like current spec evaluation because it compiles expressions with every call to Evaluate.
So i have added benchmark project to provide ability to measure performance and replaced IEnumerable<(Expression<Func<T, string>>, string, int)> ISpecification{T}.SearchCriterias with IEnumerable<SearchExpressionBase<T>>. There also 2 approaches to compile expressions: one is default dotnet compile and second is CompileFast from dadhi/FastExpressionCompiler.
With this changes user code could cache specifications to not just reduce amount of allocations but also to avoid recompilation.

I'm not sure if it's super precise benchmark since i don't have enough experience in such things and i've ran it on my home PC, however everyone can do it by himself to verify results.
I need to know, should i continue and do the same for other expressions or not?

Long story short, here's the code and benchmark results

public interface ISpecification<T>
{
    // other members omitted for brevity
    IEnumerable<SearchExpressionBase<T>> SearchCriterias { get; }
}

public abstract class SearchExpressionBase<T>
{
    protected SearchExpressionBase(Expression<Func<T, string>> source, string searchTerm, int searchGroup = 1)
    {
        this.Source = source;
        this.SearchTerm = searchTerm;
        this.SearchGroup = searchGroup;
    }

    public Expression<Func<T, string>> Source { get; }

    public string SearchTerm { get; }

    public int SearchGroup { get; }

    public abstract Func<T, string> SourceFunc { get; }
}

public sealed class SearchExpression<T> : SearchExpressionBase<T>
{
    private readonly Lazy<Func<T, string>> sourceFuncLazy;

    public SearchExpression(Expression<Func<T, string>> source, string searchTerm, int searchGroup = 1) : base(source, searchTerm, searchGroup)
    {
        this.sourceFuncLazy = new Lazy<Func<T, string>>(this.Source.Compile);
    }

    public override Func<T, string> SourceFunc => this.sourceFuncLazy.Value;
}

public sealed class SearchExpressionFast<T> : SearchExpressionBase<T>
{
    private readonly Lazy<Func<T, string>> sourceFuncLazy;

    public SearchExpressionFast(Expression<Func<T, string>> source, string searchTerm, int searchGroup = 1) : base(source, searchTerm, searchGroup)
    {
        this.sourceFuncLazy = new Lazy<Func<T, string>>(() => this.Source.CompileFast());
    }

    public override Func<T, string> SourceFunc => this.sourceFuncLazy.Value;
}

public static ISpecificationBuilder<T> Search<T>(
    this ISpecificationBuilder<T> specificationBuilder,
    Expression<Func<T, string>> selector,
    string searchTerm,
    int searchGroup = 1) where T : class
{
    ((List<SearchExpressionBase<T>>)specificationBuilder.Specification.SearchCriterias)
        .Add(new SearchExpression<T>(selector, searchTerm, searchGroup));

    return specificationBuilder;
}

public static ISpecificationBuilder<T> SearchFast<T>(
    this ISpecificationBuilder<T> specificationBuilder,
    Expression<Func<T, string>> selector,
    string searchTerm,
    int searchGroup = 1) where T : class
{
    ((List<SearchExpressionBase<T>>)specificationBuilder.Specification.SearchCriterias)
        .Add(new SearchExpressionFast<T>(selector, searchTerm, searchGroup));

    return specificationBuilder;
}

public class SearchEvaluator : IInMemoryEvaluator
{
    private SearchEvaluator() { }
    public static SearchEvaluator Instance { get; } = new SearchEvaluator();

    public IEnumerable<T> Evaluate<T>(IEnumerable<T> query, ISpecification<T> specification)
    {
        foreach (var searchGroup in specification.SearchCriterias.GroupBy(x => x.SearchGroup))
        {
            var criterias = searchGroup.Select(x => (x.SourceFunc, x.SearchTerm));

            query = query.Where(x => criterias.Any(c => c.SourceFunc(x).Like(c.SearchTerm)));
        }

        return query;
    }
}
[MemoryDiagnoser, MedianColumn, RankColumn, CsvExporter]
public class InMemorySearchEvaluatorBenchmark
{
    private SearchEvaluator evaluator;
    private Consumer consumer;
    private IEnumerable<string> data;
    private TestSpecification specification;
    private TestSpecificationFast specificationFast;

    [Params(1, 10, 100, 1000)]
    public int RepeatCount;

    [GlobalSetup]
    public void GlobalSetup()
    {
        this.evaluator = SearchEvaluator.Instance;
        this.data = Enumerable.Range(1, 247).Select(x => $"Test {x % 124} data.");
        this.specification = new TestSpecification();
        this.specificationFast = new TestSpecificationFast();
        this.consumer = new Consumer();
    }

    [Benchmark]
    public void InMemorySearchEvaluator_Evaluate()
    {
        for (var i = 0; i < this.RepeatCount; ++i)
        {
            this.evaluator.Evaluate(this.data, this.specification).Consume(this.consumer);
        }
    }

    [Benchmark]
    public void InMemorySearchEvaluator_EvaluateFast()
    {
        for (var i = 0; i < this.RepeatCount; ++i)
        {
            this.evaluator.Evaluate(this.data, this.specificationFast).Consume(this.consumer);
        }
    }

    private sealed class TestSpecification : Specification<string>
    {
        public TestSpecification()
        {
            this.Query.Search(x => x, "%123%");
        }
    }

    private sealed class TestSpecificationFast : Specification<string>
    {
        public TestSpecificationFast()
        {
            this.Query.SearchFast(x => x, "%123%");
        }
    }
}
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19042.1165 (20H2/October2020Update)
Intel Core i5-9600K CPU 3.70GHz (Coffee Lake), 1 CPU, 6 logical and 6 physical cores
.NET SDK=6.0.100
  [Host]     : .NET 6.0.0 (6.0.21.52210), X64 RyuJIT
  DefaultJob : .NET 6.0.0 (6.0.21.52210), X64 RyuJIT

Before

Method RepeatCount Mean Error StdDev Median Rank Gen 0 Gen 1 Gen 2 Allocated
InMemorySearchEvaluator_Evaluate 1 7.258 ms 0.1064 ms 0.0943 ms 7.246 ms 1 226.5625 109.3750 7.8125 1 MB
InMemorySearchEvaluator_Evaluate 10 72.422 ms 1.0351 ms 0.9175 ms 72.293 ms 2 2285.7143 1142.8571 - 10 MB
InMemorySearchEvaluator_Evaluate 100 738.127 ms 14.2779 ms 16.9969 ms 732.809 ms 3 23000.0000 11000.0000 - 105 MB
InMemorySearchEvaluator_Evaluate 1000 7,304.372 ms 69.1068 ms 64.6425 ms 7,320.962 ms 4 234000.0000 117000.0000 10000.0000 1,049 MB

After

Method RepeatCount Mean Error StdDev Median Rank Gen 0 Allocated
InMemorySearchEvaluator_Evaluate 1 104.2 μs 0.92 μs 0.77 μs 103.7 μs 1 14.4043 67 KB
InMemorySearchEvaluator_EvaluateFast 1 103.8 μs 0.51 μs 0.48 μs 103.7 μs 1 14.4043 67 KB
InMemorySearchEvaluator_Evaluate 10 1,015.9 μs 12.42 μs 11.62 μs 1,016.4 μs 2 144.5313 665 KB
InMemorySearchEvaluator_EvaluateFast 10 1,035.8 μs 16.52 μs 15.45 μs 1,037.6 μs 3 144.5313 665 KB
InMemorySearchEvaluator_Evaluate 100 10,844.6 μs 215.23 μs 376.97 μs 10,984.3 μs 5 1437.5000 6,651 KB
InMemorySearchEvaluator_EvaluateFast 100 10,265.1 μs 121.38 μs 113.54 μs 10,246.4 μs 4 1437.5000 6,651 KB
InMemorySearchEvaluator_Evaluate 1000 103,440.0 μs 1,181.59 μs 1,105.26 μs 103,007.3 μs 6 14400.0000 66,509 KB
InMemorySearchEvaluator_EvaluateFast 1000 110,561.6 μs 840.64 μs 786.33 μs 110,465.3 μs 7 14400.0000 66,509 KB
@fiseni
Copy link
Collaborator

fiseni commented Dec 10, 2021

Hey @devbased,

First of all, thanks for your interest in doing this. This is only related to in-memory evaluators (in this case you demonstrated for the search evaluator).
For clarity, let's just analyze what we have right now. We keep the state in the specification either in form of some simple values or Expressions. Then, we have the following types of evaluators:

  1. SpecificationEvaluator - It evaluates the specification state for the given persistence plugin package. In the case of Entity Framework, we start with the DbSet and build the IQueryable to the final form. We don't compile the expressions in this case, we pass them in their original form. This is the main use case of the package (probably 95% of the cases). We use specifications to form the queries for data access.
  2. InMemorySpecificationEvaluator - This feature was added as an addition. Why not use the same specs for the evaluation of in-memory collections. But, this is somewhat niche usage, and not sure if it's used often by the users. In this case, we need to compile the expressions, and this is done during the evaluation. We didn't bother too much to try to cache them, since specs are a bit dynamic constructs. You create new required specs on each request, pass some values/parameters during construction, and get the results. It's rarely the case that you will use the same spec multiple times within the same request, right? If that's the case, then trying to cache them, mehh not so much benefit. In your benchmarks, you're using the same spec over and over again, and in fact with a predefined search string. That's not a very realistic scenario. If you create the spec within your loop and pass different parameters, I assume you'll get somewhat the same results.

Anyhow, this was great work, thank you. I'll analyze it further next week, and see how we can benefit from it in the best way. If we don't see any downside of doing this, then why not, we'll implement it.

To be honest, I don't care so much about the in-memory evaluators 😄 We have a bigger issue with Include chains, in terms of performance. We are forced to keep the state in form of LambdaExpressions, and then construct the IQueryable through reflection (it's done because of multiple inner return types). I'll be quite happy if anyone has better ideas on that subject. That would be really great improvement.

@devbased
Copy link
Contributor Author

Good point. But i think if library will be well structured and optimized for most use cases it will encourage people to stick to this approach. Like you have said, user basically has a bunch of specifications and he's free to use them with any collection, doesn't matter in-memory or persisted one. They could be cached across requests it's up to user to decide to keep spec instance somewhere or not. But for now even if they keep it, it won't give much.

About include lambdas, not sure if i understand where's heavy part. typeof() is lightweight and efcore internally does the same: creates query over method invocation. Yes, we have 2 CreateQuery instead of 1 for each include, probably it can be mitigated by constructing delegate, which will directly call efcore's extension method, and cache it.

@fiseni
Copy link
Collaborator

fiseni commented Dec 10, 2021

You're right. Perhaps we have to give that flexibility to the users. Anyway, this will affect only the in-memory evaluations.

Regarding the Include infrastructure, here is the extension IncludeExtensions. I changed it a few times and it's not terribly bad now, but still, I'm not happy that we have to do it this way. We're duplicating some work done in EF, and Expression.Call uses reflection in the background (it's a lightweight reflection, but anyhow would have been nice not to have it).

For context, read this PR #83 and this issue

@fiseni
Copy link
Collaborator

fiseni commented Dec 11, 2021

Hey @devbased

I just took a look at your proposal. Let's implement it, but with the following considerations:

  • We'll use the default dotnet compile. The difference is somewhat negligible, and we don't want to be dependent on additional dependencies for the base package.
  • We'll introduce new [Name]ExpressionInfo constructs which will hold all related information. We'll need constructs for Search, Where and Order.
  • You can place them in "Expressions" folder as you did, but since these will be public to the users, let's keep the root namespace Ardalis.Specification.
  • Move the IncludeExpressionInfo into this newly created folder too.

Proposed implementation:

public interface ISpecification<T>
{
    IEnumerable<SearchExpressionInfo<T>> SearchCriterias { get; }
}

public class SearchExpressionInfo<T>
{
    private readonly Lazy<Func<T, string>> selectorFunc;

    public SearchExpressionInfo(Expression<Func<T, string>> selector, string searchTerm, int searchGroup = 1)
    {
        _ = selector ?? throw new ArgumentNullException(nameof(selector));
        if (string.IsNullOrEmpty(searchTerm)) throw new ArgumentNullException(nameof(searchTerm));

        this.Selector = selector;
        this.SearchTerm = searchTerm;
        this.SearchGroup = searchGroup;

        this.selectorFunc = new Lazy<Func<T, string>>(this.Selector.Compile);
    }

    public Expression<Func<T, string>> Selector { get; }

    public string SearchTerm { get; }

    public int SearchGroup { get; }

    public Func<T, string> SelectorFunc => this.selectorFunc.Value;
}

@devbased
Copy link
Contributor Author

@fiseni Seems good. Should i remove benchmark project?

@fiseni
Copy link
Collaborator

fiseni commented Dec 12, 2021

Yes, please remove the benchmark project. We might need something more generic.

@fiseni fiseni added this to the 6.0 milestone Dec 13, 2021
@fiseni
Copy link
Collaborator

fiseni commented Dec 13, 2021

Note: This is a breaking change for all users that have been accessing the specification state directly. We should document it in release notes.

@davidhenley
Copy link
Contributor

davidhenley commented Mar 2, 2022

For those of use that have been accessing specification state, what is the fix for this breaking change?

This stopped working and is throwing an invalid cast exception on 6.0.0

#53 (comment)

public enum SortOrder
{
    ASC,
    DESC
}

public static IOrderedSpecificationBuilder<T> OrderBy<T>(
        this ISpecificationBuilder<T> specificationBuilder,
        Expression<Func<T, object>> orderExpression,
        SortOrder orderBy)
    {
        ((List<(Expression<Func<T, object>> OrderExpression, OrderTypeEnum OrderType)>)specificationBuilder
                .Specification.OrderExpressions)
            .Add(orderBy == SortOrder.DESC
                ? (orderExpression, OrderTypeEnum.OrderByDescending)
                : (orderExpression, OrderTypeEnum.OrderBy));

        var orderedSpecificationBuilder = new OrderedSpecificationBuilder<T>(specificationBuilder.Specification);
        return orderedSpecificationBuilder;
    }
Unable to cast object of type
'System.Collections.Generic.List`1[Ardalis.Specification.OrderExpressionInfo`1[Entities.Customer]]' 
to type 'System.Collections.Generic.List`1[System.ValueTuple`2[System.Linq.Expressions.Expression`1[System.Func`2[Entities.Customer,System.Object]],Ardalis.Specification.OrderTypeEnum]]'.

@fiseni
Copy link
Collaborator

fiseni commented Mar 2, 2022

You can find the answer in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants