-
-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore ability to use null cache #194
Restore ability to use null cache #194
Conversation
…orker could have processed the request while we were waiting for the lock, so recheck the cache
…s always called from outside the lock.
…ng "var" usages per ImageSharp style guidelines
Codecov Report
@@ Coverage Diff @@
## master #194 +/- ##
==========================================
- Coverage 84.80% 84.60% -0.21%
==========================================
Files 50 55 +5
Lines 1448 1539 +91
Branches 199 228 +29
==========================================
+ Hits 1228 1302 +74
- Misses 165 181 +16
- Partials 55 56 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@kroymann I'm struggling to find the time to benchmark this so if you have the opportunity please do. |
I had a quick look this morning. The results are both very promising, and suspicously close. It either indicates that the LRU cache is blocking any meaningful test of this piece of functionality (I don't think so, but need to have a longer look at code to say for sure), or both locking arrangements are having no meaningful impact on performance, and we've at the limit of what we can serve (i.e. limited by disk speed, or the framework itself). I'm tending to the second, as the most meaningful change I could make to impact rps was to set the samples project logging to "Warning" instead of "Debug". (basically drop all the expensive logging) Need to find some time to look at the code changes, and do another run with the samples project updated to .NET 6. |
Thanks @deanmarcussen I think it might be an idea to try both this and master with a much reduced timespan for LRU cache. It's sitting at 5 minutes just now which will be interfering with results. It would be amazing if you could document the setup you use to run the benchmark as I had a go at reading the Crank docs again and felt a bit overwhelmed. A .NET 6 run would definitely be a good idea. |
this.RefCount = refCount; | ||
} | ||
|
||
public bool Equals( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am getting error CS8767: Nullability of reference types in type of parameter 'other' of 'bool RefCountedValue.Equals(RefCountedValue other)' doesn't match implicitly implemented member 'bool IEquatable<RefCountedValue>.Equals(RefCountedValue other)' (possibly because of nullability attributes)
when running the benchmarks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird. This compiles cleanly in both netcoreapp2.1 and 3.1 for me, and I'm able to run the benchmarks in both TFMs as well....? @sebastienros Anything special about how you're trying to run the benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be more precise, using crank, doing a web load benchmark independent from BDN. It works fine on master though. Full stack:
Command:
dotnet publish ImageSharp.Web.Sample.csproj -c Release -o /tmp/benchmarks-agent/benchmarks-server-1/2ns2okc1.ov0/ImageSharp.Web/samples/ImageSharp.Web.Sample/published /p:MicrosoftNETCoreAppPackageVersion=3.1.21 /p:MicrosoftAspNetCoreAppPackageVersion=3.1.21 /p:MicrosoftNETCoreApp31PackageVersion=3.1.21 /p:MicrosoftNETPlatformLibrary=Microsoft.NETCore.App /p:RestoreNoCache=true --framework netcoreapp3.1 --self-contained -r linux-x64
Microsoft (R) Build Engine version 16.7.2+b60ddb6f4 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.
Determining projects to restore...
Restored /tmp/benchmarks-agent/benchmarks-server-1/2ns2okc1.ov0/ImageSharp.Web/src/ImageSharp.Web/ImageSharp.Web.csproj (in 297 ms).
Restored /tmp/benchmarks-agent/benchmarks-server-1/2ns2okc1.ov0/ImageSharp.Web/samples/ImageSharp.Web.Sample/ImageSharp.Web.Sample.csproj (in 297 ms).
Synchronization/RefCountedConcurrentDictionary.cs(229,25): error CS8767: Nullability of reference types in type of parameter 'other' of 'bool RefCountedValue.Equals(RefCountedValue other)' doesn't match implicitly implemented member 'bool IEquatable<RefCountedValue>.Equals(RefCountedValue other)' (possibly because of nullability attributes). [/tmp/benchmarks-agent/benchmarks-server-1/2ns2okc1.ov0/ImageSharp.Web/src/ImageSharp.Web/ImageSharp.Web.csproj]
Exit code: 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed a small change to this code that adjusts the nullability attributes when compiling with net5.0 or higher. This is preemptively getting ahead of any update to this codebase to target net6.0, and maybe it will address whatever issue you hit running benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With your changes it builds on net6.0. At least it's unblocking me.
/// <summary> | ||
/// Simple immutable tuple that combines a <typeparamref name="TValue"/> instance with a ref count integer. | ||
/// </summary> | ||
private class RefCountedValue : IEquatable<RefCountedValue> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use struct records here instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I benchmarked this using both class and struct for this type and determined that using a class executes more quickly and allocates less memory. I believe this happens because ConcurrentDictionary
can use an optimized code path that leverages atomic writes when TValue is a class, but has to fall back on a less efficient path that allocates when TValue is a struct.
| Method | Mean | Error | StdDev | Gen 0 | Allocated |
|-------------------------------- |----------:|---------:|---------:|-------:|----------:|
| Class_GetAndReleaseNewKey | 129.92 ns | 0.554 ns | 0.518 ns | 0.0095 | 80 B |
| Struct_GetAndReleaseNewKey | 147.59 ns | 0.687 ns | 0.573 ns | 0.0067 | 56 B |
| Class_GetAndReleaseExistingKey | 142.32 ns | 1.159 ns | 1.027 ns | 0.0076 | 64 B |
| Struct_GetAndReleaseExistingKey | 177.69 ns | 0.813 ns | 0.721 ns | 0.0134 | 112 B |
| Class_GetExistingKey | 69.75 ns | 0.301 ns | 0.267 ns | 0.0038 | 32 B |
| Struct_GetExistingKey | 89.81 ns | 0.682 ns | 0.638 ns | 0.0067 | 56 B |
I have not yet experimented with using record types (in part because this codebase is still targeting netcoreapp2.1 and 3.1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave changing target frameworks to V2. I don't want your hard work delayed by our upstream work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not a struct record, tuples would still be good while keeping the code simpler. But I don't know about tfm requirements either, so you'll decide.
With 10 concurrent clients, 15s warmup and 15s measurement, on a resized url but without any cache header (etag, ...) so all request return the file.
|
@sebastienros Thanks for the numbers. I can see a bit of give and take but otherwise fairly even. For the additional feature I think the change is worth it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get this merged in. Perf is comparable and we can always iterate further.
Prerequisites
Description
Per the discussion in #193, this PR restores the ability to effectively disable the caching logic through the use of a "NullCache". The most important part of this change is restoring the ability to stream the response directly from the processed image output, which is how it worked before v1.0.3. In v1.0.3, the
WriterWorkers
pattern was introduced, and a side effect of that change was that the response was now always streamed from the cache, which broke the ability to use a NullCache. To address this, I removed theReadWorkers
/WriterWorkers
stuff and replaced it with a more standard reader/writer locking pattern using a synchronization library ported from my company's codebase. This made it so that the processed image stream could remain available beyond the section of code protected inside the writer lock, and thus be available for use in generating the response.A key sticking point here (obviously) is benchmark testing this to see if performance was improved or degraded by this change.