-
-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abstract cache key creation to ICacheKey #206
Abstract cache key creation to ICacheKey #206
Conversation
Codecov Report
@@ Coverage Diff @@
## master #206 +/- ##
=====================================
Coverage 86% 86%
=====================================
Files 62 68 +6
Lines 1742 1791 +49
Branches 257 262 +5
=====================================
+ Hits 1500 1556 +56
+ Misses 176 167 -9
- Partials 66 68 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
/// <summary> | ||
/// Creates a cache key based on the lowercased request host, path and commands. | ||
/// </summary> | ||
public class CacheKey : ICacheKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only ship with the two implementations as described in #147 (comment) although I like your naming convention better.
UriAbsoluteCacheKey
UriRelativeCacheKey
I can't see a scenario where we would want a case sensitive cache key e.g UriAbsoluteCacheKey
as that opens us up to potential DOS so we should always convert the string to lower invariant.
From a performance perspective we should use string.Create
in conjunction with Span<char>.ToLowerInvarient()
to ensure that string creation is snappy. Urihelper
has optimized code you can use as a reference but we'd have to change a few things to do the required casing work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the OS (or file system configuration: https://docs.microsoft.com/en-us/windows/wsl/case-sensitivity#modify-case-sensitivity), the path can be case sensitive, which means different casing can resolve to different files and also that wrong casing wouldn't find the original image in the first place... So the cache hash should only be 'normalized' when the file system is case insensitive. I'm not sure if this can be easily determined at runtime, so adding a setting for this would be sensible.
The parsed commands are already case insensitive (or at least the keys), because the default implementation uses an InvariantIgnoreCase
comparer (and gets them from the query string, which isn't case sensitive according to the RFCs). Although it ultimately depends on the processor, command values should also be case insensitive (enum parsing already ignores casing). It would be good to ensure this value is always lowercased, so changes in the command casing won't cause different cache hashes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Case sensitivity of file requests is dealt with via individual provider implementations. They don’t use the key to return the source image, they use the input uri. If nothing is found, nothing is cached.
Rethinking about this then you're correct but the implementations should be non inheriting since we don't want a virtual method as a default implementation on a fast path. We're already paying a virtualization cost via interfaces.
So, as you say, to cover our backs, your implementations would be:
UriAbsoluteCacheKey
UriRelativeCacheKey
UriAbsoluteCaseInsensitiveCacheKey
UriRelativeCaseInsensitiveCacheKey
You're gonna want custom implementations of the UriHelper
methods for handling the concatenation + casing to avoid duplicate string generation for all scenarios since the caching of commands should, as you've rightly identified, be case insensitive. A processor not using the case insensitive CommandCollection
for command key matching would be considered a bug in that implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly my point: the IImageProvider
implementations correctly deal with the case sensitivity, but the cache key doesn't. So if you have multiple files that only differ in casing, it will only cache 1 generated image (with the metadata next to it), depending on which was requested first, the TLRU cache, file write times and other weird factors:
You can test this yourself by running the sample site in WSL and creating the copies using cp imagesharp-logo.png ImageSharp-logo.png
.
Having a setting to disable lowercasing the cache key would at least make it easy to switch (if you're running into this edge case), but also ensure custom implementations of ICacheKey
are exposed to this setting if it's passed into the Create
method. And it requires less classes/implementations 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't add an option for this for two reasons.
- Each implementation would have to respect that boolean option (complexity).
- Each implementation would have to parse that boolean per request (performance).
Making the case sensitivity a detail of the registered implementation provides us with the most flexibility. Registering an ICacheKey
implementation is no more difficult than setting an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole point for adding the bool caseSensitive
parameter is to make sure the implementations have to respect that setting (or at least are directly exposed to it).
The same can be said for the uint length
parameter currently on ICacheHash.Create
: all implementations need to ensure they generate a hash of the specified length. Not having a global setting might result in one implementation using a setting to be able to configure it and another always returning a fixed length hash (or just as worse: every implementation using their own setting).
Another very important thing about the CacheKeyCaseSensitive
setting is that the need for enabling it depends on the OS/platform you're running you application on. You don't want to edit code/create separate builds to make your application work correctly for as case sensitive and insensitive file system 😖
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the setting is really a no-go, I'd suggest to only provide the case insensitive relative (default) and absolute implementations.
I can imagine that the above conflicting file casing issues probably don't occur often enough (if at all) to justify all the extra complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can smash out the two other implementations for you if you like. They wouldn’t take me long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd very much like that, as I'm not yet familiar how to best work with string.Create
and Span<char>
😉
I did create some additional tests and benchmarks, so it should be easy for you to validate the implementations 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the previous cache key creation code as baseline benchmark and the UriAbsoluteCacheKey
implementation is still faster and allocates less (even though that also includes the request scheme, thus creating a longer cache key):
ImageSharp.Web/tests/ImageSharp.Web.Benchmarks/Caching/CacheKeyBenchmarks.cs
Lines 35 to 46 in 8c966ad
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000 | |
Intel Core i7-10750H CPU 2.60GHz, 1 CPU, 12 logical and 6 physical cores | |
.NET SDK=6.0.101 | |
[Host] : .NET Core 3.1.22 (CoreCLR 4.700.21.56803, CoreFX 4.700.21.57101), X64 RyuJIT | |
DefaultJob : .NET Core 3.1.22 (CoreCLR 4.700.21.56803, CoreFX 4.700.21.57101), X64 RyuJIT | |
| Method | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Allocated | | |
|-------------------- |---------:|--------:|--------:|------:|--------:|-------:|----------:| | |
| Baseline | 497.4 ns | 8.26 ns | 7.33 ns | 1.00 | 0.00 | 0.1106 | 696 B | | |
| UriRelativeCacheKey | 270.9 ns | 5.47 ns | 7.11 ns | 0.55 | 0.02 | 0.0587 | 368 B | | |
| UriAbsoluteCacheKey | 447.4 ns | 3.38 ns | 3.16 ns | 0.90 | 0.01 | 0.0939 | 592 B | |
I'm sure you're able to make it even more performant by using string.Create
/Span<char>
directly, but the UriHelper
methods are at least already outperforming the previous StringBuilder
implementation 👍🏻
| UriAbsoluteCacheKey | 490.3 ns | 5.14 ns | 4.80 ns | 0.74 | 0.02 | 0.0763 | 320 B | | ||
| UriAbsoluteCaseInsensitiveCacheKey | 475.4 ns | 4.18 ns | 3.71 ns | 0.71 | 0.02 | 0.0763 | 320 B | | ||
*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ronaldbarendse I added the new benchmarks here. Allocations are lower in the new version so would be interested to see what the numbers are on your more powerful computer.
For reference, here's your previous numbers.
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|
Baseline | 497.4 ns | 8.26 ns | 7.33 ns | 1.00 | 0.00 | 0.1106 | 696 B |
UriRelativeCacheKey | 270.9 ns | 5.47 ns | 7.11 ns | 0.55 | 0.02 | 0.0587 | 368 B |
UriAbsoluteCacheKey | 447.4 ns | 3.38 ns | 3.16 ns | 0.90 | 0.01 | 0.0939 | 592 B |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the previous implementations always lowercased the key, the speed is comparable for the relative and a bit faster for the absolute URI, but it indeed allocates less (although that's also partially expected, because the new ones don't include the scheme) 🎉
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|
LegacyV1CacheKey | 504.2 ns | 3.22 ns | 3.02 ns | 1.00 | 0.00 | 0.1106 | 696 B |
UriRelativeCacheKey | 291.8 ns | 5.83 ns | 7.17 ns | 0.58 | 0.02 | 0.0467 | 296 B |
UriRelativeCaseInsensitiveCacheKey | 277.5 ns | 1.58 ns | 1.32 ns | 0.55 | 0.00 | 0.0467 | 296 B |
UriAbsoluteCacheKey | 371.6 ns | 4.07 ns | 3.81 ns | 0.74 | 0.01 | 0.0505 | 320 B |
UriAbsoluteCaseInsensitiveCacheKey | 372.7 ns | 2.91 ns | 2.72 ns | 0.74 | 0.01 | 0.0505 | 320 B |
I've written my optimized keygen code and added a legacy version to allow backwards compatibility (turns out the legacy was adding an extra slash) think this is good to go. Thanks for your help here, this is a really great addition to the library! I'll let you have a final read through before merging as I want to make sure I haven't missed anything. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some comments: the main thing to currently change, would be the default implementation (that should be the lowercased cache key).
Otherwise, the only remaining thing might be naming: I think we should rename UriRelativeCaseInsensitiveCacheKey
to UriRelativeLowerInvariantCacheKey
, as that's what it actually does (case insensitive would be more appropriate when comparing, right?).
| UriAbsoluteCacheKey | 490.3 ns | 5.14 ns | 4.80 ns | 0.74 | 0.02 | 0.0763 | 320 B | | ||
| UriAbsoluteCaseInsensitiveCacheKey | 475.4 ns | 4.18 ns | 3.71 ns | 0.71 | 0.02 | 0.0763 | 320 B | | ||
*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the previous implementations always lowercased the key, the speed is comparable for the relative and a bit faster for the absolute URI, but it indeed allocates less (although that's also partially expected, because the new ones don't include the scheme) 🎉
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|
LegacyV1CacheKey | 504.2 ns | 3.22 ns | 3.02 ns | 1.00 | 0.00 | 0.1106 | 696 B |
UriRelativeCacheKey | 291.8 ns | 5.83 ns | 7.17 ns | 0.58 | 0.02 | 0.0467 | 296 B |
UriRelativeCaseInsensitiveCacheKey | 277.5 ns | 1.58 ns | 1.32 ns | 0.55 | 0.00 | 0.0467 | 296 B |
UriAbsoluteCacheKey | 371.6 ns | 4.07 ns | 3.81 ns | 0.74 | 0.01 | 0.0505 | 320 B |
UriAbsoluteCaseInsensitiveCacheKey | 372.7 ns | 2.91 ns | 2.72 ns | 0.74 | 0.01 | 0.0505 | 320 B |
src/ImageSharp.Web/Caching/UriAbsoluteCaseInsensitiveCacheKey.cs
Outdated
Show resolved
Hide resolved
src/ImageSharp.Web/DependencyInjection/ServiceCollectionExtensions.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Ronald Barendse <[email protected]>
Co-authored-by: Ronald Barendse <[email protected]>
…ions.cs Co-authored-by: Ronald Barendse <[email protected]>
Naming is hard... I chose CaseInsensitive since it described the operational behavior of the key rather than the means that it uses. I'm still torn between the two but leaning towards your convention. |
I'll fix the build in a mo. |
OK.... This time we're done. |
Prerequisites
Description
This implements the feature discussed in #145 and is related to PR #147 that uses a setting to ignore the hostname.
To summarize: both
IRequestParser
andIImageProvider
can return different source/processed images based on whatever they extract from theHttpContext
, but theImageSharpMiddleware
always caches the image based on the lowercased request host, path and parsed commands.This leads to creating cache keys that are either:
www.
prefix and domain root). This will generate and cache 2 images forexample.com/imagesharp-logo.png
andwww.example.com/imagesharp-logo.png
, which is a waste of processing (CPU/memory) and disk resources (there's also no cache trimming, so these won't get automatically cleaned up!).The first commit in this PR simply moves the existing code to the default
CacheKey
implementation. The second one adds 4 alternative implementations that either use the absolute or relative URI and optionally lowercases that value. Maybe it's better to move these out to a test project or the documentation, but I wanted to share how easy implementing this new interface can be!