-
Notifications
You must be signed in to change notification settings - Fork 791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression while requiring lots of files #84
Comments
I've got exactly same problem. |
Here is my vanilla rails app, with 100 |
Exactly the same issue, upgrading from sprockets 2.12 (70ms) to 3.4 (900ms). |
I finally got a spare moment to try and repro this. I used the example app from @mib32. I ran with sprockets 2 and then with sprockets 3. The results will SHOCK you (read in news reporter voice). Sprockets 2
Sprockets 3
So sprockets 3 is about 3x faster than sprockets 2 here with no cache. That's the opposite of what's reported. I'll give the code provided by @redox a shot. |
I'm seeing the same thing with the other code from @redox sprockets 3 is faster than 2 (but not by quite the same difference). How did you instrument your app with ruby-prof? |
@schneems Hello Richard, thank you for taking your time on that thing. I see that you are measuring time that Precompilation and page rendering are not anyhow connected, right? Please correct me if I am getting something wrong. |
We might be seeing this - noticed that on first load changing any file takes forever to load. Haven't had a chance to look into it yet. |
The same machinery that powers I'm less familiar with profiling in a live request and hoping I could repro in a shell script. Tips on debugging would be nice. Like how you got that ruby-prof graph? |
@schneems The graph in the top is not mine, but i had the same results, when i profiled literally this way in
When i went through a profile, i saw that bottleneck is in the part of code that figures out the filename of a n asset and gives it to renderer. But i couldn't investigate further due to lack of ruby knowledge. I would like to point out again that it is only in development. |
@eileencodes Yep. Especially if you use |
Could you perhaps share your Ruby and OS version numbers? I might be possible certain Ruby versions on certain platforms in combination with a large number of gems (i.e. a large load path) result in such a performance regression. (For reference: https://bugs.ruby-lang.org/issues/10015) |
ruby 2.2.1p85 (2015-02-26 revision 49769) [x86_64-darwin14] |
@jorisvh If I understand correctly you should be affected by that Ruby problem, but it still would not account for that drastic difference between 70ms and 900ms. Are you by any chance comparing debug and non-debug mode? Because that way I get similar numbers. This is what I measured at my workstation (Ruby 2.2.3, Windows 10) using https://github.com/mib32/slow-sprockets-example
|
@mib32 Are you getting about 70 ms with sprockets 2 in debug mode, too? That's where I see render times of 500 ms... 🐌 (Might be a Windows issue, though) Just to be sure: I'm talking about asset debug mode, i.e. |
@daniel-rikowski Yeah... You seems to be right, i am unable to see any significant regression no more... |
Here's a flamegraph visualization from my app: (rawgit, gist) The dark blue section taking up 60% of the ~500 ms page load time is Sprockets, despite the fact that multiple requests have already happened, so Sprockets should have all the info it needs to serve the request. This part of the app is inside Active Admin, which has 11 CoffeeScript files and 41 SCSS files. Our customizations on top of that add 6 CoffeeScript files and 1 SCSS file. These are the asset settings I have configured for development: config.assets.debug = true
config.assets.digest = true
config.assets.raise_runtime_errors = true As expected I'm not seeing this slowdown in production, because there the files are already precompiled. |
I can reproduce with the test app that Active Admin generates, as well. I can give steps to set it up if needed. |
@seanlinsley Could you please answer couple of questions:
|
We're seeing this slowdown as well on a large app that was just upgraded from Rails 3.2 to 4.2 with The JS file calls murder the speed though. It seems like each 304 return from the server is taking a really long time which adds up to a long page load. For now we've sped things up a bit by using our own modified version of this: |
@mib32 did you manage to reproduce something? |
I did a big sprockets upgrade and it didn't break anything. I'm very happy about this. 😄 Dev environment details: I'm experiencing a page loading performance drop after upgrading from sprockets 2.8 to 3.6 (after reading the Speeding up Sprockets blog post). There were a few dependencies linked into it so I ended up upgrading much more than just sprockets. Notably where I was specifically looking for improvement is with my
The only other notable changes I did for the assets with the upgrade is change the file extensions from I've noticed my flame graph showed more time being spent in sprockets as well. I realize that this may be because of some side effect from the other gems. I'll do some more testing to determine this. And I'll get back to you. |
Alright I did some digging and here's my derailed stackprof of slow methods before
after
So my app is faster with the older sprockets: (from flame graph info of page load) before
after
It looks like way more sprocket method calls are being done and the app is spending more time in it. |
This is a really long thread. I'm working on performance again right now. Did anyone include any slow example apps on this thread? |
You can use https://github.com/redox/perf_regression_sprockets -> I tried to make it pretty straight forward. Let us know how that goes @schneems 👍 |
I've just run into this issue too on a newly added, asset heavy page on our site. Here's my current reproduction: https://github.com/nfm/sprockets-3.x-performance-regressions The meat of it is this super simple rake task: namespace :sprockets do
task benchmark: :environment do
asset_count = 1500
puts "Creating #{asset_count} empty assets..."
asset_count.times { |i| `touch #{Rails.root}/app/assets/images/#{i}.png` }
puts "Done"
puts "Precompiling assets..."
Rake.application.invoke_task "assets:precompile"
puts "Done"
puts "Running benchmark..."
start = Time.now
count = 1500
count.times { ActionController::Base.helpers.asset_path('1.png') }
puts "#{count} lookups of the same asset took #{(Time.now - start) / 1.second}"
start = Time.now
count = 1500
count.times { |i| ActionController::Base.helpers.asset_path("#{i}.png") }
puts "#{count} lookups of #{count} different assets took #{(Time.now - start) / 1.second}"
end
end Otherwise it's just a newly initialized Rails 4.2.6 app, with two branches: one for Sprockets 2.x and one for Sprockets 3.x. For me, on Sprockets 3.x branch:
And on Sprockets 2.x branch:
It seems to be related to the number of assets that Sprockets is managing - nothing to do with their content, or whether lots of them are looked up or not. I would have expect to see O(1) lookup times for any number of assets in the repo, but for some reason 3.x is two orders of magnitude slower than 2.x. |
@redox Not to be too over zealous but I think I already fixed the performance regression in Sprockets 3. When I use Sprockets 3.2.0 with your app I see a page load take about 400ms. (using rack-mini-profiler) When I use Sprockets 3.6.2 it takes about 100ms. Can you try upgrading and let me know how it looks. @nfm i haven't looked at your app yet, it's already using Sprockets 3.6.2. I'll need to dig in more. |
We noticed a large performance improvement upgrading to |
The method `match_path_extname` is expensive. We know that if the name of the file on disk does not start with the same basename that we're looking for then it won't matter what extensions it's got. This check is pretty fast. On the example app https://github.com/nfm/sprockets-3.x-performance-regressions It reduced asset lookup from 15 seconds to 2.6 seconds. It's still not perfect but it's a 400% + speed increase in the case of looking for an asset in a huge directory. Related comment #84 (comment)
The method `match_path_extname` is expensive. We know that if the name of the file on disk does not start with the same basename that we're looking for then it won't matter what extensions it's got. This check is pretty fast. On the example app https://github.com/nfm/sprockets-3.x-performance-regressions It reduced asset lookup from 15 seconds to 2.6 seconds. It's still not perfect but it's a 400% + speed increase in the case of looking for an asset in a huge directory. Related comment #84 (comment)
@nfm I found one place where it was spending a bunch of time, thanks for the example app. It's not back to 2.x speeds but I was able to get a 400% perf improvement with #331 There is some logic that was added to 3.x that will allow for multiple assets to be matched. It's expensive in large directories. I don't know why it was added and I don't feel comfortable tearing it out without more context. So after that patch gets green and is merged to master I can backport to 3.x and cut a release. It is still slower than 2.x but at least it's less slow. |
Turns out it's perhaps faster, but not by 400% made a mistake while benchmarking. |
The method `match_path_extname` is expensive. We know that if the name of the file on disk does not start with the same basename that we're looking for then it won't matter what extensions it's got. This check is pretty fast. On the example app https://github.com/nfm/sprockets-3.x-performance-regressions It reduced asset lookup from 15 seconds to 2.6 seconds. It's still not perfect but it's a 400% + speed increase in the case of looking for an asset in a huge directory. Related comment #84 (comment)
@schneems thanks for taking a look :) Interesting thought about things being slow if there are many assets in one directory. I've updated my repo to add another benchmark where each asset is in its own directory. Probably the most realistic use-case is having a few directories with a reasonable number of files (rather than one directory with many files, or many directories with one file), but I think these two benchmarks illustrate the problem. Here's my latest benchmarks: Sprockets 2.x: # ./bin/rake sprockets:benchmark
1500 lookups of the same asset took 0.086173673
1500 lookups of 1500 different assets took 7.336354235
# ./bin/rake sprockets:benchmark_dirs
1500 lookups of the same asset took 0.043988341
1500 lookups of 1500 different assets took 1.415721443
# ./bin/rake sprockets:benchmark PRECOMPILE=1
1500 lookups of the same asset took 0.109657337
1500 lookups of 1500 different assets took 0.10572148
# ./bin/rake sprockets:benchmark_dirs PRECOMPILE=1
1500 lookups of the same asset took 0.094545266
1500 lookups of 1500 different assets took 0.087344999 Sprockets 3.x: # ./bin/rake sprockets:benchmark
1500 lookups of the same asset took 7.827039138
1500 lookups of 1500 different assets took 6.186335159
# ./bin/rake sprockets:benchmark_dirs
1500 lookups of the same asset took 1.993022409
1500 lookups of 1500 different assets took 0.587485463
# ./bin/rake sprockets:benchmark PRECOMPILE=1
1500 lookups of the same asset took 6.232709039
1500 lookups of 1500 different assets took 6.381696028
# ./bin/rake sprockets:benchmark_dirs PRECOMPILE=1
1500 lookups of the same asset took 0.729864224
1500 lookups of 1500 different assets took 0.631234396 So it does look like the directory structure makes a significant difference without precompiling, which intuitively feels right. But the interesting discrepancy is the difference the directory structure makes after precompiling in sprockets-2.x vs sprockets-3.x. The performance should be the same for both directory structures, and it is in sprockets-2.x, but in sprockets-3.x many files in one directory is much slower. Something weird is going on! |
Ok, so it looks like most of the weirdness is me misunderstanding Sprockets! Precompiling the assets is not enough to have the benchmark behave like Sprockets will in production. The benchmark was running in the Our issue in production turned out to be that Our manifest was being used to look up assets, but some asset paths were being provided without file extensions (eg. If you're having performance issues, my take-away is to check
|
@MFM yes, keeping On the script: i noticed the manifest behavior of your script with sprockets 2.x and removed it in my local tests (basically if you have a manifest file sprockets 2.x will use that skipping the asset lookup logic which is why it was so fast).Your script did help me find a performance regression and ultimate fix I have a PR #336 that will drastically improve performance. That being said there is one core difference in behavior between 3.x and 2.x. Sprockets 2.x caches individual asset lookups so when you do the second lookup you hit a cache. Sprockets 3.x does not do this but rather it caches directory entries instead. I don't know why that decision was made. It is true that even with the #336 patch applied that it is still not as fast as 2.x in repeatedly looking up the same asset, however in the second "random" asset lookup test it is actually faster than 2.x. I want to play with the format that we use to store the entries, as a hash lookup could prove to be much faster than looping through all the entries. Until I get the time to do that you should see improved speed in development with #336. When tests are green I plan on merging and cutting a release. |
I cut a release 3.6.3 please try it out. Going to close this thread for now. Thank you all for providing feedback and example scripts. |
Thanks @schneems, that all makes sense. Nice work with the performance fix! Glad to be able to help out :) |
I just reran sprockets 2.12.41500 lookups of the same asset took 0.238217 1500 lookups of the same asset took 0.254395 1500 lookups of the same asset took 0.25907 sprockets 3.7.01500 lookups of the same asset took 9.655606 1500 lookups of the same asset took 5.900586 1500 lookups of the same asset took 6.441615 I hope "lookups of the same asset" can take less time :( |
So here's the deal. Sprockets 3 is MUCH faster at looking up an individual non-cached asset because of the way caching changed between 2 and 3. In Sprockets 2 the individual assets are cached, which is not all that useful unless you're using the same asset a ton. In that case it is much much faster in sprockets 3. However if you need different assets, it is really slow. In Sprockets 3 we cache entries in directories instead of the individual assets. This is faster for multi asset lookups. Basically a non-cached asset lookup on Sprockets 3 is faster than on Sprockets 2. If we wanted that speed back for the same asset we would have to add another layer of caching back to Sprockets 3 which would actually decrease the performance of looking up a non-duplicate asset call as we're having to add a cache check that will miss frequently. Invalidating the individual asset is also a concern. I'm not opposed to making the same asset lookup time better, but we can't do it at the expense of a non-duplicate asset lookup. |
We have about 300 CSS & JS files in our application. Our html head is contained in a partial. In development mode this partial renders in 60ms with sprockets 2. With sprockets 3 it takes 100ms with debug: false and 700ms with debug: true. With debug I'm referring to asset debugging. Since you can't use sprockets 2 with Rails 5 it seems our current options are either to use debug: false or move over to the webpacker gem. |
Hi guys,
I'm following rails/sprockets-rails#225 and I would like to share with you a performance regression I'm experiencing after moving from sprocket-rails
2.1.3
to3.0
ormaster
.It turns out that the issue is not in
sprocket-rails
but rather in the underlyingsprockets
since most of the time is spent resolving asset paths (cc @eileencodes).Here is what I'm seeing on my personal app (ruby-prof output):
And here is a small project reproducing the issue: redox/perf_regression_sprockets. In that application a page that took 20ms to be rendered using sprockets-rails
2.1.3
now takes 60ms with the latestmaster
branch.Any chance it could be related to the "compat" resolving layer? Let me know if I can help!
The text was updated successfully, but these errors were encountered: