-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very first request sometimes causes Circular dependency error on ConversionHost #544
Comments
Hm, after thinking this through, there are actually 2 requests fired from the UI almost simultaneously, so this might explain my theory with the threads. |
yeah, 2 threads are trying to autoload at the same time. This sounds very much like the issue we "worked around" using the permit concurrent loads. Is this with the fix in #416? |
@jrafanie yes, this is happening with the latest master and the latest upstream build as well. |
Also seeing this.. https://gist.github.com/himdel/36eb11d45a33f076c91047098648ddbf Happened to me during testing ManageIQ/manageiq-ui-classic#5164, right after login (so, yes, first API request, but not first http request) Status 500 Internal Server Error Content-Type application/json; charset=utf-8 Data {"error":{"kind":"internal_server_error","message":"Circular dependency detected while autoloading constant ContainerProject","klass":"RuntimeError"}} |
I couldn't recreate it this @skateman @himdel @Hyperkid123. Can you try running with the rack lock debugging middleware for a few days: Add rails lock debugging middleware to config/application.rb: If it gets the 500 error, please provide the following information: looking at both of your gists, it looks like there are two requests locking each other, an /api/auth and /api/notifications, so I tried this on my machine and appliance: 2.times do
Thread.new do
`curl -k -L http://admin:smartvm@localhost:3000/api/auth?requester_type=ws`
end
Thread.new do
`curl -k -L http://admin:smartvm@localhost:3000/api/notifications?expand=resources&attributes=details&sort_by=id&sort_order=desc&limit=100`
end
end |
@jrafanie this might be a heisenbug 😕 I was able to reproduce this only once by accident since I inserted the middleware. However, the |
Can confirm that after the request returns a 500, But that makes sense really. If there is a locking problem, it gets resolved when that exception throws, right? Managed to catch the locks a bit before, but not sure if this is already processing both api requests... https://gist.github.com/himdel/f515442221c387746ffbe3adbcd65ee0 All I had to do to reproduce was:
|
But I can't reproduce the second time .. maybe something to do with the files being cached? (But Changing expanded.sub!(/\.rb\z/, ''.freeze)
if loading.include?(expanded)
+ `curl -L http://admin:smartvm@localhost:3000/rails/locks > /tmp/locks.$$`
raise "Circular dependency detected while autoloading constant #{qualified_name}"
else
require_or_load(expanded, qualified_name) And hoping this will happen again. |
This is what my |
Got a fresh one: locks: https://gist.github.com/himdel/f839ec7ae39198a69185c0f5c28b7267 |
From MartinH's findings in ManageIQ#544, we had one thread at the `cspec[:klass].constantize` line while another thread was trying to run `klass.columns_hash`, causing a `Circular dependency detected while autoloading...` error. I then tried a bunch of things until I found a way to reliably recreate this error: ```ruby require_relative 'config/environment' threads = [] 4.times do threads << Thread.new { Api::Environment.time_attributes } end threads.collect(&:join) ``` I could then run the above script several times in my shell and get the `circular dependency` error most of the time. ``` for x in `seq 1 10`; do; bundle exec ruby test.rb; done ``` With this test in place, I then tried a few solutions: 1) Move the `klass.columns_hash` block into the permit_concurrent_loads block: ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f4b6554a 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -22,9 +22,10 @@ module Api # Temporary measure to avoid thread race condition which could lead to a deadlock ActiveSupport::Dependencies.interlock.permit_concurrent_loads do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` This did not fix the `circular dependency` error. Perhaps `permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads cross threads? 2) I tried the same solution as what's in this commit (Mutex#synchronize) but with a Sync.new object and either a `synchronize(:SH)` or EX block and instead received a thread killed error. 3) Finally, tried Mutex#synchronize and this worked. I then verified that this last Mutex#synchronize change also fixed the original problem and it did.
@himdel @Hyperkid123 @skateman I opened PR #550 with the first thing I got to work. Details are there. |
From MartinH's findings in ManageIQ#544, we had one thread at the `cspec[:klass].constantize` line while another thread was trying to run `klass.columns_hash`, causing a `Circular dependency detected while autoloading...` error. I then tried a bunch of things until I found a way to reliably recreate this error: ```ruby require_relative 'config/environment' threads = [] 4.times do threads << Thread.new { Api::Environment.time_attributes } end threads.collect(&:join) ``` I could then run the above script several times in my shell and get the `circular dependency` error most of the time. ``` for x in `seq 1 10`; do; bundle exec ruby test.rb; done ``` With this test in place, I then tried a few solutions: 1) Move the `klass.columns_hash` block into the permit_concurrent_loads block: ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f4b6554a 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -22,9 +22,10 @@ module Api # Temporary measure to avoid thread race condition which could lead to a deadlock ActiveSupport::Dependencies.interlock.permit_concurrent_loads do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` This did not fix the `circular dependency` error. Perhaps `permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads cross threads? 2) I tried Mutex#synchronize and this worked, but I'd rather we work with the interlock provided by rails. ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f51de73f 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -1,5 +1,6 @@ module Api class Environment + ONE_AUTOLOADER_LOCK = Mutex.new def self.url_attributes @url_attributes ||= Set.new(%w(href)) end @@ -19,12 +20,13 @@ module Api next if cspec[:klass].blank? klass = nil - # Temporary measure to avoid thread race condition which could lead to a deadlock - ActiveSupport::Dependencies.interlock.permit_concurrent_loads do + # Ensure we're the only thread trying to autoload classes and their columns + ONE_AUTOLOADER_LOCK.synchronize do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` 3) I tried Sync.new with SH and EX locks instead of Mutex and this failed with Thread killed errors. 4) I changed the `permit_concurrent_loads` on the interlock to `loading` and this worked and I believe is the best solution. I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])` and that method has this in the source code comment: https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168 ``` Temporarily give up all held Share locks while executing the supplied block, allowing any +compatible+ exclusive lock request to proceed. ``` Perhaps, since we're loading code, `permit_concurrent_loads` is too permissive of other exclusive lock requests and we really need to ensure nothing else is trying to load.
https://bugzilla.redhat.com/show_bug.cgi?id=1671458 From MartinH's findings in ManageIQ#544, we had one thread at the `cspec[:klass].constantize` line while another thread was trying to run `klass.columns_hash`, causing a `Circular dependency detected while autoloading...` error. I then tried a bunch of things until I found a way to reliably recreate this error: ```ruby require_relative 'config/environment' threads = [] 4.times do threads << Thread.new { Api::Environment.time_attributes } end threads.collect(&:join) ``` I could then run the above script several times in my shell and get the `circular dependency` error most of the time. ``` for x in `seq 1 10`; do; bundle exec ruby test.rb; done ``` With this test in place, I then tried a few solutions: 1) Move the `klass.columns_hash` block into the permit_concurrent_loads block: ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f4b6554a 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -22,9 +22,10 @@ module Api # Temporary measure to avoid thread race condition which could lead to a deadlock ActiveSupport::Dependencies.interlock.permit_concurrent_loads do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` This did not fix the `circular dependency` error. Perhaps `permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads cross threads? 2) I tried Mutex#synchronize and this worked, but I'd rather we work with the interlock provided by rails. ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f51de73f 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -1,5 +1,6 @@ module Api class Environment + ONE_AUTOLOADER_LOCK = Mutex.new def self.url_attributes @url_attributes ||= Set.new(%w(href)) end @@ -19,12 +20,13 @@ module Api next if cspec[:klass].blank? klass = nil - # Temporary measure to avoid thread race condition which could lead to a deadlock - ActiveSupport::Dependencies.interlock.permit_concurrent_loads do + # Ensure we're the only thread trying to autoload classes and their columns + ONE_AUTOLOADER_LOCK.synchronize do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` 3) I tried Sync.new with SH and EX locks instead of Mutex and this failed with Thread killed errors. 4) I changed the `permit_concurrent_loads` on the interlock to `loading` and this worked and I believe is the best solution. I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])` and that method has this in the source code comment: https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168 ``` Temporarily give up all held Share locks while executing the supplied block, allowing any +compatible+ exclusive lock request to proceed. ``` Perhaps, since we're loading code, `permit_concurrent_loads` is too permissive of other exclusive lock requests and we really need to ensure nothing else is trying to load.
https://bugzilla.redhat.com/show_bug.cgi?id=1671458 Fixes ManageIQ#544 From MartinH's findings in ManageIQ#544, we had one thread at the `cspec[:klass].constantize` line while another thread was trying to run `klass.columns_hash`, causing a `Circular dependency detected while autoloading...` error. I then tried a bunch of things until I found a way to reliably recreate this error: ```ruby require_relative 'config/environment' threads = [] 4.times do threads << Thread.new { Api::Environment.time_attributes } end threads.collect(&:join) ``` I could then run the above script several times in my shell and get the `circular dependency` error most of the time. ``` for x in `seq 1 10`; do; bundle exec ruby test.rb; done ``` With this test in place, I then tried a few solutions: 1) Move the `klass.columns_hash` block into the permit_concurrent_loads block: ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f4b6554a 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -22,9 +22,10 @@ module Api # Temporary measure to avoid thread race condition which could lead to a deadlock ActiveSupport::Dependencies.interlock.permit_concurrent_loads do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` This did not fix the `circular dependency` error. Perhaps `permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads cross threads? 2) I tried Mutex#synchronize and this worked, but I'd rather we work with the interlock provided by rails. ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f51de73f 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -1,5 +1,6 @@ module Api class Environment + ONE_AUTOLOADER_LOCK = Mutex.new def self.url_attributes @url_attributes ||= Set.new(%w(href)) end @@ -19,12 +20,13 @@ module Api next if cspec[:klass].blank? klass = nil - # Temporary measure to avoid thread race condition which could lead to a deadlock - ActiveSupport::Dependencies.interlock.permit_concurrent_loads do + # Ensure we're the only thread trying to autoload classes and their columns + ONE_AUTOLOADER_LOCK.synchronize do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` 3) I tried Sync.new with SH and EX locks instead of Mutex and this failed with Thread killed errors. 4) I changed the `permit_concurrent_loads` on the interlock to `loading` and this worked, but Yuri found this caused a deadlock. 5) Use `loading` from 4) and move the `columns_hash` call into the `loading` block. This solves the circular dependency and avoids the deadlock encountered in 4). I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])` and that method has this in the source code comment: https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168 ``` Temporarily give up all held Share locks while executing the supplied block, allowing any +compatible+ exclusive lock request to proceed. ``` Perhaps, since we're loading code, `permit_concurrent_loads` is too permissive of other exclusive lock requests and we really need to ensure nothing else is trying to load.
https://bugzilla.redhat.com/show_bug.cgi?id=1671458 Fixes ManageIQ#544 From MartinH's findings in ManageIQ#544, we had one thread at the `cspec[:klass].constantize` line while another thread was trying to run `klass.columns_hash`, causing a `Circular dependency detected while autoloading...` error. I then tried a bunch of things until I found a way to reliably recreate this error: ```ruby require_relative 'config/environment' threads = [] 4.times do threads << Thread.new { Api::Environment.time_attributes } end threads.collect(&:join) ``` I could then run the above script several times in my shell and get the `circular dependency` error most of the time. ``` for x in `seq 1 10`; do; bundle exec ruby test.rb; done ``` With this test in place, I then tried a few solutions: 1) Move the `klass.columns_hash` block into the permit_concurrent_loads block: ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f4b6554a 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -22,9 +22,10 @@ module Api # Temporary measure to avoid thread race condition which could lead to a deadlock ActiveSupport::Dependencies.interlock.permit_concurrent_loads do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` This did not fix the `circular dependency` error. Perhaps `permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads cross threads? 2) I tried Mutex#synchronize and this worked, but I'd rather we work with the interlock provided by rails. ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f51de73f 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -1,5 +1,6 @@ module Api class Environment + ONE_AUTOLOADER_LOCK = Mutex.new def self.url_attributes @url_attributes ||= Set.new(%w(href)) end @@ -19,12 +20,13 @@ module Api next if cspec[:klass].blank? klass = nil - # Temporary measure to avoid thread race condition which could lead to a deadlock - ActiveSupport::Dependencies.interlock.permit_concurrent_loads do + # Ensure we're the only thread trying to autoload classes and their columns + ONE_AUTOLOADER_LOCK.synchronize do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` 3) I tried Sync.new with SH and EX locks instead of Mutex and this failed with Thread killed errors. 4) I changed the `permit_concurrent_loads` on the interlock to `loading` and this worked, but Yuri found this caused a deadlock. 5) Use `AS::Dep.interlock.loading` from 4) and move the `columns_hash` call into the `loading` block. This solves the circular dependency and avoids the deadlock encountered in 4). 6) Finally, change to use `Rails.application.executor.wrap`, which behaves similar to interlock.loading but is meants for informing rails that a plugin needs to load application code. I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])` and that method has this in the source code comment: https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168 ``` Temporarily give up all held Share locks while executing the supplied block, allowing any +compatible+ exclusive lock request to proceed. ``` Perhaps, since we're loading code, `permit_concurrent_loads` is too permissive of other exclusive lock requests and we really need to ensure nothing else is trying to load.
https://bugzilla.redhat.com/show_bug.cgi?id=1671458 Fixes ManageIQ#544 From MartinH's findings in ManageIQ#544, we had one thread at the `cspec[:klass].constantize` line while another thread was trying to run `klass.columns_hash`, causing a `Circular dependency detected while autoloading...` error. I then tried a bunch of things until I found a way to reliably recreate this error: ```ruby require_relative 'config/environment' threads = [] 4.times do threads << Thread.new { Api::Environment.time_attributes } end threads.collect(&:join) ``` I could then run the above script several times in my shell and get the `circular dependency` error most of the time. ``` for x in `seq 1 10`; do; bundle exec ruby test.rb; done ``` With this test in place, I then tried a few solutions: 1) Move the `klass.columns_hash` block into the permit_concurrent_loads block: ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f4b6554a 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -22,9 +22,10 @@ module Api # Temporary measure to avoid thread race condition which could lead to a deadlock ActiveSupport::Dependencies.interlock.permit_concurrent_loads do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` This did not fix the `circular dependency` error. Perhaps `permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads cross threads? 2) I tried Mutex#synchronize and this worked, but I'd rather we work with the interlock provided by rails. ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f51de73f 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -1,5 +1,6 @@ module Api class Environment + ONE_AUTOLOADER_LOCK = Mutex.new def self.url_attributes @url_attributes ||= Set.new(%w(href)) end @@ -19,12 +20,13 @@ module Api next if cspec[:klass].blank? klass = nil - # Temporary measure to avoid thread race condition which could lead to a deadlock - ActiveSupport::Dependencies.interlock.permit_concurrent_loads do + # Ensure we're the only thread trying to autoload classes and their columns + ONE_AUTOLOADER_LOCK.synchronize do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` 3) I tried Sync.new with SH and EX locks instead of Mutex and this failed with Thread killed errors. 4) I changed the `permit_concurrent_loads` on the interlock to `loading` and this worked, but Yuri found this caused a deadlock. 5) Use `AS::Dep.interlock.loading` from 4) and move the `columns_hash` call into the `loading` block. This solves the circular dependency and avoids the deadlock encountered in 4). I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])` and that method has this in the source code comment: https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168 ``` Temporarily give up all held Share locks while executing the supplied block, allowing any +compatible+ exclusive lock request to proceed. ``` Perhaps, since we're loading code, `permit_concurrent_loads` is too permissive of other exclusive lock requests and we really need to ensure nothing else is trying to load.
@Hyperkid123 told me about his problem that sometimes his very first API request fails with a HTTP 500 and there's no deterministic way to reproduce it. When I was playing with the API, I also experienced this issue in production and found out that there's a traceback in the api.log. Then I looked into my local api.log and found out that I had those tracebacks all the time 😢 😅 🙏
So as far I could understand, this has to do something with this fix created by @jrafanie. I tried looking at the affected
ConversionHost
from any possible angle, but I don't know how this could be a circular dependency. My theory is that there are two models trying to loadApplicationRecord
at the same time.@miq-bot add_label bug
The text was updated successfully, but these errors were encountered: