Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very first request sometimes causes Circular dependency error on ConversionHost #544

Closed
skateman opened this issue Jan 18, 2019 · 11 comments · Fixed by #550
Closed

Very first request sometimes causes Circular dependency error on ConversionHost #544

skateman opened this issue Jan 18, 2019 · 11 comments · Fixed by #550
Labels

Comments

@skateman
Copy link
Member

@Hyperkid123 told me about his problem that sometimes his very first API request fails with a HTTP 500 and there's no deterministic way to reproduce it. When I was playing with the API, I also experienced this issue in production and found out that there's a traceback in the api.log. Then I looked into my local api.log and found out that I had those tracebacks all the time 😢 😅 🙏

So as far I could understand, this has to do something with this fix created by @jrafanie. I tried looking at the affected ConversionHost from any possible angle, but I don't know how this could be a circular dependency. My theory is that there are two models trying to load ApplicationRecord at the same time.

@miq-bot add_label bug

@miq-bot miq-bot added the bug label Jan 18, 2019
@skateman
Copy link
Member Author

Hm, after thinking this through, there are actually 2 requests fired from the UI almost simultaneously, so this might explain my theory with the threads.

@jrafanie
Copy link
Member

yeah, 2 threads are trying to autoload at the same time. This sounds very much like the issue we "worked around" using the permit concurrent loads. Is this with the fix in #416?

@skateman
Copy link
Member Author

@jrafanie yes, this is happening with the latest master and the latest upstream build as well.

@himdel
Copy link
Contributor

himdel commented Jan 21, 2019

Also seeing this.. https://gist.github.com/himdel/36eb11d45a33f076c91047098648ddbf

Happened to me during testing ManageIQ/manageiq-ui-classic#5164, right after login (so, yes, first API request, but not first http request)

URL http://localhost:3000/api/notifications?expand=resources&attributes=details&sort_by=id&sort_order=desc&limit=100

Status 500 Internal Server Error

Content-Type application/json; charset=utf-8

Data {"error":{"kind":"internal_server_error","message":"Circular dependency detected while autoloading constant ContainerProject","klass":"RuntimeError"}}

@jrafanie
Copy link
Member

I couldn't recreate it this @skateman @himdel @Hyperkid123. Can you try running with the rack lock debugging middleware for a few days:

Add rails lock debugging middleware to config/application.rb:
config.middleware.insert_before Rack::Sendfile, ActionDispatch::DebugLocks

If it gets the 500 error, please provide the following information:
curl -L http://admin:smartvm@localhost:4000/rails/locks or whatever port your web service /api worker is listening on.

looking at both of your gists, it looks like there are two requests locking each other, an /api/auth and /api/notifications, so I tried this on my machine and appliance:

2.times do
  Thread.new do
    `curl -k -L http://admin:smartvm@localhost:3000/api/auth?requester_type=ws`
  end

  Thread.new do
    `curl -k -L http://admin:smartvm@localhost:3000/api/notifications?expand=resources&attributes=details&sort_by=id&sort_order=desc&limit=100`
  end

end

@skateman
Copy link
Member Author

@jrafanie this might be a heisenbug 😕 I was able to reproduce this only once by accident since I inserted the middleware. However, the /rails/locks was is getting back with an empty response.

@himdel
Copy link
Contributor

himdel commented Jan 23, 2019

Can confirm that after the request returns a 500, /rails/locks is empty.

But that makes sense really. If there is a locking problem, it gets resolved when that exception throws, right?

Managed to catch the locks a bit before, but not sure if this is already processing both api requests...

https://gist.github.com/himdel/f515442221c387746ffbe3adbcd65ee0

All I had to do to reproduce was:

  • start rails s
  • open localhost:3000
  • log in

@himdel
Copy link
Contributor

himdel commented Jan 23, 2019

But I can't reproduce the second time .. maybe something to do with the files being cached? (But sudo sh -c ' echo 3 > /proc/sys/vm/drop_caches ' doesn't help.)

Changing activesupport-5.0.7.1/lib/active_support/dependencies.rb:510:

         expanded.sub!(/\.rb\z/, ''.freeze)
 
         if loading.include?(expanded)
+          `curl -L http://admin:smartvm@localhost:3000/rails/locks > /tmp/locks.$$`
           raise "Circular dependency detected while autoloading constant #{qualified_name}"
         else
           require_or_load(expanded, qualified_name)

And hoping this will happen again.

@himdel
Copy link
Contributor

himdel commented Jan 24, 2019

This is what my /api/notifications endpoint returns: https://gist.github.com/himdel/b762953ab803d9c60e3ea81b76974b71

@himdel
Copy link
Contributor

himdel commented Jan 28, 2019

Got a fresh one: Circular dependency detected while autoloading constant MiqAlertStatus

locks: https://gist.github.com/himdel/f839ec7ae39198a69185c0f5c28b7267

jrafanie added a commit to jrafanie/manageiq-api that referenced this issue Jan 28, 2019
From MartinH's findings in ManageIQ#544, we had one thread at the
`cspec[:klass].constantize` line while another thread was trying to run
`klass.columns_hash`, causing a
`Circular dependency detected while autoloading...` error.

I then tried a bunch of things until I found a way to reliably recreate
this error:

```ruby
require_relative 'config/environment'

threads = []
4.times do
  threads << Thread.new { Api::Environment.time_attributes }
end
threads.collect(&:join)
```

I could then run the above script several times in my shell and get the
`circular dependency` error most of the time.

```
for x in `seq 1 10`; do; bundle exec ruby test.rb; done
```

With this test in place, I then tried a few solutions:

1) Move the `klass.columns_hash` block into the permit_concurrent_loads
block:

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f4b6554a 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -22,9 +22,10 @@ module Api
         # Temporary measure to avoid thread race condition which could lead to a deadlock
         ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end
```

This did not fix the `circular dependency` error. Perhaps
`permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads
cross threads?

2) I tried the same solution as what's in this commit (Mutex#synchronize)
but with a Sync.new object and either a `synchronize(:SH)` or EX block and instead
received a thread killed error.

3) Finally, tried Mutex#synchronize and this worked.

I then verified that this last Mutex#synchronize change also fixed the
original problem and it did.
@jrafanie
Copy link
Member

@himdel @Hyperkid123 @skateman I opened PR #550 with the first thing I got to work. Details are there.

jrafanie added a commit to jrafanie/manageiq-api that referenced this issue Jan 30, 2019
From MartinH's findings in ManageIQ#544, we had one thread at the
`cspec[:klass].constantize` line while another thread was trying to run
`klass.columns_hash`, causing a
`Circular dependency detected while autoloading...` error.

I then tried a bunch of things until I found a way to reliably recreate
this error:

```ruby
require_relative 'config/environment'

threads = []
4.times do
  threads << Thread.new { Api::Environment.time_attributes }
end
threads.collect(&:join)
```

I could then run the above script several times in my shell and get the
`circular dependency` error most of the time.

```
for x in `seq 1 10`; do; bundle exec ruby test.rb; done
```

With this test in place, I then tried a few solutions:

1) Move the `klass.columns_hash` block into the permit_concurrent_loads
block:

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f4b6554a 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -22,9 +22,10 @@ module Api
         # Temporary measure to avoid thread race condition which could lead to a deadlock
         ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end
```

This did not fix the `circular dependency` error. Perhaps
`permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads
cross threads?

2) I tried Mutex#synchronize and this worked, but I'd rather we work
with the interlock provided by rails.

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f51de73f 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -1,5 +1,6 @@
 module Api
   class Environment
+    ONE_AUTOLOADER_LOCK = Mutex.new
     def self.url_attributes
       @url_attributes ||= Set.new(%w(href))
     end
@@ -19,12 +20,13 @@ module Api
         next if cspec[:klass].blank?
         klass = nil

-        # Temporary measure to avoid thread race condition which could lead to a deadlock
-        ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
+        # Ensure we're the only thread trying to autoload classes and their columns
+        ONE_AUTOLOADER_LOCK.synchronize do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end

```

3) I tried Sync.new with SH and EX locks instead of Mutex and this
failed with Thread killed errors.

4) I changed the `permit_concurrent_loads` on the interlock to `loading`
and this worked and I believe is the best solution.

I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])`
and that method has this in the source code comment:

https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168

```
  Temporarily give up all held Share locks while executing the
  supplied block, allowing any +compatible+ exclusive lock request
  to proceed.
```

Perhaps, since we're loading code, `permit_concurrent_loads` is too
permissive of other exclusive lock requests and we really need to ensure
nothing else is trying to load.
jrafanie added a commit to jrafanie/manageiq-api that referenced this issue Jan 31, 2019
https://bugzilla.redhat.com/show_bug.cgi?id=1671458

From MartinH's findings in ManageIQ#544, we had one thread at the
`cspec[:klass].constantize` line while another thread was trying to run
`klass.columns_hash`, causing a
`Circular dependency detected while autoloading...` error.

I then tried a bunch of things until I found a way to reliably recreate
this error:

```ruby
require_relative 'config/environment'

threads = []
4.times do
  threads << Thread.new { Api::Environment.time_attributes }
end
threads.collect(&:join)
```

I could then run the above script several times in my shell and get the
`circular dependency` error most of the time.

```
for x in `seq 1 10`; do; bundle exec ruby test.rb; done
```

With this test in place, I then tried a few solutions:

1) Move the `klass.columns_hash` block into the permit_concurrent_loads
block:

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f4b6554a 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -22,9 +22,10 @@ module Api
         # Temporary measure to avoid thread race condition which could lead to a deadlock
         ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end
```

This did not fix the `circular dependency` error. Perhaps
`permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads
cross threads?

2) I tried Mutex#synchronize and this worked, but I'd rather we work
with the interlock provided by rails.

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f51de73f 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -1,5 +1,6 @@
 module Api
   class Environment
+    ONE_AUTOLOADER_LOCK = Mutex.new
     def self.url_attributes
       @url_attributes ||= Set.new(%w(href))
     end
@@ -19,12 +20,13 @@ module Api
         next if cspec[:klass].blank?
         klass = nil

-        # Temporary measure to avoid thread race condition which could lead to a deadlock
-        ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
+        # Ensure we're the only thread trying to autoload classes and their columns
+        ONE_AUTOLOADER_LOCK.synchronize do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end

```

3) I tried Sync.new with SH and EX locks instead of Mutex and this
failed with Thread killed errors.

4) I changed the `permit_concurrent_loads` on the interlock to `loading`
and this worked and I believe is the best solution.

I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])`
and that method has this in the source code comment:

https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168

```
  Temporarily give up all held Share locks while executing the
  supplied block, allowing any +compatible+ exclusive lock request
  to proceed.
```

Perhaps, since we're loading code, `permit_concurrent_loads` is too
permissive of other exclusive lock requests and we really need to ensure
nothing else is trying to load.
jrafanie added a commit to jrafanie/manageiq-api that referenced this issue Jan 31, 2019
https://bugzilla.redhat.com/show_bug.cgi?id=1671458
Fixes ManageIQ#544

From MartinH's findings in ManageIQ#544, we had one thread at the
`cspec[:klass].constantize` line while another thread was trying to run
`klass.columns_hash`, causing a
`Circular dependency detected while autoloading...` error.

I then tried a bunch of things until I found a way to reliably recreate
this error:

```ruby
require_relative 'config/environment'

threads = []
4.times do
  threads << Thread.new { Api::Environment.time_attributes }
end
threads.collect(&:join)
```

I could then run the above script several times in my shell and get the
`circular dependency` error most of the time.

```
for x in `seq 1 10`; do; bundle exec ruby test.rb; done
```

With this test in place, I then tried a few solutions:

1) Move the `klass.columns_hash` block into the permit_concurrent_loads
block:

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f4b6554a 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -22,9 +22,10 @@ module Api
         # Temporary measure to avoid thread race condition which could lead to a deadlock
         ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end
```

This did not fix the `circular dependency` error. Perhaps
`permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads
cross threads?

2) I tried Mutex#synchronize and this worked, but I'd rather we work
with the interlock provided by rails.

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f51de73f 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -1,5 +1,6 @@
 module Api
   class Environment
+    ONE_AUTOLOADER_LOCK = Mutex.new
     def self.url_attributes
       @url_attributes ||= Set.new(%w(href))
     end
@@ -19,12 +20,13 @@ module Api
         next if cspec[:klass].blank?
         klass = nil

-        # Temporary measure to avoid thread race condition which could lead to a deadlock
-        ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
+        # Ensure we're the only thread trying to autoload classes and their columns
+        ONE_AUTOLOADER_LOCK.synchronize do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end

```

3) I tried Sync.new with SH and EX locks instead of Mutex and this
failed with Thread killed errors.

4) I changed the `permit_concurrent_loads` on the interlock to `loading`
and this worked, but Yuri found this caused a deadlock.

5) Use `loading` from 4) and move the `columns_hash` call into
the `loading` block.  This solves the circular dependency and avoids the
deadlock encountered in 4).

I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])`
and that method has this in the source code comment:

https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168

```
  Temporarily give up all held Share locks while executing the
  supplied block, allowing any +compatible+ exclusive lock request
  to proceed.
```

Perhaps, since we're loading code, `permit_concurrent_loads` is too
permissive of other exclusive lock requests and we really need to ensure
nothing else is trying to load.
jrafanie added a commit to jrafanie/manageiq-api that referenced this issue Feb 5, 2019
https://bugzilla.redhat.com/show_bug.cgi?id=1671458
Fixes ManageIQ#544

From MartinH's findings in ManageIQ#544, we had one thread at the
`cspec[:klass].constantize` line while another thread was trying to run
`klass.columns_hash`, causing a
`Circular dependency detected while autoloading...` error.

I then tried a bunch of things until I found a way to reliably recreate
this error:

```ruby
require_relative 'config/environment'

threads = []
4.times do
  threads << Thread.new { Api::Environment.time_attributes }
end
threads.collect(&:join)
```

I could then run the above script several times in my shell and get the
`circular dependency` error most of the time.

```
for x in `seq 1 10`; do; bundle exec ruby test.rb; done
```

With this test in place, I then tried a few solutions:

1) Move the `klass.columns_hash` block into the permit_concurrent_loads
block:

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f4b6554a 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -22,9 +22,10 @@ module Api
         # Temporary measure to avoid thread race condition which could lead to a deadlock
         ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end
```

This did not fix the `circular dependency` error. Perhaps
`permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads
cross threads?

2) I tried Mutex#synchronize and this worked, but I'd rather we work
with the interlock provided by rails.

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f51de73f 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -1,5 +1,6 @@
 module Api
   class Environment
+    ONE_AUTOLOADER_LOCK = Mutex.new
     def self.url_attributes
       @url_attributes ||= Set.new(%w(href))
     end
@@ -19,12 +20,13 @@ module Api
         next if cspec[:klass].blank?
         klass = nil

-        # Temporary measure to avoid thread race condition which could lead to a deadlock
-        ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
+        # Ensure we're the only thread trying to autoload classes and their columns
+        ONE_AUTOLOADER_LOCK.synchronize do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end

```

3) I tried Sync.new with SH and EX locks instead of Mutex and this
failed with Thread killed errors.

4) I changed the `permit_concurrent_loads` on the interlock to `loading`
and this worked, but Yuri found this caused a deadlock.

5) Use `AS::Dep.interlock.loading` from 4) and move the `columns_hash` call into
the `loading` block.  This solves the circular dependency and avoids the
deadlock encountered in 4).

6) Finally, change to use `Rails.application.executor.wrap`, which
behaves similar to interlock.loading but is meants for informing rails
that a plugin needs to load application code.

I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])`
and that method has this in the source code comment:

https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168

```
  Temporarily give up all held Share locks while executing the
  supplied block, allowing any +compatible+ exclusive lock request
  to proceed.
```

Perhaps, since we're loading code, `permit_concurrent_loads` is too
permissive of other exclusive lock requests and we really need to ensure
nothing else is trying to load.
jrafanie added a commit to jrafanie/manageiq-api that referenced this issue Feb 5, 2019
https://bugzilla.redhat.com/show_bug.cgi?id=1671458
Fixes ManageIQ#544

From MartinH's findings in ManageIQ#544, we had one thread at the
`cspec[:klass].constantize` line while another thread was trying to run
`klass.columns_hash`, causing a
`Circular dependency detected while autoloading...` error.

I then tried a bunch of things until I found a way to reliably recreate
this error:

```ruby
require_relative 'config/environment'

threads = []
4.times do
  threads << Thread.new { Api::Environment.time_attributes }
end
threads.collect(&:join)
```

I could then run the above script several times in my shell and get the
`circular dependency` error most of the time.

```
for x in `seq 1 10`; do; bundle exec ruby test.rb; done
```

With this test in place, I then tried a few solutions:

1) Move the `klass.columns_hash` block into the permit_concurrent_loads
block:

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f4b6554a 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -22,9 +22,10 @@ module Api
         # Temporary measure to avoid thread race condition which could lead to a deadlock
         ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end
```

This did not fix the `circular dependency` error. Perhaps
`permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads
cross threads?

2) I tried Mutex#synchronize and this worked, but I'd rather we work
with the interlock provided by rails.

```
diff --git a/lib/api/environment.rb b/lib/api/environment.rb
index 87b34f99..f51de73f 100644
--- a/lib/api/environment.rb
+++ b/lib/api/environment.rb
@@ -1,5 +1,6 @@
 module Api
   class Environment
+    ONE_AUTOLOADER_LOCK = Mutex.new
     def self.url_attributes
       @url_attributes ||= Set.new(%w(href))
     end
@@ -19,12 +20,13 @@ module Api
         next if cspec[:klass].blank?
         klass = nil

-        # Temporary measure to avoid thread race condition which could lead to a deadlock
-        ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
+        # Ensure we're the only thread trying to autoload classes and their columns
+        ONE_AUTOLOADER_LOCK.synchronize do
           klass = cspec[:klass].constantize
-        end
-        klass.columns_hash.each do |name, typeobj|
-          result << name if %w(date datetime).include?(typeobj.type.to_s)
+
+          klass.columns_hash.each do |name, typeobj|
+            result << name if %w(date datetime).include?(typeobj.type.to_s)
+          end
         end
       end
     end

```

3) I tried Sync.new with SH and EX locks instead of Mutex and this
failed with Thread killed errors.

4) I changed the `permit_concurrent_loads` on the interlock to `loading`
and this worked, but Yuri found this caused a deadlock.

5) Use `AS::Dep.interlock.loading` from 4) and move the `columns_hash` call into
the `loading` block.  This solves the circular dependency and avoids the
deadlock encountered in 4).

I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])`
and that method has this in the source code comment:

https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168

```
  Temporarily give up all held Share locks while executing the
  supplied block, allowing any +compatible+ exclusive lock request
  to proceed.
```

Perhaps, since we're loading code, `permit_concurrent_loads` is too
permissive of other exclusive lock requests and we really need to ensure
nothing else is trying to load.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants