Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any code that uses threads and the AR connection pool could mix data #186

Open
foeken opened this issue Dec 4, 2014 · 24 comments
Open

Any code that uses threads and the AR connection pool could mix data #186

foeken opened this issue Dec 4, 2014 · 24 comments

Comments

@foeken
Copy link

foeken commented Dec 4, 2014

We had a very nasty issue in production yesterday. When using threaded code it's usually a good idea to use ActiveRecord's with_connection block:

ActiveRecord::Base.connection_pool.with_connection do
  # Do stuff
end

When you run this code in a Thread that has no active connection a new connection will be yielded from the pool. If you have started your request in a different thread and switched tenants there, the new connection object will not use the previously selected tenant.

Even worse, since this is a reused connection it could still be set to a different tenant all together causing your customers to see other people's data...

# Switch to tenant A
Thread.new do
  ActiveRecord::Base.connection_pool.with_connection do
    # Best case: global tenant
    # Worst case: other random tenant that was previous user of the connection.
  end
end

We encountered this issue using the Rails live streaming feature for our API. This introduces threads in your controllers, forcing you to use the AR with connection block (or suffer problems with connection reuse).

I was thinking of simply overriding the with_connection block to switch to the parent's thread tenant. But I wanted to get this out here first.

@foeken
Copy link
Author

foeken commented Dec 4, 2014

For now I am adding this piece of code:

module ActiveRecord
  module ConnectionAdapters
    class Mysql2Adapter < AbstractMysqlAdapter

      set_callback :checkout, :after, :switch_to_global_database

      def switch_to_global_database
        execute("USE `#{Rails.application.config.database_configuration[Rails.env]['database']}`")
      end

    end
  end
end

Which will at the very least prevent tenants from accessing other tenants' data...

@bradrobertson
Copy link
Contributor

Thx, will be looking at this today. Do you think you can reproduce it in a test at all?

@foeken
Copy link
Author

foeken commented Dec 9, 2014

If you simply drop the above code in a test you will see the failure right away. Just do something with a AR object inside a thread. It won't use the correct database...

@bradrobertson
Copy link
Contributor

Just need some clarification here:

Right now, Apartment will instantiate a new adapter per thread. So if for instance you do the following:

Apartment::Tenant.switch('some-tenant')
# Current adapter (and connection) now points at 'some-tenant'
Thread.new do
  # A NEW adapter is now instantiated, and thus pointing to the default tenant
  # Expected behaviour is that we'd still be in the previous tenant (ie 'some-tenant') ??
end

Can you just verify that's what you're looking for? That spinning up a new thread would use the tenant previously selected in the main thread from which is was spun up?

@foeken
Copy link
Author

foeken commented Dec 9, 2014

The current behaviour I am seeing is that when creating a new thread and accessing an AR object it checks a new connection from the pool. This connection is not reset to use the default tenant, it is still using the previous owner's last selected tenant.

So in a clean state, this will work as intended. Since the new database connection had never been used and will default to the global tenant. When you have a pool of 2 and both connections have been used and switched to a tenant once. Then the new connection uses the previously selected tenant. It is not reset.

@foeken
Copy link
Author

foeken commented Dec 9, 2014

Note that this is using the Mysql2 adapter, this might be relevant since some adapters seem to swap the entire connection pool (which I believe may cause other interesting challenges)

@bradrobertson
Copy link
Contributor

Ok, I understand the issue now. Basically when you grab a new connection from the pool, it's possible it was left connected to an old tenant. So ideally, when a connection is checked back in, you want to call Apartment.reset! as you're essentially doing with your monkeypatch.

So I'll look at the check-in/reset idea. It should be noted though that no matter what, if you're checking out a new connection, you have to set the current_tenant. I can't imagine how useful it is in any way to grab a new connection and NOT switch to a particular tenant.

This isn't really a threading issue per-se though it's exposed by threads, it's more of a connection checkout / checkin.

My original question, somewhat relates to this as well in that, if you spin up a new thread, you'll get a new connection. So given the example below, what do you think would be ideal?

Apartment::Tenant.switch('some-tenant')
# Current adapter (and connection) now points at 'some-tenant'
Thread.new do
  # A NEW adapter is now instantiated, a new connection is checked out
  # === WHAT tenant should I be pointed at here? ===
end

@foeken
Copy link
Author

foeken commented Dec 9, 2014

Ideal would be to switch to the tenant, but it's hard to get to that info from the new thread....

@bradrobertson
Copy link
Contributor

Ya I don't know exactly how we'd achieve that. For now though let's assume we always reset on checkin so you'd at least be in the same state each time, that is, pointing to the default (public) tenant.

@foeken
Copy link
Author

foeken commented Dec 9, 2014

I think it's more durable to reset on checkout. If anything goes wrong we are SURE it has been reset then :)

@bradrobertson
Copy link
Contributor

I've added a new failing test

Hoping I can hook into checkout in a similar manner to your impl without monkey patching. This doesn't really have anything to do with threading btw. It's possibly exacerbated by multiple threads, but the real issue is just checkout/checkin of connections.

@ArthurN
Copy link

ArthurN commented Mar 19, 2015

Any update on this?

This could potentially happen on Postgres using schemas, too, right?
(I'm asking because I've noticed the same issue using apartment-sidekiq as reported in #180, except with schemas)

This is a seriously dangerous issue – we've had mixed up customer data ("crossed wires", if you will) in production, and our only workaround is to catch it and terminate the job. We can't parallelize our jobs effectively, either, because of it (we'd be worried too many jobs would have mixed up tenants and would have to get killed).

@bradrobertson
Copy link
Contributor

The amount of work required to achieve this [as far as I can tell] is going to be massive. I really can't see how this is an issue if you're explicit about always selecting the tenant you want. We've NEVER once run into an issue of "crossed wires". Every job always chooses the tenant it wants to operate on.

I definitely don't have any time to solve this, if you guys want to take a stab, you're welcome to submit a PR

@ArthurN
Copy link

ArthurN commented Mar 20, 2015

@bradrobertson Curious, are you explicitly doing Apartment::Tenant.switch! inside of the Job? I've been relying on apartment-sidekiq to tell the job which tenant to use.... perhaps it's an issue with that?

@bradrobertson
Copy link
Contributor

We're not explicitly doing a switch no. apartment-sidekiq DOES do this for every job, so I'm not quite sure how you could be seeing leaks through tenants

@ArthurN
Copy link

ArthurN commented Mar 20, 2015

@bradrobertson What's your DB environment?

@mikecmpbll
Copy link
Collaborator

The problem with Active Record connection handling is that the connection configuration is stored with the model class or ActiveRecord::Base (determined hierarchically), so you simply cannot query different databases concurrently (on the same model, certainly) using Active Record because the connection information isn't local to the each thread. (as demonstrated at #180)

I don't know how it's not a bigger problem with sidekiq users, to be honest. I managed to hack around it for https://github.com/meritec/querrel by creating temporary, anonymous subclasses for each thread and modifying the connection info on the anonymous class. (then shoving the results back into original classes)

Aaron Patterson wrote about the problem with AR connection handling in a blog post ages ago. I don't know if there's any plans afoot to sort it out for Rails 5, I haven't been able to find anything.

@bradrobertson
Copy link
Contributor

@mikecmpbll the connection information (ie where to connect to) isn't stored locally in each thread, but that's not a problem as it stands because we're always connecting to the same db. The reason we don't have issues for sidekiq users (of which we're one) is that the actual connections themselves are all keyed off a thread local object id. So each thread gets its own connection from the pool thus you can query different tenants (not databases) from the same class. The way Apartment switches a tenant (in postgres) is by setting the schema_search_path. So there's no problem with sidekiq.

This issue is referring to connections being left with their schema_search_path set to the previous tenant when a connection is checked back in, thus if a user weren't careful, they could technically query the previous tenant's data. We leave it up to the user to ensure they switch! to the appropriate tenant each time so you should never really encounter issues.

What Aaron is talking about the fixes that would be nice would actually allow Apartment to shard data across multiple physical machines, which would be super useful and a feature many people are after, but his changes won't make things any more threadsafe than they are, because things as they stand, certainly are thread safe.

@mikecmpbll
Copy link
Collaborator

Ah yes, in hindsight my comment is more relevant to #180 . You are right that this is not a problem for postgres schemas, but it remains a problem for mysql users or multi-database users of postgres, regardless of whether the databases span multiple physical machines, as I understand it.

I'd hazard that there are a lot of non-schema users of Apartment, and IIRC there was a time Apartment only did DB switching and didn't support schemas (could be wrong there..), so I'm surprised it hasn't come up more especially with the proliferation of Sidekiq users recently. Anyways, it's not really something Apartment can solve without some pretty hefty changes in Active Record. I just wanted to resurrect the discussion—I plan to take a look to see if there's anything I can do on the AR side of things, but I suspect it might be a little too much to take on.

@bradrobertson
Copy link
Contributor

Sure any insight you provide would be great! For the record mysql operates the same way postgres does by calling use database x per connection so it's neither a problem for postgres or mysql assuming you're going the NON connection based approach. We're considering stripping out the connection based adapters into their own gems and potentially deprecating them as there's NO benefit to using them until we come up with some manageable way of tenanting on physical machines. Just haven't done it yet.

I'd be super interested to know the #'s actually on who is using what type of tenanting approach (mysql vs postgres, connection vs schema) as that'd help a lot in determining the direction of development but I'm not exactly sure how to achieve those stats.

@mikecmpbll
Copy link
Collaborator

Arrghh, I hadn't even noticed the use .. implementation, I thought mysql was always using establish connection. I'd traced the implementation wrongly and thought that it was using the abstract adapter for that part, d'oh. Sorry for all the nonsense!

I'll have a look into a solution for the multiple database servers scenario, never-the-less.

Thanks for your patience! ;)

@bradrobertson
Copy link
Contributor

All good! Happy to hear your feedback/insights from a multi-db perspective.

@waleedasif322
Copy link

Maybe you can put a link up on the readme for http://www.poll-maker.com/ and tell people to choose which type of tenanting they use, schemas or dbs. I use postgres schemas. Thanks for the gem man

@bradrobertson
Copy link
Contributor

Great idea. Just added ! http://www.poll-maker.com/poll391552x4Bfb41a9-15

rpbaltazar added a commit to rails-on-services/apartment that referenced this issue Apr 15, 2020
…nant

- Monkeypatch `connected_to` rails active record method to try to ensure that whenever we switch connection we connect to the proper tenant

- Removed `allow_prepend_tenant_name` configuration because we ran into concurrency issues. When the connection goes back to the pool its attributes are set with the old values and we were getting queries being made to the wrong tenant. I still think that using this prepend_tenant seems like a good idea but I'll need further testing and investigation on how to make it safe. There is a bug reported in the original apartment gem that seems to report similar issues (influitive/apartment#186)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants