-
-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataloader in multi-threaded environments #71
Comments
My idea would be to port https://github.com/graphql/dataloader to Python, similarly to how GraphQL.js has been ported - it seems Syrus already started to work on this (https://github.com/syrusakbary/aiodataloader). Dataloader has seen a lot of new development recently, so this could be worthwile. |
Nice! I didn't realize Syrus had made another version of Dataloader specifically for Re the recent developments of Dataloader in JS, the task scheduling in Dataloader v2 uses the same underlying mechanism as in v1. Since Syrus' repo is a straight port from Dataloader JS v1, let's use it as a basis for the purpose of this conversation. I see 2 issues to use it in a multi-threaded mode:
Unless we wrap each |
Currently I don't have enough time to look deeper into this and make a qualified assessment. Maybe you should examine what they are doing in other languages, e.g. graphql/dataloader#30. |
Sounds good. I look into it when I start working on upgrading to v3 (probably not before Q2/Q3 2020. In the meantime, let's keep this issue open in case someone needs this earlier than I do so we can discuss it here. If someone is interested in working on this, refer to syrusakbary/promise#81 to learn more about Dataloader and thread-safety in Python. |
Looks like it might be possible to use import asyncio
import os
import time
os.environ['ASGI_THREADS'] = '16'
from aiodataloader import DataLoader
from asgiref.sync import sync_to_async
class TestLoader(DataLoader):
@sync_to_async
def batch_load_fn(self, keys):
t = time.time()
return [f'{key}-{t}' for key in keys]
async def main():
loader = TestLoader()
one1 = loader.load('one')
one2 = loader.load('one')
two = loader.load('two')
print(await one1, await one2)
print(await two)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close() |
@silas I'm not sure why you need If you want to make this work in a multi-threaded environment, I recommend that you make |
@jhgg You can't call That said, I never call Probably not a general solution, but it seems like it's possible to make it work now. |
Why isn't the promise library supported anymore - or an equivalent? I want to use the dataloader pattern in a sync application. It's not even multi-threaded. All I want is to return an object from my resolver which will be resolved in a second tick, after the initial pass through |
@miracle2k The async feature is a modern equivalent to promises. It's the modern and official solution to the problem and supported by both Python syntax and standard library. Promises are not contained in the standard library. |
@Cito Ok, but I don't want to use async in this code base, but I do want to use the data loader. I think this was a reasonable thing to have been supported previously. So I am aware that I can do the following, and indeed this is what I am doing now and it works:
Now we are using asyncio as the "promise execution engine" for dataloader only, and otherwise the code is synchronous. I don't now how high the performance penalty is, but would guess it's minimal / similar to what it was with the Note for anyone attempting this: In my particular case this required changes to |
- create loaders and run async query execution when dispatching GraphQL request - obtain loaders from GraphQL context in resolvers - split product resolver due to different authz enforcement cf. graphql-python/graphql-server#66 https://lightrun.com/answers/graphql-python-graphene-consider-supporting-promise-based-dataloaders-in-v3 graphql-python/graphql-core#71
- create loaders and run async query execution when dispatching GraphQL request - obtain loaders from GraphQL context in resolvers - split product resolver due to different authz enforcement cf. graphql-python/graphql-server#66 https://lightrun.com/answers/graphql-python-graphene-consider-supporting-promise-based-dataloaders-in-v3 graphql-python/graphql-core#71
- create loaders and run async query execution when dispatching GraphQL request - obtain loaders from GraphQL context in resolvers - split product resolver due to different authz enforcement cf. graphql-python/graphql-server#66 https://lightrun.com/answers/graphql-python-graphene-consider-supporting-promise-based-dataloaders-in-v3 graphql-python/graphql-core#71
Hi,
I've been thinking a bit about how we could implement the Dataloader pattern in v3 while still running in multi-threaded mode. Since v3 does not support Syrus's Promise library, we need to come up with a story for batching in async mode, as well as in multi-threaded environments. There are many libraries that do not support
asyncio
and there are many cases where it does not make sense to go fully async.As far as I understand, the only way to batch resolver calls from a single frame of execution would be to use
loop.call_soon
. But sinceasyncio
is not threadsafe, that means we would need to run a separate event loop in each worker thread. We would need to wrap thegraphql
call with something like this:Is that completely crazy? If yes, do you see a less hacky way? I'm not very familiar with
asyncio
so I would love to get feedback.Cheers
The text was updated successfully, but these errors were encountered: