Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make drivers thread-safe #246

Open
dionhaefner opened this issue Jan 5, 2022 · 7 comments
Open

Make drivers thread-safe #246

dionhaefner opened this issue Jan 5, 2022 · 7 comments
Labels
enhancement New feature or request

Comments

@dionhaefner
Copy link
Collaborator

Ultimately, we would like to return to multi-threaded access to rasters and metadata. For this we would need to use SQLalchemy's threading mechanism.

@dionhaefner dionhaefner added the enhancement New feature or request label Jan 5, 2022
@dionhaefner dionhaefner mentioned this issue Jan 5, 2022
@mrpgraae
Copy link
Collaborator

Since all of the potential gains from multi-threaded access lie in I/O operations, I'm wondering if it would not make more sense to move towards asyncio and async / await, instead of multi-threading?

I have some experience with SQLAlchemy's new async interface together with asyncpg, so I could be of help here.

Even async with MySQL is slowly maturing with https://github.com/aio-libs/aiomysql though it's not officially supported by SQLAlchemy yet.

@dionhaefner
Copy link
Collaborator Author

The main motivation for this is to have multi-threaded raster access; database queries are never the bottleneck in Terracotta, so no need to optimize here. So unless rasterio / GDAL goes through some serious paradigm shift (i.e., starts supporting non-blocking IO) I don't see much value in this over threading.

@mrpgraae
Copy link
Collaborator

Right, GDAL would be the blocker here. Seems very unlikely that they would decide to make an async interface.

@mrpgraae
Copy link
Collaborator

mrpgraae commented Jan 19, 2022

I could still see some combination of asyncio and threading that would be beneficial.

For example, using an ASGI framework for handling requests, then doing raster access in an external thread which is then awaited. This would mean that Terracotta could serve multiple requests asynchronously.

This would probably not do anything for single-request latency, but it should, in theory, mean that Terracotta would be able to handle many more requests per second. Might also improve single-request latency when under heavy load.

It's a big refactor and I'm not 100% sure if it can be done in a backwards compatible way, though at first glance it should be possible.

@dionhaefner
Copy link
Collaborator Author

I'm not against async in principle, but in this case the work should start with async raster retrieval, not database accesses. Also we'd need to dump flask? Seems like a huge undertaking to me, and given that GDAL still seems to have issues with multithreading this is hypothetical anyway.

@mrpgraae
Copy link
Collaborator

I'm not against async in principle, but in this case the work should start with async raster retrieval, not database accesses.

Yes that's also what I'm suggesting, but to wrap the multithreading inside an async interface, so that it would work within an ASGI web framework. I'm just saying that for async raster retrieval to really pay off, requests should be handled async as well.

Also we'd need to dump flask? Seems like a huge undertaking to me, and given that GDAL still seems to have issues with multithreading this is hypothetical anyway.

Yes, it would be a lot of work, I'm not saying this is something we should definitely do. I'm just thinking out loud and airing ideas 🙂

And yes, it would hinge on GDAL fixing the issues with multithreading. As I understand the issue, GDAL is supposed to handle multi-threaded reading, but there is a race-condition bug somewhere which means that this doesn't work in practice?

@dionhaefner
Copy link
Collaborator Author

As I understand the issue, GDAL is supposed to handle multi-threaded reading, but there is a race-condition bug somewhere which means that this doesn't work in practice?

Yes, or we don't know. Since the bug is so elusive you only notice it in heavy use (i.e. a production setting), so it may actually be fixed already...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants