-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Owned layers from a dataset that are Send
#238
Conversation
Send
I support this; I've wanted something like this for the Your approach above makes it a bit more ergonomic, and may be more efficient if the object is being passed around many threads. It also makes it much more ergonomic to use in async contexts (most spawners require |
Can someone tell me what's required to finish here in order to get this PR approved? |
i had no time to look at this yet. i will try to do it this week |
I am not sure if it is enough to add the LayerAccess. I guess we need to do the same for Datasets. Some datasets and layers in GDAL are the same object with a pointer into the data. So we need a DatasetAccess trait that wraps the dataset pointer. We must ensure that the dataset is not opened in shared mode i guess. Then we can have a Dataset<A: DatasetAccess> . An owned Dataset is Send since we impl Send for OwnedDatasetAccess but not for SharedDatasetAccess. Then we need a method in Dataset to create an owned Layer. This method must consume the dataset and take the pointer. It might have a method to get the Dataset back when the OwnedLayer is closed. |
Okay, but how is this different from my implementation? How would we open something in shared mode in the current implementation? Datasets are already |
I think your Implementation is fine but there is a hole. GDALOpenEx allows to specify a GDAL_OF_SHARED option. This does the same as GDALOpenShared. It works like this: "Starting with GDAL 1.6.0, if GDALOpenShared() is called on the same pszFilename from two different threads, a different GDALDataset object will be returned as it is not safe to use the same dataset from different threads, unless the user does explicitly use mutexes in its code." So you can open a dataset (pointer) twice and send it to two different threads. Then you can create two OwnedLayers with the same dataset (pointer) |
Okay, but this means the current implementation has the problem that We could just prevent opening a I could create both versions now, so, idk. |
@jdroenner I believe we discussed this while supporting flags via |
This would mean, I don't implement a |
Yeah, if there is a safe way to pass that option, then that needs to throw an error. |
Luckily, this is already implemented to ensure that a
So I have nothing to do here. In addition to that, I've added an |
@ChristianBeilschmidt could you add a test for the OwnedFeatures iterator? |
You can find one in b1afd7d. It is called |
Thanks @ChristianBeilschmidt . PR looks good to me. We could give it a couple of days to see if anyone else wants to take a look. |
Shall we merge this? Could someone with bors rights accept this? |
@rmanoka can you try to r+ it? |
bors r+ |
🔒 Permission denied Existing reviewers: click here to make rmanoka a reviewer |
@rmanoka try again now |
bors r+ please |
238: Owned layers from a dataset that are `Send` r=rmanoka a=ChristianBeilschmidt - [X] I agree to follow the project's [code of conduct](https://github.com/georust/gdal/blob/master/CODE_OF_CONDUCT.md). - [ ] I added an entry to `CHANGES.md` if knowledge of this change could be valuable to users. --- We ran into the problem of having non-`Send` `Layer`s. Without `Send` it is barely usable in an asynchronous or threading context. You cannot store it somewhere and create tasks that do something with them since you would need to send a `Layer` to another thread. From previous discussions about thread-safety in GDAL we found out that we must not access `Dataset`s or `Layer`s at the same time from different threads. But for `Dataset`s, we found that it is safe to `Send` them to another thread since GDAL 2.4 (or something). For `Layer`s, it is actually okay to assume that the are `Send` as long as we have a one-to-one relationship between `Dataset`s or `Layer`s, or, stated differently, we must ensure that there are no two `Layer`s that point to one `Dataset`. If we would have to `Layer`s pointing to the same `Dataset` and they are `Send`, then we could have two different threads accessing the same `Dataset`. We must not allow this. However, from discussions with `@jdroenner` , we thought that if we would introduce a `Layer` that owns its `Dataset`, we could make it `Send` since we can ensure that the `Dataset` isn't accessed another time, for instance, from another thread. So in this draft PR I've created an `OwnedLayer` that does exactly this, it encapsulates a `Layer` and its `Dataset`. In `Dataset`, we can then choose if we want to have a reference to `Layer` or we want to convert our `Dataset` into an `OwnedLayer`. I've added a test that this behavior works. Since the operations on `Layer` and `OwnedLayer` are more or less the same, I've transferred them into a `LayerAccess` trait. This unfortunately would mean a breaking change. On the other hand, it means not much more code for preserving both types. Having said all this, what are your opinions on `OwnedLayer`s? Do you like or criticize the concept? What are your thoughts on naming the two types? Co-authored-by: Christian Beilschmidt <[email protected]>
bors r- |
Canceled. |
I understand the reasoning and don't have any constructive suggestions, but TBH I really don't like these changes 😃. |
I can include a changelog. If you don't like this PR @lnicola, we should think about how we can do this differently. In my opinion, we cannot not tackle this problem. |
No, we should merge this. If someone comes up with a better approach in the future, we can break the API again. |
bors merge |
Build succeeded: |
CHANGES.md
if knowledge of this change could be valuable to users.We ran into the problem of having non-
Send
Layer
s. WithoutSend
it is barely usable in an asynchronous or threading context. You cannot store it somewhere and create tasks that do something with them since you would need to send aLayer
to another thread.From previous discussions about thread-safety in GDAL we found out that we must not access
Dataset
s orLayer
s at the same time from different threads. But forDataset
s, we found that it is safe toSend
them to another thread since GDAL 2.4 (or something). ForLayer
s, it is actually okay to assume that the areSend
as long as we have a one-to-one relationship betweenDataset
s orLayer
s, or, stated differently, we must ensure that there are no twoLayer
s that point to oneDataset
. If we would have toLayer
s pointing to the sameDataset
and they areSend
, then we could have two different threads accessing the sameDataset
. We must not allow this.However, from discussions with @jdroenner , we thought that if we would introduce a
Layer
that owns itsDataset
, we could make itSend
since we can ensure that theDataset
isn't accessed another time, for instance, from another thread.So in this draft PR I've created an
OwnedLayer
that does exactly this, it encapsulates aLayer
and itsDataset
. InDataset
, we can then choose if we want to have a reference toLayer
or we want to convert ourDataset
into anOwnedLayer
. I've added a test that this behavior works.Since the operations on
Layer
andOwnedLayer
are more or less the same, I've transferred them into aLayerAccess
trait. This unfortunately would mean a breaking change. On the other hand, it means not much more code for preserving both types.Having said all this, what are your opinions on
OwnedLayer
s?Do you like or criticize the concept?
What are your thoughts on naming the two types?