You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi I hope you are doing well, enjoyed reading the paper.
However, I wanted to test how your approach works for other datasets. For example using OGB benchmark datasets. I realized your BIO and CHEM datasets are preprocessed differently. Also, I could not see the code you used to prepare that data. I only see the datasets prepared and stored in ZIP files. I tried to read your data (BIO one particularly) using your provided data loaders so that I can infer the structure. if I get the structure, I could write my own code to preprocess other datasets as you did. Unfortunately, I face the below issue, where I cannot even look at a batch of the data after reading your data because it was built/processed on previous versions of PyTorch and PyTorch geometric. I tried to install the old versions but couldn’t find a compatible one that works. Also using old versions is hectic and requires installing new GPU drivers and changing the versions of every other library or dependency.
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
[<ipython-input-24-ef07b0600217>](https://localhost:8080/#) in <module>
----> 1 for step, batch in enumerate(loader):
2 print('batch: ', batch)
3 break
3 frames
[/usr/local/lib/python3.7/dist-packages/torch/_utils.py](https://localhost:8080/#) in reraise(self)
459 # instantiate since we don't know how to
460 raise RuntimeError(msg) from None
--> 461 raise exception
462
463
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.7/dist-packages/torch_geometric/data/dataset.py", line 197, in __getitem__
data = self.get(self.indices()[idx])
File "/usr/local/lib/python3.7/dist-packages/torch_geometric/data/in_memory_dataset.py", line 89, in get
decrement=False,
File "/usr/local/lib/python3.7/dist-packages/torch_geometric/data/separate.py", line 23, in separate
for batch_store, data_store in zip(batch.stores, data.stores):
File "/usr/local/lib/python3.7/dist-packages/torch_geometric/data/data.py", line 486, in stores
return [self._store]
File "/usr/local/lib/python3.7/dist-packages/torch_geometric/data/data.py", line 424, in __getattr__
"The 'data' object was created by an older version of PyG. "
RuntimeError: The 'data' object was created by an older version of PyG. If this error occurred while loading an already existing dataset, remove the 'processed/' directory in the dataset's root folder and try again.
It says remove 'processed/' which doesn't make sense, but even if I do that it will not work.
Issue 2: If I use OGB benchmark datasets for example: 'ogbl-collab'. it is still not possible to use with your BioDataset or DataLoaderSubstructContext because of structural or pre-processing differences.
This also throws an error, even the raw and processed directories are in the path.
NotImplementedError: Must indicate valid location of raw data. No download allowed
I checked the difference with the BIO directory and realized the OGB dataset comes as CSV files compressed as .gz.
My question is simple and intuitive, how reproduce your work on other datasets. is there a link you can provide where you already explain this question? and Even simply, it would be nice if you can explain what sort of modification I should make to OGB benchmark datasets in order to adapt to your data loading code.
Thanks for the work and look forward to your suggestions.
Hi I hope you are doing well, enjoyed reading the paper.
However, I wanted to test how your approach works for other datasets. For example using OGB benchmark datasets. I realized your BIO and CHEM datasets are preprocessed differently. Also, I could not see the code you used to prepare that data. I only see the datasets prepared and stored in ZIP files. I tried to read your data (BIO one particularly) using your provided data loaders so that I can infer the structure. if I get the structure, I could write my own code to preprocess other datasets as you did. Unfortunately, I face the below issue, where I cannot even look at a batch of the data after reading your data because it was built/processed on previous versions of PyTorch and PyTorch geometric. I tried to install the old versions but couldn’t find a compatible one that works. Also using old versions is hectic and requires installing new GPU drivers and changing the versions of every other library or dependency.
For instance, If I read your BIO data.
getting a batch will throw an error:
It says remove 'processed/' which doesn't make sense, but even if I do that it will not work.
Issue 2: If I use OGB benchmark datasets for example: 'ogbl-collab'. it is still not possible to use with your BioDataset or DataLoaderSubstructContext because of structural or pre-processing differences.
This also throws an error, even the raw and processed directories are in the path.
I checked the difference with the BIO directory and realized the OGB dataset comes as CSV files compressed as .gz.
My question is simple and intuitive, how reproduce your work on other datasets. is there a link you can provide where you already explain this question? and Even simply, it would be nice if you can explain what sort of modification I should make to OGB benchmark datasets in order to adapt to your data loading code.
Thanks for the work and look forward to your suggestions.
NB:
The text was updated successfully, but these errors were encountered: