Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All-classes gen4 dataset #4

Closed
Hatins opened this issue May 9, 2023 · 17 comments
Closed

All-classes gen4 dataset #4

Hatins opened this issue May 9, 2023 · 17 comments

Comments

@Hatins
Copy link

Hatins commented May 9, 2023

Hi Magehrig,

I would like to express my gratitude for your outstanding work and generosity in sharing your knowledge. I was wondering if you could provide us with an all-classes gen4 dataset for more comprehensive testing purposes and applications in other domain. Although I have attempted to generate the dataset myself, the process is quite time-consuming.

@magehrig
Copy link
Contributor

Hi @Hatins

Thank you for your kind words. The most straightforward way to generate the 1 Mpx dataset with all classes is to remove the 2 lines here from the preprocessing script and then follow the usual preprocessing steps as described in the readme

Have you tried that already?

@Hatins
Copy link
Author

Hatins commented May 11, 2023

@magehrig

Thank you for taking the time out of your busy schedule to reply to me. Your response has been very helpful, and I now know how to generate a dataset with labels for all categories. Currently, I am downloading the raw data for gen4, but I am concerned about the time it will take to generate the data. I would like to know how long it will take to generate the gen4 data once (about one week or one month?), as I am currently using a 4*RTX3090 server, and how much should the parameters below be set to:

parser.add_argument('-np', '--num_processes', type=int, default=1, help="Num proceesses to run in parallel")

@magehrig
Copy link
Contributor

For the preprocessing you don't need GPUs but only a CPU. It mostly depends on your hardware; that is how many parallel threads your CPU allows and how fast your memory is.

If I remember correctly it took less than 1 hour on my machine using -np 20. Set the number of processes to the number of threads on your machine and not higher.

@Hatins
Copy link
Author

Hatins commented May 11, 2023

Thank you again for your help. It's truly appreciated, and I'm envious of your machine! If my server can complete the data generation in two or three days, that would be great news for me.

@magehrig
Copy link
Contributor

Ok that would be quite slow ;). I suggest you convert a small subset and check if the dataset is according to your expectation before pre-processing the full dataset.

If you realize it is too slow, let me know and we may find another solution. In the meantime I am closing this issue.

@Hatins
Copy link
Author

Hatins commented May 11, 2023

Okay, thank you again for your enthusiastic help, I wish you all the best in your research!

@magehrig
Copy link
Contributor

thanks :)

@Hatins
Copy link
Author

Hatins commented May 14, 2023

Hi, magehrig

I'm back again. I followed your advice to test on a subset of data, but the speed is still very slow.

train
>--moorea_2019-01-30_000_td_122500000_182500000_bbox.npy
>--moorea_2019-01-30_000_td_122500000_182500000_td.h5
test
>--moorea_2019-01-30_000_td_671500000_731500000_bbox.npy
>--moorea_2019-01-30_000_td_671500000_731500000_td.h5
val
>--moorea_2019-01-30_000_td_549500000_609500000_bbox.npy
>--moorea_2019-01-30_000_td_549500000_609500000_td.h5

As you can see, for the above subset of gen4, I spent around five hours for generations with -np 10.

I thought at first that the slow speed was due to my use of a mechanical hard drive, but then I migrated the data to an SSD (Samsung 870), and it seems that the speed has not improved.

About the CPU, since the machine I am using is also new, I don't think it should be a problem with the CPU:

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz
Stepping: 7
CPU MHz: 800.050
CPU max MHz: 3200.0000
CPU min MHz: 800.0000
BogoMIPS: 4200.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 22528K
NUMA node0 CPU(s): 0-15,32-47
NUMA node1 CPU(s): 16-31,48-63

I suspect that the reason for the slow speed is because at the beginning, I converted the data from dat format to h5 format, following the method of issues #2 :

import os
import sys
sys.path.append(os.getcwd())
import h5py
import ipdb
import scripts.genx.tools.psee_loader as psee_loader
from pathlib import Path
if __name__ == '__main__':
    path = '/data/gen4_test'
    input_dir = Path(path)
    train_path = input_dir / 'train'
    val_path = input_dir / 'val'
    test_path = input_dir / 'test'
    for split in [train_path, val_path, test_path]:
        for npy_file in split.iterdir():
            if npy_file.suffix != '.npy':
                continue
            dat_path = npy_file.parent / (
                    npy_file.stem.split('bbox')[0] + f"td.dat")
            dat = psee_loader.PSEELoader(str(dat_path))
            eve = dat.load_n_events(dat._ev_count)
            h5_file_path = str(split/dat_path.stem.split('.dat')[0]) + '.h5'
            h5 = h5py.File(h5_file_path, 'w')
            h5.create_dataset('events', data=eve)
            dat_path.unlink()
            print('finish created of ' + h5_file_path)

Do you happen to know what might be the reason for this? :)

@magehrig
Copy link
Contributor

Yes, the problem is that you are creating this h5 dataset without chunking. What this means is that the whole data will be written as a single block. As a result, during reading, h5 will read out the whole dataset, which happens very often in a for loop here:

ev_window = h5_reader.get_event_slice(idx_start=idx_start, idx_end=idx_end)

To fix this, you need to write the h5 dataset in chunks. A simple way would be to write all the data at once with "create_dataset" in your code but specify that you want to use chunking.

Another (typically faster approach) is to write data incrementally with chunking. If you want to do this you can take some inspiration from my code here which does it for the final dataset:

class H5Writer:

@magehrig
Copy link
Contributor

@Hatins
Copy link
Author

Hatins commented May 15, 2023

Aha, so it is. Thank you for your clear guidance. I will try the two methods you gave and give feedback, thanks again! !

@Hatins
Copy link
Author

Hatins commented May 16, 2023

Hi, megehrig

I'm facing difficulties in solving this problem, and I've made some changes to the code:

    for split in [train_path, val_path, test_path]:
        for npy_file in split.iterdir():
            if npy_file.suffix != '.npy':
                continue
            dat_path = npy_file.parent / (
                    npy_file.stem.split('bbox')[0] + f"td.dat")
            dat = psee_loader.PSEELoader(str(dat_path))
            eve = dat.load_n_events(dat._ev_count)
            h5_file_path = str(split/dat_path.stem.split('.dat')[0]) + '.h5'

            h5 = h5py.File(h5_file_path, 'w')
            h5.create_dataset('events', data=eve, chunks=True, **_blosc_opts(complevel=1, shuffle='byte'))
            dat_path.unlink()
            print('finish created of ' + h5_file_path)

However, these modifications don't seem to be effective (It still costs many time in generating frames).

When I revise chunks=True to chunks=(1,), it significantly increases the time required to convert dat to h5 format.

I carefully reviewed your code for H5writer and noticed that the chunk shape was set to [1, 20, 360, 720]. However, I'm unsure about how to determine the appropriate parameters for this part when converting dat to an h5 file.

Currently, I'm completely stuck and would greatly appreciate your guidance and assistance in resolving this problem. Could you please help me with the necessary modifications or provide a set of basic conversion codes?

@magehrig
Copy link
Contributor

Ok, first we need to figure out if you can use the h5 data I provide to preprocess the dataset reasonably fast. If that works, it means there is room for optimization in how you create your own h5 dataset. I suggest downloading the 1 Mpx validation or test set h5 files and running the preprocessing scripts to check how fast this runs.

Assuming that indeed this runs reasonably fast, you can go on and optimize your code:

First, I am not sure if your code is compatible with the preprocessing script because the preprocessing script accesses individually x, y, t, p:

x_array = np.asarray(ev_data['x'][idx_start:idx_end], dtype='int64')

To fix this, you can do the following:

shape = (2**16,)
h5f.create_dataset('events/x', shape=shape, dtype='u2', chunks=shape, compression=**_blosc_opts(complevel=1, shuffle='byte'))
h5f.create_dataset('events/y', shape=shape, dtype='u2', chunks=shape, compression=**_blosc_opts(complevel=1, shuffle='byte'))
h5f.create_dataset('events/p', shape=shape, dtype='u1', chunks=shape, compression=**_blosc_opts(complevel=1, shuffle='byte'))
h5f.create_dataset('events/t', shape=shape, dtype='i8', chunks=shape, compression=**_blosc_opts(complevel=1, shuffle='byte'))

Setting chunk=True as you already did should however also work reasonably well.

If I understood you correctly, the time to generate the h5 files is not the issue but running the pre-processing script on your generated h5 files is slow. Can you confirm this?

@Hatins
Copy link
Author

Hatins commented May 16, 2023

Sorry I didn't notice before that you have given the original h5 event file. If this is the case, converting dat to h5 is not so important to me, I will directly download the h5 file you provided. I hope that the h5 file you provided can be used to build frames at a normal speed. In this case, my problem will be completely solved!!!

@magehrig
Copy link
Contributor

Sure the h5 files that are provided contain all the events and labels of the original dataset but in a more convenient format.

I am a bit confused since I thought you wanted to convert the h5 files by yourself according to the thread in the other issue #2

I suppose then the easiest way forward is just to use the existing h5 files.

@Hatins
Copy link
Author

Hatins commented May 17, 2023

I didn't expect that you would provide a whole set of raw data, which is a great help to us. Next, I will directly download the h5 data you provided, and then build a frame to test whether the speed is normal!

@Hatins
Copy link
Author

Hatins commented May 18, 2023

Oh my gosh, this is so fast. Far faster than my previous frame building! This saved me a lot of time. Now my problem has been completely solved, thank you again for your help, I hope the noisy and tedious questions did not bother you too much. You are right, there is really no need for me to convert dat to h5 myself! Now I have deleted all the dat files and replaced with the h5 files you provided :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants