num_worker training dependency #196

hnbabaei · 2022-09-15T03:43:05Z

Hi mufeili,
I have a couple of question which I appreciate it if you could help with.
-Changing the number of workers changes the number of epochs required to converge which is not expected. Increasing # of CPUs increases the training time. Any advice on why these happen?

-Could we use graph.bin file generated previously to start training without loading graph from a .csv file?

Thanks.

mufeili · 2022-09-16T08:15:45Z

Hi, which example are you talking about?

hnbabaei · 2022-09-19T18:22:11Z

Hi, the property_prediction with csv_data_configuration. I used the regression_train.py code.

mufeili · 2022-09-29T05:40:02Z

Sorry for the late reply.

Changing the number of workers changes the number of epochs required to converge which is not expected. Increasing # of CPUs increases the training time. Any advice on why these happen?

Have you eliminated all sources of randomness? By default regression_train.py does not do so like fixing the random seed. Without eliminating randomness, we cannot perform a fair comparison.

Could we use graph.bin file generated previously to start training without loading graph from a .csv file?

Yes, you can set load=True here.

hnbabaei · 2022-09-30T18:25:13Z

Thanks very much for your response.

I try to use my own splitting for the Train/test/val sets which are based on splitting 0 and 1 labels separately. I have a column that has the splitting. Is there an easy way to do this?
To do this, currently, I have added the following lines to the classification_train.py and regression_train.py:

added -ttvc (--train-test-val-col) argument which indicates column for train-test-val split labels

parser.add_argument('-ttvc', '--train-test-val-col', default=None, type=str,
                    help='column for train-test-val split labels. If None, we will use '
                         'the default method in dgllife for splitting.'
                         '(default: None)')

And here is the change I made where the data gets read and splitting done:

if args['train_test_val_col'] is not None:
    train_set = load_dataset(args, df[df[args['train_test_val_col']]=='train'])
    test_set = load_dataset(args, df[df[args['train_test_val_col']]=='test'])
    val_set = load_dataset(args, df[df[args['train_test_val_col']]=='valid'])
else:
    train_set, val_set, test_set = split_dataset(args, dataset)

Thanks

hnbabaei · 2022-09-30T20:14:43Z

I actually found the SingleTaskStratifiedSplitter class which I think will do what I found but did not see it in the options for splitting method. I will try to use it. Please let me know if you think this is a correct way to do it.

mufeili · 2022-10-02T07:12:33Z

That should work. Feel free if you encounter any further issues.

hnbabaei · 2022-10-03T18:24:49Z

Thanks Mufei. Just wondering if the code has been ever used for large scale datasets(e.g., 100 million molecules). If so, what you suggest to use or change within the code to make it scalable and memory efficient? Thanks.

mufeili · 2022-10-04T04:17:22Z

Thanks Mufei. Just wondering if the code has been ever used for large scale datasets(e.g., 100 million molecules). If so, what you suggest to use or change within the code to make it scalable and memory efficient? Thanks.

I have not tested the code for that scale. Likely you will need to check if you have enough memory to load the data at once or alternatively load the data in batches. You will also need more computational resources, e.g., multi-GPU training. The example here might help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

num_worker training dependency #196

num_worker training dependency #196

hnbabaei commented Sep 15, 2022

mufeili commented Sep 16, 2022

hnbabaei commented Sep 19, 2022

mufeili commented Sep 29, 2022

hnbabaei commented Sep 30, 2022 •

edited

Loading

hnbabaei commented Sep 30, 2022

mufeili commented Oct 2, 2022

hnbabaei commented Oct 3, 2022

mufeili commented Oct 4, 2022

num_worker training dependency #196

num_worker training dependency #196

Comments

hnbabaei commented Sep 15, 2022

mufeili commented Sep 16, 2022

hnbabaei commented Sep 19, 2022

mufeili commented Sep 29, 2022

hnbabaei commented Sep 30, 2022 • edited Loading

added -ttvc (--train-test-val-col) argument which indicates column for train-test-val split labels

hnbabaei commented Sep 30, 2022

mufeili commented Oct 2, 2022

hnbabaei commented Oct 3, 2022

mufeili commented Oct 4, 2022

hnbabaei commented Sep 30, 2022 •

edited

Loading