You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,thanks for your great work. And now,when i want to train model on PKUMMD dataset with ,for example cos_tr.py,however,i just adjusted the GraphDataset.py,the code as follow:
but when i run the code by cmdline as :python models/cos_tr/cos_tr.py --train --max_epochs 30 --id benchmark_costr_pkummdv1 --gpus "0,1,2,3" --profile_model --profile_model_num_runs 10 --forward_mode clip --batch_size 128 --num_workers 8 --dataset_name pkummd --dataset_classes ./datasets/pkummd/classes.yaml --dataset_train_data /data/pkummdv1_float32/train_subject_data_v1.pkl --dataset_val_data /data/pkummdv1_float32/test_subject_data_v1.pkl --dataset_train_labels /data/pkummdv1_float32/train_subject_label_thoum_v1.pkl --dataset_val_labels /data/pkummdv1_float32/test_subject_label_thoum_v1.pkl
the used memory would quickly rise to more than 100 G, but my total data is about 5G. and the output log is
`lightning: Global seed set to 123
ride: Running on host gpu-task-nod5
ride: ⭐️ View project repository at [email protected]:LukasHedegaard/continual-skeletons/tree/a8fe2937a33f24cce65c1f8c2fc41081bceda721
ride: Run data is saved locally at logs/run_logs/benchmark_costr_pkummdv1/version_6
ride: Logging using Tensorboard
ride: 💾 Saving logs/run_logs/benchmark_costr_pkummdv1/version_6/hparams.yaml
ride: 🚀 Running training
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
models: Input shape (C, T, V, S) = (3, 300, 25, 2)
models: Receptive field 449
models: Init frames 144
models: Pool size 75
models: Stride 4
models: Padding 152
models: Using Continual CallMode.FORWARD
ride: ✅ Checkpointing on val/loss with optimisation direction min
/home/yaoning.li/Anaconda/yes/envs/mmlab/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:110: LightningDeprecationWarning: Trainer(distrib uted_backend=ddp) has been deprecated and will be removed in v1.5. Use Trainer(accelerator=ddp) instead.
rank_zero_deprecation(
lightning: GPU available: True, used: True
lightning: TPU available: False, using: 0 TPU cores
lightning: IPU available: False, using: 0 IPUs
lightning: Global seed set to 123
lightning: initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/4`
i dont know why. Can you help me?Thank you !
The text was updated successfully, but these errors were encountered:
Hello,thanks for your great work. And now,when i want to train model on PKUMMD dataset with ,for example cos_tr.py,however,i just adjusted the GraphDataset.py,the code as follow:
but when i run the code by cmdline as :
python models/cos_tr/cos_tr.py --train --max_epochs 30 --id benchmark_costr_pkummdv1 --gpus "0,1,2,3" --profile_model --profile_model_num_runs 10 --forward_mode clip --batch_size 128 --num_workers 8 --dataset_name pkummd --dataset_classes ./datasets/pkummd/classes.yaml --dataset_train_data /data/pkummdv1_float32/train_subject_data_v1.pkl --dataset_val_data /data/pkummdv1_float32/test_subject_data_v1.pkl --dataset_train_labels /data/pkummdv1_float32/train_subject_label_thoum_v1.pkl --dataset_val_labels /data/pkummdv1_float32/test_subject_label_thoum_v1.pkl
the used memory would quickly rise to more than 100 G, but my total data is about 5G. and the output log is
`lightning: Global seed set to 123
ride: Running on host gpu-task-nod5
ride: ⭐️ View project repository at [email protected]:LukasHedegaard/continual-skeletons/tree/a8fe2937a33f24cce65c1f8c2fc41081bceda721
ride: Run data is saved locally at logs/run_logs/benchmark_costr_pkummdv1/version_6
ride: Logging using Tensorboard
ride: 💾 Saving logs/run_logs/benchmark_costr_pkummdv1/version_6/hparams.yaml
ride: 🚀 Running training
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
models: Input shape (C, T, V, S) = (3, 300, 25, 2)
models: Receptive field 449
models: Init frames 144
models: Pool size 75
models: Stride 4
models: Padding 152
models: Using Continual CallMode.FORWARD
ride: ✅ Checkpointing on val/loss with optimisation direction min
/home/yaoning.li/Anaconda/yes/envs/mmlab/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:110: LightningDeprecationWarning:
Trainer(distrib uted_backend=ddp)
has been deprecated and will be removed in v1.5. UseTrainer(accelerator=ddp)
instead.rank_zero_deprecation(
lightning: GPU available: True, used: True
lightning: TPU available: False, using: 0 TPU cores
lightning: IPU available: False, using: 0 IPUs
lightning: Global seed set to 123
lightning: initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/4`
i dont know why. Can you help me?Thank you !
The text was updated successfully, but these errors were encountered: