Paper Link
Link of trained weights
Important
As for the multi-agent models, the trained weights are created based on agents with random policy. If you want more highly accurate environments in any kinds of situations, include data of agents with various kinds of levels, such as random, beginners, mid levels, and experts.
Note
Please also refer to scripts
directory as to 1st and 2nd training phases and encoding images after 1st training phase. Hyperparameters we applied for each dataset are also listed there.
Tip
We used a single GPU having 48GB GPU memory for 2nd training phase. If your machine has smaller GPU memory size, reduce the batch size.
docker build -t [Docker image name] .
docker run --rm -it --ipc=host --gpus '"device=[device id(s)]"' -v $(pwd):/work [Docker image name]:latest
Choose a dataset name among boxing
, pong
, carla
or gtav
.
As for an image size, specify 64x64
for boxing
, pong
or carla
and 48x80
for gtav
.
python data/multi_thread_processing.py --dataset [dataset name] --num_eps 1500 --data_dir datasets --num_threads [number of threads] --num_agents [number of agents]
Then create a file to split the dataset into train, validation, and test datasets.
python data/data_split.py --datapath [dataset path]
python enc_dec_training.py --log_dir [log path] --use_perceptual_loss --batch [batch size] --data_path [dataset path] --dataset [dataset name]
python latent_encoder.py --ckpt [checkpoint path] --results_path [output path for encoded dataset] --data_path [dataset path] --dataset [dataset name] --img_size [image size(heightxweight)]
Use --visualize 1
and specify --vis_bs [batch size for visualization]
to check the images after encoding and decoding are applied for debugging purposes.
As for action_space
argument, specify 4
for pong
, 6
for boxing
, 2
for carla
, and 3
for gtav
.
To train Transition Learner, there are two ways provided.
[1] Train with auto-regressive.
python trans_learner_training_ar.py --batch_size [batch size] --data_dir [dataset directory path] --num_workers [number of data processing workers] --max_seq_len [maximum sequential length for each visual and actions]
[2] Train with GAN.
python trans_learner_training_gan.py --batch_size [batch size] --data_dir [dataset directory path] --num_workers [number of data processing workers] --max_seq_len [maximum sequential length for each emb of visual frames and actions] --num_transenclayer [number of transformer layers] --attn_mask_type [attention mask type] --dataset [dataset name] --action_space [number of action space]
sudo
is required to run keyboard
module. Use an image located under init_imgs
directory for each initial image required to run the simulator.
The followings are command descriptions for all the environments supported to run on this repo.
- [GTAⅤ]
Left: a, Right: d - [Pong (2 agents)]
1st agent: Fire: w, Left: a, Right: d
2nd agent: Fire: i, Left: j, Right: l - [Pong (4 agents)]
1st agent: Fire: w, Left: a, Right: d
2nd agent: Fire: t, Left: h, Right: f
3rd agent: Fire: i, Left: j, Right: l
4th agent: Fire: s, Left: z, Right: c - [Boxing]
1st agent: Fire: e, Left: a, Right: d, Up: w, Down: s
2nd agent: Fire: u, Left: j, Right: l, Up: i, Down: k
press q
or Ctrl + C on the terminal to quit the environment you are playing.
Important
As for the multi-agent models, the trained weights are created based on agents with random policy. Thus, the generated results may not be consistent if the inputs for the agents are not random.
Tip
As for pong
environment, the transitions of the environment is rather slow. In case you feel the same, use --fps 60
for more challenging transition speed.
sudo python3 simulator.py --encdec_ckpt [encoder decoder checkpoint path] --trans_ckpt [transition learner checkpoint path] --init_img_path [initial image path]
- This codebase is based on nv-tlabs/DriveGAN_code repository, which has Nvidia Source Code License.
- Code for Lpips is imported from https://github.com/richzhang/PerceptualSimilarity (License).