Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding speed with multiple instances of AI2THOR #711

Open
soyeonm opened this issue Apr 20, 2021 · 9 comments
Open

Question regarding speed with multiple instances of AI2THOR #711

soyeonm opened this issue Apr 20, 2021 · 9 comments

Comments

@soyeonm
Copy link

soyeonm commented Apr 20, 2021

Hello,

I have a question regarding the number of threads (the number of instances of AI2THOR's controller).
I am using (and I have to use) AI2THOR version 2.1.0.

I observed that on my server, any number of thread (or any simultaneously running instances of AI2THOR) beyond 3 makes the entire computer very, very slow.

Do you have an insight on how to overcome such an issue?

Thanks all the time.

@Lucaweihs
Copy link
Collaborator

Hi @soyeonm,

I have managed to run 30+ AI2-THOR processes even with older versions of AI2-THOR so I suspect something strange is going on. Some questions:

  1. Can you give us some information about your server (OS, nvidia drivers version, # process, # GPUs, RAM, etc)?
  2. What FPS do you get running glxgears?
  3. Can you give a minimum reproducing example script? It would be useful to see how exactly you're initializing everything.

@soyeonm
Copy link
Author

soyeonm commented Apr 20, 2021

Hello @Lucaweihs,

Thanks very much for your reply and for helping me.

OS version:
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"

Nvidia Drivers version
Driver Version: 450.80.02 CUDA Version: 11.0

# GPUS: 7

RAM: 465982316 KB available

  1. I get 367.729 FPS

  2. To run this code requires some setup, but if your purpose is to just see how things are set up,
    I am setting up controller as in here
    The instance of "ThorEnv" (which inherits controller in the above file) is initialized here

Please let me know if it will be helpful for you to run the code. I will prepare a version with minimum setup.

Again, thank you very much!

@soyeonm
Copy link
Author

soyeonm commented Apr 20, 2021

Also, I am running multiple Xorg's at the same time (e.g. 1 Xorg for gpu "1", another X org for gpu "2" , another Xorg for gpu "3", etc), and running one AI2THOR for each Xorg.

@Lucaweihs
Copy link
Collaborator

Ah, that 367.729 FPS number suggests to me that something non AI2-THOR-specific is going on. For context, here's what I see on my (not so modern) workstation:

$ glxgears
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
229521 frames in 5.0 seconds = 45904.074 FPS

so basically 100x higher FPS than what you're seeing.

To get a bit more information. Can you show me the full output from nvidia-smi? Are you running things using the docker setup from ask for alfred?

@soyeonm
Copy link
Author

soyeonm commented Apr 20, 2021

Hello @Lucaweihs ,

Thank you for your reply.
Screen Shot 2021-04-20 at 6 33 03 PM

This is my full output from nvidia-smi.
Other than the "Xorg"'s, another user (who is not me) is running the other processes.

Are you running things using the docker setup from ask for alfred?
--> No, I am only using their "startx.py" to setup AI2THOR 2.1.0 with Xserver.

@Lucaweihs
Copy link
Collaborator

Thanks for the info! A few more thoughts:

  1. When you ran glxgears (or AI2-THOR) which display did you target? If you, e.g., run DISPLAY=:0.7 glxgears do you get higher FPS rates?
  2. AI2-THOR can be quite CPU intensive, is it possible that many of the CPU resources are already being consumed by the other user? If you run mpstat 5 for a bit what %idle do you see?

@soyeonm
Copy link
Author

soyeonm commented Apr 20, 2021

Hello,

Thank you very much, again.

  1. I cannot run DISPLAY=:0.7 glxgears
    (I obtain Error: couldn't open display :0.7 , probably because I started the xserver on GPU 7).

However, I ran glxgears once again with DISPLAY=:7, and I get the following:
Screen Shot 2021-04-20 at 6 48 23 PM

I did nothing and it's pretty puzzling that the fps is so low.

  1. When I run mpstat 5, I get the numbers above.

I do not have any AI2THOR running at the moment (and I did not have one running, before when I had replied to this thread as well).

@Lucaweihs
Copy link
Collaborator

Lucaweihs commented Apr 20, 2021

@ekolve any ideas?

Given those 1.000 FPS numbers I think there might be something strange going on with the x-server. Can you try (1) killing it, (2) rerunning the startx.py script, and (3) rerunning DISPLAY=:7 glxgears?

While I can't really see it being the issue, it's probably also worth ruling out that the problem is coming from the older AI2-THOR version, can you try installing the latest AI2-THOR. And then running

from ai2thor.controller import Controller
import time

c = Controller(x_display="7")
c.reset("FloorPlan1")
start = time.time()
for i in range(200):
     c.step("RotateRight")
c.stop()
print(f"{1000 / (time.time() - start)} FPS")

@ekolve
Copy link

ekolve commented Apr 21, 2021

I have seen issues like this before where glxgears only is able to run at a few hundred FPS and the only solution I have found for this are to upgrade the NVIDIA driver along with rebooting the machine. Rebooting may not be necessary - it may be possible to reload each of the kernel modules referenced by the nvidia module, but I found it easier to just reboot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants