-
Notifications
You must be signed in to change notification settings - Fork 204
tutorials vrx_docker_debug_info
crvogt edited this page Aug 23, 2023
·
10 revisions
If your competitor image is not working as expected, it's helpful to know the following:
- Whether your image is running at all.
- Whether it exits early or stays running the entire trial.
- Whether it is running the correct entrypoint.
In this tutorial we will walk through how to get this information using docker
commands.
Before you begin, open two terminals:
- In one terminal, we will run the trial you want to debug, using the testing instructions.
- In the the other terminal we will run some Docker commands to get information about your container.
- Begin running your trial as described in the testing tutorial, using the command:
./run_trial.bash -n $TEAM $TASK $TRIAL
- While you are running the trial in one terminal, execute the following in the other:
This will list all currently running containers.
docker container ls
- While the simulation is still starting up, the output of the above command will be empty (other than the list headers):
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
- If you repeatedly run the
docker container ls
command, you should eventually see the VRX server listed, then both the server and your competitor image listed as running at the same time. For example:CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES efb815475139 virtualrobotx/vrx_2022_simple:test "/ros_entrypoint.sh" 1 second ago Up Less than a second vrx-competitor-system 61768582e987 vrx-server-noetic-nvidia:latest "/vrx_entrypoint.sh …" 10 seconds ago Up 6 seconds 11345/tcp vrx-server-system
- Note that you can automate the process of repeatedly running the
docker container ls
command using thewatch
command:This will show the output ofwatch docker container ls
docker container ls
and by default will dynamically refresh every 2 seconds.
If you only see the server image listed, or if your image only appears for a short period of time, then it is most likely exiting early. To find out more about what happened, we can check the exit status.
- By default, the
run_trial.bash
script cleans up its server and competitor containers before exiting. - When debugging an image it is often useful to disable this behavior.
- To do this, use a text editor to open the
run_trial.bash
script at the root of thevrx-docker
repository. - Go to the bottom of the script and comment out the second-to-last command:
# Kill and remove all containers before exit #${DIR}/utils/kill_vrx_containers.bash exit 0
- Save the file.
- You will now be able to use
docker
commands to get information about these containers after the run has terminated. - When you are finished debugging you can reverse this change.
These instructions assume you have modified your run_trial.bash
script as described in the previous section.
- If your container is exiting early, you can check the exit status:
docker ps -a | grep "CONTAINER\|vrx-competitor-system"
- If this command produces no output except the header row, then it is possible that your image did not run at all.
- In this case, the most likely culprit is the spelling of the url in your dockerhub_image.txt.
- A second possibility is that your image is stored in a private repository and you do not have access from your terminal.
- This situation should be very noticeable because the entire trial will exit early when your image cannot be downloaded.
- In either case, the best way to troubleshoot is to re-run the validation tutorial and make sure you can pull your image.
- If your container ran, but exited early, the output of the command should like something like:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 1bd20825263d virtualrobotx/vrx_2022_simple:test "/bin/bash -c 'catki…" About a minute ago Exited (127) About a minute ago vrx-competitor-system
- In this example, the output tells us:
- the container did run, but it exited about the same time it was created
- The exit code was 127, which means it could not find the command it tried to run, according Docker's exit code documentation.
- The command it tried to run began with
/bin/bash -c 'catki…
. This command is abbreviated, but it would be possible to see the expanded version using the--no-trunc
option. For example:docker ps -a --no-trunc | grep "vrx-competitor-system"
- In our case, however, we can already guess what is wrong. The correct entrypoint for the container should be
/ros_entrypoint.sh
, so the command listed should also be/ros_entrypoint.sh
. - Instead, the image has been misconfigured to call a
catkin
tool on startup. - Since nothing is running
ros_entrypoint.sh
,setup.bash
has not been sourced, and the container is crashing because it can't find the requested tool in its path.
- Using the
--no-trunc
option reveals that the intended command wascatkin_make
. - This demonstrates a second common error: attempting to build software in the container at runtime.
- Since our Docker image represents the WAMV platform, this is analoguous to compiling software on the WAMV after it has been placed in the water.
- Although there may be some exceptional cases where building on the fly might make sense, generally the best practice is to build any required system software into the image during the image build process.
Back: Troubleshooting Prerequisites | Up: VRX Docker Image Overview | Next: Examine a Running Container |
---|