-
Notifications
You must be signed in to change notification settings - Fork 0
Visual Referee Challenge
For the Visual Referee Challenge at RoboCup 2022, we developed a simple yet effective deep learning model for gesture recognition on short sequences of frames. The augmentation on the training data increased the robustness of the system.
Different members of the team and university mimicked the different poses to create a preliminary dataset. A green screen so that the poses can be transferred to different backgrounds. This increased the dataset size and made the model robust to new backgrounds.
The model is based on 3D convolutions with LSTM. The model takes in 15 frames and outputs a probability distribution over the gesture labels. An additional "no pose" dustbin label is included to reduce false positives.
If the confidence score does not exceed a certain threshold, it takes another set of 15 frames until the confidence threshold is met.