GitHub - gsig/charades-algorithms: Activity Recognition Algorithms for the Charades Dataset

Charades Starter Code for Activity Recognition in Torch and PyTorch

Contributor: Gunnar Atli Sigurdsson

New: extension of this framework to the deep CRF model on Charades for Asynchronous Temporal Fields for Action Recognition: https://github.com/gsig/temporal-fields

New: This code implements a Two-Stream network in PyTorch
This code implements a Two-Stream network in Torch
This code implements a Two-Stream+LSTM network in Torch

See pytorch/, torch/, for the code repositories.

The code replicates the 'Two-Stream Extended' and 'Two-Stream+LSTM' baselines found in:

@inproceedings{sigurdsson2017asynchronous,
author = {Gunnar A. Sigurdsson and Santosh Divvala and Ali Farhadi and Abhinav Gupta},
title = {Asynchronous Temporal Fields for Action Recognition},
booktitle={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2017},
pdf = {http://arxiv.org/pdf/1612.06371.pdf},
code = {https://github.com/gsig/temporal-fields},
}

which is in turn based off "Two-stream convolutional networks for action recognition in videos" by Simonyan and Zisserman, and "Beyond Short Snippets: Deep Networks for Video Classification" by Joe Yue-Hei Ng el al.

Combining the predictions (submission files) of those models using combine_rgb_flow.py yields a final classification accuracy of 18.9% mAP (Two-Stream) and 19.8% (LSTM) on Charades (evalated with charades_v1_classify.m)

Technical Overview:

The code is organized such that to train a two-stream network. Two independed network are trained: One RGB network and one Flow network. This code parses the training data into pairs of an image (or flow), and a label for a single activity class. This forms a softmax training setup like a standard CNN. The network is a VGG-16 network. For RGB it is pretrained on Image-Net, and for Flow it is pretrained on UCF101. The pretrained networks can be downloaded with the scripts in this directory. For testing. The network uses a batch size of 25, scores all images, and pools the output to make a classfication prediction or uses all 25 outputs for localization.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
pytorch		pytorch
torch		torch
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
combine_rgb_flow.py		combine_rgb_flow.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Charades Starter Code for Activity Recognition in Torch and PyTorch

Technical Overview:

About

Releases

Packages

Languages

gsig/charades-algorithms

Folders and files

Latest commit

History

Repository files navigation

Charades Starter Code for Activity Recognition in Torch and PyTorch

Technical Overview:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages