Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract TV-L1 from HMDB-51 #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mlakhal
Copy link

@mlakhal mlakhal commented Sep 9, 2017

Adding the scripts to download and extract the TV-L1 optical flow from HMDB-51.

for _ in range(vid_len - 2):
ret, frame2 = cap.read()
curr = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY)
curr = cv2.resize(curr, (_IMAGE_SIZE, _IMAGE_SIZE))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the original author's implementation, image is first resized preserving aspect ratio (the smallest dimension is 256 pixels), then it's cropped to 224x224.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure about this? It says in the README of this repository that:

For RGB, the videos are resized preserving aspect ratio so that the smallest dimension is 256 pixels, with bilinear interpolation. Pixel values are then rescaled between -1 and 1. During training, we randomly select a 224x224 image crop, while during test, we select the center 224x224 image crop from the video. The provided .npy file thus has shape (1, num_frames, 224, 224, 3) for RGB, corresponding to a batch size of 1.

For the Flow stream, after sampling the videos at 25 frames per second, we convert the videos to grayscale. We apply a TV-L1 optical flow algorithm, similar to this code from OpenCV. Pixel values are truncated to the range [-20, 20], then rescaled between -1 and 1. We only use the first two output dimensions, and apply the same cropping as for RGB. The provided .npy file thus has shape (1, num_frames, 224, 224, 2) for Flow, corresponding to a batch size of 1.

I take that to mean that the flow images are not resized at all until they are used as input to the model, which suggests that their resizing happens in-graph.

@pfabreu
Copy link

pfabreu commented Jan 10, 2018

I often get RuntimeWarning: invalid value encountered in true_divide curr_flow = curr_flow / max_val(curr_flow) on line 82.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants