-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Results on Breakfast Dataset #5
Comments
Hi, 1-2. We trained the CNN model for 20 epochs, but the batch size was 64. The default values should work well for the RNN model.
Regarding the results with nn-viterbi, I think you should look at the accuracy of the observed part (only the decoded output). If the accuracy of the observations is higher for nn-viterbi then I would expect better results for the anticipation. Otherwise, the accuracy should be lower. |
Thanks! batch_size=64 works well! |
Hi Yazan, Sorry for reopening this issue, but I'm having a similar problem in reproducing the results for the RNN model. I trained the RNN models using the default values in the source code, as you recommend above, but my performance for the 20% observation is roughly 5% below the results reported and for the 30% observation is something around 2% below. Is there anything I might be doing wrong or missing while training the RNN model? Kind regards, |
Hi Romero, Kindly note that the results in the paper are the average over multiple splits. If you don't have the data for all the splits, you can find them in the following link: I hope this would help. Best, |
Hi Yazan, I'm aware that the results are the average over the four splits on the Breakfast dataset, so I trained four models (train on data from split 02, 03, and 04, and evaluate on split 01, ...) and averaged their results. However, my average results are as I mentioned above, and I've trained all the RNN models with the default parameters. The experiments I'm doing are using the ground-truth data as input to the model, both during training and during evaluation. Kind regards, |
The default parameters work well for my case. It seems that @giulio93 got comparable results (except for the 20% observations and 10% prediction). I'm not sure what's the problem in your case. Can you maybe check the convergence on the training set and see if you need to train more? |
Hi, If you do not achieve similar results, please consider more than one predict run on test set,since in the RNN training procedure, random cuts are taken between actions in order to create training examples. @RomeroBarata this is not clear to me: I'm colletting experiments in a forked repository of this project, feel free to explore: |
Hi guys, No worries, I'll check everything again. Thank you for all the clarifications! @giulio93 , when you evaluate any machine learning model you should never train and test on the same data. Anything you see during training your model learns well (sometimes too well -> overfitting) and if you test on the same data you are going to get a very optimistic result that is not actually true in practice (when you deploy the model on a truly unseen test set). Thus, the correct way of training and testing is to train models on three splits of the data and test on the remaining one. Kind regards, |
@RomeroBarata yeah man, i suppose to know the difference between training and evalutation... You need to read carefully the paper. Inside the dataset there are four splits, each split contains a TRAIN SET and a TEST SET. In Breakfast splits are made as follow: In Salad splits are made as follow: Hope this help! |
Sorry @giulio93, I didn't mean to lecture about k-fold cross validation, I was just trying to clarify the previous misunderstanding. As you mentioned, even though the splits provided by the author are named split0X.train and split0X.test, they point to the correct files. Anyway, I'll check everything again and rerun the experiments. Thank you guys for the help! |
Hi, thank you so much to share this useful project!

I have run the code just like that, and i earned this results:
I have a couple of question:
I tried also to use weakly supervise Decoded output coming from the new Viterbi Decoder Paper . So i filled the two folder obs0.2 and obs0.3 with the new decoded data percentage respectively.
I earned better results on prediction with decode output data, does it make sense for you?
Even if i did not train the decoder as fully supervised mode but instead in weakly supervised mode.
The text was updated successfully, but these errors were encountered: