You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I have been trying to understand the provided code and have certain concerns about the decoder.
main.py contains the following pieces of code
out = prepared_batch[2][:, :]
tar = prepared_batch[2][:, 1:]
I presume these are the output and the target (and have the same information content ie. edit actions) . out is then being fed to the edit_net, this does not seem to make sense.
thereafter, In the decoder part, the code uses the same out does manipulation to create output_t which is then returned as the result of the EditNet. I am unable to find the parts in your code where prediction for actions is happening.
I am unable to understand that if your model is being fed the edit actions already what exactly is your model predicting?
The text was updated successfully, but these errors were encountered:
Hi, this is the standard way to train the seq2seq model with teacher forcing where you would shift the input and output by one and compute the maximum log-likelihood. (please not the out start with index 0 and tar start with index 1, there's a shift here and thus input and output are not the same!)
out = prepared_batch[2][:, :]
tar = prepared_batch[2][:, 1:]
Here we give the expert edit sequence for training (but at time step t, the model can only see the gold edit label of time t-1) and if you look at the actual model (editNTS.py) you will see that the prediction is being shifted by one during training.
During the inference, the model however does not have this information and has to do inference one edit at a time.
Many thanks for your question and I hope this clarify your concern.
Hi, I have been trying to understand the provided code and have certain concerns about the decoder.
main.py contains the following pieces of code
I presume these are the output and the target (and have the same information content ie. edit actions) .
out
is then being fed to theedit_net
, this does not seem to make sense.thereafter, In the decoder part, the code uses the same
out
does manipulation to createoutput_t
which is then returned as the result of the EditNet. I am unable to find the parts in your code where prediction for actions is happening.I am unable to understand that if your model is being fed the edit actions already what exactly is your model predicting?
The text was updated successfully, but these errors were encountered: