Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The input to the Model and the target is the same. #9

Open
adityakrgupta25 opened this issue Jun 16, 2020 · 2 comments
Open

The input to the Model and the target is the same. #9

adityakrgupta25 opened this issue Jun 16, 2020 · 2 comments

Comments

@adityakrgupta25
Copy link

adityakrgupta25 commented Jun 16, 2020

Hi, I have been trying to understand the provided code and have certain concerns about the decoder.

main.py contains the following pieces of code

out = prepared_batch[2][:, :]
tar = prepared_batch[2][:, 1:]

I presume these are the output and the target (and have the same information content ie. edit actions) .
out is then being fed to the edit_net, this does not seem to make sense.

output = edit_net(org, out, org_ids, org_pos, simp_ids)

thereafter, In the decoder part, the code uses the same out does manipulation to create output_t which is then returned as the result of the EditNet. I am unable to find the parts in your code where prediction for actions is happening.

I am unable to understand that if your model is being fed the edit actions already what exactly is your model predicting?

@adityakrgupta25
Copy link
Author

Hi, @yuedongP , can you provide updates and elaborate on the above?

@YueDongCS
Copy link
Owner

YueDongCS commented Jun 24, 2020

Hi, this is the standard way to train the seq2seq model with teacher forcing where you would shift the input and output by one and compute the maximum log-likelihood. (please not the out start with index 0 and tar start with index 1, there's a shift here and thus input and output are not the same!)
out = prepared_batch[2][:, :]
tar = prepared_batch[2][:, 1:]

Here we give the expert edit sequence for training (but at time step t, the model can only see the gold edit label of time t-1) and if you look at the actual model (editNTS.py) you will see that the prediction is being shifted by one during training.

During the inference, the model however does not have this information and has to do inference one edit at a time.

Many thanks for your question and I hope this clarify your concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants