Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About Evaluate LRP for your dataset - Clarification on data selection and preprocessing #10

Open
zhongmz opened this issue Apr 12, 2023 · 0 comments

Comments

@zhongmz
Copy link

zhongmz commented Apr 12, 2023

Hello

First and foremost, I would like to express my gratitude to you for this outstanding work

I am interested in evaluating LRP for my dataset, and I have a couple of questions regarding the data selection and preprocessing steps. I would appreciate it if you could provide some clarification on the following points:

  1. When selecting the data for evaluation, is it necessary to choose source sentences with equal lengths and target sentences with equal lengths? Or can the sentences have varying lengths?

  2. Should the selected data be preprocessed with BPE tokenization, or is it supposed to be the raw test set without any tokenization applied?

Thank you in advance for your assistance! Looking forward to your response.

Best regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant