-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug Lab dataset #1
Comments
Hi, We did not use the BugLab data. Instead, we ran BugLab with our dataset converted to the BugLab format. Best, |
Why not use our dataset? |
Hi Jingxuan, Thank you for your prompt reply. I am currently planning to use your dataset. I have a small confusion there. The dataset folder contains three splits "synthetic", "constrastive" and "real". I believe the contrastive data was used in the first iteration with the contrastive loss and the real data was used in the second iteration. What is the synthetic split ? |
Your understanding about "contrastive" and "real" is correct. "synthetic" is for a version of the first iteration without the contrastive loss. |
I received a notification on a commnet asking where the tokinizer is but somehow the comment is not shown now. The tokenizer vocabulary is included in the pretrained model (README updated for this). |
I saw that there were experiments where you compared against buglab. I am having difficulties in generating data using the docker method proposed by buglab. It would be really helpful if you can share the buglab data which you used for your experiments.
The text was updated successfully, but these errors were encountered: