Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Lab dataset #1

Open
sumit-agrwl opened this issue Nov 28, 2022 · 5 comments
Open

Bug Lab dataset #1

sumit-agrwl opened this issue Nov 28, 2022 · 5 comments

Comments

@sumit-agrwl
Copy link

I saw that there were experiments where you compared against buglab. I am having difficulties in generating data using the docker method proposed by buglab. It would be really helpful if you can share the buglab data which you used for your experiments.

@LostBenjamin
Copy link
Collaborator

Hi,

We did not use the BugLab data. Instead, we ran BugLab with our dataset converted to the BugLab format.

Best,
Jingxuan

@LostBenjamin
Copy link
Collaborator

Why not use our dataset?

@sumit-agrwl
Copy link
Author

Hi Jingxuan,

Thank you for your prompt reply. I am currently planning to use your dataset. I have a small confusion there. The dataset folder contains three splits "synthetic", "constrastive" and "real". I believe the contrastive data was used in the first iteration with the contrastive loss and the real data was used in the second iteration. What is the synthetic split ?

@LostBenjamin
Copy link
Collaborator

Your understanding about "contrastive" and "real" is correct. "synthetic" is for a version of the first iteration without the contrastive loss.

@LostBenjamin
Copy link
Collaborator

I received a notification on a commnet asking where the tokinizer is but somehow the comment is not shown now.

The tokenizer vocabulary is included in the pretrained model (README updated for this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants