You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After re-reading the GOT paper, I’d like more insight into how noise or document quality was handled during training. For example, was there any focus on the percentage of pdf documents from Common Crawl that were distorted or noisy?
In my experiments, adding noise to 40% of the documents during fine-tuning still results in hallucinations during inference. Should I increase that margin, or would starting training from scratch be a better approach?
The text was updated successfully, but these errors were encountered:
After re-reading the GOT paper, I’d like more insight into how noise or document quality was handled during training. For example, was there any focus on the percentage of pdf documents from Common Crawl that were distorted or noisy?
In my experiments, adding noise to 40% of the documents during fine-tuning still results in hallucinations during inference. Should I increase that margin, or would starting training from scratch be a better approach?
The text was updated successfully, but these errors were encountered: