You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You can confirm that 8.4k tokens were scraped from GetTogether.community by CommonCrawl and are included in Google's C4 dataset. It's likely that other LLMs have scraped and will continue to scrape user-generated content from GetTogether.community to train their proprietary large language models.
This can be discouraged for CommonCrawl and ChatGPT with the proper robots.txt inclusion:
You can confirm that 8.4k tokens were scraped from GetTogether.community by CommonCrawl and are included in Google's C4 dataset. It's likely that other LLMs have scraped and will continue to scrape user-generated content from GetTogether.community to train their proprietary large language models.
This can be discouraged for CommonCrawl and ChatGPT with the proper robots.txt inclusion:
The text was updated successfully, but these errors were encountered: