-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TRL SFT data knowledge cutoff #2844
Comments
Do you any dataset example that would contain such data? Your concern is that it could corrupt the training in some sense right? |
Exactly! It can cause some issues in the training phase. Technically, any sort of rlhf schema that has SFT step with SFT data that you wanna fine-tune on those specific examples. If those specific examples contain any knowledge cutoff, it might cause issues! So far I did a very simple r"as of my last update",
r"as of my last knowledge update",
r"as of \d{4}", # Matches "As of 2024", "As of 2023", etc.
r"i do not have access to real-time information", for the trl-lib/tldr dataset, and could not find anything. I mean I found ~ 1000 examples that has the term "As of the year" or "as it date back" etc but when I took a closer look these examples were referring to some date or sth in the context as this dataset is from reddit and mostly from the relationship subreddit, so could not find any matching example with respect to my concern of model generating a cutoff knowledge completion! However, note that this specific dataset by nature is not really a good match for the purpose I mentioned earlier but I believe this can happen in other SFT data. Speaking of this, I also saw a similar thing raised in allenai/open-instruct. Therefore, I thought it might be nice if we add such support, WDYT? |
Imagine Im a user that only wanna use the Base model as publicly available models but for the sft training, I wanna do it with my local examples using TRL/sft_trainer, OR the datasets that TRL presents under trl-lib. Is my question valid? |
This information is typically included in the system prompt. So, even if the model has been trained on this "corrupting" data, it shouldn’t pose an issue during generation—but that’s just my intuition. In any case, this sounds more like a data preparation concern. Unless it’s a severe and recurring issue (i.e., well-documented and frequently reported), I’d consider it slightly beyond the scope of TRL. That said, now that this issue has been raised, if you have code that can detect or filter such data in a dataset, this would be the right place to share it. |
This was the quick code i used for the trl-lib/tldr dataset. import re
dataset = load_dataset("trl-lib/tldr", split="train")
# knowledge cutoff-related phrases
cutoff_patterns = [
r"as of my last update",
r"as of my last knowledge update",
r"as of \d{4}", # Matches "As of 2024", etc.
r"i do not have access to real-time information",
r"i was last updated in \d{4}",
]
def check_knowledge_cutoff(text):
text = text.lower() # Normalize to lowercase
return any(re.search(pattern, text) for pattern in cutoff_patterns)
cutoff_mentions = [
(row["prompt"], row["completion"])
for row in dataset
if check_knowledge_cutoff(row["prompt"]) or check_knowledge_cutoff(row["completion"])
]
# Optionally, print a few examples
print("Sample Matches:")
for i, (prompt, completion) in enumerate(cutoff_mentions):
print(f"{i+1}. Prompt: {prompt}\n Completion: {completion}\n") But I also found this PR on Allenai/open-instruct relevant to the topic! |
@qgallouedec
Just curious to know in general for any rlhf or rl for CoT schema built on TRL that has an SFT step or SFT data, is there any script that can figure out the outdated examples (knowledge cutoff) in the dataset which the sft model is getting fine-tuned on?
To clarify; we should have an script easily using a simple regex to recognize if the SFT data has any so-called knowledge-cutoff patterns like for instance "as of my last update" or "as my last update is in December 2024" etc.
I could not find any script taking care of it, if that's the case, I can help?
The text was updated successfully, but these errors were encountered: