-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Watermarking LogitsProcessor and WatermarkDetector #29676
Conversation
@@ -1474,6 +1495,8 @@ def generate( | |||
encoder_input_ids=inputs_tensor, | |||
prefix_allowed_tokens_fn=prefix_allowed_tokens_fn, | |||
logits_processor=logits_processor, | |||
tokenizer=tokenizer, | |||
device=inputs_tensor.device, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope passing a device into the processor directly, works for the multi-gpu generate
.
Actually, this is quite handy to be able to init tensors in their devices, while init the processor. Especially when we make compile
compatible processors, where we already moved to init some of arguments in tensor format.
cc @JonasGeiping @jwkirchenbauer I would love to have your feedback on the default values we chose to use 🤗 |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Hi, great work! Supporting a default context length of 4 would be something I would really advocate for, based on results such as Fig.2 in https://arxiv.org/abs/2306.04634. A separate concern is execution speed. We hooked the watermark into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR won't be complete without the Detector -- without it, our users will struggle to use the watermarking techniques and we will struggle to test their correctness.
I'd suggest adding the Detector in a new file in the generation
folder (watermarking.py
). In their docstrings we could add usage examples :)
Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
@JonasGeiping Thanks for the feedback! Yes, indeed the higher context width has better performance. I had reluctance adding more complexity to the code when opening PR, but now that we are okay with adding the whole watermarking functionality, I will add possibility for users to set their own context width. Yes, different implementation of rng which works in batched form would be very nice to have. Right now I am not planning to work on it, and I prefer to leave it for future plans if we see active usage of the watermarking feature 😄 |
complexity could be reduced a bit by removing self-hashing for now. This setting has several implementation complexities, and without efficient RNG is quite slow to use it during text-gen for a purpose that is not testing watermark quality. |
@gante , where can we add a doc for the detector? The tests are failing otherwise. |
@zucchini-nlp here -- https://github.com/huggingface/transformers/blob/main/docs/source/en/internal/generation_utils.md (ping me again when it's ready for a review :) ) |
@gante sorry, forgot to tag. Yes, ready to review. Added a config for watermark args, changed cache size and rewrote some docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The missing bits are nits, so I'm approving the PR. Well done, this is a cool feature 🔥
Missing: somewhere in the docs (in addition to the API docs) showcasing this feature. Perhaps a section below this one?
@@ -17,7 +17,7 @@ | |||
|
|||
import numpy as np | |||
|
|||
from ... import is_vision_available, requires_backends |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect these changes come from another PR :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they do, i need to rebase main
after it's merged
|
||
""" | ||
|
||
# Let's assume that if one batch start with `bos`, all batched also do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: what happens with left-padding? Does left-padding have an impact on the detector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I did not think someone would feed left-padded text. What I usually do if feeding only the generated text part, excluding the prompt.
Prompt itself has some effect on the z-score but in my toy examples the final prediction did not change. But I believe long prompt with smaller generated text might be predicted as "human" 🤔
Anyway, I will better explain this in the docs, adding more information next to the "Generation strategies".
@zucchini-nlp I'd also edit the PR header, it is outdated :) (for instance, it says users should use the detector from the original repo) |
Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
I'll review next week! 🤗 |
Hi, @zucchini-nlp Running
gives
Could you look into this? Full error log
|
@ydshieh my bad, did not fix tests after latest changes |
No worry. (But it is always nice to check the tests once we think a PR is ready at some point 😄 ) |
We need to check the docstrings for |
@ArthurZucker ping |
Thanks for the ping, reviewing now! |
OUps on it today sorry @zucchini-nlp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for working on this interesting feature!
I think the doc can be improved a bit, but otherwise very clean
>>> model = AutoModelForCausalLM.from_pretrained("openai-community/gpt2") | ||
>>> tok = AutoTokenizer.from_pretrained("openai-community/gpt2") | ||
>>> tok.pad_token_id = tok.eos_token_id | ||
>>> tok.padding_side = "left" | ||
|
||
>>> inputs = tok(["This is the beginning of a long story", "Alice and Bob are"], padding=True, return_tensors="pt") | ||
>>> input_len = inputs["input_ids"].shape[-1] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from this snippet I have no idea what the green and red is, no idea what the prediction says, Truem True?
Is the detector detecting watermarking? What is it detecting etc. Think this needs to be improved!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a bit more info, but for full understanding it is better to read the paper. I will give a very brief overview of the general idea behind the tecnique :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Summing up without the user or me to dig into the paper is nice
``` | ||
>>> # to detect watermarked text use the WatermarkDetector class | ||
>>> from transformers import WatermarkDetector | ||
>>> detector = WatermarkDetector(model_config=model.config, device="cpu", watermarking_config= watermarking_config) | ||
>>> detection_preds = detector(out) | ||
>>> detection_preds | ||
array([ True]) | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring seems a bit wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(the code blkock stops while it should not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example is a bit better! But would be nice to explain in the md that a bias is added to the logits of both
, which makes the next token generated be "stilleinstead of
in` (I am saying this as an example, but what actually happend?). which words are green in this example?)
Also could we manually set green words?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is a randomized internal process of "green" token selection, which is set by indicating a hashing fn and a hash key. This two can be later used to reverse the process and check how many green tokens the generated text contains, and if it's statistically likely for a human-generated text to have this proportion of green tokens.
Not sure if we have to give the whole explanation of the algorithm or just refer to the paper though 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's give a brief explanation, it's better to sum up and also link to the paper, but for both me (the reviewer) and any curious user, we don't want to go and read everything!
) | ||
return num_tokens_scored_batch, green_token_count_batch | ||
|
||
def _compute_z_score(self, green_token_count: np.array, total_num_tokens: np.array) -> np.array: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is a z score? Where does it come from?
>>> detection_out_watermarked.prediction | ||
array([ True, True]) | ||
|
||
>>> detection_out.prediction | ||
array([False, False]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's always going to be watermarked in batches for the generate right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be unbatched also, depends on what is passed as the input so that it just works as a simple add-on on the generate. The logits_process.py
has a one element watermarking in its docstring
z_score (np.array of shape (batch_size)): | ||
Array containing the z-score for each batch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a more reprensentative name for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a term from stats, so I guess we cannot just call it how we want. I added a small explanation of what is z-score
The PR seems ready for me, all comments are addressed and the tests are passing. I see that I got two approvals, but I will leave it here until next week May 13 in case anyone wants to add something 😄 |
What does this PR do?
Adds a watermarking technique proposed in this paper to
transformers
logits processor. I added only the simple method (algorithm 2 from paper) and the robust one (algorithm 3), both with context length of 1 token only. I am not sure if we should support higher context width, defined by user in generation config.In contrast to the original repo, masking now is done in batched manner. Yet, I could not make the
_get_greenlist_ids
batched, so we are still left with a loop over batches one by one...Note, this is only a processor for generation with watermarking. Anyone who uses it and wants later to detect the watermarked text, has to use the
Detector
from the original repo, using their own private hashing keys.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@gante