[transformers] Prompt masking #2192

horheynm · 2024-03-20T13:56:44Z

Applying character masks to a prompts in the format [foo]some text here\n[bar]response here, to mask characters owned by[bar]

Satrat

Calculation of the mask looks good but pretty sure its being applied backwards. Lets add a test case to test the whole pipeline so we catch this error(ie text -> tokenization -> text and confirm the labels are masked out only on the prompts.

Also, can you update the internal_docs repo with instructions on how to use this new feature on a custom dataset: https://github.com/neuralmagic/internal-docs/blob/main/teams/ml-engineering/sparseml/text_generation/custom_datasets.md

tests/sparseml/transformers/utils/test_helpers.py

src/sparseml/transformers/finetune/data/custom.py

src/sparseml/transformers/utils/helpers.py

src/sparseml/transformers/finetune/data/base.py

…ompt-mask

Satrat

Overall LGTM, just want to see a few more test cases (see comments)

src/sparseml/transformers/utils/helpers.py

tests/sparseml/transformers/utils/test_helpers.py

horheynm added 5 commits March 15, 2024 22:45

mask, offset_mapping and filtering

405b56a

bug cuda bug on applying mask

7ffc0b5

fix cuda bug

0a81315

Delete _scratch/mask.py

cd66340

Merge branch 'main' into prompt-mask

7273529

horheynm requested review from Satrat and abhinavnmagic March 20, 2024 13:57

bfineran previously approved these changes Mar 20, 2024

View reviewed changes

Satrat suggested changes Mar 26, 2024

View reviewed changes

horheynm added 2 commits March 27, 2024 17:56

fix non prompt case

5b76267

Merge branch 'prompt-mask' of github.com:neuralmagic/sparseml into pr…

6614ef1

…ompt-mask

horheynm dismissed bfineran’s stale review via 6614ef1 March 27, 2024 17:56

horheynm added 3 commits April 1, 2024 17:44

fix bug for no prompt

2ca4be7

test case with no starting prompt tag

d819f8d

Merge branch 'main' into prompt-mask

1392a35

Satrat suggested changes Apr 2, 2024

View reviewed changes

src/sparseml/transformers/utils/helpers.py Outdated Show resolved Hide resolved

tests/sparseml/transformers/utils/test_helpers.py Show resolved Hide resolved

tests/sparseml/transformers/utils/test_helpers.py Show resolved Hide resolved

comments

68f4eac

Satrat previously approved these changes Apr 5, 2024

View reviewed changes

bfineran previously approved these changes Apr 8, 2024

View reviewed changes

Merge branch 'main' into prompt-mask

4b0ad66

bfineran dismissed stale reviews from Satrat and themself via 4b0ad66 April 15, 2024 15:11

bfineran changed the title ~~Prompt mask~~ [transformers] Prompt masking Apr 15, 2024

Merge branch 'main' into prompt-mask

e925cde

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[transformers] Prompt masking #2192

[transformers] Prompt masking #2192

horheynm commented Mar 20, 2024 •

edited

Loading

Satrat left a comment

Satrat left a comment

[transformers] Prompt masking #2192

Are you sure you want to change the base?

[transformers] Prompt masking #2192

Conversation

horheynm commented Mar 20, 2024 • edited Loading

Satrat left a comment

Choose a reason for hiding this comment

Satrat left a comment

Choose a reason for hiding this comment

horheynm commented Mar 20, 2024 •

edited

Loading