Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BanSubstrings doesn't redacts all correct words if case_sensitive=False #210

Open
aalbersk opened this issue Jan 17, 2025 · 1 comment
Open

Comments

@aalbersk
Copy link

Describe the bug
When running BanSubstrings with case_sensitive=False and redact=True and scanning the prompt, the function will redact only the words that match the casing.

To Reproduce

prompt = "The user can perform arbitrary virus code execution by Virus injecting malicious code."
ban_substrings = BanSubstrings(substrings=["virus", "bug"], redact=True)
sanitized_prompt, results_valid, results_score = ban_substrings.scan(prompt)

Expected behavior
Actual: The user can perform arbitrary [REDACTED] code execution by Virus injecting malicious code.
Expected: The user can perform arbitrary [REDACTED] code execution by [REDACTED] injecting malicious code.

Possible solution
As str.replace is case sensitive, the issue might be solve by using regex's, e.g. like so:

def _redact_text(text: str, substrings: list[str]) -> str:
        redacted_text = text
        for s in substrings:
            regex_redacted = re.compile(re.escape(s), re.IGNORECASE)
            redacted_text = regex_redacted.sub("[REDACTED]", redacted_text)
        return redacted_text
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@aalbersk and others