BanSubstrings doesn't redacts all correct words if case_sensitive=False #210

aalbersk · 2025-01-17T18:19:46Z

Describe the bug
When running BanSubstrings with case_sensitive=False and redact=True and scanning the prompt, the function will redact only the words that match the casing.

To Reproduce

prompt = "The user can perform arbitrary virus code execution by Virus injecting malicious code."
ban_substrings = BanSubstrings(substrings=["virus", "bug"], redact=True)
sanitized_prompt, results_valid, results_score = ban_substrings.scan(prompt)

Expected behavior
Actual: The user can perform arbitrary [REDACTED] code execution by Virus injecting malicious code.
Expected: The user can perform arbitrary [REDACTED] code execution by [REDACTED] injecting malicious code.

Possible solution
As str.replace is case sensitive, the issue might be solve by using regex's, e.g. like so:

def _redact_text(text: str, substrings: list[str]) -> str:
        redacted_text = text
        for s in substrings:
            regex_redacted = re.compile(re.escape(s), re.IGNORECASE)
            redacted_text = regex_redacted.sub("[REDACTED]", redacted_text)
        return redacted_text

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BanSubstrings doesn't redacts all correct words if case_sensitive=False #210

BanSubstrings doesn't redacts all correct words if case_sensitive=False #210

aalbersk commented Jan 17, 2025

BanSubstrings doesn't redacts all correct words if case_sensitive=False #210

BanSubstrings doesn't redacts all correct words if case_sensitive=False #210

Comments

aalbersk commented Jan 17, 2025