Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Remove any NULL characters from remote displaynames before updating user directory #12743

Closed

Conversation

Fizzadar
Copy link
Contributor

@Fizzadar Fizzadar commented May 16, 2022

An event that looks like this:

{
    "content":
    {
        "displayname": "\u0001VERSION\u0001\u0000",
        "membership": "join"
    }
}

Causes an exception updating the user directory table A string literal cannot contain NUL (0x00) characters which never passes and just keeps retrying consuming a lot of CPU on main.

Is this the right fix?

Signed off by Nick ([email protected])

Pull Request Checklist

@Fizzadar Fizzadar requested a review from a team as a code owner May 16, 2022 13:29
@DMRobertson
Copy link
Contributor

I thought we tried to fix this fairly recently. #11230 I think?

Related: #10820

@DMRobertson
Copy link
Contributor

Ahh, I see: this is specific to the user directory.

@DMRobertson DMRobertson changed the title Remove any NULL characters from displaynames to avoid breaking database ten Remove any NULL characters from remote displaynames before updating user directory May 16, 2022
Comment on lines 468 to 469
# Replace any NULL characters in the name as these cannot be stored in the database
new_name = new_name.replace("\x00", "\uFFFD")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness, sqlite sort-of-supports null-codepointsin strings, with scary caveats: https://sqlite.org/nulinstr.html
Postgres definitely doesn't and won't any time soon. See e.g. this HN post.

@@ -464,6 +464,10 @@ async def _handle_possible_remote_profile_change(

prev_name = prev_event.content.get("displayname")
new_name = event.content.get("displayname")

# Replace any NULL characters in the name as these cannot be stored in the database
new_name = new_name.replace("\x00", "\uFFFD")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used a space as the replacement character in #10820. I think this might have been chosen to make sure that the text-search stuff (as in https://www.postgresql.org/docs/current/datatype-textsearch.html and https://www.postgresql.org/docs/current/textsearch-controls.html) work more nicely. Let me see if I can experiment to see how postgres handles a replacement character in that context...

Comment on lines 468 to 469
# Replace any NULL characters in the name as these cannot be stored in the database
new_name = new_name.replace("\x00", "\uFFFD")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to move this after the isinstance(new_name, str) validation below.

... actually do we want to do it after the has-anything-changed condition?:
prev_name != new_name or prev_avatar != new_avatar

@DMRobertson
Copy link
Contributor

I think I'd put the check in update_profile_in_user_dir, which has five call sites (including this one).

Fizzadar added 2 commits May 17, 2022 14:55
Extract the shared function into storage utils and reuse in both
locations for consistent results.
@Fizzadar
Copy link
Contributor Author

So I pulled out the function added in #11230 to ensure the handling is identical for displayname/avatar_url in both places (29aff26).

@Fizzadar Fizzadar requested a review from DMRobertson May 17, 2022 18:59
@DMRobertson
Copy link
Contributor

So I pulled out the function added in #11230 to ensure the handling is identical for displayname/avatar_url in both places (29aff26).

Ahh, sorry---we might have been duplicating effort. This sounds like what I put together on #12762

@Fizzadar Fizzadar deleted the fix-null-chars-userdir-error branch May 18, 2022 11:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

User directory background process gets stuck when someone sets a name with a null codepoint
4 participants