-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add documentation about the user directory search algorithm #16320
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this (and for fixing up my prose from a couple of years ago).
I think this looks good. I've left some thoughts and a few suggestions, but don't consider them blockers.
|
||
Results are sorted by a rank derived by: | ||
|
||
* 4x if a user ID exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err, when would this not be the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same question. I think it is not-nullable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=> \d user_directory_search
Table "user_directory_search"
Column | Type | Collation | Nullable | Default
---------+----------+-----------+----------+---------
user_id | text | | not null |
vector | tsvector | | |
Indexes:
"user_directory_search_fts_idx" gin (vector)
"user_directory_search_user_idx" UNIQUE, btree (user_id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this actually means the initial weight is 4, not 1? 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DMRobertson I tried to take your feedback into account. I suspect it is worth a re-read and not looking at the diff? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments left to your discretion. 🚢 🇮🇹 !
* `user_directory_search`. To be joined to `user_directory`. It contains an extra | ||
column that enables full text search based on user ids and display names. | ||
Different schemas for SQLite and Postgres with different code paths to match. | ||
column that enables full text search based on user IDs and display names. | ||
Different schemas for SQLite and Postgres are used. | ||
- Indexed on the full text search data. Indexed on users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside: I always found it strange that this is a separate table. I guess it's to limit the sqlite vs postgres differences to a small table rather than a larger one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is because you enable "full text search" on a table, but the database engines then backs that by multiple tables that are hidden from you.
docs/user_directory.md
Outdated
* `user_directory_stream_pos`. When the initial background update to populate | ||
the directory is complete, we record a stream position here. This indicates | ||
that synapse should now listen for room changes and incrementally update | ||
the directory where necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could cross-reference to https://matrix-org.github.io/synapse/latest/development/synapse_architecture/streams.html here maybe
Co-authored-by: David Robertson <[email protected]>
No significant changes since 1.94.0rc1. - Render plain, CSS, CSV, JSON and common image formats in the browser (inline) when requested through the /download endpoint. ([\matrix-org#15988](matrix-org#15988)) - Add experimental support for [MSC4028](matrix-org/matrix-spec-proposals#4028) to push all encrypted events to clients. ([\matrix-org#16361](matrix-org#16361)) - Minor performance improvement when sending presence to federated servers. ([\matrix-org#16385](matrix-org#16385)) - Minor performance improvement by caching server ACL checking. ([\matrix-org#16360](matrix-org#16360)) - Add developer documentation concerning gradual schema migrations with column alterations. ([\matrix-org#15691](matrix-org#15691)) - Improve documentation of the user directory search algorithm. ([\matrix-org#16320](matrix-org#16320)) - Fix rendering of user admin API documentation around deactivation. This was broken in Synapse 1.91.0. ([\matrix-org#16355](matrix-org#16355)) - Update documentation around message retention policies. ([\matrix-org#16382](matrix-org#16382)) - Add note to `federation_domain_whitelist` config option to clarify its usage. ([\matrix-org#16416](matrix-org#16416)) - Improve legacy release notes. ([\matrix-org#16418](matrix-org#16418)) - Remove Python version from `/_synapse/admin/v1/server_version`. ([\matrix-org#16380](matrix-org#16380)) - Avoid running CI steps when the files they check have not been changed. ([\matrix-org#14745](matrix-org#14745), [\matrix-org#16387](matrix-org#16387)) - Improve type hints. ([\matrix-org#14911](matrix-org#14911), [\matrix-org#16350](matrix-org#16350), [\matrix-org#16356](matrix-org#16356), [\matrix-org#16395](matrix-org#16395)) - Added support for pydantic v2 in addition to pydantic v1. Contributed by Maxwell G (@gotmax23). ([\matrix-org#16332](matrix-org#16332)) - Get CI to check PRs have been signed-off. ([\matrix-org#16348](matrix-org#16348)) - Add missing licence header. ([\matrix-org#16359](matrix-org#16359)) - Improve type hints, and bump types-psycopg2 from 2.9.21.11 to 2.9.21.14. ([\matrix-org#16381](matrix-org#16381)) - Improve comments in `StateGroupBackgroundUpdateStore`. ([\matrix-org#16383](matrix-org#16383)) - Update maturin configuration. ([\matrix-org#16394](matrix-org#16394)) - Downgrade replication stream time out error log lines to warning. ([\matrix-org#16401](matrix-org#16401)) * Bump actions/checkout from 3 to 4. ([\matrix-org#16250](matrix-org#16250)) * Bump cryptography from 41.0.3 to 41.0.4. ([\matrix-org#16362](matrix-org#16362)) * Bump dawidd6/action-download-artifact from 2.27.0 to 2.28.0. ([\matrix-org#16374](matrix-org#16374)) * Bump docker/setup-buildx-action from 2 to 3. ([\matrix-org#16375](matrix-org#16375)) * Bump gitpython from 3.1.35 to 3.1.37. ([\matrix-org#16376](matrix-org#16376)) * Bump msgpack from 1.0.5 to 1.0.6. ([\matrix-org#16377](matrix-org#16377)) * Bump msgpack from 1.0.6 to 1.0.7. ([\matrix-org#16412](matrix-org#16412)) * Bump phonenumbers from 8.13.19 to 8.13.22. ([\matrix-org#16413](matrix-org#16413)) * Bump psycopg2 from 2.9.7 to 2.9.8. ([\matrix-org#16409](matrix-org#16409)) * Bump pydantic from 2.3.0 to 2.4.2. ([\matrix-org#16410](matrix-org#16410)) * Bump regex from 1.9.5 to 1.9.6. ([\matrix-org#16408](matrix-org#16408)) * Bump sentry-sdk from 1.30.0 to 1.31.0. ([\matrix-org#16378](matrix-org#16378)) * Bump types-netaddr from 0.8.0.9 to 0.9.0.1. ([\matrix-org#16411](matrix-org#16411)) * Bump types-psycopg2 from 2.9.21.11 to 2.9.21.14. ([\matrix-org#16381](matrix-org#16381)) * Bump urllib3 from 1.26.15 to 1.26.17. ([\matrix-org#16422](matrix-org#16422)) # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE8SRSDO7gYkSP4chELS76LzL74EcFAmUlINwACgkQLS76LzL7 # 4EdvExAAgjk6+/Fu45MRG7u5kFmFzoZWLOPD10XROANaSeqW1l/pBhFh+XvwR4TZ # l/FdkSfS9YpHnw3aof13TclLu6IVWDM+vqYFuY2HSY/yzbcGvJFHqr26kOccpTTd # 2r9m/AkguyHEBECDW8qJLXb8M7dqNa2SydTBu1+IrKfj6nq+fRxVyQhyAJXrI1Ta # Dnz0XJ4TcwTrMPVk4MYrAcYjID6IV89dtp7ttH4DwXKDeSjMtxM/46EIg4u+VXDz # fzK25JHVFYJA5+/rOn/RslmxjJHQfEIEB6NYxQwLeMeZuGSZooTebKn1odwogvhI # Srtfsytum+twgSHD1s+7KldM+EjTiu7ouKi8VcfOlFuLnuBiROEc5WUljcL5K63F # kVx2bXGU/eNkPp6ntNhYfgswx+yk2rXFqkTjz+xZQIZcOBqehHBDy8VhtwlRkTUw # bzocdKkLMA4nfSlq5fFOAErMqJKsPS8aN9yYPShqEUiSUOKle8eHfA1cTXJuK0MS # K2/YcDDZmJBrwVADyNDk5GKaDx39rR752OSuJb57Sp/edwUg6+H1I6lIN6YTeoJw # FzJwGMzuMCktOQRW2enxQiA6RZjXFCwvD1LoWMjyO4YTXQwXxNCXsb0kLKUqfwsy # qMGphWEl3rdzVSuFapNAgOLF0RfFNYZdhQnk+3fNEwxumxoqgho= # =hx8G # -----END PGP SIGNATURE----- # gpg: Signature made Tue Oct 10 11:01:00 2023 BST # gpg: using RSA key F124520CEEE062448FE1C8442D2EFA2F32FBE047 # gpg: Can't check signature: No public key # Conflicts: # .github/workflows/docker.yml # .github/workflows/docs-pr-netlify.yaml # .github/workflows/docs-pr.yaml # .github/workflows/docs.yaml # .github/workflows/latest_deps.yml # .github/workflows/poetry_lockfile.yaml # .github/workflows/push_complement_image.yml # .github/workflows/release-artifacts.yml # .github/workflows/tests.yml # .github/workflows/twisted_trunk.yml # poetry.lock # rust/src/push/base_rules.rs
Cleans-up the current user directory documentation and expands it a bit. It documents the algorithm (which is kind of readable from the code, but kind of not).