From a60bccf9d17b9815513af078c46229ac3395c5e7 Mon Sep 17 00:00:00 2001 From: DMRobertson Date: Tue, 3 Oct 2023 10:53:20 +0000 Subject: deploy: 8b50a9d01da2c84bb9838287519fa3e0a4e955ce --- v1.94/user_directory.html | 319 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 319 insertions(+) create mode 100644 v1.94/user_directory.html (limited to 'v1.94/user_directory.html') diff --git a/v1.94/user_directory.html b/v1.94/user_directory.html new file mode 100644 index 0000000000..df4d710ab0 --- /dev/null +++ b/v1.94/user_directory.html @@ -0,0 +1,319 @@ + + + + + + User Directory - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

User Directory API Implementation

+

The user directory is maintained based on users that are 'visible' to the homeserver - +i.e. ones which are local to the server and ones which any local user shares a +room with.

+

The directory info is stored in various tables, which can sometimes get out of +sync (although this is considered a bug). If this happens, for now the +solution to fix it is to use the admin API +and execute the job regenerate_directory. This should then start a background task to +flush the current tables and regenerate the directory. Depending on the size +of your homeserver (number of users and rooms) this can take a while.

+

Data model

+

There are five relevant tables that collectively form the "user directory". +Three of them track a list of all known users. The last two (collectively called +the "search tables") track which users are visible to each other.

+

From all of these tables we exclude three types of local user:

+
    +
  • support users
  • +
  • appservice users
  • +
  • deactivated users
  • +
+

A description of each table follows:

+
    +
  • +

    user_directory. This contains the user ID, display name and avatar of each user.

    +
      +
    • Because there is only one directory entry per user, it is important that it +only contain publicly visible information. Otherwise, this will leak the +nickname or avatar used in a private room.
    • +
    • Indexed on rooms. Indexed on users.
    • +
    +
  • +
  • +

    user_directory_search. To be joined to user_directory. It contains an extra +column that enables full text search based on user IDs and display names. +Different schemas for SQLite and Postgres are used.

    +
      +
    • Indexed on the full text search data. Indexed on users.
    • +
    +
  • +
  • +

    user_directory_stream_pos. When the initial background update to populate +the directory is complete, we record a stream position here. This indicates +that synapse should now listen for room changes and incrementally update +the directory where necessary. (See stream positions.)

    +
  • +
  • +

    users_in_public_rooms. Contains associations between users and the public +rooms they're in. Used to determine which users are in public rooms and should +be publicly visible in the directory. Both local and remote users are tracked.

    +
  • +
  • +

    users_who_share_private_rooms. Rows are triples (L, M, room id) where L +is a local user and M is a local or remote user. L and M should be +different, but this isn't enforced by a constraint.

    +

    Note that if two local users share a room then there will be two entries: +(user1, user2, !room_id) and (user2, user1, !room_id).

    +
  • +
+

Configuration options

+

The exact way user search works can be tweaked via some server-level +configuration options.

+

The information is not repeated here, but the options are mentioned below.

+

Search algorithm

+

If search_all_users is false, then results are limited to users who:

+
    +
  1. Are found in the users_in_public_rooms table, or
  2. +
  3. Are found in the users_who_share_private_rooms where L is the requesting +user and M is the search result.
  4. +
+

Otherwise, if search_all_users is true, no such limits are placed and all +users known to the server (matching the search query) will be returned.

+

By default, locked users are not returned. If show_locked_users is true then +no filtering on the locked status of a user is done.

+

The user provided search term is lowercased and normalized using NFKC, +this treats the string as case-insensitive, canonicalizes different forms of the +same text, and maps some "roughly equivalent" characters together.

+

The search term is then split into words:

+
    +
  • If ICU is +available, then the system's default locale +will be used to break the search term into words. (See the +installation instructions for how to install ICU.)
  • +
  • If unavailable, then runs of ASCII characters, numbers, underscores, and hyphens +are considered words.
  • +
+

The queries for PostgreSQL and SQLite are detailed below, by their overall goal +is to find matching users, preferring users who are "real" (e.g. not bots, +not deactivated). It is assumed that real users will have an display name and +avatar set.

+

PostgreSQL

+

The above words are then transformed into two queries:

+
    +
  1. "exact" which matches the parsed words exactly (using to_tsquery);
  2. +
  3. "prefix" which matches the parsed words as prefixes (using to_tsquery).
  4. +
+

Results are composed of all rows in the user_directory_search table whose information +matches one (or both) of these queries. Results are ordered by calculating a weighted +score for each result, higher scores are returned first:

+
    +
  • 4x if a user ID exists.
  • +
  • 1.2x if the user has a display name set.
  • +
  • 1.2x if the user has an avatar set.
  • +
  • 0x-3x by the full text search results using the ts_rank_cd function +against the "exact" search query; this has four variables with the following weightings: +
      +
    • D: 0.1 for the user ID's domain
    • +
    • C: 0.1 for unused
    • +
    • B: 0.9 for the user's display name (or an empty string if it is not set)
    • +
    • A: 0.1 for the user ID's localpart
    • +
    +
  • +
  • 0x-1x by the full text search results using the ts_rank_cd function against the +"prefix" search query. (Using the same weightings as above.)
  • +
  • If prefer_local_users is true, then 2x if the user is local to the homeserver.
  • +
+

Note that ts_rank_cd returns a weight between 0 and 1. The initial weighting of +all results is 1.

+

SQLite

+

Results are composed of all rows in the user_directory_search whose information +matches the query. Results are ordered by the following information, with each +subsequent column used as a tiebreaker, for each result:

+
    +
  1. By the rank +of the full text search results using the matchinfo function. Higher +ranks are returned first.
  2. +
  3. If prefer_local_users is true, then users local to the homeserver are +returned first.
  4. +
  5. Users with a display name set are returned first.
  6. +
  7. Users with an avatar set are returned first.
  8. +
+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + \ No newline at end of file -- cgit 1.5.1