From ca25be76d1e40ca03f81a561b4c25fd2a43ce23a Mon Sep 17 00:00:00 2001 From: reivilibre Date: Wed, 24 Apr 2024 13:43:33 +0000 Subject: deploy: 4cd6b75d0a95aa373068fae8b3a431fd453c9728 --- .../admin_api/background_updates.html | 274 +++++++++++++ .../usage/administration/admin_api/federation.html | 376 +++++++++++++++++ v1.106/usage/administration/admin_api/index.html | 236 +++++++++++ .../admin_api/registration_tokens.html | 443 +++++++++++++++++++++ v1.106/usage/administration/admin_faq.html | 416 +++++++++++++++++++ .../administration/database_maintenance_tools.html | 216 ++++++++++ v1.106/usage/administration/index.html | 211 ++++++++++ .../reporting_homeserver_usage_statistics.html | 274 +++++++++++++ .../usage/administration/monthly_active_users.html | 268 +++++++++++++ v1.106/usage/administration/request_log.html | 239 +++++++++++ v1.106/usage/administration/state_groups.html | 217 ++++++++++ ...derstanding_synapse_through_grafana_graphs.html | 254 ++++++++++++ .../administration/useful_sql_for_admins.html | 380 ++++++++++++++++++ 13 files changed, 3804 insertions(+) create mode 100644 v1.106/usage/administration/admin_api/background_updates.html create mode 100644 v1.106/usage/administration/admin_api/federation.html create mode 100644 v1.106/usage/administration/admin_api/index.html create mode 100644 v1.106/usage/administration/admin_api/registration_tokens.html create mode 100644 v1.106/usage/administration/admin_faq.html create mode 100644 v1.106/usage/administration/database_maintenance_tools.html create mode 100644 v1.106/usage/administration/index.html create mode 100644 v1.106/usage/administration/monitoring/reporting_homeserver_usage_statistics.html create mode 100644 v1.106/usage/administration/monthly_active_users.html create mode 100644 v1.106/usage/administration/request_log.html create mode 100644 v1.106/usage/administration/state_groups.html create mode 100644 v1.106/usage/administration/understanding_synapse_through_grafana_graphs.html create mode 100644 v1.106/usage/administration/useful_sql_for_admins.html (limited to 'v1.106/usage/administration') diff --git a/v1.106/usage/administration/admin_api/background_updates.html b/v1.106/usage/administration/admin_api/background_updates.html new file mode 100644 index 0000000000..f81a7c0564 --- /dev/null +++ b/v1.106/usage/administration/admin_api/background_updates.html @@ -0,0 +1,274 @@ + + + + + + Background Updates - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

Background Updates API

+

This API allows a server administrator to manage the background updates being +run against the database.

+

Status

+

This API gets the current status of the background updates.

+

The API is:

+
GET /_synapse/admin/v1/background_updates/status
+
+

Returning:

+
{
+    "enabled": true,
+    "current_updates": {
+        "<db_name>": {
+            "name": "<background_update_name>",
+            "total_item_count": 50,
+            "total_duration_ms": 10000.0,
+            "average_items_per_ms": 2.2,
+        },
+    }
+}
+
+

enabled whether the background updates are enabled or disabled.

+

db_name the database name (usually Synapse is configured with a single database named 'master').

+

For each update:

+

name the name of the update. +total_item_count total number of "items" processed (the meaning of 'items' depends on the update in question). +total_duration_ms how long the background process has been running, not including time spent sleeping. +average_items_per_ms how many items are processed per millisecond based on an exponential average.

+

Enabled

+

This API allow pausing background updates.

+

Background updates should not be paused for significant periods of time, as +this can affect the performance of Synapse.

+

Note: This won't persist over restarts.

+

Note: This won't cancel any update query that is currently running. This is +usually fine since most queries are short lived, except for CREATE INDEX +background updates which won't be cancelled once started.

+

The API is:

+
POST /_synapse/admin/v1/background_updates/enabled
+
+

with the following body:

+
{
+    "enabled": false
+}
+
+

enabled sets whether the background updates are enabled or disabled.

+

The API returns the enabled param.

+
{
+    "enabled": false
+}
+
+

There is also a GET version which returns the enabled state.

+

Run

+

This API schedules a specific background update to run. The job starts immediately after calling the API.

+

The API is:

+
POST /_synapse/admin/v1/background_updates/start_job
+
+

with the following body:

+
{
+    "job_name": "populate_stats_process_rooms"
+}
+
+

The following JSON body parameters are available:

+
    +
  • job_name - A string which job to run. Valid values are: +
      +
    • populate_stats_process_rooms - Recalculate the stats for all rooms.
    • +
    • regenerate_directory - Recalculate the user directory if it is stale or out of sync.
    • +
    +
  • +
+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/admin_api/federation.html b/v1.106/usage/administration/admin_api/federation.html new file mode 100644 index 0000000000..23d2517cf1 --- /dev/null +++ b/v1.106/usage/administration/admin_api/federation.html @@ -0,0 +1,376 @@ + + + + + + Federation - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

Federation API

+

This API allows a server administrator to manage Synapse's federation with other homeservers.

+

Note: This API is new, experimental and "subject to change".

+

List of destinations

+

This API gets the current destination retry timing info for all remote servers.

+

The list contains all the servers with which the server federates, +regardless of whether an error occurred or not. +If an error occurs, it may take up to 20 minutes for the error to be displayed here, +as a complete retry must have failed.

+

The API is:

+

A standard request with no filtering:

+
GET /_synapse/admin/v1/federation/destinations
+
+

A response body like the following is returned:

+
{
+   "destinations":[
+      {
+         "destination": "matrix.org",
+         "retry_last_ts": 1557332397936,
+         "retry_interval": 3000000,
+         "failure_ts": 1557329397936,
+         "last_successful_stream_ordering": null
+      }
+   ],
+   "total": 1
+}
+
+

To paginate, check for next_token and if present, call the endpoint again +with from set to the value of next_token. This will return a new page.

+

If the endpoint does not return a next_token then there are no more destinations +to paginate through.

+

Parameters

+

The following query parameters are available:

+
    +
  • from - Offset in the returned list. Defaults to 0.
  • +
  • limit - Maximum amount of destinations to return. Defaults to 100.
  • +
  • order_by - The method in which to sort the returned list of destinations. +Valid values are: +
      +
    • destination - Destinations are ordered alphabetically by remote server name. +This is the default.
    • +
    • retry_last_ts - Destinations are ordered by time of last retry attempt in ms.
    • +
    • retry_interval - Destinations are ordered by how long until next retry in ms.
    • +
    • failure_ts - Destinations are ordered by when the server started failing in ms.
    • +
    • last_successful_stream_ordering - Destinations are ordered by the stream ordering +of the most recent successfully-sent PDU.
    • +
    +
  • +
  • dir - Direction of room order. Either f for forwards or b for backwards. Setting +this value to b will reverse the above sort order. Defaults to f.
  • +
+

Caution: The database only has an index on the column destination. +This means that if a different sort order is used, +this can cause a large load on the database, especially for large environments.

+

Response

+

The following fields are returned in the JSON response body:

+
    +
  • destinations - An array of objects, each containing information about a destination. +Destination objects contain the following fields: +
      +
    • destination - string - Name of the remote server to federate.
    • +
    • retry_last_ts - integer - The last time Synapse tried and failed to reach the +remote server, in ms. This is 0 if the last attempt to communicate with the +remote server was successful.
    • +
    • retry_interval - integer - How long since the last time Synapse tried to reach +the remote server before trying again, in ms. This is 0 if no further retrying occurring.
    • +
    • failure_ts - nullable integer - The first time Synapse tried and failed to reach the +remote server, in ms. This is null if communication with the remote server has never failed.
    • +
    • last_successful_stream_ordering - nullable integer - The stream ordering of the most +recent successfully-sent PDU +to this destination, or null if this information has not been tracked yet.
    • +
    +
  • +
  • next_token: string representing a positive integer - Indication for pagination. See above.
  • +
  • total - integer - Total number of destinations.
  • +
+

Destination Details API

+

This API gets the retry timing info for a specific remote server.

+

The API is:

+
GET /_synapse/admin/v1/federation/destinations/<destination>
+
+

A response body like the following is returned:

+
{
+   "destination": "matrix.org",
+   "retry_last_ts": 1557332397936,
+   "retry_interval": 3000000,
+   "failure_ts": 1557329397936,
+   "last_successful_stream_ordering": null
+}
+
+

Parameters

+

The following parameters should be set in the URL:

+
    +
  • destination - Name of the remote server.
  • +
+

Response

+

The response fields are the same like in the destinations array in +List of destinations response.

+

Destination rooms

+

This API gets the rooms that federate with a specific remote server.

+

The API is:

+
GET /_synapse/admin/v1/federation/destinations/<destination>/rooms
+
+

A response body like the following is returned:

+
{
+   "rooms":[
+      {
+         "room_id": "!OGEhHVWSdvArJzumhm:matrix.org",
+         "stream_ordering": 8326
+      },
+      {
+         "room_id": "!xYvNcQPhnkrdUmYczI:matrix.org",
+         "stream_ordering": 93534
+      }
+   ],
+   "total": 2
+}
+
+

To paginate, check for next_token and if present, call the endpoint again +with from set to the value of next_token. This will return a new page.

+

If the endpoint does not return a next_token then there are no more destinations +to paginate through.

+

Parameters

+

The following parameters should be set in the URL:

+
    +
  • destination - Name of the remote server.
  • +
+

The following query parameters are available:

+
    +
  • from - Offset in the returned list. Defaults to 0.
  • +
  • limit - Maximum amount of destinations to return. Defaults to 100.
  • +
  • dir - Direction of room order by room_id. Either f for forwards or b for +backwards. Defaults to f.
  • +
+

Response

+

The following fields are returned in the JSON response body:

+
    +
  • rooms - An array of objects, each containing information about a room. +Room objects contain the following fields: +
      +
    • room_id - string - The ID of the room.
    • +
    • stream_ordering - integer - The stream ordering of the most recent +successfully-sent PDU +to this destination in this room.
    • +
    +
  • +
  • next_token: string representing a positive integer - Indication for pagination. See above.
  • +
  • total - integer - Total number of destinations.
  • +
+

Reset connection timeout

+

Synapse makes federation requests to other homeservers. If a federation request fails, +Synapse will mark the destination homeserver as offline, preventing any future requests +to that server for a "cooldown" period. This period grows over time if the server +continues to fail its responses +(exponential backoff).

+

Admins can cancel the cooldown period with this API.

+

This API resets the retry timing for a specific remote server and tries to connect to +the remote server again. It does not wait for the next retry_interval. +The connection must have previously run into an error and retry_last_ts +(Destination Details API) must not be equal to 0.

+

The connection attempt is carried out in the background and can take a while +even if the API already returns the http status 200.

+

The API is:

+
POST /_synapse/admin/v1/federation/destinations/<destination>/reset_connection
+
+{}
+
+

Parameters

+

The following parameters should be set in the URL:

+
    +
  • destination - Name of the remote server.
  • +
+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/admin_api/index.html b/v1.106/usage/administration/admin_api/index.html new file mode 100644 index 0000000000..f49740dff6 --- /dev/null +++ b/v1.106/usage/administration/admin_api/index.html @@ -0,0 +1,236 @@ + + + + + + Admin API - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

The Admin API

+

Authenticate as a server admin

+

Many of the API calls in the admin api will require an access_token for a +server admin. (Note that a server admin is distinct from a room admin.)

+

An existing user can be marked as a server admin by updating the database directly.

+

Check your database settings in the configuration file, connect to the correct database using either psql [database name] (if using PostgreSQL) or sqlite3 path/to/your/database.db (if using SQLite) and elevate the user @foo:bar.com to administrator.

+
UPDATE users SET admin = 1 WHERE name = '@foo:bar.com';
+
+

A new server admin user can also be created using the register_new_matrix_user +command. This is a script that is distributed as part of synapse. It is possibly +already on your $PATH depending on how Synapse was installed.

+

Finding your user's access_token is client-dependent, but will usually be shown in the client's settings.

+

Making an Admin API request

+

For security reasons, we recommend +that the Admin API (/_synapse/admin/...) should be hidden from public view using a +reverse proxy. This means you should typically query the Admin API from a terminal on +the machine which runs Synapse.

+

Once you have your access_token, you will need to authenticate each request to an Admin API endpoint by +providing the token as either a query parameter or a request header. To add it as a request header in cURL:

+
curl --header "Authorization: Bearer <access_token>" <the_rest_of_your_API_request>
+
+

For example, suppose we want to +query the account of the user +@foo:bar.com. We need an admin access token (e.g. +syt_AjfVef2_L33JNpafeif_0feKJfeaf0CQpoZk), and we need to know which port +Synapse's client listener is listening +on (e.g. 8008). Then we can use the following command to request the account +information from the Admin API.

+
curl --header "Authorization: Bearer syt_AjfVef2_L33JNpafeif_0feKJfeaf0CQpoZk" -X GET http://127.0.0.1:8008/_synapse/admin/v2/users/@foo:bar.com
+
+

For more details on access tokens in Matrix, please refer to the complete +matrix spec documentation.

+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/admin_api/registration_tokens.html b/v1.106/usage/administration/admin_api/registration_tokens.html new file mode 100644 index 0000000000..98c0d1807c --- /dev/null +++ b/v1.106/usage/administration/admin_api/registration_tokens.html @@ -0,0 +1,443 @@ + + + + + + Registration Tokens - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

Registration Tokens

+

Note: This API is disabled when MSC3861 is enabled. See #15582

+

This API allows you to manage tokens which can be used to authenticate +registration requests, as proposed in +MSC3231 +and stabilised in version 1.2 of the Matrix specification. +To use it, you will need to enable the registration_requires_token config +option, and authenticate by providing an access_token for a server admin: +see Admin API.

+

Registration token objects

+

Most endpoints make use of JSON objects that contain details about tokens. +These objects have the following fields:

+
    +
  • token: The token which can be used to authenticate registration.
  • +
  • uses_allowed: The number of times the token can be used to complete a +registration before it becomes invalid.
  • +
  • pending: The number of pending uses the token has. When someone uses +the token to authenticate themselves, the pending counter is incremented +so that the token is not used more than the permitted number of times. +When the person completes registration the pending counter is decremented, +and the completed counter is incremented.
  • +
  • completed: The number of times the token has been used to successfully +complete a registration.
  • +
  • expiry_time: The latest time the token is valid. Given as the number of +milliseconds since 1970-01-01 00:00:00 UTC (the start of the Unix epoch). +To convert this into a human-readable form you can remove the milliseconds +and use the date command. For example, date -d '@1625394937'.
  • +
+

List all tokens

+

Lists all tokens and details about them. If the request is successful, the top +level JSON object will have a registration_tokens key which is an array of +registration token objects.

+
GET /_synapse/admin/v1/registration_tokens
+
+

Optional query parameters:

+
    +
  • valid: true or false. If true, only valid tokens are returned. +If false, only tokens that have expired or have had all uses exhausted are +returned. If omitted, all tokens are returned regardless of validity.
  • +
+

Example:

+
GET /_synapse/admin/v1/registration_tokens
+
+
200 OK
+
+{
+    "registration_tokens": [
+        {
+            "token": "abcd",
+            "uses_allowed": 3,
+            "pending": 0,
+            "completed": 1,
+            "expiry_time": null
+        },
+        {
+            "token": "pqrs",
+            "uses_allowed": 2,
+            "pending": 1,
+            "completed": 1,
+            "expiry_time": null
+        },
+        {
+            "token": "wxyz",
+            "uses_allowed": null,
+            "pending": 0,
+            "completed": 9,
+            "expiry_time": 1625394937000    // 2021-07-04 10:35:37 UTC
+        }
+    ]
+}
+
+

Example using the valid query parameter:

+
GET /_synapse/admin/v1/registration_tokens?valid=false
+
+
200 OK
+
+{
+    "registration_tokens": [
+        {
+            "token": "pqrs",
+            "uses_allowed": 2,
+            "pending": 1,
+            "completed": 1,
+            "expiry_time": null
+        },
+        {
+            "token": "wxyz",
+            "uses_allowed": null,
+            "pending": 0,
+            "completed": 9,
+            "expiry_time": 1625394937000    // 2021-07-04 10:35:37 UTC
+        }
+    ]
+}
+
+

Get one token

+

Get details about a single token. If the request is successful, the response +body will be a registration token object.

+
GET /_synapse/admin/v1/registration_tokens/<token>
+
+

Path parameters:

+
    +
  • token: The registration token to return details of.
  • +
+

Example:

+
GET /_synapse/admin/v1/registration_tokens/abcd
+
+
200 OK
+
+{
+    "token": "abcd",
+    "uses_allowed": 3,
+    "pending": 0,
+    "completed": 1,
+    "expiry_time": null
+}
+
+

Create token

+

Create a new registration token. If the request is successful, the newly created +token will be returned as a registration token object in the response body.

+
POST /_synapse/admin/v1/registration_tokens/new
+
+

The request body must be a JSON object and can contain the following fields:

+
    +
  • token: The registration token. A string of no more than 64 characters that +consists only of characters matched by the regex [A-Za-z0-9._~-]. +Default: randomly generated.
  • +
  • uses_allowed: The integer number of times the token can be used to complete +a registration before it becomes invalid. +Default: null (unlimited uses).
  • +
  • expiry_time: The latest time the token is valid. Given as the number of +milliseconds since 1970-01-01 00:00:00 UTC (the start of the Unix epoch). +You could use, for example, date '+%s000' -d 'tomorrow'. +Default: null (token does not expire).
  • +
  • length: The length of the token randomly generated if token is not +specified. Must be between 1 and 64 inclusive. Default: 16.
  • +
+

If a field is omitted the default is used.

+

Example using defaults:

+
POST /_synapse/admin/v1/registration_tokens/new
+
+{}
+
+
200 OK
+
+{
+    "token": "0M-9jbkf2t_Tgiw1",
+    "uses_allowed": null,
+    "pending": 0,
+    "completed": 0,
+    "expiry_time": null
+}
+
+

Example specifying some fields:

+
POST /_synapse/admin/v1/registration_tokens/new
+
+{
+    "token": "defg",
+    "uses_allowed": 1
+}
+
+
200 OK
+
+{
+    "token": "defg",
+    "uses_allowed": 1,
+    "pending": 0,
+    "completed": 0,
+    "expiry_time": null
+}
+
+

Update token

+

Update the number of allowed uses or expiry time of a token. If the request is +successful, the updated token will be returned as a registration token object +in the response body.

+
PUT /_synapse/admin/v1/registration_tokens/<token>
+
+

Path parameters:

+
    +
  • token: The registration token to update.
  • +
+

The request body must be a JSON object and can contain the following fields:

+
    +
  • uses_allowed: The integer number of times the token can be used to complete +a registration before it becomes invalid. By setting uses_allowed to 0 +the token can be easily made invalid without deleting it. +If null the token will have an unlimited number of uses.
  • +
  • expiry_time: The latest time the token is valid. Given as the number of +milliseconds since 1970-01-01 00:00:00 UTC (the start of the Unix epoch). +If null the token will not expire.
  • +
+

If a field is omitted its value is not modified.

+

Example:

+
PUT /_synapse/admin/v1/registration_tokens/defg
+
+{
+    "expiry_time": 4781243146000    // 2121-07-06 11:05:46 UTC
+}
+
+
200 OK
+
+{
+    "token": "defg",
+    "uses_allowed": 1,
+    "pending": 0,
+    "completed": 0,
+    "expiry_time": 4781243146000
+}
+
+

Delete token

+

Delete a registration token. If the request is successful, the response body +will be an empty JSON object.

+
DELETE /_synapse/admin/v1/registration_tokens/<token>
+
+

Path parameters:

+
    +
  • token: The registration token to delete.
  • +
+

Example:

+
DELETE /_synapse/admin/v1/registration_tokens/wxyz
+
+
200 OK
+
+{}
+
+

Errors

+

If a request fails a "standard error response" will be returned as defined in +the Matrix Client-Server API specification.

+

For example, if the token specified in a path parameter does not exist a +404 Not Found error will be returned.

+
GET /_synapse/admin/v1/registration_tokens/1234
+
+
404 Not Found
+
+{
+    "errcode": "M_NOT_FOUND",
+    "error": "No such registration token: 1234"
+}
+
+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/admin_faq.html b/v1.106/usage/administration/admin_faq.html new file mode 100644 index 0000000000..c560d2efdf --- /dev/null +++ b/v1.106/usage/administration/admin_faq.html @@ -0,0 +1,416 @@ + + + + + + Admin FAQ - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

Admin FAQ

+

How do I become a server admin?

+

If your server already has an admin account you should use the +User Admin API +to promote other accounts to become admins.

+

If you don't have any admin accounts yet you won't be able to use the admin API, +so you'll have to edit the database manually. Manually editing the database is +generally not recommended so once you have an admin account: use the admin APIs +to make further changes.

+
UPDATE users SET admin = 1 WHERE name = '@foo:bar.com';
+
+

What servers are my server talking to?

+

Run this sql query on your db:

+
SELECT * FROM destinations;
+
+

What servers are currently participating in this room?

+

Run this sql query on your db:

+
SELECT DISTINCT split_part(state_key, ':', 2)
+FROM current_state_events
+WHERE room_id = '!cURbafjkfsMDVwdRDQ:matrix.org' AND membership = 'join';
+
+

What users are registered on my server?

+
SELECT NAME from users;
+
+

How can I export user data?

+

Synapse includes a Python command to export data for a specific user. It takes the homeserver +configuration file and the full Matrix ID of the user to export:

+
python -m synapse.app.admin_cmd -c <config_file> export-data <user_id> --output-directory <directory_path>
+
+

If you uses Poetry +to run Synapse:

+
poetry run python -m synapse.app.admin_cmd -c <config_file> export-data <user_id> --output-directory <directory_path>
+
+

The directory to store the export data in can be customised with the +--output-directory parameter; ensure that the provided directory is +empty. If this parameter is not provided, Synapse defaults to creating +a temporary directory (which starts with "synapse-exfiltrate") in /tmp, +/var/tmp, or /usr/tmp, in that order.

+

The exported data has the following layout:

+
output-directory
+├───rooms
+│   └───<room_id>
+│       ├───events
+│       ├───state
+│       ├───invite_state
+│       └───knock_state
+├───user_data
+│   ├───account_data
+│   │   ├───global
+│   │   └───<room_id>
+│   ├───connections
+│   ├───devices
+│   └───profile
+└───media_ids
+    └───<media_id>
+
+

The media_ids folder contains only the metadata of the media uploaded by the user. +It does not contain the media itself. +Furthermore, only the media_ids that Synapse manages itself are exported. +If another media repository (e.g. matrix-media-repo) +is used, the data must be exported separately.

+

With the media_ids the media files can be downloaded. +Media that have been sent in encrypted rooms are only retrieved in encrypted form. +The following script can help with download the media files:

+
#!/usr/bin/env bash
+
+# Parameters
+#
+#   source_directory: Directory which contains the export with the media_ids.
+#   target_directory: Directory into which all files are to be downloaded.
+#   repository_url: Address of the media repository resp. media worker.
+#   serverName: Name of the server (`server_name` from homeserver.yaml).
+#
+#   Example:
+#       ./download_media.sh /tmp/export_data/media_ids/ /tmp/export_data/media_files/ http://localhost:8008 matrix.example.com
+
+source_directory=$1
+target_directory=$2
+repository_url=$3
+serverName=$4
+
+mkdir -p $target_directory
+
+for file in $source_directory/*; do
+    filename=$(basename ${file})
+    url=$repository_url/_matrix/media/v3/download/$serverName/$filename
+    echo "Downloading $filename - $url"
+    if ! wget -o /dev/null -P $target_directory $url; then
+        echo "Could not download $filename"
+    fi
+done
+
+

How do I upgrade from a very old version of Synapse to the latest?

+

See this section in the +upgrade docs.

+

Manually resetting passwords

+

Users can reset their password through their client. Alternatively, a server admin +can reset a user's password using the admin API.

+

I have a problem with my server. Can I just delete my database and start again?

+

Deleting your database is unlikely to make anything better.

+

It's easy to make the mistake of thinking that you can start again from a clean +slate by dropping your database, but things don't work like that in a federated +network: lots of other servers have information about your server.

+

For example: other servers might think that you are in a room, your server will +think that you are not, and you'll probably be unable to interact with that room +in a sensible way ever again.

+

In general, there are better solutions to any problem than dropping the database. +Come and seek help in https://matrix.to/#/#synapse:matrix.org.

+

There are two exceptions when it might be sensible to delete your database and start again:

+
    +
  • You have never joined any rooms which are federated with other servers. For +instance, a local deployment which the outside world can't talk to.
  • +
  • You are changing the server_name in the homeserver configuration. In effect +this makes your server a completely new one from the point of view of the network, +so in this case it makes sense to start with a clean database. +(In both cases you probably also want to clear out the media_store.)
  • +
+

I've stuffed up access to my room, how can I delete it to free up the alias?

+

Using the following curl command:

+
curl -H 'Authorization: Bearer <access-token>' -X DELETE https://matrix.org/_matrix/client/r0/directory/room/<room-alias>
+
+

<access-token> - can be obtained in riot by looking in the riot settings, down the bottom is: +Access Token:<click to reveal>

+

<room-alias> - the room alias, eg. #my_room:matrix.org this possibly needs to be URL encoded also, for example %23my_room%3Amatrix.org

+

How can I find the lines corresponding to a given HTTP request in my homeserver log?

+

Synapse tags each log line according to the HTTP request it is processing. When +it finishes processing each request, it logs a line containing the words +Processed request: . For example:

+
2019-02-14 22:35:08,196 - synapse.access.http.8008 - 302 - INFO - GET-37 - ::1 - 8008 - {@richvdh:localhost} Processed request: 0.173sec/0.001sec (0.002sec, 0.000sec) (0.027sec/0.026sec/2) 687B 200 "GET /_matrix/client/r0/sync HTTP/1.1" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36" [0 dbevts]"
+
+

Here we can see that the request has been tagged with GET-37. (The tag depends +on the method of the HTTP request, so might start with GET-, PUT-, POST-, +OPTIONS- or DELETE-.) So to find all lines corresponding to this request, we can do:

+
grep 'GET-37' homeserver.log
+
+

If you want to paste that output into a github issue or matrix room, please +remember to surround it with triple-backticks (```) to make it legible +(see quoting code).

+

What do all those fields in the 'Processed' line mean?

+

See Request log format.

+

What are the biggest rooms on my server?

+
SELECT s.canonical_alias, g.room_id, count(*) AS num_rows
+FROM
+  state_groups_state AS g,
+  room_stats_state AS s
+WHERE g.room_id = s.room_id
+GROUP BY s.canonical_alias, g.room_id
+ORDER BY num_rows desc
+LIMIT 10;
+
+

You can also use the List Room API +and order_by state_events.

+

People can't accept room invitations from me

+

The typical failure mode here is that you send an invitation to someone +to join a room or direct chat, but when they go to accept it, they get an +error (typically along the lines of "Invalid signature"). They might see +something like the following in their logs:

+
2019-09-11 19:32:04,271 - synapse.federation.transport.server - 288 - WARNING - GET-11752 - authenticate_request failed: 401: Invalid signature for server <server> with key ed25519:a_EqML: Unable to verify signature for <server>
+
+

This is normally caused by a misconfiguration in your reverse-proxy. See the reverse proxy docs and double-check that your settings are correct.

+

Help!! Synapse is slow and eats all my RAM/CPU!

+

First, ensure you are running the latest version of Synapse, using Python 3 +with a PostgreSQL database.

+

Synapse's architecture is quite RAM hungry currently - we deliberately +cache a lot of recent room data and metadata in RAM in order to speed up +common requests. We'll improve this in the future, but for now the easiest +way to either reduce the RAM usage (at the risk of slowing things down) +is to set the almost-undocumented SYNAPSE_CACHE_FACTOR environment +variable. The default is 0.5, which can be decreased to reduce RAM usage +in memory constrained environments, or increased if performance starts to +degrade.

+

However, degraded performance due to a low cache factor, common on +machines with slow disks, often leads to explosions in memory use due +backlogged requests. In this case, reducing the cache factor will make +things worse. Instead, try increasing it drastically. 2.0 is a good +starting value.

+

Using libjemalloc can also yield a significant +improvement in overall memory use, and especially in terms of giving back +RAM to the OS. To use it, the library must simply be put in the +LD_PRELOAD environment variable when launching Synapse. On Debian, this +can be done by installing the libjemalloc1 package and adding this +line to /etc/default/matrix-synapse:

+
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1
+
+

This made a significant difference on Python 2.7 - it's unclear how +much of an improvement it provides on Python 3.x.

+

If you're encountering high CPU use by the Synapse process itself, you +may be affected by a bug with presence tracking that leads to a +massive excess of outgoing federation requests (see discussion). If metrics +indicate that your server is also issuing far more outgoing federation +requests than can be accounted for by your users' activity, this is a +likely cause. The misbehavior can be worked around by disabling presence +in the Synapse config file: see here.

+

Running out of File Handles

+

If Synapse runs out of file handles, it typically fails badly - live-locking +at 100% CPU, and/or failing to accept new TCP connections (blocking the +connecting client). Matrix currently can legitimately use a lot of file handles, +thanks to busy rooms like #matrix:matrix.org containing hundreds of participating +servers. The first time a server talks in a room it will try to connect +simultaneously to all participating servers, which could exhaust the available +file descriptors between DNS queries & HTTPS sockets, especially if DNS is slow +to respond. (We need to improve the routing algorithm used to be better than +full mesh, but as of March 2019 this hasn't happened yet).

+

If you hit this failure mode, we recommend increasing the maximum number of +open file handles to be at least 4096 (assuming a default of 1024 or 256). +This is typically done by editing /etc/security/limits.conf

+

Separately, Synapse may leak file handles if inbound HTTP requests get stuck +during processing - e.g. blocked behind a lock or talking to a remote server etc. +This is best diagnosed by matching up the 'Received request' and 'Processed request' +log lines and looking for any 'Processed request' lines which take more than +a few seconds to execute. Please let us know at #synapse:matrix.org if +you see this failure mode so we can help debug it, however.

+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/database_maintenance_tools.html b/v1.106/usage/administration/database_maintenance_tools.html new file mode 100644 index 0000000000..dd9eb94d71 --- /dev/null +++ b/v1.106/usage/administration/database_maintenance_tools.html @@ -0,0 +1,216 @@ + + + + + + Database Maintenance Tools - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

This blog post by Jackson Chen (Dec 2022) explains how to use many of the tools listed on this page. There is also an earlier blog by Victor Berger (June 2020), though this may be outdated in places.

+

List of useful tools and scripts for maintenance Synapse database:

+

Purge Remote Media API

+

The purge remote media API allows server admins to purge old cached remote media.

+

Purge Local Media API

+

This API deletes the local media from the disk of your own server.

+

Purge History API

+

The purge history API allows server admins to purge historic events from their database, reclaiming disk space.

+

synapse-compress-state

+

Tool for compressing (deduplicating) state_groups_state table.

+

SQL for analyzing Synapse PostgreSQL database stats

+

Some easy SQL that reports useful stats about your Synapse database.

+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/index.html b/v1.106/usage/administration/index.html new file mode 100644 index 0000000000..1420506134 --- /dev/null +++ b/v1.106/usage/administration/index.html @@ -0,0 +1,211 @@ + + + + + + Administration - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

Administration

+

This section contains information on managing your Synapse homeserver. This includes:

+
    +
  • Managing users, rooms and media via the Admin API.
  • +
  • Setting up metrics and monitoring to give you insight into your homeserver's health.
  • +
  • Configuring structured logging.
  • +
+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/monitoring/reporting_homeserver_usage_statistics.html b/v1.106/usage/administration/monitoring/reporting_homeserver_usage_statistics.html new file mode 100644 index 0000000000..462af504c7 --- /dev/null +++ b/v1.106/usage/administration/monitoring/reporting_homeserver_usage_statistics.html @@ -0,0 +1,274 @@ + + + + + + Reporting Homeserver Usage Statistics - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

Reporting Homeserver Usage Statistics

+

When generating your Synapse configuration file, you are asked whether you +would like to report usage statistics to Matrix.org. These statistics +provide the foundation a glimpse into the number of Synapse homeservers +participating in the network, as well as statistics such as the number of +rooms being created and messages being sent. This feature is sometimes +affectionately called "phone home" stats. Reporting +is optional +and the reporting endpoint +can be configured, +in case you would like to instead report statistics from a set of homeservers +to your own infrastructure.

+

This documentation aims to define the statistics available and the +homeserver configuration options that exist to tweak it.

+

Available Statistics

+

The following statistics are sent to the configured reporting endpoint:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Statistic NameTypeDescription
homeserverstringThe homeserver's server name.
memory_rssintThe memory usage of the process (in kilobytes on Unix-based systems, bytes on MacOS).
cpu_averageintCPU time in % of a single core (not % of all cores).
server_contextstringAn arbitrary string used to group statistics from a set of homeservers.
timestampintThe current time, represented as the number of seconds since the epoch.
uptime_secondsintThe number of seconds since the homeserver was last started.
python_versionstringThe Python version number in use (e.g "3.7.1"). Taken from sys.version_info.
total_usersintThe number of registered users on the homeserver.
total_nonbridged_usersintThe number of users, excluding those created by an Application Service.
daily_user_type_nativeintThe number of native users created in the last 24 hours.
daily_user_type_guestintThe number of guest users created in the last 24 hours.
daily_user_type_bridgedintThe number of users created by Application Services in the last 24 hours.
total_room_countintThe total number of rooms present on the homeserver.
daily_active_usersintThe number of unique users1 that have used the homeserver in the last 24 hours.
monthly_active_usersintThe number of unique users1 that have used the homeserver in the last 30 days.
daily_active_roomsintThe number of rooms that have had a (state) event with the type m.room.message sent in them in the last 24 hours.
daily_active_e2ee_roomsintThe number of rooms that have had a (state) event with the type m.room.encrypted sent in them in the last 24 hours.
daily_messagesintThe number of (state) events with the type m.room.message seen in the last 24 hours.
daily_e2ee_messagesintThe number of (state) events with the type m.room.encrypted seen in the last 24 hours.
daily_sent_messagesintThe number of (state) events sent by a local user with the type m.room.message seen in the last 24 hours.
daily_sent_e2ee_messagesintThe number of (state) events sent by a local user with the type m.room.encrypted seen in the last 24 hours.
r30v2_users_allintThe number of 30 day retained users, with a revised algorithm. Defined as users that appear more than once in the past 60 days, and have more than 30 days between the most and least recent appearances in the past 60 days. Includes clients that do not fit into the below r30 client types.
r30v2_users_androidintThe number of 30 day retained users, as defined above. Filtered only to clients with ("riot" or "element") and "android" (case-insensitive) in the user agent string.
r30v2_users_iosintThe number of 30 day retained users, as defined above. Filtered only to clients with ("riot" or "element") and "ios" (case-insensitive) in the user agent string.
r30v2_users_electronintThe number of 30 day retained users, as defined above. Filtered only to clients with ("riot" or "element") and "electron" (case-insensitive) in the user agent string.
r30v2_users_webintThe number of 30 day retained users, as defined above. Filtered only to clients with "mozilla" or "gecko" (case-insensitive) in the user agent string.
cache_factorintThe configured global factor value for caching.
event_cache_sizeintThe configured event_cache_size value for caching.
database_enginestringThe database engine that is in use. Either "psycopg2" meaning PostgreSQL is in use, or "sqlite3" for SQLite3.
database_server_versionstringThe version of the database server. Examples being "10.10" for PostgreSQL server version 10.0, and "3.38.5" for SQLite 3.38.5 installed on the system.
log_levelstringThe log level in use. Examples are "INFO", "WARNING", "ERROR", "DEBUG", etc.
+
1 +

Native matrix users and guests are always counted. If the +track_puppeted_user_ips +option is set to true, "puppeted" users (users that an Application Service have performed +an action on behalf of) +will also be counted. Note that an Application Service can "puppet" any user in their +user namespace, +not only users that the Application Service has created. If this happens, the Application Service +will additionally be counted as a user (irrespective of track_puppeted_user_ips).

+
+

Using a Custom Statistics Collection Server

+

If statistics reporting is enabled, the endpoint that Synapse sends metrics to is configured by the +report_stats_endpoint config +option. By default, statistics are sent to Matrix.org.

+

If you would like to set up your own statistics collection server and send metrics there, you may +consider using one of the following known implementations:

+ + +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/monthly_active_users.html b/v1.106/usage/administration/monthly_active_users.html new file mode 100644 index 0000000000..54f65a232a --- /dev/null +++ b/v1.106/usage/administration/monthly_active_users.html @@ -0,0 +1,268 @@ + + + + + + Monthly Active Users - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

Monthly Active Users

+

Synapse can be configured to record the number of monthly active users (also referred to as MAU) on a given homeserver. +For clarity's sake, MAU only tracks local users.

+

Please note that the metrics recorded by the Homeserver Usage Stats +are calculated differently. The monthly_active_users from the usage stats does not take into account any +of the rules below, and counts any users who have made a request to the homeserver in the last 30 days.

+

See the configuration manual for details on how to configure MAU.

+

Calculating active users

+

Individual user activity is measured in active days. If a user performs an action, the exact time of that action is then recorded. When +calculating the MAU figure, any users with a recorded action in the last 30 days are considered part of the cohort. Days are measured +as a rolling window from the current system time to 30 days ago.

+

So for example, if Synapse were to calculate the active users on the 15th July at 13:25, it would include any activity from 15th June 13:25 onwards.

+

A user is never considered active if they are either:

+
    +
  • Part of the trial day cohort (described below)
  • +
  • Owned by an application service. +
      +
    • Note: This only covers users that are part of an application service namespaces.users registration. The namespace +must also be marked as exclusive.
    • +
    +
  • +
+

Otherwise, any request to Synapse will mark the user as active. Please note that registration will not mark a user as active unless +they register with a 3pid that is included in the config field mau_limits_reserved_threepids.

+

The Prometheus metric for MAU is refreshed every 5 minutes.

+

Once an hour, Synapse checks to see if any users are inactive (with only activity timestamps later than 30 days). These users +are removed from the active users cohort. If they then become active, they are immediately restored to the cohort.

+

It is important to note that deactivated users are not immediately removed from the pool of active users, but as these users won't +perform actions they will eventually be removed from the cohort.

+

Trial days

+

If the config option mau_trial_days is set, a user must have been active this many days after registration to be active. A user is in the +trial period if their registration timestamp (also known as the creation_ts) is less than mau_trial_days old.

+

As an example, if mau_trial_days is set to 3 and a user is active after 3 days (72 hours from registration time) then they will be counted as active.

+

The mau_appservice_trial_days config further extends this rule by applying different durations depending on the appservice_id of the user. +Users registered by an application service will be recorded with an appservice_id matching the id key in the registration file for that service.

+

Limiting usage of the homeserver when the maximum MAU is reached

+

If both config options limit_usage_by_mau and max_mau_value is set, and the current MAU value exceeds the maximum value, the +homeserver will begin to block some actions.

+

Individual users matching any of the below criteria never have their actions blocked:

+
    +
  • Considered part of the cohort of MAU users.
  • +
  • Considered part of the trial period.
  • +
  • Registered as a support user.
  • +
  • Application service users if track_appservice_user_ips is NOT set.
  • +
+

Please not that server admins are not exempt from blocking.

+

The following actions are blocked when the MAU limit is exceeded:

+
    +
  • Logging in
  • +
  • Sending events
  • +
  • Creating rooms
  • +
  • Syncing
  • +
+

Registration is also blocked for all new signups unless the user is registering with a threepid included in the mau_limits_reserved_threepids +config value.

+

When a request is blocked, the response will have the errcode M_RESOURCE_LIMIT_EXCEEDED.

+

Metrics

+

Synapse records several different prometheus metrics for MAU.

+

synapse_admin_mau_current records the current MAU figure for native (non-application-service) users.

+

synapse_admin_mau_max records the maximum MAU as dictated by the max_mau_value config value.

+

synapse_admin_mau_current_mau_by_service records the current MAU including application service users. The label app_service can be used +to filter by a specific service ID. This also includes non-application-service users under app_service=native .

+

synapse_admin_mau_registered_reserved_users records the number of users specified in mau_limits_reserved_threepids which have +registered accounts on the homeserver.

+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/request_log.html b/v1.106/usage/administration/request_log.html new file mode 100644 index 0000000000..ca966d58ae --- /dev/null +++ b/v1.106/usage/administration/request_log.html @@ -0,0 +1,239 @@ + + + + + + Request log format - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

Request log format

+

HTTP request logs are written by synapse (see synapse/http/site.py for details).

+

See the following for how to decode the dense data available from the default logging configuration.

+
2020-10-01 12:00:00,000 - synapse.access.http.8008 - 311 - INFO - PUT-1000- 192.168.0.1 - 8008 - {another-matrix-server.com} Processed request: 0.100sec/-0.000sec (0.000sec, 0.000sec) (0.001sec/0.090sec/3) 11B !200 "PUT /_matrix/federation/v1/send/1600000000000 HTTP/1.1" "Synapse/1.20.1" [0 dbevts]
+-AAAAAAAAAAAAAAAAAAAAA-   -BBBBBBBBBBBBBBBBBBBBBB-   -C-   -DD-   -EEEEEE-  -FFFFFFFFF-   -GG-    -HHHHHHHHHHHHHHHHHHHHHHH-                     -IIIIII- -JJJJJJJ-  -KKKKKK-, -LLLLLL-  -MMMMMMM- -NNNNNN- O  -P- -QQ-  -RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR-   -SSSSSSSSSSSS-   -TTTTTT-
+
+ + + + + + + + + + + + + + + + + + + + + +
PartExplanation
AAAATimestamp request was logged (not received)
BBBBLogger name (synapse.access.(http\|https).<tag>, where 'tag' is defined in the listeners config section, normally the port)
CCCCLine number in code
DDDDLog Level
EEEERequest Identifier (This identifier is shared by related log lines)
FFFFSource IP (Or X-Forwarded-For if enabled)
GGGGServer Port
HHHHFederated Server or Local User making request (blank if unauthenticated or not supplied).
If this is of the form `@aaa:example.com
IIIITotal Time to process the request
JJJJTime to send response over network once generated (this may be negative if the socket is closed before the response is generated)
KKKKUserland CPU time
LLLLSystem CPU time
MMMMTotal time waiting for a free DB connection from the pool across all parallel DB work from this request
NNNNTotal time waiting for response to DB queries across all parallel DB work from this request
OOOOCount of DB transactions performed
PPPPResponse body size
QQQQResponse status code
Suffixed with ! if the socket was closed before the response was generated.
A 499! status code indicates that Synapse also cancelled request processing after the socket was closed.
RRRRRequest
SSSSUser-agent
TTTTEvents fetched from DB to service this request (note that this does not include events fetched from the cache)
+

MMMM / NNNN can be greater than IIII if there are multiple slow database queries +running in parallel.

+

Some actions can result in multiple identical http requests, which will return +the same data, but only the first request will report time/transactions in +KKKK/LLLL/MMMM/NNNN/OOOO - the others will be awaiting the first query to return a +response and will simultaneously return with the first request, but with very +small processing times.

+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/state_groups.html b/v1.106/usage/administration/state_groups.html new file mode 100644 index 0000000000..5f364c7ab2 --- /dev/null +++ b/v1.106/usage/administration/state_groups.html @@ -0,0 +1,217 @@ + + + + + + State Groups - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

How do State Groups work?

+

As a general rule, I encourage people who want to understand the deepest darkest secrets of the database schema to drop by #synapse-dev:matrix.org and ask questions.

+

However, one question that comes up frequently is that of how "state groups" work, and why the state_groups_state table gets so big, so here's an attempt to answer that question.

+

We need to be able to relatively quickly calculate the state of a room at any point in that room's history. In other words, we need to know the state of the room at each event in that room. This is done as follows:

+

A sequence of events where the state is the same are grouped together into a state_group; the mapping is recorded in event_to_state_groups. (Technically speaking, since a state event usually changes the state in the room, we are recording the state of the room after the given event id: which is to say, to a handwavey simplification, the first event in a state group is normally a state event, and others in the same state group are normally non-state-events.)

+

state_groups records, for each state group, the id of the room that we're looking at, and also the id of the first event in that group. (I'm not sure if that event id is used much in practice.)

+

Now, if we stored all the room state for each state_group, that would be a huge amount of data. Instead, for each state group, we normally store the difference between the state in that group and some other state group, and only occasionally (every 100 state changes or so) record the full state.

+

So, most state groups have an entry in state_group_edges (don't ask me why it's not a column in state_groups) which records the previous state group in the room, and state_groups_state records the differences in state since that previous state group.

+

A full state group just records the event id for each piece of state in the room at that point.

+

Known bugs with state groups

+

There are various reasons that we can end up creating many more state groups than we need: see https://github.com/matrix-org/synapse/issues/3364 for more details.

+

Compression tool

+

There is a tool at https://github.com/matrix-org/rust-synapse-compress-state which can compress the state_groups_state on a room by-room basis (essentially, it reduces the number of "full" state groups). This can result in dramatic reductions of the storage used.

+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/understanding_synapse_through_grafana_graphs.html b/v1.106/usage/administration/understanding_synapse_through_grafana_graphs.html new file mode 100644 index 0000000000..11c85d56b1 --- /dev/null +++ b/v1.106/usage/administration/understanding_synapse_through_grafana_graphs.html @@ -0,0 +1,254 @@ + + + + + + Understanding Synapse Through Grafana Graphs - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

Understanding Synapse through Grafana graphs

+

It is possible to monitor much of the internal state of Synapse using Prometheus +metrics and Grafana. +A guide for configuring Synapse to provide metrics is available here +and information on setting up Grafana is here. +In this setup, Prometheus will periodically scrape the information Synapse provides and +store a record of it over time. Grafana is then used as an interface to query and +present this information through a series of pretty graphs.

+

Once you have grafana set up, and assuming you're using our grafana dashboard template, look for the following graphs when debugging a slow/overloaded Synapse:

+

Message Event Send Time

+

image

+

This, along with the CPU and Memory graphs, is a good way to check the general health of your Synapse instance. It represents how long it takes for a user on your homeserver to send a message.

+

Transaction Count and Transaction Duration

+

image

+

image

+

These graphs show the database transactions that are occurring the most frequently, as well as those are that are taking the most amount of time to execute.

+

image

+

In the first graph, we can see obvious spikes corresponding to lots of get_user_by_id transactions. This would be useful information to figure out which part of the Synapse codebase is potentially creating a heavy load on the system. However, be sure to cross-reference this with Transaction Duration, which states that get_users_by_id is actually a very quick database transaction and isn't causing as much load as others, like persist_events:

+

image

+

Still, it's probably worth investigating why we're getting users from the database that often, and whether it's possible to reduce the amount of queries we make by adjusting our cache factor(s).

+

The persist_events transaction is responsible for saving new room events to the Synapse database, so can often show a high transaction duration.

+

Federation

+

The charts in the "Federation" section show information about incoming and outgoing federation requests. Federation data can be divided into two basic types:

+
    +
  • PDU (Persistent Data Unit) - room events: messages, state events (join/leave), etc. These are permanently stored in the database.
  • +
  • EDU (Ephemeral Data Unit) - other data, which need not be stored permanently, such as read receipts, typing notifications.
  • +
+

The "Outgoing EDUs by type" chart shows the EDUs within outgoing federation requests by type: m.device_list_update, m.direct_to_device, m.presence, m.receipt, m.typing.

+

If you see a large number of m.presence EDUs and are having trouble with too much CPU load, you can disable presence in the Synapse config. See also #3971.

+

Caches

+

image

+

image

+

This is quite a useful graph. It shows how many times Synapse attempts to retrieve a piece of data from a cache which the cache did not contain, thus resulting in a call to the database. We can see here that the _get_joined_profile_from_event_id cache is being requested a lot, and often the data we're after is not cached.

+

Cross-referencing this with the Eviction Rate graph, which shows that entries are being evicted from _get_joined_profile_from_event_id quite often:

+

image

+

we should probably consider raising the size of that cache by raising its cache factor (a multiplier value for the size of an individual cache). Information on doing so is available here (note that the configuration of individual cache factors through the configuration file is available in Synapse v1.14.0+, whereas doing so through environment variables has been supported for a very long time). Note that this will increase Synapse's overall memory usage.

+

Forward Extremities

+

image

+

Forward extremities are the leaf events at the end of a DAG in a room, aka events that have no children. The more that exist in a room, the more state resolution that Synapse needs to perform (hint: it's an expensive operation). While Synapse has code to prevent too many of these existing at one time in a room, bugs can sometimes make them crop up again.

+

If a room has >10 forward extremities, it's worth checking which room is the culprit and potentially removing them using the SQL queries mentioned in #1760.

+

Garbage Collection

+

image

+

Large spikes in garbage collection times (bigger than shown here, I'm talking in the +multiple seconds range), can cause lots of problems in Synapse performance. It's more an +indicator of problems, and a symptom of other problems though, so check other graphs for what might be causing it.

+

Final Thoughts

+

If you're still having performance problems with your Synapse instance and you've +tried everything you can, it may just be a lack of system resources. Consider adding +more CPU and RAM, and make use of worker mode +to make use of multiple CPU cores / multiple machines for your homeserver.

+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + diff --git a/v1.106/usage/administration/useful_sql_for_admins.html b/v1.106/usage/administration/useful_sql_for_admins.html new file mode 100644 index 0000000000..b501d28f23 --- /dev/null +++ b/v1.106/usage/administration/useful_sql_for_admins.html @@ -0,0 +1,380 @@ + + + + + + Useful SQL for Admins - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + +
+
+ +
+ +
+ +

Some useful SQL queries for Synapse Admins

+

Size of full matrix db

+
SELECT pg_size_pretty( pg_database_size( 'matrix' ) );
+
+

Result example:

+
pg_size_pretty 
+----------------
+ 6420 MB
+(1 row)
+
+

Show top 20 larger tables by row count

+
SELECT relname, n_live_tup AS "rows"
+  FROM pg_stat_user_tables
+  ORDER BY n_live_tup DESC
+  LIMIT 20;
+
+

This query is quick, but may be very approximate, for exact number of rows use:

+
SELECT COUNT(*) FROM <table_name>;
+
+

Result example:

+
state_groups_state - 161687170
+event_auth - 8584785
+event_edges - 6995633
+event_json - 6585916
+event_reference_hashes - 6580990
+events - 6578879
+received_transactions - 5713989
+event_to_state_groups - 4873377
+stream_ordering_to_exterm - 4136285
+current_state_delta_stream - 3770972
+event_search - 3670521
+state_events - 2845082
+room_memberships - 2785854
+cache_invalidation_stream - 2448218
+state_groups - 1255467
+state_group_edges - 1229849
+current_state_events - 1222905
+users_in_public_rooms - 364059
+device_lists_stream - 326903
+user_directory_search - 316433
+
+

Show top 20 larger tables by storage size

+
SELECT nspname || '.' || relname AS "relation",
+    pg_size_pretty(pg_total_relation_size(c.oid)) AS "total_size"
+  FROM pg_class c
+  LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace)
+  WHERE nspname NOT IN ('pg_catalog', 'information_schema')
+    AND c.relkind <> 'i'
+    AND nspname !~ '^pg_toast'
+  ORDER BY pg_total_relation_size(c.oid) DESC
+  LIMIT 20;
+
+

Result example:

+
public.state_groups_state - 27 GB
+public.event_json - 9855 MB
+public.events - 3675 MB
+public.event_edges - 3404 MB
+public.received_transactions - 2745 MB
+public.event_reference_hashes - 1864 MB
+public.event_auth - 1775 MB
+public.stream_ordering_to_exterm - 1663 MB
+public.event_search - 1370 MB
+public.room_memberships - 1050 MB
+public.event_to_state_groups - 948 MB
+public.current_state_delta_stream - 711 MB
+public.state_events - 611 MB
+public.presence_stream - 530 MB
+public.current_state_events - 525 MB
+public.cache_invalidation_stream - 466 MB
+public.receipts_linearized - 279 MB
+public.state_groups - 160 MB
+public.device_lists_remote_cache - 124 MB
+public.state_group_edges - 122 MB
+
+

Show top 20 larger rooms by state events count

+

You get the same information when you use the +admin API +and set parameter order_by=state_events.

+
SELECT r.name, s.room_id, s.current_state_events
+  FROM room_stats_current s
+  LEFT JOIN room_stats_state r USING (room_id)
+  ORDER BY current_state_events DESC
+  LIMIT 20;
+
+

and by state_group_events count:

+
SELECT rss.name, s.room_id, COUNT(s.room_id)
+  FROM state_groups_state s
+  LEFT JOIN room_stats_state rss USING (room_id)
+  GROUP BY s.room_id, rss.name
+  ORDER BY COUNT(s.room_id) DESC
+  LIMIT 20;
+
+

plus same, but with join removed for performance reasons:

+
SELECT s.room_id, COUNT(s.room_id)
+  FROM state_groups_state s
+  GROUP BY s.room_id 
+  ORDER BY COUNT(s.room_id) DESC
+  LIMIT 20;
+
+

Show top 20 rooms by new events count in last 1 day:

+
SELECT e.room_id, r.name, COUNT(e.event_id) cnt
+  FROM events e
+  LEFT JOIN room_stats_state r USING (room_id)
+  WHERE e.origin_server_ts >= DATE_PART('epoch', NOW() - INTERVAL '1 day') * 1000
+  GROUP BY e.room_id, r.name 
+  ORDER BY cnt DESC
+  LIMIT 20;
+
+

Show top 20 users on homeserver by sent events (messages) at last month:

+

Caution. This query does not use any indexes, can be slow and create load on the database.

+
SELECT COUNT(*), sender
+  FROM events
+  WHERE (type = 'm.room.encrypted' OR type = 'm.room.message')
+    AND origin_server_ts >= DATE_PART('epoch', NOW() - INTERVAL '1 month') * 1000
+  GROUP BY sender
+  ORDER BY COUNT(*) DESC
+  LIMIT 20;
+
+

Show last 100 messages from needed user, with room names:

+
SELECT e.room_id, r.name, e.event_id, e.type, e.content, j.json
+  FROM events e
+  LEFT JOIN event_json j USING (room_id)
+  LEFT JOIN room_stats_state r USING (room_id)
+  WHERE sender = '@LOGIN:example.com'
+    AND e.type = 'm.room.message'
+  ORDER BY stream_ordering DESC
+  LIMIT 100;
+
+

Show rooms with names, sorted by events in this rooms

+

Sort and order with bash

+
echo "SELECT event_json.room_id, room_stats_state.name FROM event_json, room_stats_state \
+WHERE room_stats_state.room_id = event_json.room_id" | psql -d synapse -h localhost -U synapse_user -t \
+| sort | uniq -c | sort -n
+
+

Documentation for psql command line parameters: https://www.postgresql.org/docs/current/app-psql.html

+

Sort and order with SQL

+
SELECT COUNT(*), event_json.room_id, room_stats_state.name
+  FROM event_json, room_stats_state
+  WHERE room_stats_state.room_id = event_json.room_id
+  GROUP BY event_json.room_id, room_stats_state.name
+  ORDER BY COUNT(*) DESC
+  LIMIT 50;
+
+

Result example:

+
   9459  !FPUfgzXYWTKgIrwKxW:matrix.org              | This Week in Matrix
+   9459  !FPUfgzXYWTKgIrwKxW:matrix.org              | This Week in Matrix (TWIM)
+  17799  !iDIOImbmXxwNngznsa:matrix.org              | Linux in Russian
+  18739  !GnEEPYXUhoaHbkFBNX:matrix.org              | Riot Android
+  23373  !QtykxKocfZaZOUrTwp:matrix.org              | Matrix HQ
+  39504  !gTQfWzbYncrtNrvEkB:matrix.org              | ru.[matrix]
+  43601  !iNmaIQExDMeqdITdHH:matrix.org              | Riot
+  43601  !iNmaIQExDMeqdITdHH:matrix.org              | Riot Web/Desktop
+
+

Lookup room state info by list of room_id

+

You get the same information when you use the +admin API.

+
SELECT rss.room_id, rss.name, rss.canonical_alias, rss.topic, rss.encryption,
+    rsc.joined_members, rsc.local_users_in_room, rss.join_rules
+  FROM room_stats_state rss
+  LEFT JOIN room_stats_current rsc USING (room_id)
+  WHERE room_id IN (
+    '!OGEhHVWSdvArJzumhm:matrix.org',
+    '!YTvKGNlinIzlkMTVRl:matrix.org' 
+  );
+
+

Show users and devices that have not been online for a while

+
SELECT user_id, device_id, user_agent, TO_TIMESTAMP(last_seen / 1000) AS "last_seen"
+  FROM devices
+  WHERE last_seen < DATE_PART('epoch', NOW() - INTERVAL '3 month') * 1000;
+
+

Clear the cache of a remote user's device list

+

Forces the resync of a remote user's device list - if you have somehow cached a bad state, and the remote server is +will not send out a device list update.

+
INSERT INTO device_lists_remote_resync
+VALUES ('USER_ID', (EXTRACT(epoch FROM NOW()) * 1000)::BIGINT);
+
+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + -- cgit 1.5.1