1 files changed, 652 insertions, 0 deletions
diff --git a/docs/usage/configuration/workers.md b/docs/usage/configuration/workers.md
new file mode 100644
index 0000000000..e6a7426874
--- /dev/null
+++ b/docs/usage/configuration/workers.md
@@ -0,0 +1,652 @@
+# Scaling synapse via workers
+
+For small instances it is recommended to run Synapse in the default monolith mode.
+For larger instances where performance is a concern it can be helpful to split
+out functionality into multiple separate python processes. These processes are
+called 'workers', and are (eventually) intended to scale horizontally
+independently.
+
+Synapse's worker support is under active development and subject to change as
+we attempt to rapidly scale ever larger Synapse instances. However we are
+documenting it here to help admins needing a highly scalable Synapse instance
+similar to the one running `matrix.org`.
+
+All processes continue to share the same database instance, and as such,
+workers only work with PostgreSQL-based Synapse deployments. SQLite should only
+be used for demo purposes and any admin considering workers should already be
+running PostgreSQL.
+
+See also [Matrix.org blog post](https://matrix.org/blog/2020/11/03/how-we-fixed-synapses-scalability)
+for a higher level overview.
+
+## Main process/worker communication
+
+The processes communicate with each other via a Synapse-specific protocol called
+'replication' (analogous to MySQL- or Postgres-style database replication) which
+feeds streams of newly written data between processes so they can be kept in
+sync with the database state.
+
+When configured to do so, Synapse uses a
+[Redis pub/sub channel](https://redis.io/docs/manual/pubsub/) to send the replication
+stream between all configured Synapse processes. Additionally, processes may
+make HTTP requests to each other, primarily for operations which need to wait
+for a reply ─ such as sending an event.
+
+All the workers and the main process connect to Redis, which relays replication
+commands between processes.
+
+If Redis support is enabled Synapse will use it as a shared cache, as well as a
+pub/sub mechanism.
+
+See the [Architectural diagram](#architectural-diagram) section at the end for
+a visualisation of what this looks like.
+
+
+## Setting up workers
+
+A Redis server is required to manage the communication between the processes.
+The Redis server should be installed following the normal procedure for your
+distribution (e.g. `apt install redis-server` on Debian). It is safe to use an
+existing Redis deployment if you have one.
+
+Once installed, check that Redis is running and accessible from the host running
+Synapse, for example by executing `echo PING | nc -q1 localhost 6379` and seeing
+a response of `+PONG`.
+
+The appropriate dependencies must also be installed for Synapse. If using a
+virtualenv, these can be installed with:
+
+```sh
+pip install "matrix-synapse[redis]"
+```
+
+Note that these dependencies are included when synapse is installed with `pip
+install matrix-synapse[all]`. They are also included in the debian packages from
+`matrix.org` and in the docker images at
+https://hub.docker.com/r/matrixdotorg/synapse/.
+
+To make effective use of the workers, you will need to configure an HTTP
+reverse-proxy such as nginx or haproxy, which will direct incoming requests to
+the correct worker, or to the main synapse instance. See
+[the reverse proxy documentation](../../setup/reverse_proxy.md) for information on setting up a reverse
+proxy.
+
+When using workers, each worker process has its own configuration file which
+contains settings specific to that worker, such as the HTTP listener that it
+provides (if any), logging configuration, etc.
+
+Normally, the worker processes are configured to read from a shared
+configuration file as well as the worker-specific configuration files. This
+makes it easier to keep common configuration settings synchronised across all
+the processes.
+
+The main process is somewhat special in this respect: it does not normally
+need its own configuration file and can take all of its configuration from the
+shared configuration file.
+
+
+### Shared configuration
+
+Normally, only a couple of changes are needed to make an existing configuration
+file suitable for use with workers. First, you need to enable an "HTTP replication
+listener" for the main process; and secondly, you need to enable redis-based
+replication. Optionally, a shared secret can be used to authenticate HTTP
+traffic between workers. For example:
+
+```yaml
+# extend the existing `listeners` section. This defines the ports that the
+# main process will listen on.
+listeners:
+  # The HTTP replication port
+  - port: 9093
+    bind_address: '127.0.0.1'
+    type: http
+    resources:
+     - names: [replication]
+
+# Add a random shared secret to authenticate traffic.
+worker_replication_secret: ""
+
+redis:
+    enabled: true
+```
+
+See the [configuration manual](usage/configuration/config_documentation.html) for the full documentation of each option.
+
+Under **no circumstances** should the replication listener be exposed to the
+public internet; replication traffic is:
+
+* always unencrypted
+* unauthenticated, unless `worker_replication_secret` is configured
+
+
+### Worker configuration
+
+In the config file for each worker, you must specify:
+ * The type of worker (`worker_app`). The currently available worker applications are listed below.
+ * A unique name for the worker (`worker_name`).
+ * The HTTP replication endpoint that it should talk to on the main synapse process
+   (`worker_replication_host` and `worker_replication_http_port`)
+ * If handling HTTP requests, a `worker_listeners` option with an `http`
+   listener, in the same way as the [`listeners`](config_documentation.md#listeners)
+   option in the shared config.
+ * If handling the `^/_matrix/client/v3/keys/upload` endpoint, the HTTP URI for
+   the main process (`worker_main_http_uri`).
+
+For example:
+
+```yaml
+{{#include systemd-with-workers/workers/generic_worker.yaml}}
+```
+
+...is a full configuration for a generic worker instance, which will expose a
+plain HTTP endpoint on port 8083 separately serving various endpoints, e.g.
+`/sync`, which are listed below.
+
+Obviously you should configure your reverse-proxy to route the relevant
+endpoints to the worker (`localhost:8083` in the above example).
+
+
+### Running Synapse with workers
+
+Finally, you need to start your worker processes. This can be done with either
+`synctl` or your distribution's preferred service manager such as `systemd`. We
+recommend the use of `systemd` where available: for information on setting up
+`systemd` to start synapse workers, see
+[Systemd with Workers](systemd-with-workers). To use `synctl`, see
+[Using synctl with Workers](synctl_workers.md).
+
+
+## Available worker applications
+
+### `synapse.app.generic_worker`
+
+This worker can handle API requests matching the following regular expressions.
+These endpoints can be routed to any worker. If a worker is set up to handle a
+stream then, for maximum efficiency, additional endpoints should be routed to that
+worker: refer to the [stream writers](#stream-writers) section below for further
+information.
+
+    # Sync requests
+    ^/_matrix/client/(r0|v3)/sync$
+    ^/_matrix/client/(api/v1|r0|v3)/events$
+    ^/_matrix/client/(api/v1|r0|v3)/initialSync$
+    ^/_matrix/client/(api/v1|r0|v3)/rooms/[^/]+/initialSync$
+
+    # Federation requests
+    ^/_matrix/federation/v1/event/
+    ^/_matrix/federation/v1/state/
+    ^/_matrix/federation/v1/state_ids/
+    ^/_matrix/federation/v1/backfill/
+    ^/_matrix/federation/v1/get_missing_events/
+    ^/_matrix/federation/v1/publicRooms
+    ^/_matrix/federation/v1/query/
+    ^/_matrix/federation/v1/make_join/
+    ^/_matrix/federation/v1/make_leave/
+    ^/_matrix/federation/(v1|v2)/send_join/
+    ^/_matrix/federation/(v1|v2)/send_leave/
+    ^/_matrix/federation/(v1|v2)/invite/
+    ^/_matrix/federation/v1/event_auth/
+    ^/_matrix/federation/v1/exchange_third_party_invite/
+    ^/_matrix/federation/v1/user/devices/
+    ^/_matrix/key/v2/query
+    ^/_matrix/federation/v1/hierarchy/
+
+    # Inbound federation transaction request
+    ^/_matrix/federation/v1/send/
+
+    # Client API requests
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/createRoom$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/publicRooms$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/joined_members$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/context/.*$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/members$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/state$
+    ^/_matrix/client/v1/rooms/.*/hierarchy$
+    ^/_matrix/client/(v1|unstable)/rooms/.*/relations/
+    ^/_matrix/client/v1/rooms/.*/threads$
+    ^/_matrix/client/unstable/org.matrix.msc2716/rooms/.*/batch_send$
+    ^/_matrix/client/unstable/im.nheko.summary/rooms/.*/summary$
+    ^/_matrix/client/(r0|v3|unstable)/account/3pid$
+    ^/_matrix/client/(r0|v3|unstable)/account/whoami$
+    ^/_matrix/client/(r0|v3|unstable)/devices$
+    ^/_matrix/client/versions$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/voip/turnServer$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/event/
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/joined_rooms$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/search$
+
+    # Encryption requests
+    # Note that ^/_matrix/client/(r0|v3|unstable)/keys/upload/ requires `worker_main_http_uri`
+    ^/_matrix/client/(r0|v3|unstable)/keys/query$
+    ^/_matrix/client/(r0|v3|unstable)/keys/changes$
+    ^/_matrix/client/(r0|v3|unstable)/keys/claim$
+    ^/_matrix/client/(r0|v3|unstable)/room_keys/
+    ^/_matrix/client/(r0|v3|unstable)/keys/upload/
+
+    # Registration/login requests
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/login$
+    ^/_matrix/client/(r0|v3|unstable)/register$
+    ^/_matrix/client/v1/register/m.login.registration_token/validity$
+
+    # Event sending requests
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/redact
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/send
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/state/
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/(join|invite|leave|ban|unban|kick)$
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/join/
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/profile/
+
+    # Account data requests
+    ^/_matrix/client/(r0|v3|unstable)/.*/tags
+    ^/_matrix/client/(r0|v3|unstable)/.*/account_data
+
+    # Receipts requests
+    ^/_matrix/client/(r0|v3|unstable)/rooms/.*/receipt
+    ^/_matrix/client/(r0|v3|unstable)/rooms/.*/read_markers
+
+    # Presence requests
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/presence/
+
+    # User directory search requests
+    ^/_matrix/client/(r0|v3|unstable)/user_directory/search$
+
+Additionally, the following REST endpoints can be handled for GET requests:
+
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/pushrules/
+
+Pagination requests can also be handled, but all requests for a given
+room must be routed to the same instance. Additionally, care must be taken to
+ensure that the purge history admin API is not used while pagination requests
+for the room are in flight:
+
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/messages$
+
+Additionally, the following endpoints should be included if Synapse is configured
+to use SSO (you only need to include the ones for whichever SSO provider you're
+using):
+
+    # for all SSO providers
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/login/sso/redirect
+    ^/_synapse/client/pick_idp$
+    ^/_synapse/client/pick_username
+    ^/_synapse/client/new_user_consent$
+    ^/_synapse/client/sso_register$
+
+    # OpenID Connect requests.
+    ^/_synapse/client/oidc/callback$
+
+    # SAML requests.
+    ^/_synapse/client/saml2/authn_response$
+
+    # CAS requests.
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/login/cas/ticket$
+
+Ensure that all SSO logins go to a single process.
+For multiple workers not handling the SSO endpoints properly, see
+[#7530](https://github.com/matrix-org/synapse/issues/7530) and
+[#9427](https://github.com/matrix-org/synapse/issues/9427).
+
+Note that a [HTTP listener](config_documentation.md#listeners)
+with `client` and `federation` `resources` must be configured in the `worker_listeners`
+option in the worker config.
+
+#### Load balancing
+
+It is possible to run multiple instances of this worker app, with incoming requests
+being load-balanced between them by the reverse-proxy. However, different endpoints
+have different characteristics and so admins
+may wish to run multiple groups of workers handling different endpoints so that
+load balancing can be done in different ways.
+
+For `/sync` and `/initialSync` requests it will be more efficient if all
+requests from a particular user are routed to a single instance. Extracting a
+user ID from the access token or `Authorization` header is currently left as an
+exercise for the reader. Admins may additionally wish to separate out `/sync`
+requests that have a `since` query parameter from those that don't (and
+`/initialSync`), as requests that don't are known as "initial sync" that happens
+when a user logs in on a new device and can be *very* resource intensive, so
+isolating these requests will stop them from interfering with other users ongoing
+syncs.
+
+Federation and client requests can be balanced via simple round robin.
+
+The inbound federation transaction request `^/_matrix/federation/v1/send/`
+should be balanced by source IP so that transactions from the same remote server
+go to the same process.
+
+Registration/login requests can be handled separately purely to help ensure that
+unexpected load doesn't affect new logins and sign ups.
+
+Finally, event sending requests can be balanced by the room ID in the URI (or
+the full URI, or even just round robin), the room ID is the path component after
+`/rooms/`. If there is a large bridge connected that is sending or may send lots
+of events, then a dedicated set of workers can be provisioned to limit the
+effects of bursts of events from that bridge on events sent by normal users.
+
+#### Stream writers
+
+Additionally, the writing of specific streams (such as events) can be moved off
+of the main process to a particular worker.
+
+To enable this, the worker must have a
+[HTTP `replication` listener](config_documentation.md#listeners) configured,
+have a `worker_name` and be listed in the `instance_map` config. The same worker
+can handle multiple streams, but unless otherwise documented, each stream can only
+have a single writer.
+
+For example, to move event persistence off to a dedicated worker, the shared
+configuration would include:
+
+```yaml
+instance_map:
+    event_persister1:
+        host: localhost
+        port: 8034
+
+stream_writers:
+    events: event_persister1
+```
+
+An example for a stream writer instance:
+
+```yaml
+{{#include systemd-with-workers/workers/event_persister.yaml}}
+```
+
+Some of the streams have associated endpoints which, for maximum efficiency, should
+be routed to the workers handling that stream. See below for the currently supported
+streams and the endpoints associated with them:
+
+##### The `events` stream
+
+The `events` stream experimentally supports having multiple writers, where work
+is sharded between them by room ID. Note that you *must* restart all worker
+instances when adding or removing event persisters. An example `stream_writers`
+configuration with multiple writers:
+
+```yaml
+stream_writers:
+    events:
+        - event_persister1
+        - event_persister2
+```
+
+##### The `typing` stream
+
+The following endpoints should be routed directly to the worker configured as
+the stream writer for the `typing` stream:
+
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/typing
+
+##### The `to_device` stream
+
+The following endpoints should be routed directly to the worker configured as
+the stream writer for the `to_device` stream:
+
+    ^/_matrix/client/(r0|v3|unstable)/sendToDevice/
+
+##### The `account_data` stream
+
+The following endpoints should be routed directly to the worker configured as
+the stream writer for the `account_data` stream:
+
+    ^/_matrix/client/(r0|v3|unstable)/.*/tags
+    ^/_matrix/client/(r0|v3|unstable)/.*/account_data
+
+##### The `receipts` stream
+
+The following endpoints should be routed directly to the worker configured as
+the stream writer for the `receipts` stream:
+
+    ^/_matrix/client/(r0|v3|unstable)/rooms/.*/receipt
+    ^/_matrix/client/(r0|v3|unstable)/rooms/.*/read_markers
+
+##### The `presence` stream
+
+The following endpoints should be routed directly to the worker configured as
+the stream writer for the `presence` stream:
+
+    ^/_matrix/client/(api/v1|r0|v3|unstable)/presence/
+
+#### Background tasks
+
+There is also support for moving background tasks to a separate
+worker. Background tasks are run periodically or started via replication. Exactly
+which tasks are configured to run depends on your Synapse configuration (e.g. if
+stats is enabled). This worker doesn't handle any REST endpoints itself.
+
+To enable this, the worker must have a `worker_name` and can be configured to run
+background tasks. For example, to move background tasks to a dedicated worker,
+the shared configuration would include:
+
+```yaml
+run_background_tasks_on: background_worker
+```
+
+You might also wish to investigate the `update_user_directory_from_worker` and
+`media_instance_running_background_jobs` settings.
+
+An example for a dedicated background worker instance:
+
+```yaml
+{{#include systemd-with-workers/workers/background_worker.yaml}}
+```
+
+#### Updating the User Directory
+
+You can designate one generic worker to update the user directory.
+
+Specify its name in the shared configuration as follows:
+
+```yaml
+update_user_directory_from_worker: worker_name
+```
+
+This work cannot be load-balanced; please ensure the main process is restarted
+after setting this option in the shared configuration!
+
+User directory updates allow REST endpoints matching the following regular
+expressions to work:
+
+    ^/_matrix/client/(r0|v3|unstable)/user_directory/search$
+
+The above endpoints can be routed to any worker, though you may choose to route
+it to the chosen user directory worker.
+
+This style of configuration supersedes the legacy `synapse.app.user_dir`
+worker application type.
+
+
+#### Notifying Application Services
+
+You can designate one generic worker to send output traffic to Application Services.
+Doesn't handle any REST endpoints itself, but you should specify its name in the
+shared configuration as follows:
+
+```yaml
+notify_appservices_from_worker: worker_name
+```
+
+This work cannot be load-balanced; please ensure the main process is restarted
+after setting this option in the shared configuration!
+
+This style of configuration supersedes the legacy `synapse.app.appservice`
+worker application type.
+
+
+### `synapse.app.pusher`
+
+Handles sending push notifications to sygnal and email. Doesn't handle any
+REST endpoints itself, but you should set `start_pushers: False` in the
+shared configuration file to stop the main synapse sending push notifications.
+
+To run multiple instances at once the `pusher_instances` option should list all
+pusher instances by their worker name, e.g.:
+
+```yaml
+pusher_instances:
+    - pusher_worker1
+    - pusher_worker2
+```
+
+An example for a pusher instance:
+
+```yaml
+{{#include systemd-with-workers/workers/pusher_worker.yaml}}
+```
+
+
+### `synapse.app.appservice`
+
+**Deprecated as of Synapse v1.59.** [Use `synapse.app.generic_worker` with the
+`notify_appservices_from_worker` option instead.](#notifying-application-services)
+
+Handles sending output traffic to Application Services. Doesn't handle any
+REST endpoints itself, but you should set `notify_appservices: False` in the
+shared configuration file to stop the main synapse sending appservice notifications.
+
+Note this worker cannot be load-balanced: only one instance should be active.
+
+
+### `synapse.app.federation_sender`
+
+Handles sending federation traffic to other servers. Doesn't handle any
+REST endpoints itself, but you should set `send_federation: False` in the
+shared configuration file to stop the main synapse sending this traffic.
+
+If running multiple federation senders then you must list each
+instance in the `federation_sender_instances` option by their `worker_name`.
+All instances must be stopped and started when adding or removing instances.
+For example:
+
+```yaml
+federation_sender_instances:
+    - federation_sender1
+    - federation_sender2
+```
+
+An example for a federation sender instance:
+
+```yaml
+{{#include systemd-with-workers/workers/federation_sender.yaml}}
+```
+
+### `synapse.app.media_repository`
+
+Handles the media repository. It can handle all endpoints starting with:
+
+    /_matrix/media/
+
+... and the following regular expressions matching media-specific administration APIs:
+
+    ^/_synapse/admin/v1/purge_media_cache$
+    ^/_synapse/admin/v1/room/.*/media.*$
+    ^/_synapse/admin/v1/user/.*/media.*$
+    ^/_synapse/admin/v1/media/.*$
+    ^/_synapse/admin/v1/quarantine_media/.*$
+    ^/_synapse/admin/v1/users/.*/media$
+
+You should also set `enable_media_repo: False` in the shared configuration
+file to stop the main synapse running background jobs related to managing the
+media repository. Note that doing so will prevent the main process from being
+able to handle the above endpoints.
+
+In the `media_repository` worker configuration file, configure the
+[HTTP listener](config_documentation.md#listeners) to
+expose the `media` resource. For example:
+
+```yaml
+{{#include systemd-with-workers/workers/media_worker.yaml}}
+```
+
+Note that if running multiple media repositories they must be on the same server
+and you must configure a single instance to run the background tasks, e.g.:
+
+```yaml
+media_instance_running_background_jobs: "media-repository-1"
+```
+
+Note that if a reverse proxy is used , then `/_matrix/media/` must be routed for both inbound client and federation requests (if they are handled separately).
+
+### `synapse.app.user_dir`
+
+**Deprecated as of Synapse v1.59.** [Use `synapse.app.generic_worker` with the
+`update_user_directory_from_worker` option instead.](#updating-the-user-directory)
+
+Handles searches in the user directory. It can handle REST endpoints matching
+the following regular expressions:
+
+    ^/_matrix/client/(r0|v3|unstable)/user_directory/search$
+
+When using this worker you must also set `update_user_directory: false` in the
+shared configuration file to stop the main synapse running background
+jobs related to updating the user directory.
+
+Above endpoint is not *required* to be routed to this worker. By default,
+`update_user_directory` is set to `true`, which means the main process
+will handle updates. All workers configured with `client` can handle the above
+endpoint as long as either this worker or the main process are configured to
+handle it, and are online.
+
+If `update_user_directory` is set to `false`, and this worker is not running,
+the above endpoint may give outdated results.
+
+### Historical apps
+
+The following used to be separate worker application types, but are now
+equivalent to `synapse.app.generic_worker`:
+
+ * `synapse.app.client_reader`
+ * `synapse.app.event_creator`
+ * `synapse.app.federation_reader`
+ * `synapse.app.frontend_proxy`
+ * `synapse.app.synchrotron`
+
+
+## Migration from old config
+
+A main change that has occurred is the merging of worker apps into
+`synapse.app.generic_worker`. This change is backwards compatible and so no
+changes to the config are required.
+
+To migrate apps to use `synapse.app.generic_worker` simply update the
+`worker_app` option in the worker configs, and where worker are started (e.g.
+in systemd service files, but not required for synctl).
+
+
+## Architectural diagram
+
+The following shows an example setup using Redis and a reverse proxy:
+
+```
+                     Clients & Federation
+                              |
+                              v
+                        +-----------+
+                        |           |
+                        |  Reverse  |
+                        |  Proxy    |
+                        |           |
+                        +-----------+
+                            | | |
+                            | | | HTTP requests
+        +-------------------+ | +-----------+
+        |                 +---+             |
+        |                 |                 |
+        v                 v                 v
++--------------+  +--------------+  +--------------+  +--------------+
+|   Main       |  |   Generic    |  |   Generic    |  |  Event       |
+|   Process    |  |   Worker 1   |  |   Worker 2   |  |  Persister   |
++--------------+  +--------------+  +--------------+  +--------------+
+      ^    ^          |   ^   |         |   ^   |         |   ^   ^
+      |    |          |   |   |         |   |   |         |   |   |
+      |    |          |   |   |  HTTP   |   |   |         |   |   |
+      |    +----------+<--|---|---------+<--|---|---------+   |   |
+      |                   |   +-------------|-->+-------------+   |
+      |                   |                 |                     |
+      |                   |                 |                     |
+      v                   v                 v                     v
+======================================================================
+                                                         Redis pub/sub channel
+```