diff --git a/docs/workers.md b/docs/workers.md
index 4bd60ba0a0..7512eff43a 100644
--- a/docs/workers.md
+++ b/docs/workers.md
@@ -1,23 +1,31 @@
# Scaling synapse via workers
-Synapse has experimental support for splitting out functionality into
-multiple separate python processes, helping greatly with scalability. These
+For small instances it recommended to run Synapse in monolith mode (the
+default). For larger instances where performance is a concern it can be helpful
+to split out functionality into multiple separate python processes. These
processes are called 'workers', and are (eventually) intended to scale
horizontally independently.
-All of the below is highly experimental and subject to change as Synapse evolves,
-but documenting it here to help folks needing highly scalable Synapses similar
-to the one running matrix.org!
+Synapse's worker support is under active development and subject to change as
+we attempt to rapidly scale ever larger Synapse instances. However we are
+documenting it here to help admins needing a highly scalable Synapse instance
+similar to the one running `matrix.org`.
-All processes continue to share the same database instance, and as such, workers
-only work with postgres based synapse deployments (sharing a single sqlite
-across multiple processes is a recipe for disaster, plus you should be using
-postgres anyway if you care about scalability).
+All processes continue to share the same database instance, and as such,
+workers only work with PostgreSQL-based Synapse deployments. SQLite should only
+be used for demo purposes and any admin considering workers should already be
+running PostgreSQL.
-The workers communicate with the master synapse process via a synapse-specific
-TCP protocol called 'replication' - analogous to MySQL or Postgres style
-database replication; feeding a stream of relevant data to the workers so they
-can be kept in sync with the main synapse process and database state.
+## Master/worker communication
+
+The workers communicate with the master process via a Synapse-specific protocol
+called 'replication' (analogous to MySQL- or Postgres-style database
+replication) which feeds a stream of relevant data from the master to the
+workers so they can be kept in sync with the master process and database state.
+
+Additionally, workers may make HTTP requests to the master, to send information
+in the other direction. Typically this is used for operations which need to
+wait for a reply - such as sending an event.
## Configuration
@@ -27,72 +35,61 @@ the correct worker, or to the main synapse instance. Note that this includes
requests made to the federation port. See [reverse_proxy.md](reverse_proxy.md)
for information on setting up a reverse proxy.
-To enable workers, you need to add two replication listeners to the master
-synapse, e.g.:
-
- listeners:
- # The TCP replication port
- - port: 9092
- bind_address: '127.0.0.1'
- type: replication
- # The HTTP replication port
- - port: 9093
- bind_address: '127.0.0.1'
- type: http
- resources:
- - names: [replication]
-
-Under **no circumstances** should these replication API listeners be exposed to
-the public internet; it currently implements no authentication whatsoever and is
-unencrypted.
-
-(Roughly, the TCP port is used for streaming data from the master to the
-workers, and the HTTP port for the workers to send data to the main
-synapse process.)
-
-You then create a set of configs for the various worker processes. These
-should be worker configuration files, and should be stored in a dedicated
-subdirectory, to allow synctl to manipulate them. An additional configuration
-for the master synapse process will need to be created because the process will
-not be started automatically. That configuration should look like this:
-
- worker_app: synapse.app.homeserver
- daemonize: true
-
-Each worker configuration file inherits the configuration of the main homeserver
-configuration file. You can then override configuration specific to that worker,
-e.g. the HTTP listener that it provides (if any); logging configuration; etc.
-You should minimise the number of overrides though to maintain a usable config.
+To enable workers, you need to add *two* replication listeners to the
+main Synapse configuration file (`homeserver.yaml`). For example:
-You must specify the type of worker application (`worker_app`). The currently
-available worker applications are listed below. You must also specify the
-replication endpoints that it's talking to on the main synapse process.
-`worker_replication_host` should specify the host of the main synapse,
-`worker_replication_port` should point to the TCP replication listener port and
-`worker_replication_http_port` should point to the HTTP replication port.
+```yaml
+listeners:
+ # The TCP replication port
+ - port: 9092
+ bind_address: '127.0.0.1'
+ type: replication
-Currently, the `event_creator` and `federation_reader` workers require specifying
-`worker_replication_http_port`.
+ # The HTTP replication port
+ - port: 9093
+ bind_address: '127.0.0.1'
+ type: http
+ resources:
+ - names: [replication]
+```
-For instance:
-
- worker_app: synapse.app.synchrotron
-
- # The replication listener on the synapse to talk to.
- worker_replication_host: 127.0.0.1
- worker_replication_port: 9092
- worker_replication_http_port: 9093
-
- worker_listeners:
- - type: http
- port: 8083
- resources:
- - names:
- - client
-
- worker_daemonize: True
- worker_pid_file: /home/matrix/synapse/synchrotron.pid
- worker_log_config: /home/matrix/synapse/config/synchrotron_log_config.yaml
+Under **no circumstances** should these replication API listeners be exposed to
+the public internet; they have no authentication and are unencrypted.
+
+You should then create a set of configs for the various worker processes. Each
+worker configuration file inherits the configuration of the main homeserver
+configuration file. You can then override configuration specific to that
+worker, e.g. the HTTP listener that it provides (if any); logging
+configuration; etc. You should minimise the number of overrides though to
+maintain a usable config.
+
+In the config file for each worker, you must specify the type of worker
+application (`worker_app`). The currently available worker applications are
+listed below. You must also specify the replication endpoints that it should
+talk to on the main synapse process. `worker_replication_host` should specify
+the host of the main synapse, `worker_replication_port` should point to the TCP
+replication listener port and `worker_replication_http_port` should point to
+the HTTP replication port.
+
+For example:
+
+```yaml
+worker_app: synapse.app.synchrotron
+
+# The replication listener on the synapse to talk to.
+worker_replication_host: 127.0.0.1
+worker_replication_port: 9092
+worker_replication_http_port: 9093
+
+worker_listeners:
+ - type: http
+ port: 8083
+ resources:
+ - names:
+ - client
+
+worker_log_config: /home/matrix/synapse/config/synchrotron_log_config.yaml
+```
...is a full configuration for a synchrotron worker instance, which will expose a
plain HTTP `/sync` endpoint on port 8083 separately from the `/sync` endpoint provided
@@ -101,7 +98,75 @@ by the main synapse.
Obviously you should configure your reverse-proxy to route the relevant
endpoints to the worker (`localhost:8083` in the above example).
-Finally, to actually run your worker-based synapse, you must pass synctl the -a
+Finally, you need to start your worker processes. This can be done with either
+`synctl` or your distribution's preferred service manager such as `systemd`. We
+recommend the use of `systemd` where available: for information on setting up
+`systemd` to start synapse workers, see
+[systemd-with-workers](systemd-with-workers). To use `synctl`, see below.
+
+### **Experimental** support for replication over redis
+
+As of Synapse v1.13.0, it is possible to configure Synapse to send replication
+via a [Redis pub/sub channel](https://redis.io/topics/pubsub). This is an
+alternative to direct TCP connections to the master: rather than all the
+workers connecting to the master, all the workers and the master connect to
+Redis, which relays replication commands between processes. This can give a
+significant cpu saving on the master and will be a prerequisite for upcoming
+performance improvements.
+
+Note that this support is currently experimental; you may experience lost
+messages and similar problems! It is strongly recommended that admins setting
+up workers for the first time use direct TCP replication as above.
+
+To configure Synapse to use Redis:
+
+1. Install Redis following the normal procedure for your distribution - for
+ example, on Debian, `apt install redis-server`. (It is safe to use an
+ existing Redis deployment if you have one: we use a pub/sub stream named
+ according to the `server_name` of your synapse server.)
+2. Check Redis is running and accessible: you should be able to `echo PING | nc -q1
+ localhost 6379` and get a response of `+PONG`.
+3. Install the python prerequisites. If you installed synapse into a
+ virtualenv, this can be done with:
+ ```sh
+ pip install matrix-synapse[redis]
+ ```
+ The debian packages from matrix.org already include the required
+ dependencies.
+4. Add config to the shared configuration (`homeserver.yaml`):
+ ```yaml
+ redis:
+ enabled: true
+ ```
+ Optional parameters which can go alongside `enabled` are `host`, `port`,
+ `password`. Normally none of these are required.
+5. Restart master and all workers.
+
+Once redis replication is in use, `worker_replication_port` is redundant and
+can be removed from the worker configuration files. Similarly, the
+configuration for the `listener` for the TCP replication port can be removed
+from the main configuration file. Note that the HTTP replication port is
+still required.
+
+### Using synctl
+
+If you want to use `synctl` to manage your synapse processes, you will need to
+create an an additional configuration file for the master synapse process. That
+configuration should look like this:
+
+```yaml
+worker_app: synapse.app.homeserver
+```
+
+Additionally, each worker app must be configured with the name of a "pid file",
+to which it will write its process ID when it starts. For example, for a
+synchrotron, you might write:
+
+```yaml
+worker_pid_file: /home/matrix/synapse/synchrotron.pid
+```
+
+Finally, to actually run your worker-based synapse, you must pass synctl the `-a`
commandline option to tell it to operate on all the worker configurations found
in the given directory, e.g.:
@@ -168,20 +233,42 @@ endpoints matching the following regular expressions:
^/_matrix/federation/v1/make_join/
^/_matrix/federation/v1/make_leave/
^/_matrix/federation/v1/send_join/
+ ^/_matrix/federation/v2/send_join/
^/_matrix/federation/v1/send_leave/
+ ^/_matrix/federation/v2/send_leave/
^/_matrix/federation/v1/invite/
+ ^/_matrix/federation/v2/invite/
^/_matrix/federation/v1/query_auth/
^/_matrix/federation/v1/event_auth/
^/_matrix/federation/v1/exchange_third_party_invite/
+ ^/_matrix/federation/v1/user/devices/
^/_matrix/federation/v1/send/
+ ^/_matrix/federation/v1/get_groups_publicised$
^/_matrix/key/v2/query
+Additionally, the following REST endpoints can be handled for GET requests:
+
+ ^/_matrix/federation/v1/groups/
+
The above endpoints should all be routed to the federation_reader worker by the
reverse-proxy configuration.
The `^/_matrix/federation/v1/send/` endpoint must only be handled by a single
instance.
+Note that `federation` must be added to the listener resources in the worker config:
+
+```yaml
+worker_app: synapse.app.federation_reader
+...
+worker_listeners:
+ - type: http
+ port: <port>
+ resources:
+ - names:
+ - federation
+```
+
### `synapse.app.federation_sender`
Handles sending federation traffic to other servers. Doesn't handle any
@@ -196,16 +283,30 @@ Handles the media repository. It can handle all endpoints starting with:
/_matrix/media/
-And the following regular expressions matching media-specific administration APIs:
+... and the following regular expressions matching media-specific administration APIs:
^/_synapse/admin/v1/purge_media_cache$
- ^/_synapse/admin/v1/room/.*/media$
+ ^/_synapse/admin/v1/room/.*/media.*$
+ ^/_synapse/admin/v1/user/.*/media.*$
+ ^/_synapse/admin/v1/media/.*$
^/_synapse/admin/v1/quarantine_media/.*$
You should also set `enable_media_repo: False` in the shared configuration
file to stop the main synapse running background jobs related to managing the
media repository.
+In the `media_repository` worker configuration file, configure the http listener to
+expose the `media` resource. For example:
+
+```yaml
+ worker_listeners:
+ - type: http
+ port: 8085
+ resources:
+ - names:
+ - media
+```
+
Note this worker cannot be load-balanced: only one instance should be active.
### `synapse.app.client_reader`
@@ -224,15 +325,22 @@ following regular expressions:
^/_matrix/client/(api/v1|r0|unstable)/keys/changes$
^/_matrix/client/versions$
^/_matrix/client/(api/v1|r0|unstable)/voip/turnServer$
+ ^/_matrix/client/(api/v1|r0|unstable)/joined_groups$
+ ^/_matrix/client/(api/v1|r0|unstable)/publicised_groups$
+ ^/_matrix/client/(api/v1|r0|unstable)/publicised_groups/
Additionally, the following REST endpoints can be handled for GET requests:
^/_matrix/client/(api/v1|r0|unstable)/pushrules/.*$
+ ^/_matrix/client/(api/v1|r0|unstable)/groups/.*$
+ ^/_matrix/client/(api/v1|r0|unstable)/user/[^/]*/account_data/
+ ^/_matrix/client/(api/v1|r0|unstable)/user/[^/]*/rooms/[^/]*/account_data/
Additionally, the following REST endpoints can be handled, but all requests must
be routed to the same instance:
^/_matrix/client/(r0|unstable)/register$
+ ^/_matrix/client/(r0|unstable)/auth/.*/fallback/web$
Pagination requests can also be handled, but all requests with the same path
room must be routed to the same instance. Additionally, care must be taken to
@@ -248,6 +356,10 @@ the following regular expressions:
^/_matrix/client/(api/v1|r0|unstable)/user_directory/search$
+When using this worker you must also set `update_user_directory: False` in the
+shared configuration file to stop the main synapse running background
+jobs related to updating the user directory.
+
### `synapse.app.frontend_proxy`
Proxies some frequently-requested client endpoints to add caching and remove
@@ -276,6 +388,7 @@ file. For example:
Handles some event creation. It can handle REST endpoints matching:
^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/send
+ ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/state/
^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/(join|invite|leave|ban|unban|kick)$
^/_matrix/client/(api/v1|r0|unstable)/join/
^/_matrix/client/(api/v1|r0|unstable)/profile/
|