summary refs log tree commit diff
path: root/docs/workers.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/workers.md')
-rw-r--r--docs/workers.md100
1 files changed, 68 insertions, 32 deletions
diff --git a/docs/workers.md b/docs/workers.md
index c27b3f8bd5..5e1b9ba220 100644
--- a/docs/workers.md
+++ b/docs/workers.md
@@ -88,10 +88,12 @@ shared configuration file.
 ### Shared configuration
 
 Normally, only a couple of changes are needed to make an existing configuration
-file suitable for use with workers. First, you need to enable an "HTTP replication
-listener" for the main process; and secondly, you need to enable redis-based
-replication. Optionally, a shared secret can be used to authenticate HTTP
-traffic between workers. For example:
+file suitable for use with workers. First, you need to enable an
+["HTTP replication listener"](usage/configuration/config_documentation.md#listeners)
+for the main process; and secondly, you need to enable
+[redis-based replication](usage/configuration/config_documentation.md#redis).
+Optionally, a [shared secret](usage/configuration/config_documentation.md#worker_replication_secret)
+can be used to authenticate HTTP traffic between workers. For example:
 
 ```yaml
 # extend the existing `listeners` section. This defines the ports that the
@@ -111,25 +113,28 @@ redis:
     enabled: true
 ```
 
-See the [configuration manual](usage/configuration/config_documentation.html) for the full documentation of each option.
+See the [configuration manual](usage/configuration/config_documentation.md)
+for the full documentation of each option.
 
 Under **no circumstances** should the replication listener be exposed to the
 public internet; replication traffic is:
 
 * always unencrypted
-* unauthenticated, unless `worker_replication_secret` is configured
+* unauthenticated, unless [`worker_replication_secret`](usage/configuration/config_documentation.md#worker_replication_secret)
+  is configured
 
 
 ### Worker configuration
 
 In the config file for each worker, you must specify:
- * The type of worker (`worker_app`). The currently available worker applications are listed below.
- * A unique name for the worker (`worker_name`).
+ * The type of worker ([`worker_app`](usage/configuration/config_documentation.md#worker_app)).
+   The currently available worker applications are listed [below](#available-worker-applications).
+ * A unique name for the worker ([`worker_name`](usage/configuration/config_documentation.md#worker_name)).
  * The HTTP replication endpoint that it should talk to on the main synapse process
-   (`worker_replication_host` and `worker_replication_http_port`)
- * If handling HTTP requests, a `worker_listeners` option with an `http`
-   listener, in the same way as the [`listeners`](usage/configuration/config_documentation.md#listeners)
-   option in the shared config.
+   ([`worker_replication_host`](usage/configuration/config_documentation.md#worker_replication_host) and
+   [`worker_replication_http_port`](usage/configuration/config_documentation.md#worker_replication_http_port)).
+ * If handling HTTP requests, a [`worker_listeners`](usage/configuration/config_documentation.md#worker_listeners) option
+   with an `http` listener.
  * If handling the `^/_matrix/client/v3/keys/upload` endpoint, the HTTP URI for
    the main process (`worker_main_http_uri`).
 
@@ -146,7 +151,6 @@ plain HTTP endpoint on port 8083 separately serving various endpoints, e.g.
 Obviously you should configure your reverse-proxy to route the relevant
 endpoints to the worker (`localhost:8083` in the above example).
 
-
 ### Running Synapse with workers
 
 Finally, you need to start your worker processes. This can be done with either
@@ -288,7 +292,8 @@ For multiple workers not handling the SSO endpoints properly, see
 [#9427](https://github.com/matrix-org/synapse/issues/9427).
 
 Note that a [HTTP listener](usage/configuration/config_documentation.md#listeners)
-with `client` and `federation` `resources` must be configured in the `worker_listeners`
+with `client` and `federation` `resources` must be configured in the
+[`worker_listeners`](usage/configuration/config_documentation.md#worker_listeners)
 option in the worker config.
 
 #### Load balancing
@@ -331,9 +336,10 @@ of the main process to a particular worker.
 
 To enable this, the worker must have a
 [HTTP `replication` listener](usage/configuration/config_documentation.md#listeners) configured,
-have a `worker_name` and be listed in the `instance_map` config. The same worker
-can handle multiple streams, but unless otherwise documented, each stream can only
-have a single writer.
+have a [`worker_name`](usage/configuration/config_documentation.md#worker_name)
+and be listed in the [`instance_map`](usage/configuration/config_documentation.md#instance_map)
+config. The same worker can handle multiple streams, but unless otherwise documented,
+each stream can only have a single writer.
 
 For example, to move event persistence off to a dedicated worker, the shared
 configuration would include:
@@ -360,9 +366,26 @@ streams and the endpoints associated with them:
 
 ##### The `events` stream
 
-The `events` stream experimentally supports having multiple writers, where work
-is sharded between them by room ID. Note that you *must* restart all worker
-instances when adding or removing event persisters. An example `stream_writers`
+The `events` stream experimentally supports having multiple writer workers, where load
+is sharded between them by room ID. Each writer is called an _event persister_. They are
+responsible for
+- receiving new events,
+- linking them to those already in the room [DAG](development/room-dag-concepts.md),
+- persisting them to the DB, and finally
+- updating the events stream.
+
+Because load is sharded in this way, you *must* restart all worker instances when 
+adding or removing event persisters.
+
+An `event_persister` should not be mistaken for an `event_creator`.
+An `event_creator` listens for requests from clients to create new events and does
+so. It will then pass those events over HTTP replication to any configured event
+persisters (or the main process if none are configured).
+
+Note that `event_creator`s and `event_persister`s are implemented using the same
+[`synapse.app.generic_worker`](#synapse.app.generic_worker).
+
+An example [`stream_writers`](usage/configuration/config_documentation.md#stream_writers)
 configuration with multiple writers:
 
 ```yaml
@@ -416,16 +439,18 @@ worker. Background tasks are run periodically or started via replication. Exactl
 which tasks are configured to run depends on your Synapse configuration (e.g. if
 stats is enabled). This worker doesn't handle any REST endpoints itself.
 
-To enable this, the worker must have a `worker_name` and can be configured to run
-background tasks. For example, to move background tasks to a dedicated worker,
-the shared configuration would include:
+To enable this, the worker must have a unique
+[`worker_name`](usage/configuration/config_documentation.md#worker_name)
+and can be configured to run background tasks. For example, to move background tasks
+to a dedicated worker, the shared configuration would include:
 
 ```yaml
 run_background_tasks_on: background_worker
 ```
 
-You might also wish to investigate the `update_user_directory_from_worker` and
-`media_instance_running_background_jobs` settings.
+You might also wish to investigate the
+[`update_user_directory_from_worker`](#updating-the-user-directory) and
+[`media_instance_running_background_jobs`](#synapseappmedia_repository) settings.
 
 An example for a dedicated background worker instance:
 
@@ -478,13 +503,17 @@ worker application type.
 ### `synapse.app.pusher`
 
 Handles sending push notifications to sygnal and email. Doesn't handle any
-REST endpoints itself, but you should set `start_pushers: False` in the
+REST endpoints itself, but you should set
+[`start_pushers: false`](usage/configuration/config_documentation.md#start_pushers) in the
 shared configuration file to stop the main synapse sending push notifications.
 
-To run multiple instances at once the `pusher_instances` option should list all
-pusher instances by their worker name, e.g.:
+To run multiple instances at once the
+[`pusher_instances`](usage/configuration/config_documentation.md#pusher_instances)
+option should list all pusher instances by their
+[`worker_name`](usage/configuration/config_documentation.md#worker_name), e.g.:
 
 ```yaml
+start_pushers: false
 pusher_instances:
     - pusher_worker1
     - pusher_worker2
@@ -512,15 +541,20 @@ Note this worker cannot be load-balanced: only one instance should be active.
 ### `synapse.app.federation_sender`
 
 Handles sending federation traffic to other servers. Doesn't handle any
-REST endpoints itself, but you should set `send_federation: False` in the
-shared configuration file to stop the main synapse sending this traffic.
+REST endpoints itself, but you should set
+[`send_federation: false`](usage/configuration/config_documentation.md#send_federation)
+in the shared configuration file to stop the main synapse sending this traffic.
 
 If running multiple federation senders then you must list each
-instance in the `federation_sender_instances` option by their `worker_name`.
+instance in the
+[`federation_sender_instances`](usage/configuration/config_documentation.md#federation_sender_instances)
+option by their
+[`worker_name`](usage/configuration/config_documentation.md#worker_name).
 All instances must be stopped and started when adding or removing instances.
 For example:
 
 ```yaml
+send_federation: false
 federation_sender_instances:
     - federation_sender1
     - federation_sender2
@@ -547,7 +581,9 @@ Handles the media repository. It can handle all endpoints starting with:
     ^/_synapse/admin/v1/quarantine_media/.*$
     ^/_synapse/admin/v1/users/.*/media$
 
-You should also set `enable_media_repo: False` in the shared configuration
+You should also set
+[`enable_media_repo: False`](usage/configuration/config_documentation.md#enable_media_repo)
+in the shared configuration
 file to stop the main synapse running background jobs related to managing the
 media repository. Note that doing so will prevent the main process from being
 able to handle the above endpoints.