3 files changed, 289 insertions, 81 deletions
diff --git a/docs/sample_log_config.yaml b/docs/sample_log_config.yaml
index 3065a0e2d9..6339160d00 100644
--- a/docs/sample_log_config.yaml
+++ b/docs/sample_log_config.yaml
@@ -6,7 +6,7 @@
 # Synapse also supports structured logging for machine readable logs which can
 # be ingested by ELK stacks. See [2] for details.
 #
-# [1]: https://docs.python.org/3.7/library/logging.config.html#configuration-dictionary-schema
+# [1]: https://docs.python.org/3/library/logging.config.html#configuration-dictionary-schema
 # [2]: https://matrix-org.github.io/synapse/latest/structured_logging.html
 
 version: 1
diff --git a/docs/usage/configuration/config_documentation.md b/docs/usage/configuration/config_documentation.md
index d81eda52c1..fb5eb42c52 100644
--- a/docs/usage/configuration/config_documentation.md
+++ b/docs/usage/configuration/config_documentation.md
@@ -99,7 +99,7 @@ modules:
     config: {}
 ```
 ---
-## Server ##
+## Server
 
 Define your homeserver name and other base options.
 
@@ -159,7 +159,7 @@ including _matrix/...). This is the same URL a user might enter into the
 'Custom Homeserver URL' field on their client. If you use Synapse with a
 reverse proxy, this should be the URL to reach Synapse via the proxy.
 Otherwise, it should be the URL to reach Synapse's client HTTP listener (see
-'listeners' below).
+['listeners'](#listeners) below).
 
 Defaults to `https://<server_name>/`.
 
@@ -570,7 +570,7 @@ Example configuration:
 delete_stale_devices_after: 1y
 ```
 
-## Homeserver blocking ##
+## Homeserver blocking
 Useful options for Synapse admins.
 
 ---
@@ -922,7 +922,7 @@ retention:
       interval: 1d
 ```
 ---
-## TLS ##
+## TLS
 
 Options related to TLS.
 
@@ -1012,7 +1012,7 @@ federation_custom_ca_list:
   - myCA3.pem
 ```
 ---
-## Federation ##
+## Federation
 
 Options related to federation.
 
@@ -1071,7 +1071,7 @@ Example configuration:
 allow_device_name_lookup_over_federation: true
 ```
 ---
-## Caching ##
+## Caching
 
 Options related to caching.
 
@@ -1185,7 +1185,7 @@ file in Synapse's `contrib` directory, you can send a `SIGHUP` signal by using
 `systemctl reload matrix-synapse`.
 
 ---
-## Database ##
+## Database
 Config options related to database settings.
 
 ---
@@ -1332,20 +1332,21 @@ databases:
       cp_max: 10
 ```
 ---
-## Logging ##
+## Logging
 Config options related to logging.
 
 ---
 ### `log_config`
 
-This option specifies a yaml python logging config file as described [here](https://docs.python.org/3.7/library/logging.config.html#configuration-dictionary-schema).
+This option specifies a yaml python logging config file as described
+[here](https://docs.python.org/3/library/logging.config.html#configuration-dictionary-schema).
 
 Example configuration:
 ```yaml
 log_config: "CONFDIR/SERVERNAME.log.config"
 ```
 ---
-## Ratelimiting ##
+## Ratelimiting
 Options related to ratelimiting in Synapse.
 
 Each ratelimiting configuration is made of two parameters:
@@ -1576,7 +1577,7 @@ Example configuration:
 federation_rr_transactions_per_room_per_second: 40
 ```
 ---
-## Media Store ##
+## Media Store
 Config options related to Synapse's media store.
 
 ---
@@ -1766,7 +1767,7 @@ url_preview_ip_range_blacklist:
   - 'ff00::/8'
   - 'fec0::/10'
 ```
-----
+---
 ### `url_preview_ip_range_whitelist`
 
 This option sets a list of IP address CIDR ranges that the URL preview spider is allowed
@@ -1860,7 +1861,7 @@ Example configuration:
    - 'fr;q=0.8'
    - '*;q=0.7'
 ```
-----
+---
 ### `oembed`
 
 oEmbed allows for easier embedding content from a website. It can be
@@ -1877,7 +1878,7 @@ oembed:
     - oembed/my_providers.json
 ```
 ---
-## Captcha ##
+## Captcha
 
 See [here](../../CAPTCHA_SETUP.md) for full details on setting up captcha.
 
@@ -1926,7 +1927,7 @@ Example configuration:
 recaptcha_siteverify_api: "https://my.recaptcha.site"
 ```
 ---
-## TURN ##
+## TURN
 Options related to adding a TURN server to Synapse.
 
 ---
@@ -1947,7 +1948,7 @@ Example configuration:
 ```yaml
 turn_shared_secret: "YOUR_SHARED_SECRET"
 ```
-----
+---
 ### `turn_username` and `turn_password`
 
 The Username and password if the TURN server needs them and does not use a token.
@@ -2366,7 +2367,7 @@ Example configuration:
 ```yaml
 session_lifetime: 24h
 ```
-----
+---
 ### `refresh_access_token_lifetime`
 
 Time that an access token remains valid for, if the session is using refresh tokens.
@@ -2422,7 +2423,7 @@ nonrefreshable_access_token_lifetime: 24h
 ```
 
 ---
-## Metrics ###
+## Metrics
 Config options related to metrics.
 
 ---
@@ -2519,7 +2520,7 @@ Example configuration:
 report_stats_endpoint: https://example.com/report-usage-stats/push
 ```
 ---
-## API Configuration ##
+## API Configuration
 Config settings related to the client/server API
 
 ---
@@ -2619,7 +2620,7 @@ Example configuration:
 form_secret: <PRIVATE STRING>
 ```
 ---
-## Signing Keys ##
+## Signing Keys
 Config options relating to signing keys
 
 ---
@@ -2728,7 +2729,7 @@ Example configuration:
 key_server_signing_keys_path: "key_server_signing_keys.key"
 ```
 ---
-## Single sign-on integration ##
+## Single sign-on integration
 
 The following settings can be used to make Synapse use a single sign-on
 provider for authentication, instead of its internal password database.
@@ -3348,7 +3349,7 @@ email:
     email_validation: "[%(server_name)s] Validate your email"
 ```
 ---
-## Push ##
+## Push
 Configuration settings related to push notifications
 
 ---
@@ -3381,7 +3382,7 @@ push:
   group_unread_count_by_room: false
 ```
 ---
-## Rooms ##
+## Rooms
 Config options relating to rooms.
 
 ---
@@ -3627,7 +3628,7 @@ default_power_level_content_override:
 ```
 
 ---
-## Opentracing ##
+## Opentracing
 Configuration options related to Opentracing support.
 
 ---
@@ -3670,14 +3671,71 @@ opentracing:
         false
 ```
 ---
-## Workers ##
-Configuration options related to workers.
+## Coordinating workers
+Configuration options related to workers which belong in the main config file
+(usually called `homeserver.yaml`).
+A Synapse deployment can scale horizontally by running multiple Synapse processes
+called _workers_. Incoming requests are distributed between workers to handle higher
+loads. Some workers are privileged and can accept requests from other workers.
+
+As a result, the worker configuration is divided into two parts.
+
+1. The first part (in this section of the manual) defines which shardable tasks
+   are delegated to privileged workers. This allows unprivileged workers to make
+   request a privileged worker to act on their behalf.
+1. [The second part](#individual-worker-configuration)
+   controls the behaviour of individual workers in isolation.
+
+For guidance on setting up workers, see the [worker documentation](../../workers.md).
+
+---
+### `worker_replication_secret`
+
+A shared secret used by the replication APIs on the main process to authenticate
+HTTP requests from workers.
+
+The default, this value is omitted (equivalently `null`), which means that 
+traffic between the workers and the main process is not authenticated.
+
+Example configuration:
+```yaml
+worker_replication_secret: "secret_secret"
+```
+---
+### `start_pushers`
+
+Controls sending of push notifications on the main process. Set to `false`
+if using a [pusher worker](../../workers.md#synapseapppusher). Defaults to `true`.
+
+Example configuration:
+```yaml
+start_pushers: false
+```
+---
+### `pusher_instances`
+
+It is possible to run multiple [pusher workers](../../workers.md#synapseapppusher),
+in which case the work is balanced across them. Use this setting to list the pushers by
+[`worker_name`](#worker_name). Ensure the main process and all pusher workers are
+restarted after changing this option.
 
+If no or only one pusher worker is configured, this setting is not necessary.
+The main process will send out push notifications by default if you do not disable
+it by setting [`start_pushers: false`](#start_pushers).
+
+Example configuration:
+```yaml
+start_pushers: false
+pusher_instances:
+  - pusher_worker1
+  - pusher_worker2
+```
 ---
 ### `send_federation`
 
 Controls sending of outbound federation transactions on the main process.
-Set to false if using a federation sender worker. Defaults to true.
+Set to `false` if using a [federation sender worker](../../workers.md#synapseappfederation_sender).
+Defaults to `true`.
 
 Example configuration:
 ```yaml
@@ -3686,8 +3744,9 @@ send_federation: false
 ---
 ### `federation_sender_instances`
 
-It is possible to run multiple federation sender workers, in which case the
-work is balanced across them. Use this setting to list the senders.
+It is possible to run multiple
+[federation sender worker](../../workers.md#synapseappfederation_sender), in which
+case the work is balanced across them. Use this setting to list the senders.
 
 This configuration setting must be shared between all federation sender workers, and if
 changed all federation sender workers must be stopped at the same time and then
@@ -3696,14 +3755,19 @@ events may be dropped).
 
 Example configuration:
 ```yaml
+send_federation: false
 federation_sender_instances:
   - federation_sender1
 ```
 ---
 ### `instance_map`
 
-When using workers this should be a map from worker name to the
+When using workers this should be a map from [`worker_name`](#worker_name) to the
 HTTP replication listener of the worker, if configured.
+Each worker declared under [`stream_writers`](../../workers.md#stream-writers) needs 
+a HTTP replication listener, and that listener should be included in the `instance_map`.
+(The main process also needs an HTTP replication listener, but it should not be 
+listed in the `instance_map`.)
 
 Example configuration:
 ```yaml
@@ -3716,8 +3780,11 @@ instance_map:
 ### `stream_writers`
 
 Experimental: When using workers you can define which workers should
-handle event persistence and typing notifications. Any worker
-specified here must also be in the `instance_map`.
+handle writing to streams such as event persistence and typing notifications.
+Any worker specified here must also be in the [`instance_map`](#instance_map).
+
+See the list of available streams in the
+[worker documentation](../../workers.md#stream-writers).
 
 Example configuration:
 ```yaml
@@ -3728,29 +3795,18 @@ stream_writers:
 ---
 ### `run_background_tasks_on`
 
-The worker that is used to run background tasks (e.g. cleaning up expired
-data). If not provided this defaults to the main process.
+The [worker](../../workers.md#background-tasks) that is used to run
+background tasks (e.g. cleaning up expired data). If not provided this
+defaults to the main process.
 
 Example configuration:
 ```yaml
 run_background_tasks_on: worker1
 ```
 ---
-### `worker_replication_secret`
-
-A shared secret used by the replication APIs to authenticate HTTP requests
-from workers.
-
-By default this is unused and traffic is not authenticated.
-
-Example configuration:
-```yaml
-worker_replication_secret: "secret_secret"
-```
 ### `redis`
 
-Configuration for Redis when using workers. This *must* be enabled when
-using workers (unless using old style direct TCP configuration).
+Configuration for Redis when using workers. This *must* be enabled when using workers.
 This setting has the following sub-options:
 * `enabled`: whether to use Redis support. Defaults to false.
 * `host` and `port`: Optional host and port to use to connect to redis. Defaults to
@@ -3765,7 +3821,123 @@ redis:
   port: 6379
   password: <secret_password>
 ```
-## Background Updates ##
+---
+## Individual worker configuration
+These options configure an individual worker, in its worker configuration file.
+They should be not be provided when configuring the main process.
+
+Note also the configuration above for
+[coordinating a cluster of workers](#coordinating-workers).
+
+For guidance on setting up workers, see the [worker documentation](../../workers.md).
+
+---
+### `worker_app`
+
+The type of worker. The currently available worker applications are listed
+in [worker documentation](../../workers.md#available-worker-applications).
+
+The most common worker is the
+[`synapse.app.generic_worker`](../../workers.md#synapseappgeneric_worker).
+
+Example configuration:
+```yaml
+worker_app: synapse.app.generic_worker
+```
+---
+### `worker_name`
+
+A unique name for the worker. The worker needs a name to be addressed in
+further parameters and identification in log files. We strongly recommend
+giving each worker a unique `worker_name`.
+
+Example configuration:
+```yaml
+worker_name: generic_worker1
+```
+---
+### `worker_replication_host`
+
+The HTTP replication endpoint that it should talk to on the main Synapse process.
+The main Synapse process defines this with a `replication` resource in
+[`listeners` option](#listeners).
+
+Example configuration:
+```yaml
+worker_replication_host: 127.0.0.1
+```
+---
+### `worker_replication_http_port`
+
+The HTTP replication port that it should talk to on the main Synapse process.
+The main Synapse process defines this with a `replication` resource in
+[`listeners` option](#listeners).
+
+Example configuration:
+```yaml
+worker_replication_http_port: 9093
+```
+---
+### `worker_listeners`
+
+A worker can handle HTTP requests. To do so, a `worker_listeners` option 
+must be declared, in the same way as the [`listeners` option](#listeners) 
+in the shared config.
+
+Workers declared in [`stream_writers`](#stream_writers) will need to include a
+`replication` listener here, in order to accept internal HTTP requests from
+other workers.
+
+Example configuration:
+```yaml
+worker_listeners:
+  - type: http
+    port: 8083
+    resources:
+      - names: [client, federation]
+```
+---
+### `worker_daemonize`
+
+Specifies whether the worker should be started as a daemon process.
+If Synapse is being managed by [systemd](../../systemd-with-workers/README.md), this option 
+must be omitted or set to `false`.
+
+Defaults to `false`.
+
+Example configuration:
+```yaml
+worker_daemonize: true
+```
+---
+### `worker_pid_file`
+
+When running a worker as a daemon, we need a place to store the 
+[PID](https://en.wikipedia.org/wiki/Process_identifier) of the worker.
+This option defines the location of that "pid file".
+
+This option is required if `worker_daemonize` is `true` and ignored 
+otherwise. It has no default.
+
+See also the [`pid_file` option](#pid_file) option for the main Synapse process.
+
+Example configuration:
+```yaml
+worker_pid_file: DATADIR/generic_worker1.pid
+```
+---
+### `worker_log_config`
+
+This option specifies a yaml python logging config file as described
+[here](https://docs.python.org/3/library/logging.config.html#configuration-dictionary-schema).
+See also the [`log_config` option](#log_config) option for the main Synapse process.
+
+Example configuration:
+```yaml
+worker_log_config: /etc/matrix-synapse/generic-worker-log.yaml
+```
+---
+## Background Updates
 Configuration settings related to background updates.
 
 ---
diff --git a/docs/workers.md b/docs/workers.md
index c27b3f8bd5..5e1b9ba220 100644
--- a/docs/workers.md
+++ b/docs/workers.md
@@ -88,10 +88,12 @@ shared configuration file.
 ### Shared configuration
 
 Normally, only a couple of changes are needed to make an existing configuration
-file suitable for use with workers. First, you need to enable an "HTTP replication
-listener" for the main process; and secondly, you need to enable redis-based
-replication. Optionally, a shared secret can be used to authenticate HTTP
-traffic between workers. For example:
+file suitable for use with workers. First, you need to enable an
+["HTTP replication listener"](usage/configuration/config_documentation.md#listeners)
+for the main process; and secondly, you need to enable
+[redis-based replication](usage/configuration/config_documentation.md#redis).
+Optionally, a [shared secret](usage/configuration/config_documentation.md#worker_replication_secret)
+can be used to authenticate HTTP traffic between workers. For example:
 
 ```yaml
 # extend the existing `listeners` section. This defines the ports that the
@@ -111,25 +113,28 @@ redis:
     enabled: true
 ```
 
-See the [configuration manual](usage/configuration/config_documentation.html) for the full documentation of each option.
+See the [configuration manual](usage/configuration/config_documentation.md)
+for the full documentation of each option.
 
 Under **no circumstances** should the replication listener be exposed to the
 public internet; replication traffic is:
 
 * always unencrypted
-* unauthenticated, unless `worker_replication_secret` is configured
+* unauthenticated, unless [`worker_replication_secret`](usage/configuration/config_documentation.md#worker_replication_secret)
+  is configured
 
 
 ### Worker configuration
 
 In the config file for each worker, you must specify:
- * The type of worker (`worker_app`). The currently available worker applications are listed below.
- * A unique name for the worker (`worker_name`).
+ * The type of worker ([`worker_app`](usage/configuration/config_documentation.md#worker_app)).
+   The currently available worker applications are listed [below](#available-worker-applications).
+ * A unique name for the worker ([`worker_name`](usage/configuration/config_documentation.md#worker_name)).
  * The HTTP replication endpoint that it should talk to on the main synapse process
-   (`worker_replication_host` and `worker_replication_http_port`)
- * If handling HTTP requests, a `worker_listeners` option with an `http`
-   listener, in the same way as the [`listeners`](usage/configuration/config_documentation.md#listeners)
-   option in the shared config.
+   ([`worker_replication_host`](usage/configuration/config_documentation.md#worker_replication_host) and
+   [`worker_replication_http_port`](usage/configuration/config_documentation.md#worker_replication_http_port)).
+ * If handling HTTP requests, a [`worker_listeners`](usage/configuration/config_documentation.md#worker_listeners) option
+   with an `http` listener.
  * If handling the `^/_matrix/client/v3/keys/upload` endpoint, the HTTP URI for
    the main process (`worker_main_http_uri`).
 
@@ -146,7 +151,6 @@ plain HTTP endpoint on port 8083 separately serving various endpoints, e.g.
 Obviously you should configure your reverse-proxy to route the relevant
 endpoints to the worker (`localhost:8083` in the above example).
 
-
 ### Running Synapse with workers
 
 Finally, you need to start your worker processes. This can be done with either
@@ -288,7 +292,8 @@ For multiple workers not handling the SSO endpoints properly, see
 [#9427](https://github.com/matrix-org/synapse/issues/9427).
 
 Note that a [HTTP listener](usage/configuration/config_documentation.md#listeners)
-with `client` and `federation` `resources` must be configured in the `worker_listeners`
+with `client` and `federation` `resources` must be configured in the
+[`worker_listeners`](usage/configuration/config_documentation.md#worker_listeners)
 option in the worker config.
 
 #### Load balancing
@@ -331,9 +336,10 @@ of the main process to a particular worker.
 
 To enable this, the worker must have a
 [HTTP `replication` listener](usage/configuration/config_documentation.md#listeners) configured,
-have a `worker_name` and be listed in the `instance_map` config. The same worker
-can handle multiple streams, but unless otherwise documented, each stream can only
-have a single writer.
+have a [`worker_name`](usage/configuration/config_documentation.md#worker_name)
+and be listed in the [`instance_map`](usage/configuration/config_documentation.md#instance_map)
+config. The same worker can handle multiple streams, but unless otherwise documented,
+each stream can only have a single writer.
 
 For example, to move event persistence off to a dedicated worker, the shared
 configuration would include:
@@ -360,9 +366,26 @@ streams and the endpoints associated with them:
 
 ##### The `events` stream
 
-The `events` stream experimentally supports having multiple writers, where work
-is sharded between them by room ID. Note that you *must* restart all worker
-instances when adding or removing event persisters. An example `stream_writers`
+The `events` stream experimentally supports having multiple writer workers, where load
+is sharded between them by room ID. Each writer is called an _event persister_. They are
+responsible for
+- receiving new events,
+- linking them to those already in the room [DAG](development/room-dag-concepts.md),
+- persisting them to the DB, and finally
+- updating the events stream.
+
+Because load is sharded in this way, you *must* restart all worker instances when 
+adding or removing event persisters.
+
+An `event_persister` should not be mistaken for an `event_creator`.
+An `event_creator` listens for requests from clients to create new events and does
+so. It will then pass those events over HTTP replication to any configured event
+persisters (or the main process if none are configured).
+
+Note that `event_creator`s and `event_persister`s are implemented using the same
+[`synapse.app.generic_worker`](#synapse.app.generic_worker).
+
+An example [`stream_writers`](usage/configuration/config_documentation.md#stream_writers)
 configuration with multiple writers:
 
 ```yaml
@@ -416,16 +439,18 @@ worker. Background tasks are run periodically or started via replication. Exactl
 which tasks are configured to run depends on your Synapse configuration (e.g. if
 stats is enabled). This worker doesn't handle any REST endpoints itself.
 
-To enable this, the worker must have a `worker_name` and can be configured to run
-background tasks. For example, to move background tasks to a dedicated worker,
-the shared configuration would include:
+To enable this, the worker must have a unique
+[`worker_name`](usage/configuration/config_documentation.md#worker_name)
+and can be configured to run background tasks. For example, to move background tasks
+to a dedicated worker, the shared configuration would include:
 
 ```yaml
 run_background_tasks_on: background_worker
 ```
 
-You might also wish to investigate the `update_user_directory_from_worker` and
-`media_instance_running_background_jobs` settings.
+You might also wish to investigate the
+[`update_user_directory_from_worker`](#updating-the-user-directory) and
+[`media_instance_running_background_jobs`](#synapseappmedia_repository) settings.
 
 An example for a dedicated background worker instance:
 
@@ -478,13 +503,17 @@ worker application type.
 ### `synapse.app.pusher`
 
 Handles sending push notifications to sygnal and email. Doesn't handle any
-REST endpoints itself, but you should set `start_pushers: False` in the
+REST endpoints itself, but you should set
+[`start_pushers: false`](usage/configuration/config_documentation.md#start_pushers) in the
 shared configuration file to stop the main synapse sending push notifications.
 
-To run multiple instances at once the `pusher_instances` option should list all
-pusher instances by their worker name, e.g.:
+To run multiple instances at once the
+[`pusher_instances`](usage/configuration/config_documentation.md#pusher_instances)
+option should list all pusher instances by their
+[`worker_name`](usage/configuration/config_documentation.md#worker_name), e.g.:
 
 ```yaml
+start_pushers: false
 pusher_instances:
     - pusher_worker1
     - pusher_worker2
@@ -512,15 +541,20 @@ Note this worker cannot be load-balanced: only one instance should be active.
 ### `synapse.app.federation_sender`
 
 Handles sending federation traffic to other servers. Doesn't handle any
-REST endpoints itself, but you should set `send_federation: False` in the
-shared configuration file to stop the main synapse sending this traffic.
+REST endpoints itself, but you should set
+[`send_federation: false`](usage/configuration/config_documentation.md#send_federation)
+in the shared configuration file to stop the main synapse sending this traffic.
 
 If running multiple federation senders then you must list each
-instance in the `federation_sender_instances` option by their `worker_name`.
+instance in the
+[`federation_sender_instances`](usage/configuration/config_documentation.md#federation_sender_instances)
+option by their
+[`worker_name`](usage/configuration/config_documentation.md#worker_name).
 All instances must be stopped and started when adding or removing instances.
 For example:
 
 ```yaml
+send_federation: false
 federation_sender_instances:
     - federation_sender1
     - federation_sender2
@@ -547,7 +581,9 @@ Handles the media repository. It can handle all endpoints starting with:
     ^/_synapse/admin/v1/quarantine_media/.*$
     ^/_synapse/admin/v1/users/.*/media$
 
-You should also set `enable_media_repo: False` in the shared configuration
+You should also set
+[`enable_media_repo: False`](usage/configuration/config_documentation.md#enable_media_repo)
+in the shared configuration
 file to stop the main synapse running background jobs related to managing the
 media repository. Note that doing so will prevent the main process from being
 able to handle the above endpoints.