9 files changed, 483 insertions, 337 deletions
diff --git a/docs/.sample_config_header.yaml b/docs/.sample_config_header.yaml
index 35a591d042..8c9b31acdb 100644
--- a/docs/.sample_config_header.yaml
+++ b/docs/.sample_config_header.yaml
@@ -10,5 +10,16 @@
 # homeserver.yaml. Instead, if you are starting from scratch, please generate
 # a fresh config using Synapse by following the instructions in INSTALL.md.
 
+# Configuration options that take a time period can be set using a number
+# followed by a letter. Letters have the following meanings:
+# s = second
+# m = minute
+# h = hour
+# d = day
+# w = week
+# y = year
+# For example, setting redaction_retention_period: 5m would remove redacted
+# messages from the database after 5 minutes, rather than 5 months.
+
 ################################################################################
 
diff --git a/docs/ACME.md b/docs/ACME.md
index f4c4740476..a7a498f575 100644
--- a/docs/ACME.md
+++ b/docs/ACME.md
@@ -12,13 +12,14 @@ introduced support for automatically provisioning certificates through
 In [March 2019](https://community.letsencrypt.org/t/end-of-life-plan-for-acmev1/88430),
 Let's Encrypt announced that they were deprecating version 1 of the ACME
 protocol, with the plan to disable the use of it for new accounts in
-November 2019, and for existing accounts in June 2020.
+November 2019, for new domains in June 2020, and for existing accounts and
+domains in June 2021.
 
 Synapse doesn't currently support version 2 of the ACME protocol, which
 means that:
 
 * for existing installs, Synapse's built-in ACME support will continue
-  to work until June 2020.
+  to work until June 2021.
 * for new installs, this feature will not work at all.
 
 Either way, it is recommended to move from Synapse's ACME support
diff --git a/docs/admin_api/rooms.md b/docs/admin_api/rooms.md
index 15b83e9824..0f267d2b7b 100644
--- a/docs/admin_api/rooms.md
+++ b/docs/admin_api/rooms.md
@@ -369,7 +369,9 @@ to the new room will have power level `-10` by default, and thus be unable to sp
 If `block` is `True` it prevents new joins to the old room.
 
 This API will remove all trace of the old room from your database after removing
-all local users.
+all local users. If `purge` is `true` (the default), all traces of the old room will
+be removed from your database after removing all local users. If you do not want
+this to happen, set `purge` to `false`.
 Depending on the amount of history being purged a call to the API may take
 several minutes or longer.
 
@@ -388,7 +390,8 @@ with a body of:
     "new_room_user_id": "@someuser:example.com",
     "room_name": "Content Violation Notification",
     "message": "Bad Room has been shutdown due to content violations on this server. Please review our Terms of Service.",
-    "block": true
+    "block": true,
+    "purge": true
 }
 ```
 
@@ -430,8 +433,10 @@ The following JSON body parameters are available:
               `new_room_user_id` in the new room. Ideally this will clearly convey why the
                original room was shut down. Defaults to `Sharing illegal content on this server
                is not permitted and rooms in violation will be blocked.`
-* `block` - Optional. If set to `true`, this room will be added to a blocking list, preventing future attempts to
-  join the room. Defaults to `false`.
+* `block` - Optional. If set to `true`, this room will be added to a blocking list, preventing
+            future attempts to join the room. Defaults to `false`.
+* `purge` - Optional. If set to `true`, it will remove all traces of the room from your database.
+            Defaults to `true`.
 
 The JSON body must not be empty. The body must be at least `{}`.
 
diff --git a/docs/metrics-howto.md b/docs/metrics-howto.md
index cf69938a2a..b386ec91c1 100644
--- a/docs/metrics-howto.md
+++ b/docs/metrics-howto.md
@@ -27,7 +27,7 @@
     different thread to Synapse. This can make it more resilient to
     heavy load meaning metrics cannot be retrieved, and can be exposed
     to just internal networks easier. The served metrics are available
-    over HTTP only, and will be available at `/`.
+    over HTTP only, and will be available at `/_synapse/metrics`.
 
     Add a new listener to homeserver.yaml:
 
diff --git a/docs/password_auth_providers.md b/docs/password_auth_providers.md
index 5d9ae67041..fef1d47e85 100644
--- a/docs/password_auth_providers.md
+++ b/docs/password_auth_providers.md
@@ -19,102 +19,103 @@ password auth provider module implementations:
 
 Password auth provider classes must provide the following methods:
 
-*class* `SomeProvider.parse_config`(*config*)
+* `parse_config(config)`
+  This method is passed the `config` object for this module from the
+  homeserver configuration file.
 
-> This method is passed the `config` object for this module from the
-> homeserver configuration file.
->
-> It should perform any appropriate sanity checks on the provided
-> configuration, and return an object which is then passed into
-> `__init__`.
+  It should perform any appropriate sanity checks on the provided
+  configuration, and return an object which is then passed into
 
-*class* `SomeProvider`(*config*, *account_handler*)
+  This method should have the `@staticmethod` decoration.
 
-> The constructor is passed the config object returned by
-> `parse_config`, and a `synapse.module_api.ModuleApi` object which
-> allows the password provider to check if accounts exist and/or create
-> new ones.
+* `__init__(self, config, account_handler)`
+
+  The constructor is passed the config object returned by
+  `parse_config`, and a `synapse.module_api.ModuleApi` object which
+  allows the password provider to check if accounts exist and/or create
+  new ones.
 
 ## Optional methods
 
-Password auth provider classes may optionally provide the following
-methods.
-
-*class* `SomeProvider.get_db_schema_files`()
-
-> This method, if implemented, should return an Iterable of
-> `(name, stream)` pairs of database schema files. Each file is applied
-> in turn at initialisation, and a record is then made in the database
-> so that it is not re-applied on the next start.
-
-`someprovider.get_supported_login_types`()
-
-> This method, if implemented, should return a `dict` mapping from a
-> login type identifier (such as `m.login.password`) to an iterable
-> giving the fields which must be provided by the user in the submission
-> to the `/login` api. These fields are passed in the `login_dict`
-> dictionary to `check_auth`.
->
-> For example, if a password auth provider wants to implement a custom
-> login type of `com.example.custom_login`, where the client is expected
-> to pass the fields `secret1` and `secret2`, the provider should
-> implement this method and return the following dict:
->
->     {"com.example.custom_login": ("secret1", "secret2")}
-
-`someprovider.check_auth`(*username*, *login_type*, *login_dict*)
-
-> This method is the one that does the real work. If implemented, it
-> will be called for each login attempt where the login type matches one
-> of the keys returned by `get_supported_login_types`.
->
-> It is passed the (possibly UNqualified) `user` provided by the client,
-> the login type, and a dictionary of login secrets passed by the
-> client.
->
-> The method should return a Twisted `Deferred` object, which resolves
-> to the canonical `@localpart:domain` user id if authentication is
-> successful, and `None` if not.
->
-> Alternatively, the `Deferred` can resolve to a `(str, func)` tuple, in
-> which case the second field is a callback which will be called with
-> the result from the `/login` call (including `access_token`,
-> `device_id`, etc.)
-
-`someprovider.check_3pid_auth`(*medium*, *address*, *password*)
-
-> This method, if implemented, is called when a user attempts to
-> register or log in with a third party identifier, such as email. It is
-> passed the medium (ex. "email"), an address (ex.
-> "<jdoe@example.com>") and the user's password.
->
-> The method should return a Twisted `Deferred` object, which resolves
-> to a `str` containing the user's (canonical) User ID if
-> authentication was successful, and `None` if not.
->
-> As with `check_auth`, the `Deferred` may alternatively resolve to a
-> `(user_id, callback)` tuple.
-
-`someprovider.check_password`(*user_id*, *password*)
-
-> This method provides a simpler interface than
-> `get_supported_login_types` and `check_auth` for password auth
-> providers that just want to provide a mechanism for validating
-> `m.login.password` logins.
->
-> Iif implemented, it will be called to check logins with an
-> `m.login.password` login type. It is passed a qualified
-> `@localpart:domain` user id, and the password provided by the user.
->
-> The method should return a Twisted `Deferred` object, which resolves
-> to `True` if authentication is successful, and `False` if not.
-
-`someprovider.on_logged_out`(*user_id*, *device_id*, *access_token*)
-
-> This method, if implemented, is called when a user logs out. It is
-> passed the qualified user ID, the ID of the deactivated device (if
-> any: access tokens are occasionally created without an associated
-> device ID), and the (now deactivated) access token.
->
-> It may return a Twisted `Deferred` object; the logout request will
-> wait for the deferred to complete but the result is ignored.
+Password auth provider classes may optionally provide the following methods:
+
+* `get_db_schema_files(self)`
+
+  This method, if implemented, should return an Iterable of
+  `(name, stream)` pairs of database schema files. Each file is applied
+  in turn at initialisation, and a record is then made in the database
+  so that it is not re-applied on the next start.
+
+* `get_supported_login_types(self)`
+
+  This method, if implemented, should return a `dict` mapping from a
+  login type identifier (such as `m.login.password`) to an iterable
+  giving the fields which must be provided by the user in the submission
+  to [the `/login` API](https://matrix.org/docs/spec/client_server/latest#post-matrix-client-r0-login).
+  These fields are passed in the `login_dict` dictionary to `check_auth`.
+
+  For example, if a password auth provider wants to implement a custom
+  login type of `com.example.custom_login`, where the client is expected
+  to pass the fields `secret1` and `secret2`, the provider should
+  implement this method and return the following dict:
+
+  ```python
+  {"com.example.custom_login": ("secret1", "secret2")}
+  ```
+
+* `check_auth(self, username, login_type, login_dict)`
+
+  This method does the real work. If implemented, it
+  will be called for each login attempt where the login type matches one
+  of the keys returned by `get_supported_login_types`.
+
+  It is passed the (possibly unqualified) `user` field provided by the client,
+  the login type, and a dictionary of login secrets passed by the
+  client.
+
+  The method should return an `Awaitable` object, which resolves
+  to the canonical `@localpart:domain` user ID if authentication is
+  successful, and `None` if not.
+
+  Alternatively, the `Awaitable` can resolve to a `(str, func)` tuple, in
+  which case the second field is a callback which will be called with
+  the result from the `/login` call (including `access_token`,
+  `device_id`, etc.)
+
+* `check_3pid_auth(self, medium, address, password)`
+
+  This method, if implemented, is called when a user attempts to
+  register or log in with a third party identifier, such as email. It is
+  passed the medium (ex. "email"), an address (ex.
+  "<jdoe@example.com>") and the user's password.
+
+  The method should return an `Awaitable` object, which resolves
+  to a `str` containing the user's (canonical) User id if
+  authentication was successful, and `None` if not.
+
+  As with `check_auth`, the `Awaitable` may alternatively resolve to a
+  `(user_id, callback)` tuple.
+
+* `check_password(self, user_id, password)`
+
+  This method provides a simpler interface than
+  `get_supported_login_types` and `check_auth` for password auth
+  providers that just want to provide a mechanism for validating
+  `m.login.password` logins.
+
+  If implemented, it will be called to check logins with an
+  `m.login.password` login type. It is passed a qualified
+  `@localpart:domain` user id, and the password provided by the user.
+
+  The method should return an `Awaitable` object, which resolves
+  to `True` if authentication is successful, and `False` if not.
+
+* `on_logged_out(self, user_id, device_id, access_token)`
+
+  This method, if implemented, is called when a user logs out. It is
+  passed the qualified user ID, the ID of the deactivated device (if
+  any: access tokens are occasionally created without an associated
+  device ID), and the (now deactivated) access token.
+
+  It may return an `Awaitable` object; the logout request will
+  wait for the `Awaitable` to complete, but the result is ignored.
diff --git a/docs/postgres.md b/docs/postgres.md
index 70fe29cdcc..e71a1975d8 100644
--- a/docs/postgres.md
+++ b/docs/postgres.md
@@ -188,6 +188,9 @@ to do step 2.
 
 It is safe to at any time kill the port script and restart it.
 
+Note that the database may take up significantly more (25% - 100% more)
+space on disk after porting to Postgres.
+
 ### Using the port script
 
 Firstly, shut down the currently running synapse server and copy its
diff --git a/docs/sample_config.yaml b/docs/sample_config.yaml
index 3227294e0b..341bd2f858 100644
--- a/docs/sample_config.yaml
+++ b/docs/sample_config.yaml
@@ -10,6 +10,17 @@
 # homeserver.yaml. Instead, if you are starting from scratch, please generate
 # a fresh config using Synapse by following the instructions in INSTALL.md.
 
+# Configuration options that take a time period can be set using a number
+# followed by a letter. Letters have the following meanings:
+# s = second
+# m = minute
+# h = hour
+# d = day
+# w = week
+# y = year
+# For example, setting redaction_retention_period: 5m would remove redacted
+# messages from the database after 5 minutes, rather than 5 months.
+
 ################################################################################
 
 # Configuration file for Synapse.
@@ -314,6 +325,10 @@ limit_remote_rooms:
   #
   #complexity_error: "This room is too complex."
 
+  # allow server admins to join complex rooms. Default is false.
+  #
+  #admins_can_join: true
+
 # Whether to require a user to be in the room to add an alias to it.
 # Defaults to 'true'.
 #
@@ -1145,24 +1160,6 @@ account_validity:
 #
 #default_identity_server: https://matrix.org
 
-# The list of identity servers trusted to verify third party
-# identifiers by this server.
-#
-# Also defines the ID server which will be called when an account is
-# deactivated (one will be picked arbitrarily).
-#
-# Note: This option is deprecated. Since v0.99.4, Synapse has tracked which identity
-# server a 3PID has been bound to. For 3PIDs bound before then, Synapse runs a
-# background migration script, informing itself that the identity server all of its
-# 3PIDs have been bound to is likely one of the below.
-#
-# As of Synapse v1.4.0, all other functionality of this option has been deprecated, and
-# it is now solely used for the purposes of the background migration script, and can be
-# removed once it has run.
-#trusted_third_party_id_servers:
-#  - matrix.org
-#  - vector.im
-
 # Handle threepid (email/phone etc) registration and password resets through a set of
 # *trusted* identity servers. Note that this allows the configured identity server to
 # reset passwords for accounts!
@@ -2398,3 +2395,57 @@ opentracing:
     #
     #  logging:
     #    false
+
+
+## Workers ##
+
+# Disables sending of outbound federation transactions on the main process.
+# Uncomment if using a federation sender worker.
+#
+#send_federation: false
+
+# It is possible to run multiple federation sender workers, in which case the
+# work is balanced across them.
+#
+# This configuration must be shared between all federation sender workers, and if
+# changed all federation sender workers must be stopped at the same time and then
+# started, to ensure that all instances are running with the same config (otherwise
+# events may be dropped).
+#
+#federation_sender_instances:
+#  - federation_sender1
+
+# When using workers this should be a map from `worker_name` to the
+# HTTP replication listener of the worker, if configured.
+#
+#instance_map:
+#  worker1:
+#    host: localhost
+#    port: 8034
+
+# Experimental: When using workers you can define which workers should
+# handle event persistence and typing notifications. Any worker
+# specified here must also be in the `instance_map`.
+#
+#stream_writers:
+#  events: worker1
+#  typing: worker1
+
+
+# Configuration for Redis when using workers. This *must* be enabled when
+# using workers (unless using old style direct TCP configuration).
+#
+redis:
+  # Uncomment the below to enable Redis support.
+  #
+  #enabled: true
+
+  # Optional host and port to use to connect to redis. Defaults to
+  # localhost and 6379
+  #
+  #host: localhost
+  #port: 6379
+
+  # Optional password if configured on the Redis instance
+  #
+  #password: <secret_password>
diff --git a/docs/synctl_workers.md b/docs/synctl_workers.md
new file mode 100644
index 0000000000..8da4a31852
--- /dev/null
+++ b/docs/synctl_workers.md
@@ -0,0 +1,32 @@
+### Using synctl with workers
+
+If you want to use `synctl` to manage your synapse processes, you will need to
+create an an additional configuration file for the main synapse process. That
+configuration should look like this:
+
+```yaml
+worker_app: synapse.app.homeserver
+```
+
+Additionally, each worker app must be configured with the name of a "pid file",
+to which it will write its process ID when it starts. For example, for a
+synchrotron, you might write:
+
+```yaml
+worker_pid_file: /home/matrix/synapse/worker1.pid
+```
+
+Finally, to actually run your worker-based synapse, you must pass synctl the `-a`
+commandline option to tell it to operate on all the worker configurations found
+in the given directory, e.g.:
+
+    synctl -a $CONFIG/workers start
+
+Currently one should always restart all workers when restarting or upgrading
+synapse, unless you explicitly know it's safe not to.  For instance, restarting
+synapse without restarting all the synchrotrons may result in broken typing
+notifications.
+
+To manipulate a specific worker, you pass the -w option to synctl:
+
+    synctl -w $CONFIG/workers/worker1.yaml restart
diff --git a/docs/workers.md b/docs/workers.md
index f4cbbc0400..80b65a0cec 100644
--- a/docs/workers.md
+++ b/docs/workers.md
@@ -1,10 +1,10 @@
 # Scaling synapse via workers
 
-For small instances it recommended to run Synapse in monolith mode (the
-default). For larger instances where performance is a concern it can be helpful
-to split out functionality into multiple separate python processes. These
-processes are called 'workers', and are (eventually) intended to scale
-horizontally independently.
+For small instances it recommended to run Synapse in the default monolith mode.
+For larger instances where performance is a concern it can be helpful to split
+out functionality into multiple separate python processes. These processes are
+called 'workers', and are (eventually) intended to scale horizontally
+independently.
 
 Synapse's worker support is under active development and subject to change as
 we attempt to rapidly scale ever larger Synapse instances. However we are
@@ -16,69 +16,115 @@ workers only work with PostgreSQL-based Synapse deployments. SQLite should only
 be used for demo purposes and any admin considering workers should already be
 running PostgreSQL.
 
-## Master/worker communication
+## Main process/worker communication
+
+The processes communicate with each other via a Synapse-specific protocol called
+'replication' (analogous to MySQL- or Postgres-style database replication) which
+feeds streams of newly written data between processes so they can be kept in
+sync with the database state.
+
+When configured to do so, Synapse uses a 
+[Redis pub/sub channel](https://redis.io/topics/pubsub) to send the replication
+stream between all configured Synapse processes. Additionally, processes may
+make HTTP requests to each other, primarily for operations which need to wait
+for a reply ─ such as sending an event.
+
+Redis support was added in v1.13.0 with it becoming the recommended method in
+v1.18.0. It replaced the old direct TCP connections (which is deprecated as of
+v1.18.0) to the main process. With Redis, rather than all the workers connecting
+to the main process, all the workers and the main process connect to Redis,
+which relays replication commands between processes. This can give a significant
+cpu saving on the main process and will be a prerequisite for upcoming
+performance improvements.
+
+See the [Architectural diagram](#architectural-diagram) section at the end for
+a visualisation of what this looks like.
+
 
-The workers communicate with the master process via a Synapse-specific protocol
-called 'replication' (analogous to MySQL- or Postgres-style database
-replication) which feeds a stream of relevant data from the master to the
-workers so they can be kept in sync with the master process and database state.
+## Setting up workers
 
-Additionally, workers may make HTTP requests to the master, to send information
-in the other direction. Typically this is used for operations which need to
-wait for a reply - such as sending an event.
+A Redis server is required to manage the communication between the processes.
+The Redis server should be installed following the normal procedure for your
+distribution (e.g. `apt install redis-server` on Debian). It is safe to use an
+existing Redis deployment if you have one.
 
-## Configuration
+Once installed, check that Redis is running and accessible from the host running
+Synapse, for example by executing `echo PING | nc -q1 localhost 6379` and seeing
+a response of `+PONG`.
+
+The appropriate dependencies must also be installed for Synapse. If using a
+virtualenv, these can be installed with:
+
+```sh
+pip install matrix-synapse[redis]
+```
+
+Note that these dependencies are included when synapse is installed with `pip
+install matrix-synapse[all]`. They are also included in the debian packages from
+`matrix.org` and in the docker images at
+https://hub.docker.com/r/matrixdotorg/synapse/.
 
 To make effective use of the workers, you will need to configure an HTTP
 reverse-proxy such as nginx or haproxy, which will direct incoming requests to
-the correct worker, or to the main synapse instance. Note that this includes
-requests made to the federation port. See [reverse_proxy.md](reverse_proxy.md)
-for information on setting up a reverse proxy.
+the correct worker, or to the main synapse instance. See 
+[reverse_proxy.md](reverse_proxy.md) for information on setting up a reverse
+proxy.
+
+To enable workers you should create a configuration file for each worker
+process. Each worker configuration file inherits the configuration of the shared
+homeserver configuration file.  You can then override configuration specific to
+that worker, e.g. the HTTP listener that it provides (if any); logging
+configuration; etc.  You should minimise the number of overrides though to
+maintain a usable config.
+
+
+### Shared Configuration
 
-To enable workers, you need to add *two* replication listeners to the
-main Synapse configuration file (`homeserver.yaml`). For example:
+Next you need to add both a HTTP replication listener, used for HTTP requests
+between processes, and redis config to the shared Synapse configuration file
+(`homeserver.yaml`). For example:
 
 ```yaml
+# extend the existing `listeners` section. This defines the ports that the
+# main process will listen on.
 listeners:
-  # The TCP replication port
-  - port: 9092
-    bind_address: '127.0.0.1'
-    type: replication
-
   # The HTTP replication port
   - port: 9093
     bind_address: '127.0.0.1'
     type: http
     resources:
      - names: [replication]
+
+redis:
+    enabled: true
 ```
 
-Under **no circumstances** should these replication API listeners be exposed to
-the public internet; they have no authentication and are unencrypted.
+See the sample config for the full documentation of each option.
 
-You should then create a set of configs for the various worker processes.  Each
-worker configuration file inherits the configuration of the main homeserver
-configuration file.  You can then override configuration specific to that
-worker, e.g. the HTTP listener that it provides (if any); logging
-configuration; etc.  You should minimise the number of overrides though to
-maintain a usable config.
+Under **no circumstances** should the replication listener be exposed to the
+public internet; it has no authentication and is unencrypted.
+
+
+### Worker Configuration
 
 In the config file for each worker, you must specify the type of worker
-application (`worker_app`). The currently available worker applications are
-listed below. You must also specify the replication endpoints that it should
-talk to on the main synapse process.  `worker_replication_host` should specify
-the host of the main synapse, `worker_replication_port` should point to the TCP
-replication listener port and `worker_replication_http_port` should point to
-the HTTP replication port.
+application (`worker_app`), and you should specify a unqiue name for the worker
+(`worker_name`). The currently available worker applications are listed below.
+You must also specify the HTTP replication endpoint that it should talk to on
+the main synapse process.  `worker_replication_host` should specify the host of
+the main synapse and `worker_replication_http_port` should point to the HTTP
+replication port. If the worker will handle HTTP requests then the
+`worker_listeners` option should be set with a `http` listener, in the same way
+as the `listeners` option in the shared config.
 
 For example:
 
 ```yaml
-worker_app: synapse.app.synchrotron
+worker_app: synapse.app.generic_worker
+worker_name: worker1
 
-# The replication listener on the synapse to talk to.
+# The replication listener on the main synapse process.
 worker_replication_host: 127.0.0.1
-worker_replication_port: 9092
 worker_replication_http_port: 9093
 
 worker_listeners:
@@ -87,13 +133,14 @@ worker_listeners:
    resources:
      - names:
        - client
+       - federation
 
-worker_log_config: /home/matrix/synapse/config/synchrotron_log_config.yaml
+worker_log_config: /home/matrix/synapse/config/worker1_log_config.yaml
 ```
 
-...is a full configuration for a synchrotron worker instance, which will expose a
-plain HTTP `/sync` endpoint on port 8083 separately from the `/sync` endpoint provided
-by the main synapse.
+...is a full configuration for a generic worker instance, which will expose a
+plain HTTP endpoint on port 8083 separately serving various endpoints, e.g.
+`/sync`, which are listed below.
 
 Obviously you should configure your reverse-proxy to route the relevant
 endpoints to the worker (`localhost:8083` in the above example).
@@ -102,127 +149,24 @@ Finally, you need to start your worker processes. This can be done with either
 `synctl` or your distribution's preferred service manager such as `systemd`. We
 recommend the use of `systemd` where available: for information on setting up
 `systemd` to start synapse workers, see
-[systemd-with-workers](systemd-with-workers). To use `synctl`, see below.
+[systemd-with-workers](systemd-with-workers). To use `synctl`, see
+[synctl_workers.md](synctl_workers.md).
 
-### **Experimental** support for replication over redis
-
-As of Synapse v1.13.0, it is possible to configure Synapse to send replication
-via a [Redis pub/sub channel](https://redis.io/topics/pubsub). This is an
-alternative to direct TCP connections to the master: rather than all the
-workers connecting to the master, all the workers and the master connect to
-Redis, which relays replication commands between processes. This can give a
-significant cpu saving on the master and will be a prerequisite for upcoming
-performance improvements.
-
-Note that this support is currently experimental; you may experience lost
-messages and similar problems! It is strongly recommended that admins setting
-up workers for the first time use direct TCP replication as above.
-
-To configure Synapse to use Redis:
-
-1. Install Redis following the normal procedure for your distribution - for
-   example, on Debian, `apt install redis-server`. (It is safe to use an
-   existing Redis deployment if you have one: we use a pub/sub stream named
-   according to the `server_name` of your synapse server.)
-2. Check Redis is running and accessible: you should be able to `echo PING | nc -q1
-   localhost 6379` and get a response of `+PONG`.
-3. Install the python prerequisites. If you installed synapse into a
-   virtualenv, this can be done with:
-   ```sh
-   pip install matrix-synapse[redis]
-   ```
-   The debian packages from matrix.org already include the required
-   dependencies.
-4. Add config to the shared configuration (`homeserver.yaml`):
-    ```yaml
-    redis:
-      enabled: true
-    ```
-    Optional parameters which can go alongside `enabled` are `host`, `port`,
-    `password`. Normally none of these are required.
-5. Restart master and all workers.
-
-Once redis replication is in use, `worker_replication_port` is redundant and
-can be removed from the worker configuration files. Similarly, the
-configuration for the `listener` for the TCP replication port can be removed
-from the main configuration file. Note that the HTTP replication port is
-still required.
-
-### Using synctl
-
-If you want to use `synctl` to manage your synapse processes, you will need to
-create an an additional configuration file for the master synapse process. That
-configuration should look like this:
-
-```yaml
-worker_app: synapse.app.homeserver
-```
-
-Additionally, each worker app must be configured with the name of a "pid file",
-to which it will write its process ID when it starts. For example, for a
-synchrotron, you might write:
-
-```yaml
-worker_pid_file: /home/matrix/synapse/synchrotron.pid
-```
-
-Finally, to actually run your worker-based synapse, you must pass synctl the `-a`
-commandline option to tell it to operate on all the worker configurations found
-in the given directory, e.g.:
-
-    synctl -a $CONFIG/workers start
-
-Currently one should always restart all workers when restarting or upgrading
-synapse, unless you explicitly know it's safe not to.  For instance, restarting
-synapse without restarting all the synchrotrons may result in broken typing
-notifications.
-
-To manipulate a specific worker, you pass the -w option to synctl:
-
-    synctl -w $CONFIG/workers/synchrotron.yaml restart
 
 ## Available worker applications
 
-### `synapse.app.pusher`
-
-Handles sending push notifications to sygnal and email. Doesn't handle any
-REST endpoints itself, but you should set `start_pushers: False` in the
-shared configuration file to stop the main synapse sending these notifications.
-
-Note this worker cannot be load-balanced: only one instance should be active.
-
-### `synapse.app.synchrotron`
+### `synapse.app.generic_worker`
 
-The synchrotron handles `sync` requests from clients. In particular, it can
-handle REST endpoints matching the following regular expressions:
+This worker can handle API requests matching the following regular
+expressions:
 
+    # Sync requests
     ^/_matrix/client/(v2_alpha|r0)/sync$
     ^/_matrix/client/(api/v1|v2_alpha|r0)/events$
     ^/_matrix/client/(api/v1|r0)/initialSync$
     ^/_matrix/client/(api/v1|r0)/rooms/[^/]+/initialSync$
 
-The above endpoints should all be routed to the synchrotron worker by the
-reverse-proxy configuration.
-
-It is possible to run multiple instances of the synchrotron to scale
-horizontally. In this case the reverse-proxy should be configured to
-load-balance across the instances, though it will be more efficient if all
-requests from a particular user are routed to a single instance. Extracting
-a userid from the access token is currently left as an exercise for the reader.
-
-### `synapse.app.appservice`
-
-Handles sending output traffic to Application Services. Doesn't handle any
-REST endpoints itself, but you should set `notify_appservices: False` in the
-shared configuration file to stop the main synapse sending these notifications.
-
-Note this worker cannot be load-balanced: only one instance should be active.
-
-### `synapse.app.federation_reader`
-
-Handles a subset of federation endpoints. In particular, it can handle REST
-endpoints matching the following regular expressions:
-
+    # Federation requests
     ^/_matrix/federation/v1/event/
     ^/_matrix/federation/v1/state/
     ^/_matrix/federation/v1/state_ids/
@@ -242,40 +186,145 @@ endpoints matching the following regular expressions:
     ^/_matrix/federation/v1/event_auth/
     ^/_matrix/federation/v1/exchange_third_party_invite/
     ^/_matrix/federation/v1/user/devices/
-    ^/_matrix/federation/v1/send/
     ^/_matrix/federation/v1/get_groups_publicised$
     ^/_matrix/key/v2/query
 
+    # Inbound federation transaction request
+    ^/_matrix/federation/v1/send/
+
+    # Client API requests
+    ^/_matrix/client/(api/v1|r0|unstable)/publicRooms$
+    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/joined_members$
+    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/context/.*$
+    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/members$
+    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/state$
+    ^/_matrix/client/(api/v1|r0|unstable)/account/3pid$
+    ^/_matrix/client/(api/v1|r0|unstable)/keys/query$
+    ^/_matrix/client/(api/v1|r0|unstable)/keys/changes$
+    ^/_matrix/client/versions$
+    ^/_matrix/client/(api/v1|r0|unstable)/voip/turnServer$
+    ^/_matrix/client/(api/v1|r0|unstable)/joined_groups$
+    ^/_matrix/client/(api/v1|r0|unstable)/publicised_groups$
+    ^/_matrix/client/(api/v1|r0|unstable)/publicised_groups/
+
+    # Registration/login requests
+    ^/_matrix/client/(api/v1|r0|unstable)/login$
+    ^/_matrix/client/(r0|unstable)/register$
+    ^/_matrix/client/(r0|unstable)/auth/.*/fallback/web$
+
+    # Event sending requests
+    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/send
+    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/state/
+    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/(join|invite|leave|ban|unban|kick)$
+    ^/_matrix/client/(api/v1|r0|unstable)/join/
+    ^/_matrix/client/(api/v1|r0|unstable)/profile/
+
+
 Additionally, the following REST endpoints can be handled for GET requests:
 
     ^/_matrix/federation/v1/groups/
 
-The above endpoints should all be routed to the federation_reader worker by the
-reverse-proxy configuration.
+Pagination requests can also be handled, but all requests for a given
+room must be routed to the same instance. Additionally, care must be taken to
+ensure that the purge history admin API is not used while pagination requests
+for the room are in flight:
+
+    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/messages$
+
+Note that a HTTP listener with `client` and `federation` resources must be
+configured in the `worker_listeners` option in the worker config.
+
+
+#### Load balancing
+
+It is possible to run multiple instances of this worker app, with incoming requests
+being load-balanced between them by the reverse-proxy. However, different endpoints
+have different characteristics and so admins
+may wish to run multiple groups of workers handling different endpoints so that
+load balancing can be done in different ways.
+
+For `/sync` and `/initialSync` requests it will be more efficient if all
+requests from a particular user are routed to a single instance. Extracting a
+user ID from the access token or `Authorization` header is currently left as an
+exercise for the reader. Admins may additionally wish to separate out `/sync`
+requests that have a `since` query parameter from those that don't (and
+`/initialSync`), as requests that don't are known as "initial sync" that happens
+when a user logs in on a new device and can be *very* resource intensive, so
+isolating these requests will stop them from interfering with other users ongoing
+syncs.
+
+Federation and client requests can be balanced via simple round robin.
+
+The inbound federation transaction request `^/_matrix/federation/v1/send/`
+should be balanced by source IP so that transactions from the same remote server
+go to the same process.
 
-The `^/_matrix/federation/v1/send/` endpoint must only be handled by a single
-instance.
+Registration/login requests can be handled separately purely to help ensure that
+unexpected load doesn't affect new logins and sign ups.
 
-Note that `federation` must be added to the listener resources in the worker config:
+Finally, event sending requests can be balanced by the room ID in the URI (or
+the full URI, or even just round robin), the room ID is the path component after
+`/rooms/`. If there is a large bridge connected that is sending or may send lots
+of events, then a dedicated set of workers can be provisioned to limit the
+effects of bursts of events from that bridge on events sent by normal users.
+
+#### Stream writers
+
+Additionally, there is *experimental* support for moving writing of specific
+streams (such as events) off of the main process to a particular worker. (This
+is only supported with Redis-based replication.)
+
+Currently support streams are `events` and `typing`.
+
+To enable this, the worker must have a HTTP replication listener configured,
+have a `worker_name` and be listed in the `instance_map` config. For example to
+move event persistence off to a dedicated worker, the shared configuration would
+include:
 
 ```yaml
-worker_app: synapse.app.federation_reader
-...
-worker_listeners:
- - type: http
-   port: <port>
-   resources:
-     - names:
-       - federation
+instance_map:
+    event_persister1:
+        host: localhost
+        port: 8034
+
+stream_writers:
+    events: event_persister1
 ```
 
+
+### `synapse.app.pusher`
+
+Handles sending push notifications to sygnal and email. Doesn't handle any
+REST endpoints itself, but you should set `start_pushers: False` in the
+shared configuration file to stop the main synapse sending push notifications.
+
+Note this worker cannot be load-balanced: only one instance should be active.
+
+### `synapse.app.appservice`
+
+Handles sending output traffic to Application Services. Doesn't handle any
+REST endpoints itself, but you should set `notify_appservices: False` in the
+shared configuration file to stop the main synapse sending appservice notifications.
+
+Note this worker cannot be load-balanced: only one instance should be active.
+
+
 ### `synapse.app.federation_sender`
 
 Handles sending federation traffic to other servers. Doesn't handle any
 REST endpoints itself, but you should set `send_federation: False` in the
 shared configuration file to stop the main synapse sending this traffic.
 
-Note this worker cannot be load-balanced: only one instance should be active.
+If running multiple federation senders then you must list each
+instance in the `federation_sender_instances` option by their `worker_name`.
+All instances must be stopped and started when adding or removing instances.
+For example:
+
+```yaml
+federation_sender_instances:
+    - federation_sender1
+    - federation_sender2
+```
 
 ### `synapse.app.media_repository`
 
@@ -314,46 +363,6 @@ and you must configure a single instance to run the background tasks, e.g.:
     media_instance_running_background_jobs: "media-repository-1"
 ```
 
-### `synapse.app.client_reader`
-
-Handles client API endpoints. It can handle REST endpoints matching the
-following regular expressions:
-
-    ^/_matrix/client/(api/v1|r0|unstable)/publicRooms$
-    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/joined_members$
-    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/context/.*$
-    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/members$
-    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/state$
-    ^/_matrix/client/(api/v1|r0|unstable)/login$
-    ^/_matrix/client/(api/v1|r0|unstable)/account/3pid$
-    ^/_matrix/client/(api/v1|r0|unstable)/keys/query$
-    ^/_matrix/client/(api/v1|r0|unstable)/keys/changes$
-    ^/_matrix/client/versions$
-    ^/_matrix/client/(api/v1|r0|unstable)/voip/turnServer$
-    ^/_matrix/client/(api/v1|r0|unstable)/joined_groups$
-    ^/_matrix/client/(api/v1|r0|unstable)/publicised_groups$
-    ^/_matrix/client/(api/v1|r0|unstable)/publicised_groups/
-
-Additionally, the following REST endpoints can be handled for GET requests:
-
-    ^/_matrix/client/(api/v1|r0|unstable)/pushrules/.*$
-    ^/_matrix/client/(api/v1|r0|unstable)/groups/.*$
-    ^/_matrix/client/(api/v1|r0|unstable)/user/[^/]*/account_data/
-    ^/_matrix/client/(api/v1|r0|unstable)/user/[^/]*/rooms/[^/]*/account_data/
-
-Additionally, the following REST endpoints can be handled, but all requests must
-be routed to the same instance:
-
-    ^/_matrix/client/(r0|unstable)/register$
-    ^/_matrix/client/(r0|unstable)/auth/.*/fallback/web$
-
-Pagination requests can also be handled, but all requests with the same path
-room must be routed to the same instance. Additionally, care must be taken to
-ensure that the purge history admin API is not used while pagination requests
-for the room are in flight:
-
-    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/messages$
-
 ### `synapse.app.user_dir`
 
 Handles searches in the user directory. It can handle REST endpoints matching
@@ -388,15 +397,48 @@ file. For example:
 
     worker_main_http_uri: http://127.0.0.1:8008
 
-### `synapse.app.event_creator`
+### Historical apps
 
-Handles some event creation. It can handle REST endpoints matching:
+*Note:* Historically there used to be more apps, however they have been
+amalgamated into a single `synapse.app.generic_worker` app. The remaining apps
+are ones that do specific processing unrelated to requests, e.g. the `pusher`
+that handles sending out push notifications for new events. The intention is for
+all these to be folded into the `generic_worker` app and to use config to define
+which processes handle the various proccessing such as push notifications.
 
-    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/send
-    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/state/
-    ^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/(join|invite|leave|ban|unban|kick)$
-    ^/_matrix/client/(api/v1|r0|unstable)/join/
-    ^/_matrix/client/(api/v1|r0|unstable)/profile/
 
-It will create events locally and then send them on to the main synapse
-instance to be persisted and handled.
+## Architectural diagram
+
+The following shows an example setup using Redis and a reverse proxy:
+
+```
+                     Clients & Federation
+                              |
+                              v
+                        +-----------+
+                        |           |
+                        |  Reverse  |
+                        |  Proxy    |
+                        |           |
+                        +-----------+
+                            | | |
+                            | | | HTTP requests
+        +-------------------+ | +-----------+
+        |                 +---+             |
+        |                 |                 |
+        v                 v                 v
++--------------+  +--------------+  +--------------+  +--------------+
+|   Main       |  |   Generic    |  |   Generic    |  |  Event       |
+|   Process    |  |   Worker 1   |  |   Worker 2   |  |  Persister   |
++--------------+  +--------------+  +--------------+  +--------------+
+      ^    ^          |   ^   |         |   ^   |          ^    ^
+      |    |          |   |   |         |   |   |          |    |
+      |    |          |   |   |  HTTP   |   |   |          |    |
+      |    +----------+<--|---|---------+   |   |          |    |
+      |                   |   +-------------|-->+----------+    |
+      |                   |                 |                   |
+      |                   |                 |                   |
+      v                   v                 v                   v
+====================================================================
+                                                         Redis pub/sub channel
+```