summary refs log tree commit diff
path: root/synapse
diff options
context:
space:
mode:
authorRichard van der Hoff <richard@matrix.org>2020-03-23 13:54:17 +0000
committerRichard van der Hoff <richard@matrix.org>2020-03-23 13:54:17 +0000
commit229eb81498b0fe1da81e9b5b333a0285acde9446 (patch)
tree10d1a9b9c0c88e3156215a77cf490fca2aab1432 /synapse
parentUpdate postgres.md (diff)
parentmatrix.org was fine (diff)
downloadsynapse-229eb81498b0fe1da81e9b5b333a0285acde9446.tar.xz
Merge tag 'v1.12.0'
Synapse 1.12.0 (2020-03-23)
===========================

No significant changes since 1.12.0rc1.

Debian packages and Docker images are rebuilt using the latest versions of
dependency libraries, including Twisted 20.3.0. **Please see security advisory
below**.

Security advisory
-----------------

Synapse may be vulnerable to request-smuggling attacks when it is used with a
reverse-proxy. The vulnerabilties are fixed in Twisted 20.3.0, and are
described in
[CVE-2020-10108](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10108)
and
[CVE-2020-10109](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10109).
For a good introduction to this class of request-smuggling attacks, see
https://portswigger.net/research/http-desync-attacks-request-smuggling-reborn.

We are not aware of these vulnerabilities being exploited in the wild, and
do not believe that they are exploitable with current versions of any reverse
proxies. Nevertheless, we recommend that all Synapse administrators ensure that
they have the latest versions of the Twisted library to ensure that their
installation remains secure.

* Administrators using the [`matrix.org` Docker
  image](https://hub.docker.com/r/matrixdotorg/synapse/) or the [Debian/Ubuntu
  packages from
  `matrix.org`](https://github.com/matrix-org/synapse/blob/master/INSTALL.md#matrixorg-packages)
  should ensure that they have version 1.12.0 installed: these images include
  Twisted 20.3.0.
* Administrators who have [installed Synapse from
  source](https://github.com/matrix-org/synapse/blob/master/INSTALL.md#installing-from-source)
  should upgrade Twisted within their virtualenv by running:
  ```sh
  <path_to_virtualenv>/bin/pip install 'Twisted>=20.3.0'
  ```
* Administrators who have installed Synapse from distribution packages should
  consult the information from their distributions.

The `matrix.org` Synapse instance was not vulnerable to these vulnerabilities.

Advance notice of change to the default `git` branch for Synapse
----------------------------------------------------------------

Currently, the default `git` branch for Synapse is `master`, which tracks the
latest release.

After the release of Synapse 1.13.0, we intend to change this default to
`develop`, which is the development tip. This is more consistent with common
practice and modern `git` usage.

Although we try to keep `develop` in a stable state, there may be occasions
where regressions creep in. Developers and distributors who have scripts which
run builds using the default branch of `Synapse` should therefore consider
pinning their scripts to `master`.

Synapse 1.12.0rc1 (2020-03-19)
==============================

Features
--------

- Changes related to room alias management ([MSC2432](https://github.com/matrix-org/matrix-doc/pull/2432)):
  - Publishing/removing a room from the room directory now requires the user to have a power level capable of modifying the canonical alias, instead of the room aliases. ([\#6965](https://github.com/matrix-org/synapse/issues/6965))
  - Validate the `alt_aliases` property of canonical alias events. ([\#6971](https://github.com/matrix-org/synapse/issues/6971))
  - Users with a power level sufficient to modify the canonical alias of a room can now delete room aliases. ([\#6986](https://github.com/matrix-org/synapse/issues/6986))
  - Implement updated authorization rules and redaction rules for aliases events, from [MSC2261](https://github.com/matrix-org/matrix-doc/pull/2261) and [MSC2432](https://github.com/matrix-org/matrix-doc/pull/2432). ([\#7037](https://github.com/matrix-org/synapse/issues/7037))
  - Stop sending m.room.aliases events during room creation and upgrade. ([\#6941](https://github.com/matrix-org/synapse/issues/6941))
  - Synapse no longer uses room alias events to calculate room names for push notifications. ([\#6966](https://github.com/matrix-org/synapse/issues/6966))
  - The room list endpoint no longer returns a list of aliases. ([\#6970](https://github.com/matrix-org/synapse/issues/6970))
  - Remove special handling of aliases events from [MSC2260](https://github.com/matrix-org/matrix-doc/pull/2260) added in v1.10.0rc1. ([\#7034](https://github.com/matrix-org/synapse/issues/7034))
- Expose the `synctl`, `hash_password` and `generate_config` commands in the snapcraft package. Contributed by @devec0. ([\#6315](https://github.com/matrix-org/synapse/issues/6315))
- Check that server_name is correctly set before running database updates. ([\#6982](https://github.com/matrix-org/synapse/issues/6982))
- Break down monthly active users by `appservice_id` and emit via Prometheus. ([\#7030](https://github.com/matrix-org/synapse/issues/7030))
- Render a configurable and comprehensible error page if something goes wrong during the SAML2 authentication process. ([\#7058](https://github.com/matrix-org/synapse/issues/7058), [\#7067](https://github.com/matrix-org/synapse/issues/7067))
- Add an optional parameter to control whether other sessions are logged out when a user's password is modified. ([\#7085](https://github.com/matrix-org/synapse/issues/7085))
- Add prometheus metrics for the number of active pushers. ([\#7103](https://github.com/matrix-org/synapse/issues/7103), [\#7106](https://github.com/matrix-org/synapse/issues/7106))
- Improve performance when making HTTPS requests to sygnal, sydent, etc, by sharing the SSL context object between connections. ([\#7094](https://github.com/matrix-org/synapse/issues/7094))

Bugfixes
--------

- When a user's profile is updated via the admin API, also generate a displayname/avatar update for that user in each room. ([\#6572](https://github.com/matrix-org/synapse/issues/6572))
- Fix a couple of bugs in email configuration handling. ([\#6962](https://github.com/matrix-org/synapse/issues/6962))
- Fix an issue affecting worker-based deployments where replication would stop working, necessitating a full restart, after joining a large room. ([\#6967](https://github.com/matrix-org/synapse/issues/6967))
- Fix `duplicate key` error which was logged when rejoining a room over federation. ([\#6968](https://github.com/matrix-org/synapse/issues/6968))
- Prevent user from setting 'deactivated' to anything other than a bool on the v2 PUT /users Admin API. ([\#6990](https://github.com/matrix-org/synapse/issues/6990))
- Fix py35-old CI by using native tox package. ([\#7018](https://github.com/matrix-org/synapse/issues/7018))
- Fix a bug causing `org.matrix.dummy_event` to be included in responses from `/sync`. ([\#7035](https://github.com/matrix-org/synapse/issues/7035))
- Fix a bug that renders UTF-8 text files incorrectly when loaded from media. Contributed by @TheStranjer. ([\#7044](https://github.com/matrix-org/synapse/issues/7044))
- Fix a bug that would cause Synapse to respond with an error about event visibility if a client tried to request the state of a room at a given token. ([\#7066](https://github.com/matrix-org/synapse/issues/7066))
- Repair a data-corruption issue which was introduced in Synapse 1.10, and fixed in Synapse 1.11, and which could cause `/sync` to return with 404 errors about missing events and unknown rooms. ([\#7070](https://github.com/matrix-org/synapse/issues/7070))
- Fix a bug causing account validity renewal emails to be sent even if the feature is turned off in some cases. ([\#7074](https://github.com/matrix-org/synapse/issues/7074))

Improved Documentation
----------------------

- Updated CentOS8 install instructions. Contributed by Richard Kellner. ([\#6925](https://github.com/matrix-org/synapse/issues/6925))
- Fix `POSTGRES_INITDB_ARGS` in the `contrib/docker/docker-compose.yml` example docker-compose configuration. ([\#6984](https://github.com/matrix-org/synapse/issues/6984))
- Change date in [INSTALL.md](./INSTALL.md#tls-certificates) for last date of getting TLS certificates to November 2019. ([\#7015](https://github.com/matrix-org/synapse/issues/7015))
- Document that the fallback auth endpoints must be routed to the same worker node as the register endpoints. ([\#7048](https://github.com/matrix-org/synapse/issues/7048))

Deprecations and Removals
-------------------------

- Remove the unused query_auth federation endpoint per [MSC2451](https://github.com/matrix-org/matrix-doc/pull/2451). ([\#7026](https://github.com/matrix-org/synapse/issues/7026))

Internal Changes
----------------

- Add type hints to `logging/context.py`. ([\#6309](https://github.com/matrix-org/synapse/issues/6309))
- Add some clarifications to `README.md` in the database schema directory. ([\#6615](https://github.com/matrix-org/synapse/issues/6615))
- Refactoring work in preparation for changing the event redaction algorithm. ([\#6874](https://github.com/matrix-org/synapse/issues/6874), [\#6875](https://github.com/matrix-org/synapse/issues/6875), [\#6983](https://github.com/matrix-org/synapse/issues/6983), [\#7003](https://github.com/matrix-org/synapse/issues/7003))
- Improve performance of v2 state resolution for large rooms. ([\#6952](https://github.com/matrix-org/synapse/issues/6952), [\#7095](https://github.com/matrix-org/synapse/issues/7095))
- Reduce time spent doing GC, by freezing objects on startup. ([\#6953](https://github.com/matrix-org/synapse/issues/6953))
- Minor perfermance fixes to `get_auth_chain_ids`. ([\#6954](https://github.com/matrix-org/synapse/issues/6954))
- Don't record remote cross-signing keys in the `devices` table. ([\#6956](https://github.com/matrix-org/synapse/issues/6956))
- Use flake8-comprehensions to enforce good hygiene of list/set/dict comprehensions. ([\#6957](https://github.com/matrix-org/synapse/issues/6957))
- Merge worker apps together. ([\#6964](https://github.com/matrix-org/synapse/issues/6964), [\#7002](https://github.com/matrix-org/synapse/issues/7002), [\#7055](https://github.com/matrix-org/synapse/issues/7055), [\#7104](https://github.com/matrix-org/synapse/issues/7104))
- Remove redundant `store_room` call from `FederationHandler._process_received_pdu`. ([\#6979](https://github.com/matrix-org/synapse/issues/6979))
- Update warning for incorrect database collation/ctype to include link to documentation. ([\#6985](https://github.com/matrix-org/synapse/issues/6985))
- Add some type annotations to the database storage classes. ([\#6987](https://github.com/matrix-org/synapse/issues/6987))
- Port `synapse.handlers.presence` to async/await. ([\#6991](https://github.com/matrix-org/synapse/issues/6991), [\#7019](https://github.com/matrix-org/synapse/issues/7019))
- Add some type annotations to the federation base & client classes. ([\#6995](https://github.com/matrix-org/synapse/issues/6995))
- Port `synapse.rest.keys` to async/await. ([\#7020](https://github.com/matrix-org/synapse/issues/7020))
- Add a type check to `is_verified` when processing room keys. ([\#7045](https://github.com/matrix-org/synapse/issues/7045))
- Add type annotations and comments to the auth handler. ([\#7063](https://github.com/matrix-org/synapse/issues/7063))
Diffstat (limited to 'synapse')
-rw-r--r--synapse/__init__.py2
-rw-r--r--synapse/api/auth.py19
-rw-r--r--synapse/api/errors.py1
-rw-r--r--synapse/api/room_versions.py9
-rw-r--r--synapse/app/_base.py12
-rw-r--r--synapse/app/appservice.py156
-rw-r--r--synapse/app/client_reader.py190
-rw-r--r--synapse/app/event_creator.py186
-rw-r--r--synapse/app/federation_reader.py172
-rw-r--r--synapse/app/federation_sender.py303
-rw-r--r--synapse/app/frontend_proxy.py236
-rw-r--r--synapse/app/generic_worker.py923
-rw-r--r--synapse/app/homeserver.py14
-rw-r--r--synapse/app/media_repository.py157
-rw-r--r--synapse/app/pusher.py209
-rw-r--r--synapse/app/synchrotron.py449
-rw-r--r--synapse/app/user_dir.py211
-rw-r--r--synapse/config/emailconfig.py66
-rw-r--r--synapse/config/saml2_config.py30
-rw-r--r--synapse/config/server.py4
-rw-r--r--synapse/config/tls.py2
-rw-r--r--synapse/crypto/context_factory.py68
-rw-r--r--synapse/crypto/event_signing.py2
-rw-r--r--synapse/crypto/keyring.py6
-rw-r--r--synapse/event_auth.py8
-rw-r--r--synapse/events/__init__.py53
-rw-r--r--synapse/events/utils.py26
-rw-r--r--synapse/federation/federation_base.py112
-rw-r--r--synapse/federation/federation_client.py90
-rw-r--r--synapse/federation/federation_server.py51
-rw-r--r--synapse/federation/send_queue.py4
-rw-r--r--synapse/federation/transport/server.py12
-rw-r--r--synapse/groups/groups_server.py2
-rw-r--r--synapse/handlers/account_validity.py6
-rw-r--r--synapse/handlers/auth.py193
-rw-r--r--synapse/handlers/device.py2
-rw-r--r--synapse/handlers/directory.py140
-rw-r--r--synapse/handlers/e2e_room_keys.py7
-rw-r--r--synapse/handlers/federation.py75
-rw-r--r--synapse/handlers/message.py54
-rw-r--r--synapse/handlers/presence.py198
-rw-r--r--synapse/handlers/profile.py12
-rw-r--r--synapse/handlers/receipts.py2
-rw-r--r--synapse/handlers/room.py54
-rw-r--r--synapse/handlers/room_list.py9
-rw-r--r--synapse/handlers/saml_handler.py20
-rw-r--r--synapse/handlers/search.py8
-rw-r--r--synapse/handlers/set_password.py41
-rw-r--r--synapse/handlers/sync.py22
-rw-r--r--synapse/handlers/typing.py4
-rw-r--r--synapse/http/client.py3
-rw-r--r--synapse/http/federation/matrix_federation_agent.py2
-rw-r--r--synapse/logging/context.py122
-rw-r--r--synapse/logging/utils.py2
-rw-r--r--synapse/metrics/__init__.py14
-rw-r--r--synapse/metrics/background_process_metrics.py5
-rw-r--r--synapse/push/bulk_push_rule_evaluator.py8
-rw-r--r--synapse/push/emailpusher.py2
-rw-r--r--synapse/push/mailer.py22
-rw-r--r--synapse/push/presentable_names.py36
-rw-r--r--synapse/push/pusherpool.py45
-rw-r--r--synapse/replication/http/_base.py2
-rw-r--r--synapse/replication/http/federation.py49
-rw-r--r--synapse/replication/http/send_event.py14
-rw-r--r--synapse/replication/slave/storage/events.py20
-rw-r--r--synapse/replication/tcp/resource.py6
-rw-r--r--synapse/replication/tcp/streams/_base.py2
-rw-r--r--synapse/res/templates/saml_error.html45
-rw-r--r--synapse/rest/admin/_base.py4
-rw-r--r--synapse/rest/admin/users.py17
-rw-r--r--synapse/rest/client/v1/push_rule.py6
-rw-r--r--synapse/rest/client/v1/pusher.py4
-rw-r--r--synapse/rest/client/v1/room.py12
-rw-r--r--synapse/rest/client/v2_alpha/account.py5
-rw-r--r--synapse/rest/client/v2_alpha/sync.py2
-rw-r--r--synapse/rest/key/v2/remote_key_resource.py13
-rw-r--r--synapse/rest/media/v1/_base.py65
-rw-r--r--synapse/rest/saml2/response_resource.py18
-rw-r--r--synapse/server.py6
-rw-r--r--synapse/server.pyi5
-rw-r--r--synapse/state/__init__.py25
-rw-r--r--synapse/state/v1.py10
-rw-r--r--synapse/state/v2.py36
-rw-r--r--synapse/storage/_base.py2
-rw-r--r--synapse/storage/background_updates.py2
-rw-r--r--synapse/storage/data_stores/main/__init__.py34
-rw-r--r--synapse/storage/data_stores/main/appservice.py14
-rw-r--r--synapse/storage/data_stores/main/client_ips.py4
-rw-r--r--synapse/storage/data_stores/main/devices.py13
-rw-r--r--synapse/storage/data_stores/main/end_to_end_keys.py33
-rw-r--r--synapse/storage/data_stores/main/event_federation.py188
-rw-r--r--synapse/storage/data_stores/main/event_push_actions.py34
-rw-r--r--synapse/storage/data_stores/main/events.py18
-rw-r--r--synapse/storage/data_stores/main/events_bg_updates.py2
-rw-r--r--synapse/storage/data_stores/main/events_worker.py90
-rw-r--r--synapse/storage/data_stores/main/monthly_active_users.py32
-rw-r--r--synapse/storage/data_stores/main/push_rule.py8
-rw-r--r--synapse/storage/data_stores/main/pusher.py156
-rw-r--r--synapse/storage/data_stores/main/receipts.py4
-rw-r--r--synapse/storage/data_stores/main/room.py37
-rw-r--r--synapse/storage/data_stores/main/roommember.py4
-rw-r--r--synapse/storage/data_stores/main/schema/delta/57/rooms_version_column_3.sql.postgres39
-rw-r--r--synapse/storage/data_stores/main/schema/delta/57/rooms_version_column_3.sql.sqlite23
-rw-r--r--synapse/storage/data_stores/main/schema/full_schemas/README.md24
-rw-r--r--synapse/storage/data_stores/main/state.py8
-rw-r--r--synapse/storage/data_stores/main/state_deltas.py4
-rw-r--r--synapse/storage/data_stores/main/stream.py8
-rw-r--r--synapse/storage/data_stores/main/user_erasure_store.py4
-rw-r--r--synapse/storage/data_stores/state/store.py4
-rw-r--r--synapse/storage/database.py159
-rw-r--r--synapse/storage/engines/__init__.py28
-rw-r--r--synapse/storage/engines/_base.py87
-rw-r--r--synapse/storage/engines/postgres.py22
-rw-r--r--synapse/storage/engines/sqlite.py13
-rw-r--r--synapse/storage/persist_events.py8
-rw-r--r--synapse/storage/prepare_database.py21
-rw-r--r--synapse/storage/types.py65
-rw-r--r--synapse/types.py15
-rw-r--r--synapse/util/frozenutils.py2
-rw-r--r--synapse/visibility.py66
120 files changed, 3046 insertions, 3488 deletions
diff --git a/synapse/__init__.py b/synapse/__init__.py
index e56ba89ff4..5b86008945 100644
--- a/synapse/__init__.py
+++ b/synapse/__init__.py
@@ -36,7 +36,7 @@ try:
 except ImportError:
     pass
 
-__version__ = "1.11.1"
+__version__ = "1.12.0"
 
 if bool(os.environ.get("SYNAPSE_TEST_PATCH_LOG_CONTEXTS", False)):
     # We import here so that we don't have to install a bunch of deps when
diff --git a/synapse/api/auth.py b/synapse/api/auth.py
index f576d65388..c1ade1333b 100644
--- a/synapse/api/auth.py
+++ b/synapse/api/auth.py
@@ -538,13 +538,13 @@ class Auth(object):
         return defer.succeed(auth_ids)
 
     @defer.inlineCallbacks
-    def check_can_change_room_list(self, room_id, user):
-        """Check if the user is allowed to edit the room's entry in the
+    def check_can_change_room_list(self, room_id: str, user: UserID):
+        """Determine whether the user is allowed to edit the room's entry in the
         published room list.
 
         Args:
-            room_id (str)
-            user (UserID)
+            room_id
+            user
         """
 
         is_admin = yield self.is_server_admin(user)
@@ -556,7 +556,7 @@ class Auth(object):
 
         # We currently require the user is a "moderator" in the room. We do this
         # by checking if they would (theoretically) be able to change the
-        # m.room.aliases events
+        # m.room.canonical_alias events
         power_level_event = yield self.state.get_current_state(
             room_id, EventTypes.PowerLevels, ""
         )
@@ -566,16 +566,11 @@ class Auth(object):
             auth_events[(EventTypes.PowerLevels, "")] = power_level_event
 
         send_level = event_auth.get_send_level(
-            EventTypes.Aliases, "", power_level_event
+            EventTypes.CanonicalAlias, "", power_level_event
         )
         user_level = event_auth.get_user_power_level(user_id, auth_events)
 
-        if user_level < send_level:
-            raise AuthError(
-                403,
-                "This server requires you to be a moderator in the room to"
-                " edit its room list entry",
-            )
+        return user_level >= send_level
 
     @staticmethod
     def has_access_token(request):
diff --git a/synapse/api/errors.py b/synapse/api/errors.py
index 0c20601600..616942b057 100644
--- a/synapse/api/errors.py
+++ b/synapse/api/errors.py
@@ -66,6 +66,7 @@ class Codes(object):
     EXPIRED_ACCOUNT = "ORG_MATRIX_EXPIRED_ACCOUNT"
     INVALID_SIGNATURE = "M_INVALID_SIGNATURE"
     USER_DEACTIVATED = "M_USER_DEACTIVATED"
+    BAD_ALIAS = "M_BAD_ALIAS"
 
 
 class CodeMessageException(RuntimeError):
diff --git a/synapse/api/room_versions.py b/synapse/api/room_versions.py
index cf7ee60d3a..871179749a 100644
--- a/synapse/api/room_versions.py
+++ b/synapse/api/room_versions.py
@@ -57,7 +57,7 @@ class RoomVersion(object):
     state_res = attr.ib()  # int; one of the StateResolutionVersions
     enforce_key_validity = attr.ib()  # bool
 
-    # bool: before MSC2260, anyone was allowed to send an aliases event
+    # bool: before MSC2261/MSC2432, m.room.aliases had special auth rules and redaction rules
     special_case_aliases_auth = attr.ib(type=bool, default=False)
 
 
@@ -102,12 +102,13 @@ class RoomVersions(object):
         enforce_key_validity=True,
         special_case_aliases_auth=True,
     )
-    MSC2260_DEV = RoomVersion(
-        "org.matrix.msc2260",
+    MSC2432_DEV = RoomVersion(
+        "org.matrix.msc2432",
         RoomDisposition.UNSTABLE,
         EventFormatVersions.V3,
         StateResolutionVersions.V2,
         enforce_key_validity=True,
+        special_case_aliases_auth=False,
     )
 
 
@@ -119,6 +120,6 @@ KNOWN_ROOM_VERSIONS = {
         RoomVersions.V3,
         RoomVersions.V4,
         RoomVersions.V5,
-        RoomVersions.MSC2260_DEV,
+        RoomVersions.MSC2432_DEV,
     )
 }  # type: Dict[str, RoomVersion]
diff --git a/synapse/app/_base.py b/synapse/app/_base.py
index 0e8b467a3e..4d84f4595a 100644
--- a/synapse/app/_base.py
+++ b/synapse/app/_base.py
@@ -141,7 +141,7 @@ def start_reactor(
 
 def quit_with_error(error_string):
     message_lines = error_string.split("\n")
-    line_length = max([len(l) for l in message_lines if len(l) < 80]) + 2
+    line_length = max(len(l) for l in message_lines if len(l) < 80) + 2
     sys.stderr.write("*" * line_length + "\n")
     for line in message_lines:
         sys.stderr.write(" %s\n" % (line.rstrip(),))
@@ -276,9 +276,19 @@ def start(hs, listeners=None):
         # It is now safe to start your Synapse.
         hs.start_listening(listeners)
         hs.get_datastore().db.start_profiling()
+        hs.get_pusherpool().start()
 
         setup_sentry(hs)
         setup_sdnotify(hs)
+
+        # We now freeze all allocated objects in the hopes that (almost)
+        # everything currently allocated are things that will be used for the
+        # rest of time. Doing so means less work each GC (hopefully).
+        #
+        # This only works on Python 3.7
+        if sys.version_info >= (3, 7):
+            gc.collect()
+            gc.freeze()
     except Exception:
         traceback.print_exc(file=sys.stderr)
         reactor = hs.get_reactor()
diff --git a/synapse/app/appservice.py b/synapse/app/appservice.py
index 2217d4a4fb..add43147b3 100644
--- a/synapse/app/appservice.py
+++ b/synapse/app/appservice.py
@@ -13,161 +13,11 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import logging
-import sys
-
-from twisted.internet import defer, reactor
-from twisted.web.resource import NoResource
-
-import synapse
-from synapse import events
-from synapse.app import _base
-from synapse.config._base import ConfigError
-from synapse.config.homeserver import HomeServerConfig
-from synapse.config.logger import setup_logging
-from synapse.http.site import SynapseSite
-from synapse.logging.context import LoggingContext, run_in_background
-from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
-from synapse.replication.slave.storage.appservice import SlavedApplicationServiceStore
-from synapse.replication.slave.storage.directory import DirectoryStore
-from synapse.replication.slave.storage.events import SlavedEventStore
-from synapse.replication.slave.storage.registration import SlavedRegistrationStore
-from synapse.replication.tcp.client import ReplicationClientHandler
-from synapse.server import HomeServer
-from synapse.util.httpresourcetree import create_resource_tree
-from synapse.util.manhole import manhole
-from synapse.util.versionstring import get_version_string
-
-logger = logging.getLogger("synapse.app.appservice")
-
-
-class AppserviceSlaveStore(
-    DirectoryStore,
-    SlavedEventStore,
-    SlavedApplicationServiceStore,
-    SlavedRegistrationStore,
-):
-    pass
-
-
-class AppserviceServer(HomeServer):
-    DATASTORE_CLASS = AppserviceSlaveStore
-
-    def _listen_http(self, listener_config):
-        port = listener_config["port"]
-        bind_addresses = listener_config["bind_addresses"]
-        site_tag = listener_config.get("tag", port)
-        resources = {}
-        for res in listener_config["resources"]:
-            for name in res["names"]:
-                if name == "metrics":
-                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
-
-        root_resource = create_resource_tree(resources, NoResource())
-
-        _base.listen_tcp(
-            bind_addresses,
-            port,
-            SynapseSite(
-                "synapse.access.http.%s" % (site_tag,),
-                site_tag,
-                listener_config,
-                root_resource,
-                self.version_string,
-            ),
-        )
-
-        logger.info("Synapse appservice now listening on port %d", port)
-
-    def start_listening(self, listeners):
-        for listener in listeners:
-            if listener["type"] == "http":
-                self._listen_http(listener)
-            elif listener["type"] == "manhole":
-                _base.listen_tcp(
-                    listener["bind_addresses"],
-                    listener["port"],
-                    manhole(
-                        username="matrix", password="rabbithole", globals={"hs": self}
-                    ),
-                )
-            elif listener["type"] == "metrics":
-                if not self.get_config().enable_metrics:
-                    logger.warning(
-                        (
-                            "Metrics listener configured, but "
-                            "enable_metrics is not True!"
-                        )
-                    )
-                else:
-                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
-            else:
-                logger.warning("Unrecognized listener type: %s", listener["type"])
-
-        self.get_tcp_replication().start_replication(self)
-
-    def build_tcp_replication(self):
-        return ASReplicationHandler(self)
 
+import sys
 
-class ASReplicationHandler(ReplicationClientHandler):
-    def __init__(self, hs):
-        super(ASReplicationHandler, self).__init__(hs.get_datastore())
-        self.appservice_handler = hs.get_application_service_handler()
-
-    async def on_rdata(self, stream_name, token, rows):
-        await super(ASReplicationHandler, self).on_rdata(stream_name, token, rows)
-
-        if stream_name == "events":
-            max_stream_id = self.store.get_room_max_stream_ordering()
-            run_in_background(self._notify_app_services, max_stream_id)
-
-    @defer.inlineCallbacks
-    def _notify_app_services(self, room_stream_id):
-        try:
-            yield self.appservice_handler.notify_interested_services(room_stream_id)
-        except Exception:
-            logger.exception("Error notifying application services of event")
-
-
-def start(config_options):
-    try:
-        config = HomeServerConfig.load_config("Synapse appservice", config_options)
-    except ConfigError as e:
-        sys.stderr.write("\n" + str(e) + "\n")
-        sys.exit(1)
-
-    assert config.worker_app == "synapse.app.appservice"
-
-    events.USE_FROZEN_DICTS = config.use_frozen_dicts
-
-    if config.notify_appservices:
-        sys.stderr.write(
-            "\nThe appservices must be disabled in the main synapse process"
-            "\nbefore they can be run in a separate worker."
-            "\nPlease add ``notify_appservices: false`` to the main config"
-            "\n"
-        )
-        sys.exit(1)
-
-    # Force the pushers to start since they will be disabled in the main config
-    config.notify_appservices = True
-
-    ps = AppserviceServer(
-        config.server_name,
-        config=config,
-        version_string="Synapse/" + get_version_string(synapse),
-    )
-
-    setup_logging(ps, config, use_worker_options=True)
-
-    ps.setup()
-    reactor.addSystemEventTrigger(
-        "before", "startup", _base.start, ps, config.worker_listeners
-    )
-
-    _base.start_worker_reactor("synapse-appservice", config)
-
+from synapse.app.generic_worker import start
+from synapse.util.logcontext import LoggingContext
 
 if __name__ == "__main__":
     with LoggingContext("main"):
diff --git a/synapse/app/client_reader.py b/synapse/app/client_reader.py
index 7fa91a3b11..add43147b3 100644
--- a/synapse/app/client_reader.py
+++ b/synapse/app/client_reader.py
@@ -13,195 +13,11 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import logging
-import sys
-
-from twisted.internet import reactor
-from twisted.web.resource import NoResource
-
-import synapse
-from synapse import events
-from synapse.app import _base
-from synapse.config._base import ConfigError
-from synapse.config.homeserver import HomeServerConfig
-from synapse.config.logger import setup_logging
-from synapse.http.server import JsonResource
-from synapse.http.site import SynapseSite
-from synapse.logging.context import LoggingContext
-from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
-from synapse.replication.slave.storage._base import BaseSlavedStore
-from synapse.replication.slave.storage.account_data import SlavedAccountDataStore
-from synapse.replication.slave.storage.appservice import SlavedApplicationServiceStore
-from synapse.replication.slave.storage.client_ips import SlavedClientIpStore
-from synapse.replication.slave.storage.deviceinbox import SlavedDeviceInboxStore
-from synapse.replication.slave.storage.devices import SlavedDeviceStore
-from synapse.replication.slave.storage.directory import DirectoryStore
-from synapse.replication.slave.storage.events import SlavedEventStore
-from synapse.replication.slave.storage.groups import SlavedGroupServerStore
-from synapse.replication.slave.storage.keys import SlavedKeyStore
-from synapse.replication.slave.storage.profile import SlavedProfileStore
-from synapse.replication.slave.storage.push_rule import SlavedPushRuleStore
-from synapse.replication.slave.storage.receipts import SlavedReceiptsStore
-from synapse.replication.slave.storage.registration import SlavedRegistrationStore
-from synapse.replication.slave.storage.room import RoomStore
-from synapse.replication.slave.storage.transactions import SlavedTransactionStore
-from synapse.replication.tcp.client import ReplicationClientHandler
-from synapse.rest.client.v1.login import LoginRestServlet
-from synapse.rest.client.v1.push_rule import PushRuleRestServlet
-from synapse.rest.client.v1.room import (
-    JoinedRoomMemberListRestServlet,
-    PublicRoomListRestServlet,
-    RoomEventContextServlet,
-    RoomMemberListRestServlet,
-    RoomMessageListRestServlet,
-    RoomStateRestServlet,
-)
-from synapse.rest.client.v1.voip import VoipRestServlet
-from synapse.rest.client.v2_alpha import groups
-from synapse.rest.client.v2_alpha.account import ThreepidRestServlet
-from synapse.rest.client.v2_alpha.keys import KeyChangesServlet, KeyQueryServlet
-from synapse.rest.client.v2_alpha.register import RegisterRestServlet
-from synapse.rest.client.versions import VersionsRestServlet
-from synapse.server import HomeServer
-from synapse.storage.data_stores.main.monthly_active_users import (
-    MonthlyActiveUsersWorkerStore,
-)
-from synapse.util.httpresourcetree import create_resource_tree
-from synapse.util.manhole import manhole
-from synapse.util.versionstring import get_version_string
-
-logger = logging.getLogger("synapse.app.client_reader")
-
-
-class ClientReaderSlavedStore(
-    SlavedDeviceInboxStore,
-    SlavedDeviceStore,
-    SlavedReceiptsStore,
-    SlavedPushRuleStore,
-    SlavedGroupServerStore,
-    SlavedAccountDataStore,
-    SlavedEventStore,
-    SlavedKeyStore,
-    RoomStore,
-    DirectoryStore,
-    SlavedApplicationServiceStore,
-    SlavedRegistrationStore,
-    SlavedTransactionStore,
-    SlavedProfileStore,
-    SlavedClientIpStore,
-    MonthlyActiveUsersWorkerStore,
-    BaseSlavedStore,
-):
-    pass
-
-
-class ClientReaderServer(HomeServer):
-    DATASTORE_CLASS = ClientReaderSlavedStore
-
-    def _listen_http(self, listener_config):
-        port = listener_config["port"]
-        bind_addresses = listener_config["bind_addresses"]
-        site_tag = listener_config.get("tag", port)
-        resources = {}
-        for res in listener_config["resources"]:
-            for name in res["names"]:
-                if name == "metrics":
-                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
-                elif name == "client":
-                    resource = JsonResource(self, canonical_json=False)
-
-                    PublicRoomListRestServlet(self).register(resource)
-                    RoomMemberListRestServlet(self).register(resource)
-                    JoinedRoomMemberListRestServlet(self).register(resource)
-                    RoomStateRestServlet(self).register(resource)
-                    RoomEventContextServlet(self).register(resource)
-                    RoomMessageListRestServlet(self).register(resource)
-                    RegisterRestServlet(self).register(resource)
-                    LoginRestServlet(self).register(resource)
-                    ThreepidRestServlet(self).register(resource)
-                    KeyQueryServlet(self).register(resource)
-                    KeyChangesServlet(self).register(resource)
-                    VoipRestServlet(self).register(resource)
-                    PushRuleRestServlet(self).register(resource)
-                    VersionsRestServlet(self).register(resource)
-
-                    groups.register_servlets(self, resource)
-
-                    resources.update({"/_matrix/client": resource})
-
-        root_resource = create_resource_tree(resources, NoResource())
 
-        _base.listen_tcp(
-            bind_addresses,
-            port,
-            SynapseSite(
-                "synapse.access.http.%s" % (site_tag,),
-                site_tag,
-                listener_config,
-                root_resource,
-                self.version_string,
-            ),
-        )
-
-        logger.info("Synapse client reader now listening on port %d", port)
-
-    def start_listening(self, listeners):
-        for listener in listeners:
-            if listener["type"] == "http":
-                self._listen_http(listener)
-            elif listener["type"] == "manhole":
-                _base.listen_tcp(
-                    listener["bind_addresses"],
-                    listener["port"],
-                    manhole(
-                        username="matrix", password="rabbithole", globals={"hs": self}
-                    ),
-                )
-            elif listener["type"] == "metrics":
-                if not self.get_config().enable_metrics:
-                    logger.warning(
-                        (
-                            "Metrics listener configured, but "
-                            "enable_metrics is not True!"
-                        )
-                    )
-                else:
-                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
-            else:
-                logger.warning("Unrecognized listener type: %s", listener["type"])
-
-        self.get_tcp_replication().start_replication(self)
-
-    def build_tcp_replication(self):
-        return ReplicationClientHandler(self.get_datastore())
-
-
-def start(config_options):
-    try:
-        config = HomeServerConfig.load_config("Synapse client reader", config_options)
-    except ConfigError as e:
-        sys.stderr.write("\n" + str(e) + "\n")
-        sys.exit(1)
-
-    assert config.worker_app == "synapse.app.client_reader"
-
-    events.USE_FROZEN_DICTS = config.use_frozen_dicts
-
-    ss = ClientReaderServer(
-        config.server_name,
-        config=config,
-        version_string="Synapse/" + get_version_string(synapse),
-    )
-
-    setup_logging(ss, config, use_worker_options=True)
-
-    ss.setup()
-    reactor.addSystemEventTrigger(
-        "before", "startup", _base.start, ss, config.worker_listeners
-    )
-
-    _base.start_worker_reactor("synapse-client-reader", config)
+import sys
 
+from synapse.app.generic_worker import start
+from synapse.util.logcontext import LoggingContext
 
 if __name__ == "__main__":
     with LoggingContext("main"):
diff --git a/synapse/app/event_creator.py b/synapse/app/event_creator.py
index 58e5b354f6..e9c098c4e7 100644
--- a/synapse/app/event_creator.py
+++ b/synapse/app/event_creator.py
@@ -13,191 +13,11 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import logging
-import sys
-
-from twisted.internet import reactor
-from twisted.web.resource import NoResource
-
-import synapse
-from synapse import events
-from synapse.app import _base
-from synapse.config._base import ConfigError
-from synapse.config.homeserver import HomeServerConfig
-from synapse.config.logger import setup_logging
-from synapse.http.server import JsonResource
-from synapse.http.site import SynapseSite
-from synapse.logging.context import LoggingContext
-from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
-from synapse.replication.slave.storage._base import BaseSlavedStore
-from synapse.replication.slave.storage.account_data import SlavedAccountDataStore
-from synapse.replication.slave.storage.appservice import SlavedApplicationServiceStore
-from synapse.replication.slave.storage.client_ips import SlavedClientIpStore
-from synapse.replication.slave.storage.devices import SlavedDeviceStore
-from synapse.replication.slave.storage.directory import DirectoryStore
-from synapse.replication.slave.storage.events import SlavedEventStore
-from synapse.replication.slave.storage.profile import SlavedProfileStore
-from synapse.replication.slave.storage.push_rule import SlavedPushRuleStore
-from synapse.replication.slave.storage.pushers import SlavedPusherStore
-from synapse.replication.slave.storage.receipts import SlavedReceiptsStore
-from synapse.replication.slave.storage.registration import SlavedRegistrationStore
-from synapse.replication.slave.storage.room import RoomStore
-from synapse.replication.slave.storage.transactions import SlavedTransactionStore
-from synapse.replication.tcp.client import ReplicationClientHandler
-from synapse.rest.client.v1.profile import (
-    ProfileAvatarURLRestServlet,
-    ProfileDisplaynameRestServlet,
-    ProfileRestServlet,
-)
-from synapse.rest.client.v1.room import (
-    JoinRoomAliasServlet,
-    RoomMembershipRestServlet,
-    RoomSendEventRestServlet,
-    RoomStateEventRestServlet,
-)
-from synapse.server import HomeServer
-from synapse.storage.data_stores.main.monthly_active_users import (
-    MonthlyActiveUsersWorkerStore,
-)
-from synapse.storage.data_stores.main.user_directory import UserDirectoryStore
-from synapse.util.httpresourcetree import create_resource_tree
-from synapse.util.manhole import manhole
-from synapse.util.versionstring import get_version_string
-
-logger = logging.getLogger("synapse.app.event_creator")
-
-
-class EventCreatorSlavedStore(
-    # FIXME(#3714): We need to add UserDirectoryStore as we write directly
-    # rather than going via the correct worker.
-    UserDirectoryStore,
-    DirectoryStore,
-    SlavedTransactionStore,
-    SlavedProfileStore,
-    SlavedAccountDataStore,
-    SlavedPusherStore,
-    SlavedReceiptsStore,
-    SlavedPushRuleStore,
-    SlavedDeviceStore,
-    SlavedClientIpStore,
-    SlavedApplicationServiceStore,
-    SlavedEventStore,
-    SlavedRegistrationStore,
-    RoomStore,
-    MonthlyActiveUsersWorkerStore,
-    BaseSlavedStore,
-):
-    pass
-
-
-class EventCreatorServer(HomeServer):
-    DATASTORE_CLASS = EventCreatorSlavedStore
-
-    def _listen_http(self, listener_config):
-        port = listener_config["port"]
-        bind_addresses = listener_config["bind_addresses"]
-        site_tag = listener_config.get("tag", port)
-        resources = {}
-        for res in listener_config["resources"]:
-            for name in res["names"]:
-                if name == "metrics":
-                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
-                elif name == "client":
-                    resource = JsonResource(self, canonical_json=False)
-                    RoomSendEventRestServlet(self).register(resource)
-                    RoomMembershipRestServlet(self).register(resource)
-                    RoomStateEventRestServlet(self).register(resource)
-                    JoinRoomAliasServlet(self).register(resource)
-                    ProfileAvatarURLRestServlet(self).register(resource)
-                    ProfileDisplaynameRestServlet(self).register(resource)
-                    ProfileRestServlet(self).register(resource)
-                    resources.update(
-                        {
-                            "/_matrix/client/r0": resource,
-                            "/_matrix/client/unstable": resource,
-                            "/_matrix/client/v2_alpha": resource,
-                            "/_matrix/client/api/v1": resource,
-                        }
-                    )
-
-        root_resource = create_resource_tree(resources, NoResource())
-
-        _base.listen_tcp(
-            bind_addresses,
-            port,
-            SynapseSite(
-                "synapse.access.http.%s" % (site_tag,),
-                site_tag,
-                listener_config,
-                root_resource,
-                self.version_string,
-            ),
-        )
-
-        logger.info("Synapse event creator now listening on port %d", port)
-
-    def start_listening(self, listeners):
-        for listener in listeners:
-            if listener["type"] == "http":
-                self._listen_http(listener)
-            elif listener["type"] == "manhole":
-                _base.listen_tcp(
-                    listener["bind_addresses"],
-                    listener["port"],
-                    manhole(
-                        username="matrix", password="rabbithole", globals={"hs": self}
-                    ),
-                )
-            elif listener["type"] == "metrics":
-                if not self.get_config().enable_metrics:
-                    logger.warning(
-                        (
-                            "Metrics listener configured, but "
-                            "enable_metrics is not True!"
-                        )
-                    )
-                else:
-                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
-            else:
-                logger.warning("Unrecognized listener type: %s", listener["type"])
 
-        self.get_tcp_replication().start_replication(self)
-
-    def build_tcp_replication(self):
-        return ReplicationClientHandler(self.get_datastore())
-
-
-def start(config_options):
-    try:
-        config = HomeServerConfig.load_config("Synapse event creator", config_options)
-    except ConfigError as e:
-        sys.stderr.write("\n" + str(e) + "\n")
-        sys.exit(1)
-
-    assert config.worker_app == "synapse.app.event_creator"
-
-    assert config.worker_replication_http_port is not None
-
-    # This should only be done on the user directory worker or the master
-    config.update_user_directory = False
-
-    events.USE_FROZEN_DICTS = config.use_frozen_dicts
-
-    ss = EventCreatorServer(
-        config.server_name,
-        config=config,
-        version_string="Synapse/" + get_version_string(synapse),
-    )
-
-    setup_logging(ss, config, use_worker_options=True)
-
-    ss.setup()
-    reactor.addSystemEventTrigger(
-        "before", "startup", _base.start, ss, config.worker_listeners
-    )
-
-    _base.start_worker_reactor("synapse-event-creator", config)
+import sys
 
+from synapse.app.generic_worker import start
+from synapse.util.logcontext import LoggingContext
 
 if __name__ == "__main__":
     with LoggingContext("main"):
diff --git a/synapse/app/federation_reader.py b/synapse/app/federation_reader.py
index d055d11b23..add43147b3 100644
--- a/synapse/app/federation_reader.py
+++ b/synapse/app/federation_reader.py
@@ -13,177 +13,11 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import logging
-import sys
-
-from twisted.internet import reactor
-from twisted.web.resource import NoResource
-
-import synapse
-from synapse import events
-from synapse.api.urls import FEDERATION_PREFIX, SERVER_KEY_V2_PREFIX
-from synapse.app import _base
-from synapse.config._base import ConfigError
-from synapse.config.homeserver import HomeServerConfig
-from synapse.config.logger import setup_logging
-from synapse.federation.transport.server import TransportLayerServer
-from synapse.http.site import SynapseSite
-from synapse.logging.context import LoggingContext
-from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
-from synapse.replication.slave.storage._base import BaseSlavedStore
-from synapse.replication.slave.storage.account_data import SlavedAccountDataStore
-from synapse.replication.slave.storage.appservice import SlavedApplicationServiceStore
-from synapse.replication.slave.storage.devices import SlavedDeviceStore
-from synapse.replication.slave.storage.directory import DirectoryStore
-from synapse.replication.slave.storage.events import SlavedEventStore
-from synapse.replication.slave.storage.groups import SlavedGroupServerStore
-from synapse.replication.slave.storage.keys import SlavedKeyStore
-from synapse.replication.slave.storage.profile import SlavedProfileStore
-from synapse.replication.slave.storage.push_rule import SlavedPushRuleStore
-from synapse.replication.slave.storage.pushers import SlavedPusherStore
-from synapse.replication.slave.storage.receipts import SlavedReceiptsStore
-from synapse.replication.slave.storage.registration import SlavedRegistrationStore
-from synapse.replication.slave.storage.room import RoomStore
-from synapse.replication.slave.storage.transactions import SlavedTransactionStore
-from synapse.replication.tcp.client import ReplicationClientHandler
-from synapse.rest.key.v2 import KeyApiV2Resource
-from synapse.server import HomeServer
-from synapse.storage.data_stores.main.monthly_active_users import (
-    MonthlyActiveUsersWorkerStore,
-)
-from synapse.util.httpresourcetree import create_resource_tree
-from synapse.util.manhole import manhole
-from synapse.util.versionstring import get_version_string
-
-logger = logging.getLogger("synapse.app.federation_reader")
-
-
-class FederationReaderSlavedStore(
-    SlavedAccountDataStore,
-    SlavedProfileStore,
-    SlavedApplicationServiceStore,
-    SlavedPusherStore,
-    SlavedPushRuleStore,
-    SlavedReceiptsStore,
-    SlavedEventStore,
-    SlavedKeyStore,
-    SlavedRegistrationStore,
-    SlavedGroupServerStore,
-    SlavedDeviceStore,
-    RoomStore,
-    DirectoryStore,
-    SlavedTransactionStore,
-    MonthlyActiveUsersWorkerStore,
-    BaseSlavedStore,
-):
-    pass
-
-
-class FederationReaderServer(HomeServer):
-    DATASTORE_CLASS = FederationReaderSlavedStore
-
-    def _listen_http(self, listener_config):
-        port = listener_config["port"]
-        bind_addresses = listener_config["bind_addresses"]
-        site_tag = listener_config.get("tag", port)
-        resources = {}
-        for res in listener_config["resources"]:
-            for name in res["names"]:
-                if name == "metrics":
-                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
-                elif name == "federation":
-                    resources.update({FEDERATION_PREFIX: TransportLayerServer(self)})
-                if name == "openid" and "federation" not in res["names"]:
-                    # Only load the openid resource separately if federation resource
-                    # is not specified since federation resource includes openid
-                    # resource.
-                    resources.update(
-                        {
-                            FEDERATION_PREFIX: TransportLayerServer(
-                                self, servlet_groups=["openid"]
-                            )
-                        }
-                    )
-
-                if name in ["keys", "federation"]:
-                    resources[SERVER_KEY_V2_PREFIX] = KeyApiV2Resource(self)
-
-        root_resource = create_resource_tree(resources, NoResource())
-
-        _base.listen_tcp(
-            bind_addresses,
-            port,
-            SynapseSite(
-                "synapse.access.http.%s" % (site_tag,),
-                site_tag,
-                listener_config,
-                root_resource,
-                self.version_string,
-            ),
-            reactor=self.get_reactor(),
-        )
 
-        logger.info("Synapse federation reader now listening on port %d", port)
-
-    def start_listening(self, listeners):
-        for listener in listeners:
-            if listener["type"] == "http":
-                self._listen_http(listener)
-            elif listener["type"] == "manhole":
-                _base.listen_tcp(
-                    listener["bind_addresses"],
-                    listener["port"],
-                    manhole(
-                        username="matrix", password="rabbithole", globals={"hs": self}
-                    ),
-                )
-            elif listener["type"] == "metrics":
-                if not self.get_config().enable_metrics:
-                    logger.warning(
-                        (
-                            "Metrics listener configured, but "
-                            "enable_metrics is not True!"
-                        )
-                    )
-                else:
-                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
-            else:
-                logger.warning("Unrecognized listener type: %s", listener["type"])
-
-        self.get_tcp_replication().start_replication(self)
-
-    def build_tcp_replication(self):
-        return ReplicationClientHandler(self.get_datastore())
-
-
-def start(config_options):
-    try:
-        config = HomeServerConfig.load_config(
-            "Synapse federation reader", config_options
-        )
-    except ConfigError as e:
-        sys.stderr.write("\n" + str(e) + "\n")
-        sys.exit(1)
-
-    assert config.worker_app == "synapse.app.federation_reader"
-
-    events.USE_FROZEN_DICTS = config.use_frozen_dicts
-
-    ss = FederationReaderServer(
-        config.server_name,
-        config=config,
-        version_string="Synapse/" + get_version_string(synapse),
-    )
-
-    setup_logging(ss, config, use_worker_options=True)
-
-    ss.setup()
-    reactor.addSystemEventTrigger(
-        "before", "startup", _base.start, ss, config.worker_listeners
-    )
-
-    _base.start_worker_reactor("synapse-federation-reader", config)
+import sys
 
+from synapse.app.generic_worker import start
+from synapse.util.logcontext import LoggingContext
 
 if __name__ == "__main__":
     with LoggingContext("main"):
diff --git a/synapse/app/federation_sender.py b/synapse/app/federation_sender.py
index 63a91f1177..add43147b3 100644
--- a/synapse/app/federation_sender.py
+++ b/synapse/app/federation_sender.py
@@ -13,308 +13,11 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import logging
-import sys
-
-from twisted.internet import defer, reactor
-from twisted.web.resource import NoResource
-
-import synapse
-from synapse import events
-from synapse.app import _base
-from synapse.config._base import ConfigError
-from synapse.config.homeserver import HomeServerConfig
-from synapse.config.logger import setup_logging
-from synapse.federation import send_queue
-from synapse.http.site import SynapseSite
-from synapse.logging.context import LoggingContext, run_in_background
-from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
-from synapse.metrics.background_process_metrics import run_as_background_process
-from synapse.replication.slave.storage.deviceinbox import SlavedDeviceInboxStore
-from synapse.replication.slave.storage.devices import SlavedDeviceStore
-from synapse.replication.slave.storage.events import SlavedEventStore
-from synapse.replication.slave.storage.presence import SlavedPresenceStore
-from synapse.replication.slave.storage.receipts import SlavedReceiptsStore
-from synapse.replication.slave.storage.registration import SlavedRegistrationStore
-from synapse.replication.slave.storage.transactions import SlavedTransactionStore
-from synapse.replication.tcp.client import ReplicationClientHandler
-from synapse.replication.tcp.streams._base import (
-    DeviceListsStream,
-    ReceiptsStream,
-    ToDeviceStream,
-)
-from synapse.server import HomeServer
-from synapse.storage.database import Database
-from synapse.types import ReadReceipt
-from synapse.util.async_helpers import Linearizer
-from synapse.util.httpresourcetree import create_resource_tree
-from synapse.util.manhole import manhole
-from synapse.util.versionstring import get_version_string
-
-logger = logging.getLogger("synapse.app.federation_sender")
-
-
-class FederationSenderSlaveStore(
-    SlavedDeviceInboxStore,
-    SlavedTransactionStore,
-    SlavedReceiptsStore,
-    SlavedEventStore,
-    SlavedRegistrationStore,
-    SlavedDeviceStore,
-    SlavedPresenceStore,
-):
-    def __init__(self, database: Database, db_conn, hs):
-        super(FederationSenderSlaveStore, self).__init__(database, db_conn, hs)
-
-        # We pull out the current federation stream position now so that we
-        # always have a known value for the federation position in memory so
-        # that we don't have to bounce via a deferred once when we start the
-        # replication streams.
-        self.federation_out_pos_startup = self._get_federation_out_pos(db_conn)
-
-    def _get_federation_out_pos(self, db_conn):
-        sql = "SELECT stream_id FROM federation_stream_position WHERE type = ?"
-        sql = self.database_engine.convert_param_style(sql)
-
-        txn = db_conn.cursor()
-        txn.execute(sql, ("federation",))
-        rows = txn.fetchall()
-        txn.close()
-
-        return rows[0][0] if rows else -1
-
-
-class FederationSenderServer(HomeServer):
-    DATASTORE_CLASS = FederationSenderSlaveStore
-
-    def _listen_http(self, listener_config):
-        port = listener_config["port"]
-        bind_addresses = listener_config["bind_addresses"]
-        site_tag = listener_config.get("tag", port)
-        resources = {}
-        for res in listener_config["resources"]:
-            for name in res["names"]:
-                if name == "metrics":
-                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
-
-        root_resource = create_resource_tree(resources, NoResource())
-
-        _base.listen_tcp(
-            bind_addresses,
-            port,
-            SynapseSite(
-                "synapse.access.http.%s" % (site_tag,),
-                site_tag,
-                listener_config,
-                root_resource,
-                self.version_string,
-            ),
-        )
-
-        logger.info("Synapse federation_sender now listening on port %d", port)
-
-    def start_listening(self, listeners):
-        for listener in listeners:
-            if listener["type"] == "http":
-                self._listen_http(listener)
-            elif listener["type"] == "manhole":
-                _base.listen_tcp(
-                    listener["bind_addresses"],
-                    listener["port"],
-                    manhole(
-                        username="matrix", password="rabbithole", globals={"hs": self}
-                    ),
-                )
-            elif listener["type"] == "metrics":
-                if not self.get_config().enable_metrics:
-                    logger.warning(
-                        (
-                            "Metrics listener configured, but "
-                            "enable_metrics is not True!"
-                        )
-                    )
-                else:
-                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
-            else:
-                logger.warning("Unrecognized listener type: %s", listener["type"])
-
-        self.get_tcp_replication().start_replication(self)
-
-    def build_tcp_replication(self):
-        return FederationSenderReplicationHandler(self)
-
-
-class FederationSenderReplicationHandler(ReplicationClientHandler):
-    def __init__(self, hs):
-        super(FederationSenderReplicationHandler, self).__init__(hs.get_datastore())
-        self.send_handler = FederationSenderHandler(hs, self)
-
-    async def on_rdata(self, stream_name, token, rows):
-        await super(FederationSenderReplicationHandler, self).on_rdata(
-            stream_name, token, rows
-        )
-        self.send_handler.process_replication_rows(stream_name, token, rows)
-
-    def get_streams_to_replicate(self):
-        args = super(
-            FederationSenderReplicationHandler, self
-        ).get_streams_to_replicate()
-        args.update(self.send_handler.stream_positions())
-        return args
-
-    def on_remote_server_up(self, server: str):
-        """Called when get a new REMOTE_SERVER_UP command."""
-
-        # Let's wake up the transaction queue for the server in case we have
-        # pending stuff to send to it.
-        self.send_handler.wake_destination(server)
-
-
-def start(config_options):
-    try:
-        config = HomeServerConfig.load_config(
-            "Synapse federation sender", config_options
-        )
-    except ConfigError as e:
-        sys.stderr.write("\n" + str(e) + "\n")
-        sys.exit(1)
 
-    assert config.worker_app == "synapse.app.federation_sender"
-
-    events.USE_FROZEN_DICTS = config.use_frozen_dicts
-
-    if config.send_federation:
-        sys.stderr.write(
-            "\nThe send_federation must be disabled in the main synapse process"
-            "\nbefore they can be run in a separate worker."
-            "\nPlease add ``send_federation: false`` to the main config"
-            "\n"
-        )
-        sys.exit(1)
-
-    # Force the pushers to start since they will be disabled in the main config
-    config.send_federation = True
-
-    ss = FederationSenderServer(
-        config.server_name,
-        config=config,
-        version_string="Synapse/" + get_version_string(synapse),
-    )
-
-    setup_logging(ss, config, use_worker_options=True)
-
-    ss.setup()
-    reactor.addSystemEventTrigger(
-        "before", "startup", _base.start, ss, config.worker_listeners
-    )
-
-    _base.start_worker_reactor("synapse-federation-sender", config)
-
-
-class FederationSenderHandler(object):
-    """Processes the replication stream and forwards the appropriate entries
-    to the federation sender.
-    """
-
-    def __init__(self, hs: FederationSenderServer, replication_client):
-        self.store = hs.get_datastore()
-        self._is_mine_id = hs.is_mine_id
-        self.federation_sender = hs.get_federation_sender()
-        self.replication_client = replication_client
-
-        self.federation_position = self.store.federation_out_pos_startup
-        self._fed_position_linearizer = Linearizer(name="_fed_position_linearizer")
-
-        self._last_ack = self.federation_position
-
-        self._room_serials = {}
-        self._room_typing = {}
-
-    def on_start(self):
-        # There may be some events that are persisted but haven't been sent,
-        # so send them now.
-        self.federation_sender.notify_new_events(
-            self.store.get_room_max_stream_ordering()
-        )
-
-    def wake_destination(self, server: str):
-        self.federation_sender.wake_destination(server)
-
-    def stream_positions(self):
-        return {"federation": self.federation_position}
-
-    def process_replication_rows(self, stream_name, token, rows):
-        # The federation stream contains things that we want to send out, e.g.
-        # presence, typing, etc.
-        if stream_name == "federation":
-            send_queue.process_rows_for_federation(self.federation_sender, rows)
-            run_in_background(self.update_token, token)
-
-        # We also need to poke the federation sender when new events happen
-        elif stream_name == "events":
-            self.federation_sender.notify_new_events(token)
-
-        # ... and when new receipts happen
-        elif stream_name == ReceiptsStream.NAME:
-            run_as_background_process(
-                "process_receipts_for_federation", self._on_new_receipts, rows
-            )
-
-        # ... as well as device updates and messages
-        elif stream_name == DeviceListsStream.NAME:
-            hosts = set(row.destination for row in rows)
-            for host in hosts:
-                self.federation_sender.send_device_messages(host)
-
-        elif stream_name == ToDeviceStream.NAME:
-            # The to_device stream includes stuff to be pushed to both local
-            # clients and remote servers, so we ignore entities that start with
-            # '@' (since they'll be local users rather than destinations).
-            hosts = set(row.entity for row in rows if not row.entity.startswith("@"))
-            for host in hosts:
-                self.federation_sender.send_device_messages(host)
-
-    @defer.inlineCallbacks
-    def _on_new_receipts(self, rows):
-        """
-        Args:
-            rows (iterable[synapse.replication.tcp.streams.ReceiptsStreamRow]):
-                new receipts to be processed
-        """
-        for receipt in rows:
-            # we only want to send on receipts for our own users
-            if not self._is_mine_id(receipt.user_id):
-                continue
-            receipt_info = ReadReceipt(
-                receipt.room_id,
-                receipt.receipt_type,
-                receipt.user_id,
-                [receipt.event_id],
-                receipt.data,
-            )
-            yield self.federation_sender.send_read_receipt(receipt_info)
-
-    @defer.inlineCallbacks
-    def update_token(self, token):
-        try:
-            self.federation_position = token
-
-            # We linearize here to ensure we don't have races updating the token
-            with (yield self._fed_position_linearizer.queue(None)):
-                if self._last_ack < self.federation_position:
-                    yield self.store.update_federation_out_pos(
-                        "federation", self.federation_position
-                    )
-
-                    # We ACK this token over replication so that the master can drop
-                    # its in memory queues
-                    self.replication_client.send_federation_ack(
-                        self.federation_position
-                    )
-                    self._last_ack = self.federation_position
-        except Exception:
-            logger.exception("Error updating federation stream position")
+import sys
 
+from synapse.app.generic_worker import start
+from synapse.util.logcontext import LoggingContext
 
 if __name__ == "__main__":
     with LoggingContext("main"):
diff --git a/synapse/app/frontend_proxy.py b/synapse/app/frontend_proxy.py
index 30e435eead..add43147b3 100644
--- a/synapse/app/frontend_proxy.py
+++ b/synapse/app/frontend_proxy.py
@@ -13,241 +13,11 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import logging
-import sys
-
-from twisted.internet import defer, reactor
-from twisted.web.resource import NoResource
-
-import synapse
-from synapse import events
-from synapse.api.errors import HttpResponseException, SynapseError
-from synapse.app import _base
-from synapse.config._base import ConfigError
-from synapse.config.homeserver import HomeServerConfig
-from synapse.config.logger import setup_logging
-from synapse.http.server import JsonResource
-from synapse.http.servlet import RestServlet, parse_json_object_from_request
-from synapse.http.site import SynapseSite
-from synapse.logging.context import LoggingContext
-from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
-from synapse.replication.slave.storage._base import BaseSlavedStore
-from synapse.replication.slave.storage.appservice import SlavedApplicationServiceStore
-from synapse.replication.slave.storage.client_ips import SlavedClientIpStore
-from synapse.replication.slave.storage.devices import SlavedDeviceStore
-from synapse.replication.slave.storage.registration import SlavedRegistrationStore
-from synapse.replication.tcp.client import ReplicationClientHandler
-from synapse.rest.client.v2_alpha._base import client_patterns
-from synapse.server import HomeServer
-from synapse.util.httpresourcetree import create_resource_tree
-from synapse.util.manhole import manhole
-from synapse.util.versionstring import get_version_string
-
-logger = logging.getLogger("synapse.app.frontend_proxy")
-
-
-class PresenceStatusStubServlet(RestServlet):
-    PATTERNS = client_patterns("/presence/(?P<user_id>[^/]*)/status")
-
-    def __init__(self, hs):
-        super(PresenceStatusStubServlet, self).__init__()
-        self.http_client = hs.get_simple_http_client()
-        self.auth = hs.get_auth()
-        self.main_uri = hs.config.worker_main_http_uri
-
-    @defer.inlineCallbacks
-    def on_GET(self, request, user_id):
-        # Pass through the auth headers, if any, in case the access token
-        # is there.
-        auth_headers = request.requestHeaders.getRawHeaders("Authorization", [])
-        headers = {"Authorization": auth_headers}
-
-        try:
-            result = yield self.http_client.get_json(
-                self.main_uri + request.uri.decode("ascii"), headers=headers
-            )
-        except HttpResponseException as e:
-            raise e.to_synapse_error()
-
-        return 200, result
-
-    @defer.inlineCallbacks
-    def on_PUT(self, request, user_id):
-        yield self.auth.get_user_by_req(request)
-        return 200, {}
-
-
-class KeyUploadServlet(RestServlet):
-    PATTERNS = client_patterns("/keys/upload(/(?P<device_id>[^/]+))?$")
-
-    def __init__(self, hs):
-        """
-        Args:
-            hs (synapse.server.HomeServer): server
-        """
-        super(KeyUploadServlet, self).__init__()
-        self.auth = hs.get_auth()
-        self.store = hs.get_datastore()
-        self.http_client = hs.get_simple_http_client()
-        self.main_uri = hs.config.worker_main_http_uri
-
-    @defer.inlineCallbacks
-    def on_POST(self, request, device_id):
-        requester = yield self.auth.get_user_by_req(request, allow_guest=True)
-        user_id = requester.user.to_string()
-        body = parse_json_object_from_request(request)
-
-        if device_id is not None:
-            # passing the device_id here is deprecated; however, we allow it
-            # for now for compatibility with older clients.
-            if requester.device_id is not None and device_id != requester.device_id:
-                logger.warning(
-                    "Client uploading keys for a different device "
-                    "(logged in as %s, uploading for %s)",
-                    requester.device_id,
-                    device_id,
-                )
-        else:
-            device_id = requester.device_id
-
-        if device_id is None:
-            raise SynapseError(
-                400, "To upload keys, you must pass device_id when authenticating"
-            )
-
-        if body:
-            # They're actually trying to upload something, proxy to main synapse.
-            # Pass through the auth headers, if any, in case the access token
-            # is there.
-            auth_headers = request.requestHeaders.getRawHeaders(b"Authorization", [])
-            headers = {"Authorization": auth_headers}
-            result = yield self.http_client.post_json_get_json(
-                self.main_uri + request.uri.decode("ascii"), body, headers=headers
-            )
-
-            return 200, result
-        else:
-            # Just interested in counts.
-            result = yield self.store.count_e2e_one_time_keys(user_id, device_id)
-            return 200, {"one_time_key_counts": result}
-
-
-class FrontendProxySlavedStore(
-    SlavedDeviceStore,
-    SlavedClientIpStore,
-    SlavedApplicationServiceStore,
-    SlavedRegistrationStore,
-    BaseSlavedStore,
-):
-    pass
 
+import sys
 
-class FrontendProxyServer(HomeServer):
-    DATASTORE_CLASS = FrontendProxySlavedStore
-
-    def _listen_http(self, listener_config):
-        port = listener_config["port"]
-        bind_addresses = listener_config["bind_addresses"]
-        site_tag = listener_config.get("tag", port)
-        resources = {}
-        for res in listener_config["resources"]:
-            for name in res["names"]:
-                if name == "metrics":
-                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
-                elif name == "client":
-                    resource = JsonResource(self, canonical_json=False)
-                    KeyUploadServlet(self).register(resource)
-
-                    # If presence is disabled, use the stub servlet that does
-                    # not allow sending presence
-                    if not self.config.use_presence:
-                        PresenceStatusStubServlet(self).register(resource)
-
-                    resources.update(
-                        {
-                            "/_matrix/client/r0": resource,
-                            "/_matrix/client/unstable": resource,
-                            "/_matrix/client/v2_alpha": resource,
-                            "/_matrix/client/api/v1": resource,
-                        }
-                    )
-
-        root_resource = create_resource_tree(resources, NoResource())
-
-        _base.listen_tcp(
-            bind_addresses,
-            port,
-            SynapseSite(
-                "synapse.access.http.%s" % (site_tag,),
-                site_tag,
-                listener_config,
-                root_resource,
-                self.version_string,
-            ),
-            reactor=self.get_reactor(),
-        )
-
-        logger.info("Synapse client reader now listening on port %d", port)
-
-    def start_listening(self, listeners):
-        for listener in listeners:
-            if listener["type"] == "http":
-                self._listen_http(listener)
-            elif listener["type"] == "manhole":
-                _base.listen_tcp(
-                    listener["bind_addresses"],
-                    listener["port"],
-                    manhole(
-                        username="matrix", password="rabbithole", globals={"hs": self}
-                    ),
-                )
-            elif listener["type"] == "metrics":
-                if not self.get_config().enable_metrics:
-                    logger.warning(
-                        (
-                            "Metrics listener configured, but "
-                            "enable_metrics is not True!"
-                        )
-                    )
-                else:
-                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
-            else:
-                logger.warning("Unrecognized listener type: %s", listener["type"])
-
-        self.get_tcp_replication().start_replication(self)
-
-    def build_tcp_replication(self):
-        return ReplicationClientHandler(self.get_datastore())
-
-
-def start(config_options):
-    try:
-        config = HomeServerConfig.load_config("Synapse frontend proxy", config_options)
-    except ConfigError as e:
-        sys.stderr.write("\n" + str(e) + "\n")
-        sys.exit(1)
-
-    assert config.worker_app == "synapse.app.frontend_proxy"
-
-    assert config.worker_main_http_uri is not None
-
-    events.USE_FROZEN_DICTS = config.use_frozen_dicts
-
-    ss = FrontendProxyServer(
-        config.server_name,
-        config=config,
-        version_string="Synapse/" + get_version_string(synapse),
-    )
-
-    setup_logging(ss, config, use_worker_options=True)
-
-    ss.setup()
-    reactor.addSystemEventTrigger(
-        "before", "startup", _base.start, ss, config.worker_listeners
-    )
-
-    _base.start_worker_reactor("synapse-frontend-proxy", config)
-
+from synapse.app.generic_worker import start
+from synapse.util.logcontext import LoggingContext
 
 if __name__ == "__main__":
     with LoggingContext("main"):
diff --git a/synapse/app/generic_worker.py b/synapse/app/generic_worker.py
new file mode 100644
index 0000000000..b2c764bfe8
--- /dev/null
+++ b/synapse/app/generic_worker.py
@@ -0,0 +1,923 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+# Copyright 2016 OpenMarket Ltd
+# Copyright 2020 The Matrix.org Foundation C.I.C.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import contextlib
+import logging
+import sys
+
+from twisted.internet import defer, reactor
+from twisted.web.resource import NoResource
+
+import synapse
+import synapse.events
+from synapse.api.constants import EventTypes
+from synapse.api.errors import HttpResponseException, SynapseError
+from synapse.api.urls import (
+    CLIENT_API_PREFIX,
+    FEDERATION_PREFIX,
+    LEGACY_MEDIA_PREFIX,
+    MEDIA_PREFIX,
+    SERVER_KEY_V2_PREFIX,
+)
+from synapse.app import _base
+from synapse.config._base import ConfigError
+from synapse.config.homeserver import HomeServerConfig
+from synapse.config.logger import setup_logging
+from synapse.federation import send_queue
+from synapse.federation.transport.server import TransportLayerServer
+from synapse.handlers.presence import PresenceHandler, get_interested_parties
+from synapse.http.server import JsonResource
+from synapse.http.servlet import RestServlet, parse_json_object_from_request
+from synapse.http.site import SynapseSite
+from synapse.logging.context import LoggingContext, run_in_background
+from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
+from synapse.metrics.background_process_metrics import run_as_background_process
+from synapse.replication.slave.storage._base import BaseSlavedStore, __func__
+from synapse.replication.slave.storage.account_data import SlavedAccountDataStore
+from synapse.replication.slave.storage.appservice import SlavedApplicationServiceStore
+from synapse.replication.slave.storage.client_ips import SlavedClientIpStore
+from synapse.replication.slave.storage.deviceinbox import SlavedDeviceInboxStore
+from synapse.replication.slave.storage.devices import SlavedDeviceStore
+from synapse.replication.slave.storage.directory import DirectoryStore
+from synapse.replication.slave.storage.events import SlavedEventStore
+from synapse.replication.slave.storage.filtering import SlavedFilteringStore
+from synapse.replication.slave.storage.groups import SlavedGroupServerStore
+from synapse.replication.slave.storage.keys import SlavedKeyStore
+from synapse.replication.slave.storage.presence import SlavedPresenceStore
+from synapse.replication.slave.storage.profile import SlavedProfileStore
+from synapse.replication.slave.storage.push_rule import SlavedPushRuleStore
+from synapse.replication.slave.storage.pushers import SlavedPusherStore
+from synapse.replication.slave.storage.receipts import SlavedReceiptsStore
+from synapse.replication.slave.storage.registration import SlavedRegistrationStore
+from synapse.replication.slave.storage.room import RoomStore
+from synapse.replication.slave.storage.transactions import SlavedTransactionStore
+from synapse.replication.tcp.client import ReplicationClientHandler
+from synapse.replication.tcp.streams._base import (
+    DeviceListsStream,
+    ReceiptsStream,
+    ToDeviceStream,
+)
+from synapse.replication.tcp.streams.events import EventsStreamEventRow, EventsStreamRow
+from synapse.rest.admin import register_servlets_for_media_repo
+from synapse.rest.client.v1 import events
+from synapse.rest.client.v1.initial_sync import InitialSyncRestServlet
+from synapse.rest.client.v1.login import LoginRestServlet
+from synapse.rest.client.v1.profile import (
+    ProfileAvatarURLRestServlet,
+    ProfileDisplaynameRestServlet,
+    ProfileRestServlet,
+)
+from synapse.rest.client.v1.push_rule import PushRuleRestServlet
+from synapse.rest.client.v1.room import (
+    JoinedRoomMemberListRestServlet,
+    JoinRoomAliasServlet,
+    PublicRoomListRestServlet,
+    RoomEventContextServlet,
+    RoomInitialSyncRestServlet,
+    RoomMemberListRestServlet,
+    RoomMembershipRestServlet,
+    RoomMessageListRestServlet,
+    RoomSendEventRestServlet,
+    RoomStateEventRestServlet,
+    RoomStateRestServlet,
+)
+from synapse.rest.client.v1.voip import VoipRestServlet
+from synapse.rest.client.v2_alpha import groups, sync, user_directory
+from synapse.rest.client.v2_alpha._base import client_patterns
+from synapse.rest.client.v2_alpha.account import ThreepidRestServlet
+from synapse.rest.client.v2_alpha.keys import KeyChangesServlet, KeyQueryServlet
+from synapse.rest.client.v2_alpha.register import RegisterRestServlet
+from synapse.rest.client.versions import VersionsRestServlet
+from synapse.rest.key.v2 import KeyApiV2Resource
+from synapse.server import HomeServer
+from synapse.storage.data_stores.main.media_repository import MediaRepositoryStore
+from synapse.storage.data_stores.main.monthly_active_users import (
+    MonthlyActiveUsersWorkerStore,
+)
+from synapse.storage.data_stores.main.presence import UserPresenceState
+from synapse.storage.data_stores.main.user_directory import UserDirectoryStore
+from synapse.types import ReadReceipt
+from synapse.util.async_helpers import Linearizer
+from synapse.util.httpresourcetree import create_resource_tree
+from synapse.util.manhole import manhole
+from synapse.util.stringutils import random_string
+from synapse.util.versionstring import get_version_string
+
+logger = logging.getLogger("synapse.app.generic_worker")
+
+
+class PresenceStatusStubServlet(RestServlet):
+    """If presence is disabled this servlet can be used to stub out setting
+    presence status, while proxying the getters to the master instance.
+    """
+
+    PATTERNS = client_patterns("/presence/(?P<user_id>[^/]*)/status")
+
+    def __init__(self, hs):
+        super(PresenceStatusStubServlet, self).__init__()
+        self.http_client = hs.get_simple_http_client()
+        self.auth = hs.get_auth()
+        self.main_uri = hs.config.worker_main_http_uri
+
+    async def on_GET(self, request, user_id):
+        # Pass through the auth headers, if any, in case the access token
+        # is there.
+        auth_headers = request.requestHeaders.getRawHeaders("Authorization", [])
+        headers = {"Authorization": auth_headers}
+
+        try:
+            result = await self.http_client.get_json(
+                self.main_uri + request.uri.decode("ascii"), headers=headers
+            )
+        except HttpResponseException as e:
+            raise e.to_synapse_error()
+
+        return 200, result
+
+    async def on_PUT(self, request, user_id):
+        await self.auth.get_user_by_req(request)
+        return 200, {}
+
+
+class KeyUploadServlet(RestServlet):
+    """An implementation of the `KeyUploadServlet` that responds to read only
+    requests, but otherwise proxies through to the master instance.
+    """
+
+    PATTERNS = client_patterns("/keys/upload(/(?P<device_id>[^/]+))?$")
+
+    def __init__(self, hs):
+        """
+        Args:
+            hs (synapse.server.HomeServer): server
+        """
+        super(KeyUploadServlet, self).__init__()
+        self.auth = hs.get_auth()
+        self.store = hs.get_datastore()
+        self.http_client = hs.get_simple_http_client()
+        self.main_uri = hs.config.worker_main_http_uri
+
+    async def on_POST(self, request, device_id):
+        requester = await self.auth.get_user_by_req(request, allow_guest=True)
+        user_id = requester.user.to_string()
+        body = parse_json_object_from_request(request)
+
+        if device_id is not None:
+            # passing the device_id here is deprecated; however, we allow it
+            # for now for compatibility with older clients.
+            if requester.device_id is not None and device_id != requester.device_id:
+                logger.warning(
+                    "Client uploading keys for a different device "
+                    "(logged in as %s, uploading for %s)",
+                    requester.device_id,
+                    device_id,
+                )
+        else:
+            device_id = requester.device_id
+
+        if device_id is None:
+            raise SynapseError(
+                400, "To upload keys, you must pass device_id when authenticating"
+            )
+
+        if body:
+            # They're actually trying to upload something, proxy to main synapse.
+            # Pass through the auth headers, if any, in case the access token
+            # is there.
+            auth_headers = request.requestHeaders.getRawHeaders(b"Authorization", [])
+            headers = {"Authorization": auth_headers}
+            result = await self.http_client.post_json_get_json(
+                self.main_uri + request.uri.decode("ascii"), body, headers=headers
+            )
+
+            return 200, result
+        else:
+            # Just interested in counts.
+            result = await self.store.count_e2e_one_time_keys(user_id, device_id)
+            return 200, {"one_time_key_counts": result}
+
+
+UPDATE_SYNCING_USERS_MS = 10 * 1000
+
+
+class GenericWorkerPresence(object):
+    def __init__(self, hs):
+        self.hs = hs
+        self.is_mine_id = hs.is_mine_id
+        self.http_client = hs.get_simple_http_client()
+        self.store = hs.get_datastore()
+        self.user_to_num_current_syncs = {}
+        self.clock = hs.get_clock()
+        self.notifier = hs.get_notifier()
+
+        active_presence = self.store.take_presence_startup_info()
+        self.user_to_current_state = {state.user_id: state for state in active_presence}
+
+        # user_id -> last_sync_ms. Lists the users that have stopped syncing
+        # but we haven't notified the master of that yet
+        self.users_going_offline = {}
+
+        self._send_stop_syncing_loop = self.clock.looping_call(
+            self.send_stop_syncing, UPDATE_SYNCING_USERS_MS
+        )
+
+        self.process_id = random_string(16)
+        logger.info("Presence process_id is %r", self.process_id)
+
+    def send_user_sync(self, user_id, is_syncing, last_sync_ms):
+        if self.hs.config.use_presence:
+            self.hs.get_tcp_replication().send_user_sync(
+                user_id, is_syncing, last_sync_ms
+            )
+
+    def mark_as_coming_online(self, user_id):
+        """A user has started syncing. Send a UserSync to the master, unless they
+        had recently stopped syncing.
+
+        Args:
+            user_id (str)
+        """
+        going_offline = self.users_going_offline.pop(user_id, None)
+        if not going_offline:
+            # Safe to skip because we haven't yet told the master they were offline
+            self.send_user_sync(user_id, True, self.clock.time_msec())
+
+    def mark_as_going_offline(self, user_id):
+        """A user has stopped syncing. We wait before notifying the master as
+        its likely they'll come back soon. This allows us to avoid sending
+        a stopped syncing immediately followed by a started syncing notification
+        to the master
+
+        Args:
+            user_id (str)
+        """
+        self.users_going_offline[user_id] = self.clock.time_msec()
+
+    def send_stop_syncing(self):
+        """Check if there are any users who have stopped syncing a while ago
+        and haven't come back yet. If there are poke the master about them.
+        """
+        now = self.clock.time_msec()
+        for user_id, last_sync_ms in list(self.users_going_offline.items()):
+            if now - last_sync_ms > UPDATE_SYNCING_USERS_MS:
+                self.users_going_offline.pop(user_id, None)
+                self.send_user_sync(user_id, False, last_sync_ms)
+
+    def set_state(self, user, state, ignore_status_msg=False):
+        # TODO Hows this supposed to work?
+        return defer.succeed(None)
+
+    get_states = __func__(PresenceHandler.get_states)
+    get_state = __func__(PresenceHandler.get_state)
+    current_state_for_users = __func__(PresenceHandler.current_state_for_users)
+
+    def user_syncing(self, user_id, affect_presence):
+        if affect_presence:
+            curr_sync = self.user_to_num_current_syncs.get(user_id, 0)
+            self.user_to_num_current_syncs[user_id] = curr_sync + 1
+
+            # If we went from no in flight sync to some, notify replication
+            if self.user_to_num_current_syncs[user_id] == 1:
+                self.mark_as_coming_online(user_id)
+
+        def _end():
+            # We check that the user_id is in user_to_num_current_syncs because
+            # user_to_num_current_syncs may have been cleared if we are
+            # shutting down.
+            if affect_presence and user_id in self.user_to_num_current_syncs:
+                self.user_to_num_current_syncs[user_id] -= 1
+
+                # If we went from one in flight sync to non, notify replication
+                if self.user_to_num_current_syncs[user_id] == 0:
+                    self.mark_as_going_offline(user_id)
+
+        @contextlib.contextmanager
+        def _user_syncing():
+            try:
+                yield
+            finally:
+                _end()
+
+        return defer.succeed(_user_syncing())
+
+    @defer.inlineCallbacks
+    def notify_from_replication(self, states, stream_id):
+        parties = yield get_interested_parties(self.store, states)
+        room_ids_to_states, users_to_states = parties
+
+        self.notifier.on_new_event(
+            "presence_key",
+            stream_id,
+            rooms=room_ids_to_states.keys(),
+            users=users_to_states.keys(),
+        )
+
+    @defer.inlineCallbacks
+    def process_replication_rows(self, token, rows):
+        states = [
+            UserPresenceState(
+                row.user_id,
+                row.state,
+                row.last_active_ts,
+                row.last_federation_update_ts,
+                row.last_user_sync_ts,
+                row.status_msg,
+                row.currently_active,
+            )
+            for row in rows
+        ]
+
+        for state in states:
+            self.user_to_current_state[state.user_id] = state
+
+        stream_id = token
+        yield self.notify_from_replication(states, stream_id)
+
+    def get_currently_syncing_users(self):
+        if self.hs.config.use_presence:
+            return [
+                user_id
+                for user_id, count in self.user_to_num_current_syncs.items()
+                if count > 0
+            ]
+        else:
+            return set()
+
+
+class GenericWorkerTyping(object):
+    def __init__(self, hs):
+        self._latest_room_serial = 0
+        self._reset()
+
+    def _reset(self):
+        """
+        Reset the typing handler's data caches.
+        """
+        # map room IDs to serial numbers
+        self._room_serials = {}
+        # map room IDs to sets of users currently typing
+        self._room_typing = {}
+
+    def stream_positions(self):
+        # We must update this typing token from the response of the previous
+        # sync. In particular, the stream id may "reset" back to zero/a low
+        # value which we *must* use for the next replication request.
+        return {"typing": self._latest_room_serial}
+
+    def process_replication_rows(self, token, rows):
+        if self._latest_room_serial > token:
+            # The master has gone backwards. To prevent inconsistent data, just
+            # clear everything.
+            self._reset()
+
+        # Set the latest serial token to whatever the server gave us.
+        self._latest_room_serial = token
+
+        for row in rows:
+            self._room_serials[row.room_id] = token
+            self._room_typing[row.room_id] = row.user_ids
+
+
+class GenericWorkerSlavedStore(
+    # FIXME(#3714): We need to add UserDirectoryStore as we write directly
+    # rather than going via the correct worker.
+    UserDirectoryStore,
+    SlavedDeviceInboxStore,
+    SlavedDeviceStore,
+    SlavedReceiptsStore,
+    SlavedPushRuleStore,
+    SlavedGroupServerStore,
+    SlavedAccountDataStore,
+    SlavedPusherStore,
+    SlavedEventStore,
+    SlavedKeyStore,
+    RoomStore,
+    DirectoryStore,
+    SlavedApplicationServiceStore,
+    SlavedRegistrationStore,
+    SlavedTransactionStore,
+    SlavedProfileStore,
+    SlavedClientIpStore,
+    SlavedPresenceStore,
+    SlavedFilteringStore,
+    MonthlyActiveUsersWorkerStore,
+    MediaRepositoryStore,
+    BaseSlavedStore,
+):
+    def __init__(self, database, db_conn, hs):
+        super(GenericWorkerSlavedStore, self).__init__(database, db_conn, hs)
+
+        # We pull out the current federation stream position now so that we
+        # always have a known value for the federation position in memory so
+        # that we don't have to bounce via a deferred once when we start the
+        # replication streams.
+        self.federation_out_pos_startup = self._get_federation_out_pos(db_conn)
+
+    def _get_federation_out_pos(self, db_conn):
+        sql = "SELECT stream_id FROM federation_stream_position WHERE type = ?"
+        sql = self.database_engine.convert_param_style(sql)
+
+        txn = db_conn.cursor()
+        txn.execute(sql, ("federation",))
+        rows = txn.fetchall()
+        txn.close()
+
+        return rows[0][0] if rows else -1
+
+
+class GenericWorkerServer(HomeServer):
+    DATASTORE_CLASS = GenericWorkerSlavedStore
+
+    def _listen_http(self, listener_config):
+        port = listener_config["port"]
+        bind_addresses = listener_config["bind_addresses"]
+        site_tag = listener_config.get("tag", port)
+        resources = {}
+        for res in listener_config["resources"]:
+            for name in res["names"]:
+                if name == "metrics":
+                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
+                elif name == "client":
+                    resource = JsonResource(self, canonical_json=False)
+
+                    PublicRoomListRestServlet(self).register(resource)
+                    RoomMemberListRestServlet(self).register(resource)
+                    JoinedRoomMemberListRestServlet(self).register(resource)
+                    RoomStateRestServlet(self).register(resource)
+                    RoomEventContextServlet(self).register(resource)
+                    RoomMessageListRestServlet(self).register(resource)
+                    RegisterRestServlet(self).register(resource)
+                    LoginRestServlet(self).register(resource)
+                    ThreepidRestServlet(self).register(resource)
+                    KeyQueryServlet(self).register(resource)
+                    KeyChangesServlet(self).register(resource)
+                    VoipRestServlet(self).register(resource)
+                    PushRuleRestServlet(self).register(resource)
+                    VersionsRestServlet(self).register(resource)
+                    RoomSendEventRestServlet(self).register(resource)
+                    RoomMembershipRestServlet(self).register(resource)
+                    RoomStateEventRestServlet(self).register(resource)
+                    JoinRoomAliasServlet(self).register(resource)
+                    ProfileAvatarURLRestServlet(self).register(resource)
+                    ProfileDisplaynameRestServlet(self).register(resource)
+                    ProfileRestServlet(self).register(resource)
+                    KeyUploadServlet(self).register(resource)
+
+                    sync.register_servlets(self, resource)
+                    events.register_servlets(self, resource)
+                    InitialSyncRestServlet(self).register(resource)
+                    RoomInitialSyncRestServlet(self).register(resource)
+
+                    user_directory.register_servlets(self, resource)
+
+                    # If presence is disabled, use the stub servlet that does
+                    # not allow sending presence
+                    if not self.config.use_presence:
+                        PresenceStatusStubServlet(self).register(resource)
+
+                    groups.register_servlets(self, resource)
+
+                    resources.update({CLIENT_API_PREFIX: resource})
+                elif name == "federation":
+                    resources.update({FEDERATION_PREFIX: TransportLayerServer(self)})
+                elif name == "media":
+                    if self.config.can_load_media_repo:
+                        media_repo = self.get_media_repository_resource()
+
+                        # We need to serve the admin servlets for media on the
+                        # worker.
+                        admin_resource = JsonResource(self, canonical_json=False)
+                        register_servlets_for_media_repo(self, admin_resource)
+
+                        resources.update(
+                            {
+                                MEDIA_PREFIX: media_repo,
+                                LEGACY_MEDIA_PREFIX: media_repo,
+                                "/_synapse/admin": admin_resource,
+                            }
+                        )
+                    else:
+                        logger.warning(
+                            "A 'media' listener is configured but the media"
+                            " repository is disabled. Ignoring."
+                        )
+
+                if name == "openid" and "federation" not in res["names"]:
+                    # Only load the openid resource separately if federation resource
+                    # is not specified since federation resource includes openid
+                    # resource.
+                    resources.update(
+                        {
+                            FEDERATION_PREFIX: TransportLayerServer(
+                                self, servlet_groups=["openid"]
+                            )
+                        }
+                    )
+
+                if name in ["keys", "federation"]:
+                    resources[SERVER_KEY_V2_PREFIX] = KeyApiV2Resource(self)
+
+        root_resource = create_resource_tree(resources, NoResource())
+
+        _base.listen_tcp(
+            bind_addresses,
+            port,
+            SynapseSite(
+                "synapse.access.http.%s" % (site_tag,),
+                site_tag,
+                listener_config,
+                root_resource,
+                self.version_string,
+            ),
+            reactor=self.get_reactor(),
+        )
+
+        logger.info("Synapse worker now listening on port %d", port)
+
+    def start_listening(self, listeners):
+        for listener in listeners:
+            if listener["type"] == "http":
+                self._listen_http(listener)
+            elif listener["type"] == "manhole":
+                _base.listen_tcp(
+                    listener["bind_addresses"],
+                    listener["port"],
+                    manhole(
+                        username="matrix", password="rabbithole", globals={"hs": self}
+                    ),
+                )
+            elif listener["type"] == "metrics":
+                if not self.get_config().enable_metrics:
+                    logger.warning(
+                        (
+                            "Metrics listener configured, but "
+                            "enable_metrics is not True!"
+                        )
+                    )
+                else:
+                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
+            else:
+                logger.warning("Unrecognized listener type: %s", listener["type"])
+
+        self.get_tcp_replication().start_replication(self)
+
+    def remove_pusher(self, app_id, push_key, user_id):
+        self.get_tcp_replication().send_remove_pusher(app_id, push_key, user_id)
+
+    def build_tcp_replication(self):
+        return GenericWorkerReplicationHandler(self)
+
+    def build_presence_handler(self):
+        return GenericWorkerPresence(self)
+
+    def build_typing_handler(self):
+        return GenericWorkerTyping(self)
+
+
+class GenericWorkerReplicationHandler(ReplicationClientHandler):
+    def __init__(self, hs):
+        super(GenericWorkerReplicationHandler, self).__init__(hs.get_datastore())
+
+        self.store = hs.get_datastore()
+        self.typing_handler = hs.get_typing_handler()
+        # NB this is a SynchrotronPresence, not a normal PresenceHandler
+        self.presence_handler = hs.get_presence_handler()
+        self.notifier = hs.get_notifier()
+
+        self.notify_pushers = hs.config.start_pushers
+        self.pusher_pool = hs.get_pusherpool()
+
+        if hs.config.send_federation:
+            self.send_handler = FederationSenderHandler(hs, self)
+        else:
+            self.send_handler = None
+
+    async def on_rdata(self, stream_name, token, rows):
+        await super(GenericWorkerReplicationHandler, self).on_rdata(
+            stream_name, token, rows
+        )
+        run_in_background(self.process_and_notify, stream_name, token, rows)
+
+    def get_streams_to_replicate(self):
+        args = super(GenericWorkerReplicationHandler, self).get_streams_to_replicate()
+        args.update(self.typing_handler.stream_positions())
+        if self.send_handler:
+            args.update(self.send_handler.stream_positions())
+        return args
+
+    def get_currently_syncing_users(self):
+        return self.presence_handler.get_currently_syncing_users()
+
+    async def process_and_notify(self, stream_name, token, rows):
+        try:
+            if self.send_handler:
+                self.send_handler.process_replication_rows(stream_name, token, rows)
+
+            if stream_name == "events":
+                # We shouldn't get multiple rows per token for events stream, so
+                # we don't need to optimise this for multiple rows.
+                for row in rows:
+                    if row.type != EventsStreamEventRow.TypeId:
+                        continue
+                    assert isinstance(row, EventsStreamRow)
+
+                    event = await self.store.get_event(
+                        row.data.event_id, allow_rejected=True
+                    )
+                    if event.rejected_reason:
+                        continue
+
+                    extra_users = ()
+                    if event.type == EventTypes.Member:
+                        extra_users = (event.state_key,)
+                    max_token = self.store.get_room_max_stream_ordering()
+                    self.notifier.on_new_room_event(
+                        event, token, max_token, extra_users
+                    )
+
+                await self.pusher_pool.on_new_notifications(token, token)
+            elif stream_name == "push_rules":
+                self.notifier.on_new_event(
+                    "push_rules_key", token, users=[row.user_id for row in rows]
+                )
+            elif stream_name in ("account_data", "tag_account_data"):
+                self.notifier.on_new_event(
+                    "account_data_key", token, users=[row.user_id for row in rows]
+                )
+            elif stream_name == "receipts":
+                self.notifier.on_new_event(
+                    "receipt_key", token, rooms=[row.room_id for row in rows]
+                )
+                await self.pusher_pool.on_new_receipts(
+                    token, token, {row.room_id for row in rows}
+                )
+            elif stream_name == "typing":
+                self.typing_handler.process_replication_rows(token, rows)
+                self.notifier.on_new_event(
+                    "typing_key", token, rooms=[row.room_id for row in rows]
+                )
+            elif stream_name == "to_device":
+                entities = [row.entity for row in rows if row.entity.startswith("@")]
+                if entities:
+                    self.notifier.on_new_event("to_device_key", token, users=entities)
+            elif stream_name == "device_lists":
+                all_room_ids = set()
+                for row in rows:
+                    room_ids = await self.store.get_rooms_for_user(row.user_id)
+                    all_room_ids.update(room_ids)
+                self.notifier.on_new_event("device_list_key", token, rooms=all_room_ids)
+            elif stream_name == "presence":
+                await self.presence_handler.process_replication_rows(token, rows)
+            elif stream_name == "receipts":
+                self.notifier.on_new_event(
+                    "groups_key", token, users=[row.user_id for row in rows]
+                )
+            elif stream_name == "pushers":
+                for row in rows:
+                    if row.deleted:
+                        self.stop_pusher(row.user_id, row.app_id, row.pushkey)
+                    else:
+                        await self.start_pusher(row.user_id, row.app_id, row.pushkey)
+        except Exception:
+            logger.exception("Error processing replication")
+
+    def stop_pusher(self, user_id, app_id, pushkey):
+        if not self.notify_pushers:
+            return
+
+        key = "%s:%s" % (app_id, pushkey)
+        pushers_for_user = self.pusher_pool.pushers.get(user_id, {})
+        pusher = pushers_for_user.pop(key, None)
+        if pusher is None:
+            return
+        logger.info("Stopping pusher %r / %r", user_id, key)
+        pusher.on_stop()
+
+    async def start_pusher(self, user_id, app_id, pushkey):
+        if not self.notify_pushers:
+            return
+
+        key = "%s:%s" % (app_id, pushkey)
+        logger.info("Starting pusher %r / %r", user_id, key)
+        return await self.pusher_pool.start_pusher_by_id(app_id, pushkey, user_id)
+
+    def on_remote_server_up(self, server: str):
+        """Called when get a new REMOTE_SERVER_UP command."""
+
+        # Let's wake up the transaction queue for the server in case we have
+        # pending stuff to send to it.
+        if self.send_handler:
+            self.send_handler.wake_destination(server)
+
+
+class FederationSenderHandler(object):
+    """Processes the replication stream and forwards the appropriate entries
+    to the federation sender.
+    """
+
+    def __init__(self, hs: GenericWorkerServer, replication_client):
+        self.store = hs.get_datastore()
+        self._is_mine_id = hs.is_mine_id
+        self.federation_sender = hs.get_federation_sender()
+        self.replication_client = replication_client
+
+        self.federation_position = self.store.federation_out_pos_startup
+        self._fed_position_linearizer = Linearizer(name="_fed_position_linearizer")
+
+        self._last_ack = self.federation_position
+
+        self._room_serials = {}
+        self._room_typing = {}
+
+    def on_start(self):
+        # There may be some events that are persisted but haven't been sent,
+        # so send them now.
+        self.federation_sender.notify_new_events(
+            self.store.get_room_max_stream_ordering()
+        )
+
+    def wake_destination(self, server: str):
+        self.federation_sender.wake_destination(server)
+
+    def stream_positions(self):
+        return {"federation": self.federation_position}
+
+    def process_replication_rows(self, stream_name, token, rows):
+        # The federation stream contains things that we want to send out, e.g.
+        # presence, typing, etc.
+        if stream_name == "federation":
+            send_queue.process_rows_for_federation(self.federation_sender, rows)
+            run_in_background(self.update_token, token)
+
+        # We also need to poke the federation sender when new events happen
+        elif stream_name == "events":
+            self.federation_sender.notify_new_events(token)
+
+        # ... and when new receipts happen
+        elif stream_name == ReceiptsStream.NAME:
+            run_as_background_process(
+                "process_receipts_for_federation", self._on_new_receipts, rows
+            )
+
+        # ... as well as device updates and messages
+        elif stream_name == DeviceListsStream.NAME:
+            hosts = {row.destination for row in rows}
+            for host in hosts:
+                self.federation_sender.send_device_messages(host)
+
+        elif stream_name == ToDeviceStream.NAME:
+            # The to_device stream includes stuff to be pushed to both local
+            # clients and remote servers, so we ignore entities that start with
+            # '@' (since they'll be local users rather than destinations).
+            hosts = {row.entity for row in rows if not row.entity.startswith("@")}
+            for host in hosts:
+                self.federation_sender.send_device_messages(host)
+
+    async def _on_new_receipts(self, rows):
+        """
+        Args:
+            rows (iterable[synapse.replication.tcp.streams.ReceiptsStreamRow]):
+                new receipts to be processed
+        """
+        for receipt in rows:
+            # we only want to send on receipts for our own users
+            if not self._is_mine_id(receipt.user_id):
+                continue
+            receipt_info = ReadReceipt(
+                receipt.room_id,
+                receipt.receipt_type,
+                receipt.user_id,
+                [receipt.event_id],
+                receipt.data,
+            )
+            await self.federation_sender.send_read_receipt(receipt_info)
+
+    async def update_token(self, token):
+        try:
+            self.federation_position = token
+
+            # We linearize here to ensure we don't have races updating the token
+            with (await self._fed_position_linearizer.queue(None)):
+                if self._last_ack < self.federation_position:
+                    await self.store.update_federation_out_pos(
+                        "federation", self.federation_position
+                    )
+
+                    # We ACK this token over replication so that the master can drop
+                    # its in memory queues
+                    self.replication_client.send_federation_ack(
+                        self.federation_position
+                    )
+                    self._last_ack = self.federation_position
+        except Exception:
+            logger.exception("Error updating federation stream position")
+
+
+def start(config_options):
+    try:
+        config = HomeServerConfig.load_config("Synapse worker", config_options)
+    except ConfigError as e:
+        sys.stderr.write("\n" + str(e) + "\n")
+        sys.exit(1)
+
+    # For backwards compatibility let any of the old app names.
+    assert config.worker_app in (
+        "synapse.app.appservice",
+        "synapse.app.client_reader",
+        "synapse.app.event_creator",
+        "synapse.app.federation_reader",
+        "synapse.app.federation_sender",
+        "synapse.app.frontend_proxy",
+        "synapse.app.generic_worker",
+        "synapse.app.media_repository",
+        "synapse.app.pusher",
+        "synapse.app.synchrotron",
+        "synapse.app.user_dir",
+    )
+
+    if config.worker_app == "synapse.app.appservice":
+        if config.notify_appservices:
+            sys.stderr.write(
+                "\nThe appservices must be disabled in the main synapse process"
+                "\nbefore they can be run in a separate worker."
+                "\nPlease add ``notify_appservices: false`` to the main config"
+                "\n"
+            )
+            sys.exit(1)
+
+        # Force the appservice to start since they will be disabled in the main config
+        config.notify_appservices = True
+
+    if config.worker_app == "synapse.app.pusher":
+        if config.start_pushers:
+            sys.stderr.write(
+                "\nThe pushers must be disabled in the main synapse process"
+                "\nbefore they can be run in a separate worker."
+                "\nPlease add ``start_pushers: false`` to the main config"
+                "\n"
+            )
+            sys.exit(1)
+
+        # Force the pushers to start since they will be disabled in the main config
+        config.start_pushers = True
+
+    if config.worker_app == "synapse.app.user_dir":
+        if config.update_user_directory:
+            sys.stderr.write(
+                "\nThe update_user_directory must be disabled in the main synapse process"
+                "\nbefore they can be run in a separate worker."
+                "\nPlease add ``update_user_directory: false`` to the main config"
+                "\n"
+            )
+            sys.exit(1)
+
+        # Force the pushers to start since they will be disabled in the main config
+        config.update_user_directory = True
+
+    if config.worker_app == "synapse.app.federation_sender":
+        if config.send_federation:
+            sys.stderr.write(
+                "\nThe send_federation must be disabled in the main synapse process"
+                "\nbefore they can be run in a separate worker."
+                "\nPlease add ``send_federation: false`` to the main config"
+                "\n"
+            )
+            sys.exit(1)
+
+        # Force the pushers to start since they will be disabled in the main config
+        config.send_federation = True
+
+    synapse.events.USE_FROZEN_DICTS = config.use_frozen_dicts
+
+    ss = GenericWorkerServer(
+        config.server_name,
+        config=config,
+        version_string="Synapse/" + get_version_string(synapse),
+    )
+
+    setup_logging(ss, config, use_worker_options=True)
+
+    ss.setup()
+    reactor.addSystemEventTrigger(
+        "before", "startup", _base.start, ss, config.worker_listeners
+    )
+
+    _base.start_worker_reactor("synapse-generic-worker", config)
+
+
+if __name__ == "__main__":
+    with LoggingContext("main"):
+        start(sys.argv[1:])
diff --git a/synapse/app/homeserver.py b/synapse/app/homeserver.py
index c2a334a2b0..f2b56a636f 100644
--- a/synapse/app/homeserver.py
+++ b/synapse/app/homeserver.py
@@ -298,6 +298,11 @@ class SynapseHomeServer(HomeServer):
 
 # Gauges to expose monthly active user control metrics
 current_mau_gauge = Gauge("synapse_admin_mau:current", "Current MAU")
+current_mau_by_service_gauge = Gauge(
+    "synapse_admin_mau_current_mau_by_service",
+    "Current MAU by service",
+    ["app_service"],
+)
 max_mau_gauge = Gauge("synapse_admin_mau:max", "MAU Limit")
 registered_reserved_users_mau_gauge = Gauge(
     "synapse_admin_mau:registered_reserved_users",
@@ -403,7 +408,6 @@ def setup(config_options):
 
             _base.start(hs, config.listeners)
 
-            hs.get_pusherpool().start()
             hs.get_datastore().db.updates.start_doing_background_updates()
         except Exception:
             # Print the exception and bail out.
@@ -585,12 +589,20 @@ def run(hs):
     @defer.inlineCallbacks
     def generate_monthly_active_users():
         current_mau_count = 0
+        current_mau_count_by_service = {}
         reserved_users = ()
         store = hs.get_datastore()
         if hs.config.limit_usage_by_mau or hs.config.mau_stats_only:
             current_mau_count = yield store.get_monthly_active_count()
+            current_mau_count_by_service = (
+                yield store.get_monthly_active_count_by_service()
+            )
             reserved_users = yield store.get_registered_reserved_users()
         current_mau_gauge.set(float(current_mau_count))
+
+        for app_service, count in current_mau_count_by_service.items():
+            current_mau_by_service_gauge.labels(app_service).set(float(count))
+
         registered_reserved_users_mau_gauge.set(float(len(reserved_users)))
         max_mau_gauge.set(float(hs.config.max_mau_value))
 
diff --git a/synapse/app/media_repository.py b/synapse/app/media_repository.py
index 5b5832214a..add43147b3 100644
--- a/synapse/app/media_repository.py
+++ b/synapse/app/media_repository.py
@@ -13,162 +13,11 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import logging
-import sys
-
-from twisted.internet import reactor
-from twisted.web.resource import NoResource
-
-import synapse
-from synapse import events
-from synapse.api.urls import LEGACY_MEDIA_PREFIX, MEDIA_PREFIX
-from synapse.app import _base
-from synapse.config._base import ConfigError
-from synapse.config.homeserver import HomeServerConfig
-from synapse.config.logger import setup_logging
-from synapse.http.server import JsonResource
-from synapse.http.site import SynapseSite
-from synapse.logging.context import LoggingContext
-from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
-from synapse.replication.slave.storage._base import BaseSlavedStore
-from synapse.replication.slave.storage.appservice import SlavedApplicationServiceStore
-from synapse.replication.slave.storage.client_ips import SlavedClientIpStore
-from synapse.replication.slave.storage.registration import SlavedRegistrationStore
-from synapse.replication.slave.storage.room import RoomStore
-from synapse.replication.slave.storage.transactions import SlavedTransactionStore
-from synapse.replication.tcp.client import ReplicationClientHandler
-from synapse.rest.admin import register_servlets_for_media_repo
-from synapse.server import HomeServer
-from synapse.storage.data_stores.main.media_repository import MediaRepositoryStore
-from synapse.util.httpresourcetree import create_resource_tree
-from synapse.util.manhole import manhole
-from synapse.util.versionstring import get_version_string
-
-logger = logging.getLogger("synapse.app.media_repository")
-
-
-class MediaRepositorySlavedStore(
-    RoomStore,
-    SlavedApplicationServiceStore,
-    SlavedRegistrationStore,
-    SlavedClientIpStore,
-    SlavedTransactionStore,
-    BaseSlavedStore,
-    MediaRepositoryStore,
-):
-    pass
-
-
-class MediaRepositoryServer(HomeServer):
-    DATASTORE_CLASS = MediaRepositorySlavedStore
-
-    def _listen_http(self, listener_config):
-        port = listener_config["port"]
-        bind_addresses = listener_config["bind_addresses"]
-        site_tag = listener_config.get("tag", port)
-        resources = {}
-        for res in listener_config["resources"]:
-            for name in res["names"]:
-                if name == "metrics":
-                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
-                elif name == "media":
-                    media_repo = self.get_media_repository_resource()
-
-                    # We need to serve the admin servlets for media on the
-                    # worker.
-                    admin_resource = JsonResource(self, canonical_json=False)
-                    register_servlets_for_media_repo(self, admin_resource)
-
-                    resources.update(
-                        {
-                            MEDIA_PREFIX: media_repo,
-                            LEGACY_MEDIA_PREFIX: media_repo,
-                            "/_synapse/admin": admin_resource,
-                        }
-                    )
-
-        root_resource = create_resource_tree(resources, NoResource())
-
-        _base.listen_tcp(
-            bind_addresses,
-            port,
-            SynapseSite(
-                "synapse.access.http.%s" % (site_tag,),
-                site_tag,
-                listener_config,
-                root_resource,
-                self.version_string,
-            ),
-        )
 
-        logger.info("Synapse media repository now listening on port %d", port)
-
-    def start_listening(self, listeners):
-        for listener in listeners:
-            if listener["type"] == "http":
-                self._listen_http(listener)
-            elif listener["type"] == "manhole":
-                _base.listen_tcp(
-                    listener["bind_addresses"],
-                    listener["port"],
-                    manhole(
-                        username="matrix", password="rabbithole", globals={"hs": self}
-                    ),
-                )
-            elif listener["type"] == "metrics":
-                if not self.get_config().enable_metrics:
-                    logger.warning(
-                        (
-                            "Metrics listener configured, but "
-                            "enable_metrics is not True!"
-                        )
-                    )
-                else:
-                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
-            else:
-                logger.warning("Unrecognized listener type: %s", listener["type"])
-
-        self.get_tcp_replication().start_replication(self)
-
-    def build_tcp_replication(self):
-        return ReplicationClientHandler(self.get_datastore())
-
-
-def start(config_options):
-    try:
-        config = HomeServerConfig.load_config(
-            "Synapse media repository", config_options
-        )
-    except ConfigError as e:
-        sys.stderr.write("\n" + str(e) + "\n")
-        sys.exit(1)
-
-    assert config.worker_app == "synapse.app.media_repository"
-
-    if config.enable_media_repo:
-        _base.quit_with_error(
-            "enable_media_repo must be disabled in the main synapse process\n"
-            "before the media repo can be run in a separate worker.\n"
-            "Please add ``enable_media_repo: false`` to the main config\n"
-        )
-
-    events.USE_FROZEN_DICTS = config.use_frozen_dicts
-
-    ss = MediaRepositoryServer(
-        config.server_name,
-        config=config,
-        version_string="Synapse/" + get_version_string(synapse),
-    )
-
-    setup_logging(ss, config, use_worker_options=True)
-
-    ss.setup()
-    reactor.addSystemEventTrigger(
-        "before", "startup", _base.start, ss, config.worker_listeners
-    )
-
-    _base.start_worker_reactor("synapse-media-repository", config)
+import sys
 
+from synapse.app.generic_worker import start
+from synapse.util.logcontext import LoggingContext
 
 if __name__ == "__main__":
     with LoggingContext("main"):
diff --git a/synapse/app/pusher.py b/synapse/app/pusher.py
index e46b6ac598..add43147b3 100644
--- a/synapse/app/pusher.py
+++ b/synapse/app/pusher.py
@@ -13,213 +13,12 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import logging
-import sys
-
-from twisted.internet import defer, reactor
-from twisted.web.resource import NoResource
-
-import synapse
-from synapse import events
-from synapse.app import _base
-from synapse.config._base import ConfigError
-from synapse.config.homeserver import HomeServerConfig
-from synapse.config.logger import setup_logging
-from synapse.http.site import SynapseSite
-from synapse.logging.context import LoggingContext, run_in_background
-from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
-from synapse.replication.slave.storage._base import __func__
-from synapse.replication.slave.storage.account_data import SlavedAccountDataStore
-from synapse.replication.slave.storage.events import SlavedEventStore
-from synapse.replication.slave.storage.pushers import SlavedPusherStore
-from synapse.replication.slave.storage.receipts import SlavedReceiptsStore
-from synapse.replication.slave.storage.room import RoomStore
-from synapse.replication.tcp.client import ReplicationClientHandler
-from synapse.server import HomeServer
-from synapse.storage import DataStore
-from synapse.util.httpresourcetree import create_resource_tree
-from synapse.util.manhole import manhole
-from synapse.util.versionstring import get_version_string
-
-logger = logging.getLogger("synapse.app.pusher")
-
-
-class PusherSlaveStore(
-    SlavedEventStore,
-    SlavedPusherStore,
-    SlavedReceiptsStore,
-    SlavedAccountDataStore,
-    RoomStore,
-):
-    update_pusher_last_stream_ordering_and_success = __func__(
-        DataStore.update_pusher_last_stream_ordering_and_success
-    )
-
-    update_pusher_failing_since = __func__(DataStore.update_pusher_failing_since)
-
-    update_pusher_last_stream_ordering = __func__(
-        DataStore.update_pusher_last_stream_ordering
-    )
-
-    get_throttle_params_by_room = __func__(DataStore.get_throttle_params_by_room)
-
-    set_throttle_params = __func__(DataStore.set_throttle_params)
-
-    get_time_of_last_push_action_before = __func__(
-        DataStore.get_time_of_last_push_action_before
-    )
-
-    get_profile_displayname = __func__(DataStore.get_profile_displayname)
-
-
-class PusherServer(HomeServer):
-    DATASTORE_CLASS = PusherSlaveStore
-
-    def remove_pusher(self, app_id, push_key, user_id):
-        self.get_tcp_replication().send_remove_pusher(app_id, push_key, user_id)
-
-    def _listen_http(self, listener_config):
-        port = listener_config["port"]
-        bind_addresses = listener_config["bind_addresses"]
-        site_tag = listener_config.get("tag", port)
-        resources = {}
-        for res in listener_config["resources"]:
-            for name in res["names"]:
-                if name == "metrics":
-                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
-
-        root_resource = create_resource_tree(resources, NoResource())
-
-        _base.listen_tcp(
-            bind_addresses,
-            port,
-            SynapseSite(
-                "synapse.access.http.%s" % (site_tag,),
-                site_tag,
-                listener_config,
-                root_resource,
-                self.version_string,
-            ),
-        )
-
-        logger.info("Synapse pusher now listening on port %d", port)
-
-    def start_listening(self, listeners):
-        for listener in listeners:
-            if listener["type"] == "http":
-                self._listen_http(listener)
-            elif listener["type"] == "manhole":
-                _base.listen_tcp(
-                    listener["bind_addresses"],
-                    listener["port"],
-                    manhole(
-                        username="matrix", password="rabbithole", globals={"hs": self}
-                    ),
-                )
-            elif listener["type"] == "metrics":
-                if not self.get_config().enable_metrics:
-                    logger.warning(
-                        (
-                            "Metrics listener configured, but "
-                            "enable_metrics is not True!"
-                        )
-                    )
-                else:
-                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
-            else:
-                logger.warning("Unrecognized listener type: %s", listener["type"])
-
-        self.get_tcp_replication().start_replication(self)
 
-    def build_tcp_replication(self):
-        return PusherReplicationHandler(self)
-
-
-class PusherReplicationHandler(ReplicationClientHandler):
-    def __init__(self, hs):
-        super(PusherReplicationHandler, self).__init__(hs.get_datastore())
-
-        self.pusher_pool = hs.get_pusherpool()
-
-    async def on_rdata(self, stream_name, token, rows):
-        await super(PusherReplicationHandler, self).on_rdata(stream_name, token, rows)
-        run_in_background(self.poke_pushers, stream_name, token, rows)
-
-    @defer.inlineCallbacks
-    def poke_pushers(self, stream_name, token, rows):
-        try:
-            if stream_name == "pushers":
-                for row in rows:
-                    if row.deleted:
-                        yield self.stop_pusher(row.user_id, row.app_id, row.pushkey)
-                    else:
-                        yield self.start_pusher(row.user_id, row.app_id, row.pushkey)
-            elif stream_name == "events":
-                yield self.pusher_pool.on_new_notifications(token, token)
-            elif stream_name == "receipts":
-                yield self.pusher_pool.on_new_receipts(
-                    token, token, set(row.room_id for row in rows)
-                )
-        except Exception:
-            logger.exception("Error poking pushers")
-
-    def stop_pusher(self, user_id, app_id, pushkey):
-        key = "%s:%s" % (app_id, pushkey)
-        pushers_for_user = self.pusher_pool.pushers.get(user_id, {})
-        pusher = pushers_for_user.pop(key, None)
-        if pusher is None:
-            return
-        logger.info("Stopping pusher %r / %r", user_id, key)
-        pusher.on_stop()
-
-    def start_pusher(self, user_id, app_id, pushkey):
-        key = "%s:%s" % (app_id, pushkey)
-        logger.info("Starting pusher %r / %r", user_id, key)
-        return self.pusher_pool.start_pusher_by_id(app_id, pushkey, user_id)
-
-
-def start(config_options):
-    try:
-        config = HomeServerConfig.load_config("Synapse pusher", config_options)
-    except ConfigError as e:
-        sys.stderr.write("\n" + str(e) + "\n")
-        sys.exit(1)
-
-    assert config.worker_app == "synapse.app.pusher"
-
-    events.USE_FROZEN_DICTS = config.use_frozen_dicts
-
-    if config.start_pushers:
-        sys.stderr.write(
-            "\nThe pushers must be disabled in the main synapse process"
-            "\nbefore they can be run in a separate worker."
-            "\nPlease add ``start_pushers: false`` to the main config"
-            "\n"
-        )
-        sys.exit(1)
-
-    # Force the pushers to start since they will be disabled in the main config
-    config.start_pushers = True
-
-    ps = PusherServer(
-        config.server_name,
-        config=config,
-        version_string="Synapse/" + get_version_string(synapse),
-    )
-
-    setup_logging(ps, config, use_worker_options=True)
-
-    ps.setup()
-
-    def start():
-        _base.start(ps, config.worker_listeners)
-        ps.get_pusherpool().start()
-
-    reactor.addSystemEventTrigger("before", "startup", start)
-
-    _base.start_worker_reactor("synapse-pusher", config)
+import sys
 
+from synapse.app.generic_worker import start
+from synapse.util.logcontext import LoggingContext
 
 if __name__ == "__main__":
     with LoggingContext("main"):
-        ps = start(sys.argv[1:])
+        start(sys.argv[1:])
diff --git a/synapse/app/synchrotron.py b/synapse/app/synchrotron.py
index 8982c0676e..add43147b3 100644
--- a/synapse/app/synchrotron.py
+++ b/synapse/app/synchrotron.py
@@ -13,454 +13,11 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import contextlib
-import logging
-import sys
-
-from six import iteritems
-
-from twisted.internet import defer, reactor
-from twisted.web.resource import NoResource
-
-import synapse
-from synapse.api.constants import EventTypes
-from synapse.app import _base
-from synapse.config._base import ConfigError
-from synapse.config.homeserver import HomeServerConfig
-from synapse.config.logger import setup_logging
-from synapse.handlers.presence import PresenceHandler, get_interested_parties
-from synapse.http.server import JsonResource
-from synapse.http.site import SynapseSite
-from synapse.logging.context import LoggingContext, run_in_background
-from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
-from synapse.replication.slave.storage._base import BaseSlavedStore, __func__
-from synapse.replication.slave.storage.account_data import SlavedAccountDataStore
-from synapse.replication.slave.storage.appservice import SlavedApplicationServiceStore
-from synapse.replication.slave.storage.client_ips import SlavedClientIpStore
-from synapse.replication.slave.storage.deviceinbox import SlavedDeviceInboxStore
-from synapse.replication.slave.storage.devices import SlavedDeviceStore
-from synapse.replication.slave.storage.events import SlavedEventStore
-from synapse.replication.slave.storage.filtering import SlavedFilteringStore
-from synapse.replication.slave.storage.groups import SlavedGroupServerStore
-from synapse.replication.slave.storage.presence import SlavedPresenceStore
-from synapse.replication.slave.storage.push_rule import SlavedPushRuleStore
-from synapse.replication.slave.storage.receipts import SlavedReceiptsStore
-from synapse.replication.slave.storage.registration import SlavedRegistrationStore
-from synapse.replication.slave.storage.room import RoomStore
-from synapse.replication.tcp.client import ReplicationClientHandler
-from synapse.replication.tcp.streams.events import EventsStreamEventRow, EventsStreamRow
-from synapse.rest.client.v1 import events
-from synapse.rest.client.v1.initial_sync import InitialSyncRestServlet
-from synapse.rest.client.v1.room import RoomInitialSyncRestServlet
-from synapse.rest.client.v2_alpha import sync
-from synapse.server import HomeServer
-from synapse.storage.data_stores.main.monthly_active_users import (
-    MonthlyActiveUsersWorkerStore,
-)
-from synapse.storage.data_stores.main.presence import UserPresenceState
-from synapse.util.httpresourcetree import create_resource_tree
-from synapse.util.manhole import manhole
-from synapse.util.stringutils import random_string
-from synapse.util.versionstring import get_version_string
-
-logger = logging.getLogger("synapse.app.synchrotron")
-
-
-class SynchrotronSlavedStore(
-    SlavedReceiptsStore,
-    SlavedAccountDataStore,
-    SlavedApplicationServiceStore,
-    SlavedRegistrationStore,
-    SlavedFilteringStore,
-    SlavedPresenceStore,
-    SlavedGroupServerStore,
-    SlavedDeviceInboxStore,
-    SlavedDeviceStore,
-    SlavedPushRuleStore,
-    SlavedEventStore,
-    SlavedClientIpStore,
-    RoomStore,
-    MonthlyActiveUsersWorkerStore,
-    BaseSlavedStore,
-):
-    pass
-
-
-UPDATE_SYNCING_USERS_MS = 10 * 1000
-
-
-class SynchrotronPresence(object):
-    def __init__(self, hs):
-        self.hs = hs
-        self.is_mine_id = hs.is_mine_id
-        self.http_client = hs.get_simple_http_client()
-        self.store = hs.get_datastore()
-        self.user_to_num_current_syncs = {}
-        self.clock = hs.get_clock()
-        self.notifier = hs.get_notifier()
-
-        active_presence = self.store.take_presence_startup_info()
-        self.user_to_current_state = {state.user_id: state for state in active_presence}
-
-        # user_id -> last_sync_ms. Lists the users that have stopped syncing
-        # but we haven't notified the master of that yet
-        self.users_going_offline = {}
-
-        self._send_stop_syncing_loop = self.clock.looping_call(
-            self.send_stop_syncing, 10 * 1000
-        )
-
-        self.process_id = random_string(16)
-        logger.info("Presence process_id is %r", self.process_id)
-
-    def send_user_sync(self, user_id, is_syncing, last_sync_ms):
-        if self.hs.config.use_presence:
-            self.hs.get_tcp_replication().send_user_sync(
-                user_id, is_syncing, last_sync_ms
-            )
-
-    def mark_as_coming_online(self, user_id):
-        """A user has started syncing. Send a UserSync to the master, unless they
-        had recently stopped syncing.
-
-        Args:
-            user_id (str)
-        """
-        going_offline = self.users_going_offline.pop(user_id, None)
-        if not going_offline:
-            # Safe to skip because we haven't yet told the master they were offline
-            self.send_user_sync(user_id, True, self.clock.time_msec())
-
-    def mark_as_going_offline(self, user_id):
-        """A user has stopped syncing. We wait before notifying the master as
-        its likely they'll come back soon. This allows us to avoid sending
-        a stopped syncing immediately followed by a started syncing notification
-        to the master
-
-        Args:
-            user_id (str)
-        """
-        self.users_going_offline[user_id] = self.clock.time_msec()
-
-    def send_stop_syncing(self):
-        """Check if there are any users who have stopped syncing a while ago
-        and haven't come back yet. If there are poke the master about them.
-        """
-        now = self.clock.time_msec()
-        for user_id, last_sync_ms in list(self.users_going_offline.items()):
-            if now - last_sync_ms > 10 * 1000:
-                self.users_going_offline.pop(user_id, None)
-                self.send_user_sync(user_id, False, last_sync_ms)
-
-    def set_state(self, user, state, ignore_status_msg=False):
-        # TODO Hows this supposed to work?
-        return defer.succeed(None)
-
-    get_states = __func__(PresenceHandler.get_states)
-    get_state = __func__(PresenceHandler.get_state)
-    current_state_for_users = __func__(PresenceHandler.current_state_for_users)
-
-    def user_syncing(self, user_id, affect_presence):
-        if affect_presence:
-            curr_sync = self.user_to_num_current_syncs.get(user_id, 0)
-            self.user_to_num_current_syncs[user_id] = curr_sync + 1
-
-            # If we went from no in flight sync to some, notify replication
-            if self.user_to_num_current_syncs[user_id] == 1:
-                self.mark_as_coming_online(user_id)
-
-        def _end():
-            # We check that the user_id is in user_to_num_current_syncs because
-            # user_to_num_current_syncs may have been cleared if we are
-            # shutting down.
-            if affect_presence and user_id in self.user_to_num_current_syncs:
-                self.user_to_num_current_syncs[user_id] -= 1
-
-                # If we went from one in flight sync to non, notify replication
-                if self.user_to_num_current_syncs[user_id] == 0:
-                    self.mark_as_going_offline(user_id)
-
-        @contextlib.contextmanager
-        def _user_syncing():
-            try:
-                yield
-            finally:
-                _end()
-
-        return defer.succeed(_user_syncing())
-
-    @defer.inlineCallbacks
-    def notify_from_replication(self, states, stream_id):
-        parties = yield get_interested_parties(self.store, states)
-        room_ids_to_states, users_to_states = parties
-
-        self.notifier.on_new_event(
-            "presence_key",
-            stream_id,
-            rooms=room_ids_to_states.keys(),
-            users=users_to_states.keys(),
-        )
-
-    @defer.inlineCallbacks
-    def process_replication_rows(self, token, rows):
-        states = [
-            UserPresenceState(
-                row.user_id,
-                row.state,
-                row.last_active_ts,
-                row.last_federation_update_ts,
-                row.last_user_sync_ts,
-                row.status_msg,
-                row.currently_active,
-            )
-            for row in rows
-        ]
-
-        for state in states:
-            self.user_to_current_state[state.user_id] = state
-
-        stream_id = token
-        yield self.notify_from_replication(states, stream_id)
-
-    def get_currently_syncing_users(self):
-        if self.hs.config.use_presence:
-            return [
-                user_id
-                for user_id, count in iteritems(self.user_to_num_current_syncs)
-                if count > 0
-            ]
-        else:
-            return set()
-
 
-class SynchrotronTyping(object):
-    def __init__(self, hs):
-        self._latest_room_serial = 0
-        self._reset()
-
-    def _reset(self):
-        """
-        Reset the typing handler's data caches.
-        """
-        # map room IDs to serial numbers
-        self._room_serials = {}
-        # map room IDs to sets of users currently typing
-        self._room_typing = {}
-
-    def stream_positions(self):
-        # We must update this typing token from the response of the previous
-        # sync. In particular, the stream id may "reset" back to zero/a low
-        # value which we *must* use for the next replication request.
-        return {"typing": self._latest_room_serial}
-
-    def process_replication_rows(self, token, rows):
-        if self._latest_room_serial > token:
-            # The master has gone backwards. To prevent inconsistent data, just
-            # clear everything.
-            self._reset()
-
-        # Set the latest serial token to whatever the server gave us.
-        self._latest_room_serial = token
-
-        for row in rows:
-            self._room_serials[row.room_id] = token
-            self._room_typing[row.room_id] = row.user_ids
-
-
-class SynchrotronApplicationService(object):
-    def notify_interested_services(self, event):
-        pass
-
-
-class SynchrotronServer(HomeServer):
-    DATASTORE_CLASS = SynchrotronSlavedStore
-
-    def _listen_http(self, listener_config):
-        port = listener_config["port"]
-        bind_addresses = listener_config["bind_addresses"]
-        site_tag = listener_config.get("tag", port)
-        resources = {}
-        for res in listener_config["resources"]:
-            for name in res["names"]:
-                if name == "metrics":
-                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
-                elif name == "client":
-                    resource = JsonResource(self, canonical_json=False)
-                    sync.register_servlets(self, resource)
-                    events.register_servlets(self, resource)
-                    InitialSyncRestServlet(self).register(resource)
-                    RoomInitialSyncRestServlet(self).register(resource)
-                    resources.update(
-                        {
-                            "/_matrix/client/r0": resource,
-                            "/_matrix/client/unstable": resource,
-                            "/_matrix/client/v2_alpha": resource,
-                            "/_matrix/client/api/v1": resource,
-                        }
-                    )
-
-        root_resource = create_resource_tree(resources, NoResource())
-
-        _base.listen_tcp(
-            bind_addresses,
-            port,
-            SynapseSite(
-                "synapse.access.http.%s" % (site_tag,),
-                site_tag,
-                listener_config,
-                root_resource,
-                self.version_string,
-            ),
-        )
-
-        logger.info("Synapse synchrotron now listening on port %d", port)
-
-    def start_listening(self, listeners):
-        for listener in listeners:
-            if listener["type"] == "http":
-                self._listen_http(listener)
-            elif listener["type"] == "manhole":
-                _base.listen_tcp(
-                    listener["bind_addresses"],
-                    listener["port"],
-                    manhole(
-                        username="matrix", password="rabbithole", globals={"hs": self}
-                    ),
-                )
-            elif listener["type"] == "metrics":
-                if not self.get_config().enable_metrics:
-                    logger.warning(
-                        (
-                            "Metrics listener configured, but "
-                            "enable_metrics is not True!"
-                        )
-                    )
-                else:
-                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
-            else:
-                logger.warning("Unrecognized listener type: %s", listener["type"])
-
-        self.get_tcp_replication().start_replication(self)
-
-    def build_tcp_replication(self):
-        return SyncReplicationHandler(self)
-
-    def build_presence_handler(self):
-        return SynchrotronPresence(self)
-
-    def build_typing_handler(self):
-        return SynchrotronTyping(self)
-
-
-class SyncReplicationHandler(ReplicationClientHandler):
-    def __init__(self, hs):
-        super(SyncReplicationHandler, self).__init__(hs.get_datastore())
-
-        self.store = hs.get_datastore()
-        self.typing_handler = hs.get_typing_handler()
-        # NB this is a SynchrotronPresence, not a normal PresenceHandler
-        self.presence_handler = hs.get_presence_handler()
-        self.notifier = hs.get_notifier()
-
-    async def on_rdata(self, stream_name, token, rows):
-        await super(SyncReplicationHandler, self).on_rdata(stream_name, token, rows)
-        run_in_background(self.process_and_notify, stream_name, token, rows)
-
-    def get_streams_to_replicate(self):
-        args = super(SyncReplicationHandler, self).get_streams_to_replicate()
-        args.update(self.typing_handler.stream_positions())
-        return args
-
-    def get_currently_syncing_users(self):
-        return self.presence_handler.get_currently_syncing_users()
-
-    async def process_and_notify(self, stream_name, token, rows):
-        try:
-            if stream_name == "events":
-                # We shouldn't get multiple rows per token for events stream, so
-                # we don't need to optimise this for multiple rows.
-                for row in rows:
-                    if row.type != EventsStreamEventRow.TypeId:
-                        continue
-                    assert isinstance(row, EventsStreamRow)
-
-                    event = await self.store.get_event(
-                        row.data.event_id, allow_rejected=True
-                    )
-                    if event.rejected_reason:
-                        continue
-
-                    extra_users = ()
-                    if event.type == EventTypes.Member:
-                        extra_users = (event.state_key,)
-                    max_token = self.store.get_room_max_stream_ordering()
-                    self.notifier.on_new_room_event(
-                        event, token, max_token, extra_users
-                    )
-            elif stream_name == "push_rules":
-                self.notifier.on_new_event(
-                    "push_rules_key", token, users=[row.user_id for row in rows]
-                )
-            elif stream_name in ("account_data", "tag_account_data"):
-                self.notifier.on_new_event(
-                    "account_data_key", token, users=[row.user_id for row in rows]
-                )
-            elif stream_name == "receipts":
-                self.notifier.on_new_event(
-                    "receipt_key", token, rooms=[row.room_id for row in rows]
-                )
-            elif stream_name == "typing":
-                self.typing_handler.process_replication_rows(token, rows)
-                self.notifier.on_new_event(
-                    "typing_key", token, rooms=[row.room_id for row in rows]
-                )
-            elif stream_name == "to_device":
-                entities = [row.entity for row in rows if row.entity.startswith("@")]
-                if entities:
-                    self.notifier.on_new_event("to_device_key", token, users=entities)
-            elif stream_name == "device_lists":
-                all_room_ids = set()
-                for row in rows:
-                    room_ids = await self.store.get_rooms_for_user(row.user_id)
-                    all_room_ids.update(room_ids)
-                self.notifier.on_new_event("device_list_key", token, rooms=all_room_ids)
-            elif stream_name == "presence":
-                await self.presence_handler.process_replication_rows(token, rows)
-            elif stream_name == "receipts":
-                self.notifier.on_new_event(
-                    "groups_key", token, users=[row.user_id for row in rows]
-                )
-        except Exception:
-            logger.exception("Error processing replication")
-
-
-def start(config_options):
-    try:
-        config = HomeServerConfig.load_config("Synapse synchrotron", config_options)
-    except ConfigError as e:
-        sys.stderr.write("\n" + str(e) + "\n")
-        sys.exit(1)
-
-    assert config.worker_app == "synapse.app.synchrotron"
-
-    synapse.events.USE_FROZEN_DICTS = config.use_frozen_dicts
-
-    ss = SynchrotronServer(
-        config.server_name,
-        config=config,
-        version_string="Synapse/" + get_version_string(synapse),
-        application_service_handler=SynchrotronApplicationService(),
-    )
-
-    setup_logging(ss, config, use_worker_options=True)
-
-    ss.setup()
-    reactor.addSystemEventTrigger(
-        "before", "startup", _base.start, ss, config.worker_listeners
-    )
-
-    _base.start_worker_reactor("synapse-synchrotron", config)
+import sys
 
+from synapse.app.generic_worker import start
+from synapse.util.logcontext import LoggingContext
 
 if __name__ == "__main__":
     with LoggingContext("main"):
diff --git a/synapse/app/user_dir.py b/synapse/app/user_dir.py
index ba536d6f04..503d44f687 100644
--- a/synapse/app/user_dir.py
+++ b/synapse/app/user_dir.py
@@ -14,217 +14,10 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-import logging
 import sys
 
-from twisted.internet import defer, reactor
-from twisted.web.resource import NoResource
-
-import synapse
-from synapse import events
-from synapse.app import _base
-from synapse.config._base import ConfigError
-from synapse.config.homeserver import HomeServerConfig
-from synapse.config.logger import setup_logging
-from synapse.http.server import JsonResource
-from synapse.http.site import SynapseSite
-from synapse.logging.context import LoggingContext, run_in_background
-from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
-from synapse.replication.slave.storage._base import BaseSlavedStore
-from synapse.replication.slave.storage.appservice import SlavedApplicationServiceStore
-from synapse.replication.slave.storage.client_ips import SlavedClientIpStore
-from synapse.replication.slave.storage.events import SlavedEventStore
-from synapse.replication.slave.storage.registration import SlavedRegistrationStore
-from synapse.replication.tcp.client import ReplicationClientHandler
-from synapse.replication.tcp.streams.events import (
-    EventsStream,
-    EventsStreamCurrentStateRow,
-)
-from synapse.rest.client.v2_alpha import user_directory
-from synapse.server import HomeServer
-from synapse.storage.data_stores.main.user_directory import UserDirectoryStore
-from synapse.storage.database import Database
-from synapse.util.caches.stream_change_cache import StreamChangeCache
-from synapse.util.httpresourcetree import create_resource_tree
-from synapse.util.manhole import manhole
-from synapse.util.versionstring import get_version_string
-
-logger = logging.getLogger("synapse.app.user_dir")
-
-
-class UserDirectorySlaveStore(
-    SlavedEventStore,
-    SlavedApplicationServiceStore,
-    SlavedRegistrationStore,
-    SlavedClientIpStore,
-    UserDirectoryStore,
-    BaseSlavedStore,
-):
-    def __init__(self, database: Database, db_conn, hs):
-        super(UserDirectorySlaveStore, self).__init__(database, db_conn, hs)
-
-        events_max = self._stream_id_gen.get_current_token()
-        curr_state_delta_prefill, min_curr_state_delta_id = self.db.get_cache_dict(
-            db_conn,
-            "current_state_delta_stream",
-            entity_column="room_id",
-            stream_column="stream_id",
-            max_value=events_max,  # As we share the stream id with events token
-            limit=1000,
-        )
-        self._curr_state_delta_stream_cache = StreamChangeCache(
-            "_curr_state_delta_stream_cache",
-            min_curr_state_delta_id,
-            prefilled_cache=curr_state_delta_prefill,
-        )
-
-    def stream_positions(self):
-        result = super(UserDirectorySlaveStore, self).stream_positions()
-        return result
-
-    def process_replication_rows(self, stream_name, token, rows):
-        if stream_name == EventsStream.NAME:
-            self._stream_id_gen.advance(token)
-            for row in rows:
-                if row.type != EventsStreamCurrentStateRow.TypeId:
-                    continue
-                self._curr_state_delta_stream_cache.entity_has_changed(
-                    row.data.room_id, token
-                )
-        return super(UserDirectorySlaveStore, self).process_replication_rows(
-            stream_name, token, rows
-        )
-
-
-class UserDirectoryServer(HomeServer):
-    DATASTORE_CLASS = UserDirectorySlaveStore
-
-    def _listen_http(self, listener_config):
-        port = listener_config["port"]
-        bind_addresses = listener_config["bind_addresses"]
-        site_tag = listener_config.get("tag", port)
-        resources = {}
-        for res in listener_config["resources"]:
-            for name in res["names"]:
-                if name == "metrics":
-                    resources[METRICS_PREFIX] = MetricsResource(RegistryProxy)
-                elif name == "client":
-                    resource = JsonResource(self, canonical_json=False)
-                    user_directory.register_servlets(self, resource)
-                    resources.update(
-                        {
-                            "/_matrix/client/r0": resource,
-                            "/_matrix/client/unstable": resource,
-                            "/_matrix/client/v2_alpha": resource,
-                            "/_matrix/client/api/v1": resource,
-                        }
-                    )
-
-        root_resource = create_resource_tree(resources, NoResource())
-
-        _base.listen_tcp(
-            bind_addresses,
-            port,
-            SynapseSite(
-                "synapse.access.http.%s" % (site_tag,),
-                site_tag,
-                listener_config,
-                root_resource,
-                self.version_string,
-            ),
-        )
-
-        logger.info("Synapse user_dir now listening on port %d", port)
-
-    def start_listening(self, listeners):
-        for listener in listeners:
-            if listener["type"] == "http":
-                self._listen_http(listener)
-            elif listener["type"] == "manhole":
-                _base.listen_tcp(
-                    listener["bind_addresses"],
-                    listener["port"],
-                    manhole(
-                        username="matrix", password="rabbithole", globals={"hs": self}
-                    ),
-                )
-            elif listener["type"] == "metrics":
-                if not self.get_config().enable_metrics:
-                    logger.warning(
-                        (
-                            "Metrics listener configured, but "
-                            "enable_metrics is not True!"
-                        )
-                    )
-                else:
-                    _base.listen_metrics(listener["bind_addresses"], listener["port"])
-            else:
-                logger.warning("Unrecognized listener type: %s", listener["type"])
-
-        self.get_tcp_replication().start_replication(self)
-
-    def build_tcp_replication(self):
-        return UserDirectoryReplicationHandler(self)
-
-
-class UserDirectoryReplicationHandler(ReplicationClientHandler):
-    def __init__(self, hs):
-        super(UserDirectoryReplicationHandler, self).__init__(hs.get_datastore())
-        self.user_directory = hs.get_user_directory_handler()
-
-    async def on_rdata(self, stream_name, token, rows):
-        await super(UserDirectoryReplicationHandler, self).on_rdata(
-            stream_name, token, rows
-        )
-        if stream_name == EventsStream.NAME:
-            run_in_background(self._notify_directory)
-
-    @defer.inlineCallbacks
-    def _notify_directory(self):
-        try:
-            yield self.user_directory.notify_new_event()
-        except Exception:
-            logger.exception("Error notifiying user directory of state update")
-
-
-def start(config_options):
-    try:
-        config = HomeServerConfig.load_config("Synapse user directory", config_options)
-    except ConfigError as e:
-        sys.stderr.write("\n" + str(e) + "\n")
-        sys.exit(1)
-
-    assert config.worker_app == "synapse.app.user_dir"
-
-    events.USE_FROZEN_DICTS = config.use_frozen_dicts
-
-    if config.update_user_directory:
-        sys.stderr.write(
-            "\nThe update_user_directory must be disabled in the main synapse process"
-            "\nbefore they can be run in a separate worker."
-            "\nPlease add ``update_user_directory: false`` to the main config"
-            "\n"
-        )
-        sys.exit(1)
-
-    # Force the pushers to start since they will be disabled in the main config
-    config.update_user_directory = True
-
-    ss = UserDirectoryServer(
-        config.server_name,
-        config=config,
-        version_string="Synapse/" + get_version_string(synapse),
-    )
-
-    setup_logging(ss, config, use_worker_options=True)
-
-    ss.setup()
-    reactor.addSystemEventTrigger(
-        "before", "startup", _base.start, ss, config.worker_listeners
-    )
-
-    _base.start_worker_reactor("synapse-user-dir", config)
-
+from synapse.app.generic_worker import start
+from synapse.util.logcontext import LoggingContext
 
 if __name__ == "__main__":
     with LoggingContext("main"):
diff --git a/synapse/config/emailconfig.py b/synapse/config/emailconfig.py
index 74853f9faa..f31fc85ec8 100644
--- a/synapse/config/emailconfig.py
+++ b/synapse/config/emailconfig.py
@@ -27,6 +27,12 @@ import pkg_resources
 
 from ._base import Config, ConfigError
 
+MISSING_PASSWORD_RESET_CONFIG_ERROR = """\
+Password reset emails are enabled on this homeserver due to a partial
+'email' block. However, the following required keys are missing:
+    %s
+"""
+
 
 class EmailConfig(Config):
     section = "email"
@@ -142,24 +148,18 @@ class EmailConfig(Config):
             bleach
 
         if self.threepid_behaviour_email == ThreepidBehaviour.LOCAL:
-            required = ["smtp_host", "smtp_port", "notif_from"]
-
             missing = []
-            for k in required:
-                if k not in email_config:
-                    missing.append("email." + k)
+            if not self.email_notif_from:
+                missing.append("email.notif_from")
 
             # public_baseurl is required to build password reset and validation links that
             # will be emailed to users
             if config.get("public_baseurl") is None:
                 missing.append("public_baseurl")
 
-            if len(missing) > 0:
-                raise RuntimeError(
-                    "Password resets emails are configured to be sent from "
-                    "this homeserver due to a partial 'email' block. "
-                    "However, the following required keys are missing: %s"
-                    % (", ".join(missing),)
+            if missing:
+                raise ConfigError(
+                    MISSING_PASSWORD_RESET_CONFIG_ERROR % (", ".join(missing),)
                 )
 
             # These email templates have placeholders in them, and thus must be
@@ -245,32 +245,25 @@ class EmailConfig(Config):
             )
 
         if self.email_enable_notifs:
-            required = [
-                "smtp_host",
-                "smtp_port",
-                "notif_from",
-                "notif_template_html",
-                "notif_template_text",
-            ]
-
             missing = []
-            for k in required:
-                if k not in email_config:
-                    missing.append(k)
-
-            if len(missing) > 0:
-                raise RuntimeError(
-                    "email.enable_notifs is True but required keys are missing: %s"
-                    % (", ".join(["email." + k for k in missing]),)
-                )
+            if not self.email_notif_from:
+                missing.append("email.notif_from")
 
             if config.get("public_baseurl") is None:
-                raise RuntimeError(
-                    "email.enable_notifs is True but no public_baseurl is set"
+                missing.append("public_baseurl")
+
+            if missing:
+                raise ConfigError(
+                    "email.enable_notifs is True but required keys are missing: %s"
+                    % (", ".join(missing),)
                 )
 
-            self.email_notif_template_html = email_config["notif_template_html"]
-            self.email_notif_template_text = email_config["notif_template_text"]
+            self.email_notif_template_html = email_config.get(
+                "notif_template_html", "notif_mail.html"
+            )
+            self.email_notif_template_text = email_config.get(
+                "notif_template_text", "notif_mail.txt"
+            )
 
             for f in self.email_notif_template_text, self.email_notif_template_html:
                 p = os.path.join(self.email_template_dir, f)
@@ -323,10 +316,6 @@ class EmailConfig(Config):
           #
           #require_transport_security: true
 
-          # Enable sending emails for messages that the user has missed
-          #
-          #enable_notifs: false
-
           # notif_from defines the "From" address to use when sending emails.
           # It must be set if email sending is enabled.
           #
@@ -344,6 +333,11 @@ class EmailConfig(Config):
           #
           #app_name: my_branded_matrix_server
 
+          # Uncomment the following to enable sending emails for messages that the user
+          # has missed. Disabled by default.
+          #
+          #enable_notifs: true
+
           # Uncomment the following to disable automatic subscription to email
           # notifications for new users. Enabled by default.
           #
diff --git a/synapse/config/saml2_config.py b/synapse/config/saml2_config.py
index 423c158b11..8fe64d90f8 100644
--- a/synapse/config/saml2_config.py
+++ b/synapse/config/saml2_config.py
@@ -15,6 +15,9 @@
 # limitations under the License.
 
 import logging
+import os
+
+import pkg_resources
 
 from synapse.python_dependencies import DependencyException, check_requirements
 from synapse.util.module_loader import load_module, load_python_module
@@ -160,6 +163,14 @@ class SAML2Config(Config):
             saml2_config.get("saml_session_lifetime", "5m")
         )
 
+        template_dir = saml2_config.get("template_dir")
+        if not template_dir:
+            template_dir = pkg_resources.resource_filename("synapse", "res/templates",)
+
+        self.saml2_error_html_content = self.read_file(
+            os.path.join(template_dir, "saml_error.html"), "saml2_config.saml_error",
+        )
+
     def _default_saml_config_dict(
         self, required_attributes: set, optional_attributes: set
     ):
@@ -325,6 +336,25 @@ class SAML2Config(Config):
           # The default is 'uid'.
           #
           #grandfathered_mxid_source_attribute: upn
+
+          # Directory in which Synapse will try to find the template files below.
+          # If not set, default templates from within the Synapse package will be used.
+          #
+          # DO NOT UNCOMMENT THIS SETTING unless you want to customise the templates.
+          # If you *do* uncomment it, you will need to make sure that all the templates
+          # below are in the directory.
+          #
+          # Synapse will look for the following templates in this directory:
+          #
+          # * HTML page to display to users if something goes wrong during the
+          #   authentication process: 'saml_error.html'.
+          #
+          #   This template doesn't currently need any variable to render.
+          #
+          # You can see the default templates at:
+          # https://github.com/matrix-org/synapse/tree/master/synapse/res/templates
+          #
+          #template_dir: "res/templates"
         """ % {
             "config_dir_path": config_dir_path
         }
diff --git a/synapse/config/server.py b/synapse/config/server.py
index 0ec1b0fadd..7525765fee 100644
--- a/synapse/config/server.py
+++ b/synapse/config/server.py
@@ -1066,12 +1066,12 @@ KNOWN_RESOURCES = (
 
 
 def _check_resource_config(listeners):
-    resource_names = set(
+    resource_names = {
         res_name
         for listener in listeners
         for res in listener.get("resources", [])
         for res_name in res.get("names", [])
-    )
+    }
 
     for resource in resource_names:
         if resource not in KNOWN_RESOURCES:
diff --git a/synapse/config/tls.py b/synapse/config/tls.py
index 97a12d51f6..a65538562b 100644
--- a/synapse/config/tls.py
+++ b/synapse/config/tls.py
@@ -260,7 +260,7 @@ class TlsConfig(Config):
                 crypto.FILETYPE_ASN1, self.tls_certificate
             )
             sha256_fingerprint = encode_base64(sha256(x509_certificate_bytes).digest())
-            sha256_fingerprints = set(f["sha256"] for f in self.tls_fingerprints)
+            sha256_fingerprints = {f["sha256"] for f in self.tls_fingerprints}
             if sha256_fingerprint not in sha256_fingerprints:
                 self.tls_fingerprints.append({"sha256": sha256_fingerprint})
 
diff --git a/synapse/crypto/context_factory.py b/synapse/crypto/context_factory.py
index e93f0b3705..a5a2a7815d 100644
--- a/synapse/crypto/context_factory.py
+++ b/synapse/crypto/context_factory.py
@@ -75,7 +75,7 @@ class ServerContextFactory(ContextFactory):
 
 
 @implementer(IPolicyForHTTPS)
-class ClientTLSOptionsFactory(object):
+class FederationPolicyForHTTPS(object):
     """Factory for Twisted SSLClientConnectionCreators that are used to make connections
     to remote servers for federation.
 
@@ -103,15 +103,15 @@ class ClientTLSOptionsFactory(object):
         # let us do).
         minTLS = _TLS_VERSION_MAP[config.federation_client_minimum_tls_version]
 
-        self._verify_ssl = CertificateOptions(
+        _verify_ssl = CertificateOptions(
             trustRoot=trust_root, insecurelyLowerMinimumTo=minTLS
         )
-        self._verify_ssl_context = self._verify_ssl.getContext()
-        self._verify_ssl_context.set_info_callback(self._context_info_cb)
+        self._verify_ssl_context = _verify_ssl.getContext()
+        self._verify_ssl_context.set_info_callback(_context_info_cb)
 
-        self._no_verify_ssl = CertificateOptions(insecurelyLowerMinimumTo=minTLS)
-        self._no_verify_ssl_context = self._no_verify_ssl.getContext()
-        self._no_verify_ssl_context.set_info_callback(self._context_info_cb)
+        _no_verify_ssl = CertificateOptions(insecurelyLowerMinimumTo=minTLS)
+        self._no_verify_ssl_context = _no_verify_ssl.getContext()
+        self._no_verify_ssl_context.set_info_callback(_context_info_cb)
 
     def get_options(self, host: bytes):
 
@@ -136,23 +136,6 @@ class ClientTLSOptionsFactory(object):
 
         return SSLClientConnectionCreator(host, ssl_context, should_verify)
 
-    @staticmethod
-    def _context_info_cb(ssl_connection, where, ret):
-        """The 'information callback' for our openssl context object."""
-        # we assume that the app_data on the connection object has been set to
-        # a TLSMemoryBIOProtocol object. (This is done by SSLClientConnectionCreator)
-        tls_protocol = ssl_connection.get_app_data()
-        try:
-            # ... we further assume that SSLClientConnectionCreator has set the
-            # '_synapse_tls_verifier' attribute to a ConnectionVerifier object.
-            tls_protocol._synapse_tls_verifier.verify_context_info_cb(
-                ssl_connection, where
-            )
-        except:  # noqa: E722, taken from the twisted implementation
-            logger.exception("Error during info_callback")
-            f = Failure()
-            tls_protocol.failVerification(f)
-
     def creatorForNetloc(self, hostname, port):
         """Implements the IPolicyForHTTPS interace so that this can be passed
         directly to agents.
@@ -160,6 +143,43 @@ class ClientTLSOptionsFactory(object):
         return self.get_options(hostname)
 
 
+@implementer(IPolicyForHTTPS)
+class RegularPolicyForHTTPS(object):
+    """Factory for Twisted SSLClientConnectionCreators that are used to make connections
+    to remote servers, for other than federation.
+
+    Always uses the same OpenSSL context object, which uses the default OpenSSL CA
+    trust root.
+    """
+
+    def __init__(self):
+        trust_root = platformTrust()
+        self._ssl_context = CertificateOptions(trustRoot=trust_root).getContext()
+        self._ssl_context.set_info_callback(_context_info_cb)
+
+    def creatorForNetloc(self, hostname, port):
+        return SSLClientConnectionCreator(hostname, self._ssl_context, True)
+
+
+def _context_info_cb(ssl_connection, where, ret):
+    """The 'information callback' for our openssl context objects.
+
+    Note: Once this is set as the info callback on a Context object, the Context should
+    only be used with the SSLClientConnectionCreator.
+    """
+    # we assume that the app_data on the connection object has been set to
+    # a TLSMemoryBIOProtocol object. (This is done by SSLClientConnectionCreator)
+    tls_protocol = ssl_connection.get_app_data()
+    try:
+        # ... we further assume that SSLClientConnectionCreator has set the
+        # '_synapse_tls_verifier' attribute to a ConnectionVerifier object.
+        tls_protocol._synapse_tls_verifier.verify_context_info_cb(ssl_connection, where)
+    except:  # noqa: E722, taken from the twisted implementation
+        logger.exception("Error during info_callback")
+        f = Failure()
+        tls_protocol.failVerification(f)
+
+
 @implementer(IOpenSSLClientConnectionCreator)
 class SSLClientConnectionCreator(object):
     """Creates openssl connection objects for client connections.
diff --git a/synapse/crypto/event_signing.py b/synapse/crypto/event_signing.py
index 5f733c1cf5..0422c43fab 100644
--- a/synapse/crypto/event_signing.py
+++ b/synapse/crypto/event_signing.py
@@ -140,7 +140,7 @@ def compute_event_signature(
     Returns:
         a dictionary in the same format of an event's signatures field.
     """
-    redact_json = prune_event_dict(event_dict)
+    redact_json = prune_event_dict(room_version, event_dict)
     redact_json.pop("age_ts", None)
     redact_json.pop("unsigned", None)
     if logger.isEnabledFor(logging.DEBUG):
diff --git a/synapse/crypto/keyring.py b/synapse/crypto/keyring.py
index 6fe5a6a26a..983f0ead8c 100644
--- a/synapse/crypto/keyring.py
+++ b/synapse/crypto/keyring.py
@@ -326,9 +326,7 @@ class Keyring(object):
             verify_requests (list[VerifyJsonRequest]): list of verify requests
         """
 
-        remaining_requests = set(
-            (rq for rq in verify_requests if not rq.key_ready.called)
-        )
+        remaining_requests = {rq for rq in verify_requests if not rq.key_ready.called}
 
         @defer.inlineCallbacks
         def do_iterations():
@@ -396,7 +394,7 @@ class Keyring(object):
 
         results = yield fetcher.get_keys(missing_keys)
 
-        completed = list()
+        completed = []
         for verify_request in remaining_requests:
             server_name = verify_request.server_name
 
diff --git a/synapse/event_auth.py b/synapse/event_auth.py
index 472f165044..46beb5334f 100644
--- a/synapse/event_auth.py
+++ b/synapse/event_auth.py
@@ -137,7 +137,7 @@ def check(
             raise AuthError(403, "This room has been marked as unfederatable.")
 
     # 4. If type is m.room.aliases
-    if event.type == EventTypes.Aliases:
+    if event.type == EventTypes.Aliases and room_version_obj.special_case_aliases_auth:
         # 4a. If event has no state_key, reject
         if not event.is_state():
             raise AuthError(403, "Alias event must be a state event")
@@ -152,10 +152,8 @@ def check(
             )
 
         # 4c. Otherwise, allow.
-        # This is removed by https://github.com/matrix-org/matrix-doc/pull/2260
-        if room_version_obj.special_case_aliases_auth:
-            logger.debug("Allowing! %s", event)
-            return
+        logger.debug("Allowing! %s", event)
+        return
 
     if logger.isEnabledFor(logging.DEBUG):
         logger.debug("Auth events: %s", [a.event_id for a in auth_events.values()])
diff --git a/synapse/events/__init__.py b/synapse/events/__init__.py
index 7307116556..533ba327f5 100644
--- a/synapse/events/__init__.py
+++ b/synapse/events/__init__.py
@@ -15,9 +15,10 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+import abc
 import os
 from distutils.util import strtobool
-from typing import Optional, Type
+from typing import Dict, Optional, Type
 
 import six
 
@@ -199,15 +200,25 @@ class _EventInternalMetadata(object):
         return self._dict.get("redacted", False)
 
 
-class EventBase(object):
+class EventBase(metaclass=abc.ABCMeta):
+    @property
+    @abc.abstractmethod
+    def format_version(self) -> int:
+        """The EventFormatVersion implemented by this event"""
+        ...
+
     def __init__(
         self,
-        event_dict,
-        signatures={},
-        unsigned={},
-        internal_metadata_dict={},
-        rejected_reason=None,
+        event_dict: JsonDict,
+        room_version: RoomVersion,
+        signatures: Dict[str, Dict[str, str]],
+        unsigned: JsonDict,
+        internal_metadata_dict: JsonDict,
+        rejected_reason: Optional[str],
     ):
+        assert room_version.event_format == self.format_version
+
+        self.room_version = room_version
         self.signatures = signatures
         self.unsigned = unsigned
         self.rejected_reason = rejected_reason
@@ -303,7 +314,13 @@ class EventBase(object):
 class FrozenEvent(EventBase):
     format_version = EventFormatVersions.V1  # All events of this type are V1
 
-    def __init__(self, event_dict, internal_metadata_dict={}, rejected_reason=None):
+    def __init__(
+        self,
+        event_dict: JsonDict,
+        room_version: RoomVersion,
+        internal_metadata_dict: JsonDict = {},
+        rejected_reason: Optional[str] = None,
+    ):
         event_dict = dict(event_dict)
 
         # Signatures is a dict of dicts, and this is faster than doing a
@@ -326,8 +343,9 @@ class FrozenEvent(EventBase):
 
         self._event_id = event_dict["event_id"]
 
-        super(FrozenEvent, self).__init__(
+        super().__init__(
             frozen_dict,
+            room_version=room_version,
             signatures=signatures,
             unsigned=unsigned,
             internal_metadata_dict=internal_metadata_dict,
@@ -352,7 +370,13 @@ class FrozenEvent(EventBase):
 class FrozenEventV2(EventBase):
     format_version = EventFormatVersions.V2  # All events of this type are V2
 
-    def __init__(self, event_dict, internal_metadata_dict={}, rejected_reason=None):
+    def __init__(
+        self,
+        event_dict: JsonDict,
+        room_version: RoomVersion,
+        internal_metadata_dict: JsonDict = {},
+        rejected_reason: Optional[str] = None,
+    ):
         event_dict = dict(event_dict)
 
         # Signatures is a dict of dicts, and this is faster than doing a
@@ -377,8 +401,9 @@ class FrozenEventV2(EventBase):
 
         self._event_id = None
 
-        super(FrozenEventV2, self).__init__(
+        super().__init__(
             frozen_dict,
+            room_version=room_version,
             signatures=signatures,
             unsigned=unsigned,
             internal_metadata_dict=internal_metadata_dict,
@@ -445,7 +470,7 @@ class FrozenEventV3(FrozenEventV2):
         return self._event_id
 
 
-def event_type_from_format_version(format_version: int) -> Type[EventBase]:
+def _event_type_from_format_version(format_version: int) -> Type[EventBase]:
     """Returns the python type to use to construct an Event object for the
     given event format version.
 
@@ -474,5 +499,5 @@ def make_event_from_dict(
     rejected_reason: Optional[str] = None,
 ) -> EventBase:
     """Construct an EventBase from the given event dict"""
-    event_type = event_type_from_format_version(room_version.event_format)
-    return event_type(event_dict, internal_metadata_dict, rejected_reason)
+    event_type = _event_type_from_format_version(room_version.event_format)
+    return event_type(event_dict, room_version, internal_metadata_dict, rejected_reason)
diff --git a/synapse/events/utils.py b/synapse/events/utils.py
index f70f5032fb..b75b097e5e 100644
--- a/synapse/events/utils.py
+++ b/synapse/events/utils.py
@@ -23,6 +23,7 @@ from frozendict import frozendict
 from twisted.internet import defer
 
 from synapse.api.constants import EventTypes, RelationTypes
+from synapse.api.room_versions import RoomVersion
 from synapse.util.async_helpers import yieldable_gather_results
 
 from . import EventBase
@@ -35,26 +36,20 @@ from . import EventBase
 SPLIT_FIELD_REGEX = re.compile(r"(?<!\\)\.")
 
 
-def prune_event(event):
+def prune_event(event: EventBase) -> EventBase:
     """ Returns a pruned version of the given event, which removes all keys we
     don't know about or think could potentially be dodgy.
 
     This is used when we "redact" an event. We want to remove all fields that
     the user has specified, but we do want to keep necessary information like
     type, state_key etc.
-
-    Args:
-        event (FrozenEvent)
-
-    Returns:
-        FrozenEvent
     """
-    pruned_event_dict = prune_event_dict(event.get_dict())
+    pruned_event_dict = prune_event_dict(event.room_version, event.get_dict())
 
-    from . import event_type_from_format_version
+    from . import make_event_from_dict
 
-    pruned_event = event_type_from_format_version(event.format_version)(
-        pruned_event_dict, event.internal_metadata.get_dict()
+    pruned_event = make_event_from_dict(
+        pruned_event_dict, event.room_version, event.internal_metadata.get_dict()
     )
 
     # Mark the event as redacted
@@ -63,15 +58,12 @@ def prune_event(event):
     return pruned_event
 
 
-def prune_event_dict(event_dict):
+def prune_event_dict(room_version: RoomVersion, event_dict: dict) -> dict:
     """Redacts the event_dict in the same way as `prune_event`, except it
     operates on dicts rather than event objects
 
-    Args:
-        event_dict (dict)
-
     Returns:
-        dict: A copy of the pruned event dict
+        A copy of the pruned event dict
     """
 
     allowed_keys = [
@@ -118,7 +110,7 @@ def prune_event_dict(event_dict):
             "kick",
             "redact",
         )
-    elif event_type == EventTypes.Aliases:
+    elif event_type == EventTypes.Aliases and room_version.special_case_aliases_auth:
         add_fields("aliases")
     elif event_type == EventTypes.RoomHistoryVisibility:
         add_fields("history_visibility")
diff --git a/synapse/federation/federation_base.py b/synapse/federation/federation_base.py
index 9fff65716a..5c991e5412 100644
--- a/synapse/federation/federation_base.py
+++ b/synapse/federation/federation_base.py
@@ -15,11 +15,13 @@
 # limitations under the License.
 import logging
 from collections import namedtuple
+from typing import Iterable, List
 
 import six
 
 from twisted.internet import defer
-from twisted.internet.defer import DeferredList
+from twisted.internet.defer import Deferred, DeferredList
+from twisted.python.failure import Failure
 
 from synapse.api.constants import MAX_DEPTH, EventTypes, Membership
 from synapse.api.errors import Codes, SynapseError
@@ -29,6 +31,7 @@ from synapse.api.room_versions import (
     RoomVersion,
 )
 from synapse.crypto.event_signing import check_event_content_hash
+from synapse.crypto.keyring import Keyring
 from synapse.events import EventBase, make_event_from_dict
 from synapse.events.utils import prune_event
 from synapse.http.servlet import assert_params_in_dict
@@ -36,10 +39,8 @@ from synapse.logging.context import (
     LoggingContext,
     PreserveLoggingContext,
     make_deferred_yieldable,
-    preserve_fn,
 )
 from synapse.types import JsonDict, get_domain_from_id
-from synapse.util import unwrapFirstError
 
 logger = logging.getLogger(__name__)
 
@@ -54,94 +55,23 @@ class FederationBase(object):
         self.store = hs.get_datastore()
         self._clock = hs.get_clock()
 
-    @defer.inlineCallbacks
-    def _check_sigs_and_hash_and_fetch(
-        self, origin, pdus, room_version, outlier=False, include_none=False
-    ):
-        """Takes a list of PDUs and checks the signatures and hashs of each
-        one. If a PDU fails its signature check then we check if we have it in
-        the database and if not then request if from the originating server of
-        that PDU.
-
-        If a PDU fails its content hash check then it is redacted.
-
-        The given list of PDUs are not modified, instead the function returns
-        a new list.
-
-        Args:
-            origin (str)
-            pdu (list)
-            room_version (str)
-            outlier (bool): Whether the events are outliers or not
-            include_none (str): Whether to include None in the returned list
-                for events that have failed their checks
-
-        Returns:
-            Deferred : A list of PDUs that have valid signatures and hashes.
-        """
-        deferreds = self._check_sigs_and_hashes(room_version, pdus)
-
-        @defer.inlineCallbacks
-        def handle_check_result(pdu, deferred):
-            try:
-                res = yield make_deferred_yieldable(deferred)
-            except SynapseError:
-                res = None
-
-            if not res:
-                # Check local db.
-                res = yield self.store.get_event(
-                    pdu.event_id, allow_rejected=True, allow_none=True
-                )
-
-            if not res and pdu.origin != origin:
-                try:
-                    res = yield defer.ensureDeferred(
-                        self.get_pdu(
-                            destinations=[pdu.origin],
-                            event_id=pdu.event_id,
-                            room_version=room_version,
-                            outlier=outlier,
-                            timeout=10000,
-                        )
-                    )
-                except SynapseError:
-                    pass
-
-            if not res:
-                logger.warning(
-                    "Failed to find copy of %s with valid signature", pdu.event_id
-                )
-
-            return res
-
-        handle = preserve_fn(handle_check_result)
-        deferreds2 = [handle(pdu, deferred) for pdu, deferred in zip(pdus, deferreds)]
-
-        valid_pdus = yield make_deferred_yieldable(
-            defer.gatherResults(deferreds2, consumeErrors=True)
-        ).addErrback(unwrapFirstError)
-
-        if include_none:
-            return valid_pdus
-        else:
-            return [p for p in valid_pdus if p]
-
-    def _check_sigs_and_hash(self, room_version, pdu):
+    def _check_sigs_and_hash(self, room_version: str, pdu: EventBase) -> Deferred:
         return make_deferred_yieldable(
             self._check_sigs_and_hashes(room_version, [pdu])[0]
         )
 
-    def _check_sigs_and_hashes(self, room_version, pdus):
+    def _check_sigs_and_hashes(
+        self, room_version: str, pdus: List[EventBase]
+    ) -> List[Deferred]:
         """Checks that each of the received events is correctly signed by the
         sending server.
 
         Args:
-            room_version (str): The room version of the PDUs
-            pdus (list[FrozenEvent]): the events to be checked
+            room_version: The room version of the PDUs
+            pdus: the events to be checked
 
         Returns:
-            list[Deferred]: for each input event, a deferred which:
+            For each input event, a deferred which:
               * returns the original event if the checks pass
               * returns a redacted version of the event (if the signature
                 matched but the hash did not)
@@ -152,7 +82,7 @@ class FederationBase(object):
 
         ctx = LoggingContext.current_context()
 
-        def callback(_, pdu):
+        def callback(_, pdu: EventBase):
             with PreserveLoggingContext(ctx):
                 if not check_event_content_hash(pdu):
                     # let's try to distinguish between failures because the event was
@@ -189,7 +119,7 @@ class FederationBase(object):
 
                 return pdu
 
-        def errback(failure, pdu):
+        def errback(failure: Failure, pdu: EventBase):
             failure.trap(SynapseError)
             with PreserveLoggingContext(ctx):
                 logger.warning(
@@ -215,16 +145,18 @@ class PduToCheckSig(
     pass
 
 
-def _check_sigs_on_pdus(keyring, room_version, pdus):
+def _check_sigs_on_pdus(
+    keyring: Keyring, room_version: str, pdus: Iterable[EventBase]
+) -> List[Deferred]:
     """Check that the given events are correctly signed
 
     Args:
-        keyring (synapse.crypto.Keyring): keyring object to do the checks
-        room_version (str): the room version of the PDUs
-        pdus (Collection[EventBase]): the events to be checked
+        keyring: keyring object to do the checks
+        room_version: the room version of the PDUs
+        pdus: the events to be checked
 
     Returns:
-        List[Deferred]: a Deferred for each event in pdus, which will either succeed if
+        A Deferred for each event in pdus, which will either succeed if
            the signatures are valid, or fail (with a SynapseError) if not.
     """
 
@@ -329,7 +261,7 @@ def _check_sigs_on_pdus(keyring, room_version, pdus):
     return [_flatten_deferred_list(p.deferreds) for p in pdus_to_check]
 
 
-def _flatten_deferred_list(deferreds):
+def _flatten_deferred_list(deferreds: List[Deferred]) -> Deferred:
     """Given a list of deferreds, either return the single deferred,
     combine into a DeferredList, or return an already resolved deferred.
     """
@@ -341,7 +273,7 @@ def _flatten_deferred_list(deferreds):
         return defer.succeed(None)
 
 
-def _is_invite_via_3pid(event):
+def _is_invite_via_3pid(event: EventBase) -> bool:
     return (
         event.type == EventTypes.Member
         and event.membership == Membership.INVITE
diff --git a/synapse/federation/federation_client.py b/synapse/federation/federation_client.py
index 4870e39652..8c6b839478 100644
--- a/synapse/federation/federation_client.py
+++ b/synapse/federation/federation_client.py
@@ -33,6 +33,7 @@ from typing import (
 from prometheus_client import Counter
 
 from twisted.internet import defer
+from twisted.internet.defer import Deferred
 
 from synapse.api.constants import EventTypes, Membership
 from synapse.api.errors import (
@@ -51,7 +52,7 @@ from synapse.api.room_versions import (
 )
 from synapse.events import EventBase, builder
 from synapse.federation.federation_base import FederationBase, event_from_pdu_json
-from synapse.logging.context import make_deferred_yieldable
+from synapse.logging.context import make_deferred_yieldable, preserve_fn
 from synapse.logging.utils import log_function
 from synapse.types import JsonDict
 from synapse.util import unwrapFirstError
@@ -187,7 +188,7 @@ class FederationClient(FederationBase):
 
     async def backfill(
         self, dest: str, room_id: str, limit: int, extremities: Iterable[str]
-    ) -> List[EventBase]:
+    ) -> Optional[List[EventBase]]:
         """Requests some more historic PDUs for the given room from the
         given destination server.
 
@@ -199,9 +200,9 @@ class FederationClient(FederationBase):
         """
         logger.debug("backfill extrem=%s", extremities)
 
-        # If there are no extremeties then we've (probably) reached the start.
+        # If there are no extremities then we've (probably) reached the start.
         if not extremities:
-            return
+            return None
 
         transaction_data = await self.transport_layer.backfill(
             dest, room_id, extremities, limit
@@ -284,7 +285,7 @@ class FederationClient(FederationBase):
                 pdu_list = [
                     event_from_pdu_json(p, room_version, outlier=outlier)
                     for p in transaction_data["pdus"]
-                ]
+                ]  # type: List[EventBase]
 
                 if pdu_list and pdu_list[0]:
                     pdu = pdu_list[0]
@@ -345,6 +346,83 @@ class FederationClient(FederationBase):
 
         return state_event_ids, auth_event_ids
 
+    async def _check_sigs_and_hash_and_fetch(
+        self,
+        origin: str,
+        pdus: List[EventBase],
+        room_version: str,
+        outlier: bool = False,
+        include_none: bool = False,
+    ) -> List[EventBase]:
+        """Takes a list of PDUs and checks the signatures and hashs of each
+        one. If a PDU fails its signature check then we check if we have it in
+        the database and if not then request if from the originating server of
+        that PDU.
+
+        If a PDU fails its content hash check then it is redacted.
+
+        The given list of PDUs are not modified, instead the function returns
+        a new list.
+
+        Args:
+            origin
+            pdu
+            room_version
+            outlier: Whether the events are outliers or not
+            include_none: Whether to include None in the returned list
+                for events that have failed their checks
+
+        Returns:
+            Deferred : A list of PDUs that have valid signatures and hashes.
+        """
+        deferreds = self._check_sigs_and_hashes(room_version, pdus)
+
+        @defer.inlineCallbacks
+        def handle_check_result(pdu: EventBase, deferred: Deferred):
+            try:
+                res = yield make_deferred_yieldable(deferred)
+            except SynapseError:
+                res = None
+
+            if not res:
+                # Check local db.
+                res = yield self.store.get_event(
+                    pdu.event_id, allow_rejected=True, allow_none=True
+                )
+
+            if not res and pdu.origin != origin:
+                try:
+                    res = yield defer.ensureDeferred(
+                        self.get_pdu(
+                            destinations=[pdu.origin],
+                            event_id=pdu.event_id,
+                            room_version=room_version,  # type: ignore
+                            outlier=outlier,
+                            timeout=10000,
+                        )
+                    )
+                except SynapseError:
+                    pass
+
+            if not res:
+                logger.warning(
+                    "Failed to find copy of %s with valid signature", pdu.event_id
+                )
+
+            return res
+
+        handle = preserve_fn(handle_check_result)
+        deferreds2 = [handle(pdu, deferred) for pdu, deferred in zip(pdus, deferreds)]
+
+        valid_pdus = await make_deferred_yieldable(
+            defer.gatherResults(deferreds2, consumeErrors=True)
+        ).addErrback(unwrapFirstError)
+
+        if include_none:
+            return valid_pdus
+        else:
+            return [p for p in valid_pdus if p]
+
     async def get_event_auth(self, destination, room_id, event_id):
         res = await self.transport_layer.get_event_auth(destination, room_id, event_id)
 
@@ -615,7 +693,7 @@ class FederationClient(FederationBase):
             ]
             if auth_chain_create_events != [create_event.event_id]:
                 raise InvalidResponseError(
-                    "Unexpected create event(s) in auth chain"
+                    "Unexpected create event(s) in auth chain: %s"
                     % (auth_chain_create_events,)
                 )
 
diff --git a/synapse/federation/federation_server.py b/synapse/federation/federation_server.py
index 7f9da49326..275b9c99d7 100644
--- a/synapse/federation/federation_server.py
+++ b/synapse/federation/federation_server.py
@@ -470,57 +470,6 @@ class FederationServer(FederationBase):
             res = {"auth_chain": [a.get_pdu_json(time_now) for a in auth_pdus]}
         return 200, res
 
-    async def on_query_auth_request(self, origin, content, room_id, event_id):
-        """
-        Content is a dict with keys::
-            auth_chain (list): A list of events that give the auth chain.
-            missing (list): A list of event_ids indicating what the other
-              side (`origin`) think we're missing.
-            rejects (dict): A mapping from event_id to a 2-tuple of reason
-              string and a proof (or None) of why the event was rejected.
-              The keys of this dict give the list of events the `origin` has
-              rejected.
-
-        Args:
-            origin (str)
-            content (dict)
-            event_id (str)
-
-        Returns:
-            Deferred: Results in `dict` with the same format as `content`
-        """
-        with (await self._server_linearizer.queue((origin, room_id))):
-            origin_host, _ = parse_server_name(origin)
-            await self.check_server_matches_acl(origin_host, room_id)
-
-            room_version = await self.store.get_room_version(room_id)
-
-            auth_chain = [
-                event_from_pdu_json(e, room_version) for e in content["auth_chain"]
-            ]
-
-            signed_auth = await self._check_sigs_and_hash_and_fetch(
-                origin, auth_chain, outlier=True, room_version=room_version.identifier
-            )
-
-            ret = await self.handler.on_query_auth(
-                origin,
-                event_id,
-                room_id,
-                signed_auth,
-                content.get("rejects", []),
-                content.get("missing", []),
-            )
-
-            time_now = self._clock.time_msec()
-            send_content = {
-                "auth_chain": [e.get_pdu_json(time_now) for e in ret["auth_chain"]],
-                "rejects": ret.get("rejects", []),
-                "missing": ret.get("missing", []),
-            }
-
-        return 200, send_content
-
     @log_function
     def on_query_client_keys(self, origin, content):
         return self.on_query_request("client_keys", content)
diff --git a/synapse/federation/send_queue.py b/synapse/federation/send_queue.py
index 001bb304ae..876fb0e245 100644
--- a/synapse/federation/send_queue.py
+++ b/synapse/federation/send_queue.py
@@ -129,9 +129,9 @@ class FederationRemoteSendQueue(object):
             for key in keys[:i]:
                 del self.presence_changed[key]
 
-            user_ids = set(
+            user_ids = {
                 user_id for uids in self.presence_changed.values() for user_id in uids
-            )
+            }
 
             keys = self.presence_destinations.keys()
             i = self.presence_destinations.bisect_left(position_to_delete)
diff --git a/synapse/federation/transport/server.py b/synapse/federation/transport/server.py
index 92a9ae2320..af4595498c 100644
--- a/synapse/federation/transport/server.py
+++ b/synapse/federation/transport/server.py
@@ -643,17 +643,6 @@ class FederationClientKeysClaimServlet(BaseFederationServlet):
         return 200, response
 
 
-class FederationQueryAuthServlet(BaseFederationServlet):
-    PATH = "/query_auth/(?P<context>[^/]*)/(?P<event_id>[^/]*)"
-
-    async def on_POST(self, origin, content, query, context, event_id):
-        new_content = await self.handler.on_query_auth_request(
-            origin, content, context, event_id
-        )
-
-        return 200, new_content
-
-
 class FederationGetMissingEventsServlet(BaseFederationServlet):
     # TODO(paul): Why does this path alone end with "/?" optional?
     PATH = "/get_missing_events/(?P<room_id>[^/]*)/?"
@@ -1412,7 +1401,6 @@ FEDERATION_SERVLET_CLASSES = (
     FederationV2SendLeaveServlet,
     FederationV1InviteServlet,
     FederationV2InviteServlet,
-    FederationQueryAuthServlet,
     FederationGetMissingEventsServlet,
     FederationEventAuthServlet,
     FederationClientKeysQueryServlet,
diff --git a/synapse/groups/groups_server.py b/synapse/groups/groups_server.py
index c106abae21..4f0dc0a209 100644
--- a/synapse/groups/groups_server.py
+++ b/synapse/groups/groups_server.py
@@ -608,7 +608,7 @@ class GroupsServerHandler(GroupsServerWorkerHandler):
         user_results = yield self.store.get_users_in_group(
             group_id, include_private=True
         )
-        if user_id in [user_result["user_id"] for user_result in user_results]:
+        if user_id in (user_result["user_id"] for user_result in user_results):
             raise SynapseError(400, "User already in group")
 
         content = {
diff --git a/synapse/handlers/account_validity.py b/synapse/handlers/account_validity.py
index 829f52eca1..590135d19c 100644
--- a/synapse/handlers/account_validity.py
+++ b/synapse/handlers/account_validity.py
@@ -44,7 +44,11 @@ class AccountValidityHandler(object):
 
         self._account_validity = self.hs.config.account_validity
 
-        if self._account_validity.renew_by_email_enabled and load_jinja2_templates:
+        if (
+            self._account_validity.enabled
+            and self._account_validity.renew_by_email_enabled
+            and load_jinja2_templates
+        ):
             # Don't do email-specific configuration if renewal by email is disabled.
             try:
                 app_name = self.hs.config.email_app_name
diff --git a/synapse/handlers/auth.py b/synapse/handlers/auth.py
index 7ca90f91c4..7860f9625e 100644
--- a/synapse/handlers/auth.py
+++ b/synapse/handlers/auth.py
@@ -18,10 +18,10 @@ import logging
 import time
 import unicodedata
 import urllib.parse
-from typing import Any
+from typing import Any, Dict, Iterable, List, Optional
 
 import attr
-import bcrypt
+import bcrypt  # type: ignore[import]
 import pymacaroons
 
 from twisted.internet import defer
@@ -45,7 +45,7 @@ from synapse.http.site import SynapseRequest
 from synapse.logging.context import defer_to_thread
 from synapse.module_api import ModuleApi
 from synapse.push.mailer import load_jinja2_templates
-from synapse.types import UserID
+from synapse.types import Requester, UserID
 from synapse.util.caches.expiringcache import ExpiringCache
 
 from ._base import BaseHandler
@@ -63,11 +63,11 @@ class AuthHandler(BaseHandler):
         """
         super(AuthHandler, self).__init__(hs)
 
-        self.checkers = {}  # type: dict[str, UserInteractiveAuthChecker]
+        self.checkers = {}  # type: Dict[str, UserInteractiveAuthChecker]
         for auth_checker_class in INTERACTIVE_AUTH_CHECKERS:
             inst = auth_checker_class(hs)
             if inst.is_enabled():
-                self.checkers[inst.AUTH_TYPE] = inst
+                self.checkers[inst.AUTH_TYPE] = inst  # type: ignore
 
         self.bcrypt_rounds = hs.config.bcrypt_rounds
 
@@ -124,7 +124,9 @@ class AuthHandler(BaseHandler):
         self._whitelisted_sso_clients = tuple(hs.config.sso_client_whitelist)
 
     @defer.inlineCallbacks
-    def validate_user_via_ui_auth(self, requester, request_body, clientip):
+    def validate_user_via_ui_auth(
+        self, requester: Requester, request_body: Dict[str, Any], clientip: str
+    ):
         """
         Checks that the user is who they claim to be, via a UI auth.
 
@@ -133,11 +135,11 @@ class AuthHandler(BaseHandler):
         that it isn't stolen by re-authenticating them.
 
         Args:
-            requester (Requester): The user, as given by the access token
+            requester: The user, as given by the access token
 
-            request_body (dict): The body of the request sent by the client
+            request_body: The body of the request sent by the client
 
-            clientip (str): The IP address of the client.
+            clientip: The IP address of the client.
 
         Returns:
             defer.Deferred[dict]: the parameters for this request (which may
@@ -208,7 +210,9 @@ class AuthHandler(BaseHandler):
         return self.checkers.keys()
 
     @defer.inlineCallbacks
-    def check_auth(self, flows, clientdict, clientip):
+    def check_auth(
+        self, flows: List[List[str]], clientdict: Dict[str, Any], clientip: str
+    ):
         """
         Takes a dictionary sent by the client in the login / registration
         protocol and handles the User-Interactive Auth flow.
@@ -223,14 +227,14 @@ class AuthHandler(BaseHandler):
         decorator.
 
         Args:
-            flows (list): A list of login flows. Each flow is an ordered list of
-                          strings representing auth-types. At least one full
-                          flow must be completed in order for auth to be successful.
+            flows: A list of login flows. Each flow is an ordered list of
+                   strings representing auth-types. At least one full
+                   flow must be completed in order for auth to be successful.
 
             clientdict: The dictionary from the client root level, not the
                         'auth' key: this method prompts for auth if none is sent.
 
-            clientip (str): The IP address of the client.
+            clientip: The IP address of the client.
 
         Returns:
             defer.Deferred[dict, dict, str]: a deferred tuple of
@@ -250,7 +254,7 @@ class AuthHandler(BaseHandler):
         """
 
         authdict = None
-        sid = None
+        sid = None  # type: Optional[str]
         if clientdict and "auth" in clientdict:
             authdict = clientdict["auth"]
             del clientdict["auth"]
@@ -283,9 +287,9 @@ class AuthHandler(BaseHandler):
         creds = session["creds"]
 
         # check auth type currently being presented
-        errordict = {}
+        errordict = {}  # type: Dict[str, Any]
         if "type" in authdict:
-            login_type = authdict["type"]
+            login_type = authdict["type"]  # type: str
             try:
                 result = yield self._check_auth_dict(authdict, clientip)
                 if result:
@@ -326,7 +330,7 @@ class AuthHandler(BaseHandler):
         raise InteractiveAuthIncompleteError(ret)
 
     @defer.inlineCallbacks
-    def add_oob_auth(self, stagetype, authdict, clientip):
+    def add_oob_auth(self, stagetype: str, authdict: Dict[str, Any], clientip: str):
         """
         Adds the result of out-of-band authentication into an existing auth
         session. Currently used for adding the result of fallback auth.
@@ -348,7 +352,7 @@ class AuthHandler(BaseHandler):
             return True
         return False
 
-    def get_session_id(self, clientdict):
+    def get_session_id(self, clientdict: Dict[str, Any]) -> Optional[str]:
         """
         Gets the session ID for a client given the client dictionary
 
@@ -356,7 +360,7 @@ class AuthHandler(BaseHandler):
             clientdict: The dictionary sent by the client in the request
 
         Returns:
-            str|None: The string session ID the client sent. If the client did
+            The string session ID the client sent. If the client did
                 not send a session ID, returns None.
         """
         sid = None
@@ -366,40 +370,42 @@ class AuthHandler(BaseHandler):
                 sid = authdict["session"]
         return sid
 
-    def set_session_data(self, session_id, key, value):
+    def set_session_data(self, session_id: str, key: str, value: Any) -> None:
         """
         Store a key-value pair into the sessions data associated with this
         request. This data is stored server-side and cannot be modified by
         the client.
 
         Args:
-            session_id (string): The ID of this session as returned from check_auth
-            key (string): The key to store the data under
-            value (any): The data to store
+            session_id: The ID of this session as returned from check_auth
+            key: The key to store the data under
+            value: The data to store
         """
         sess = self._get_session_info(session_id)
         sess.setdefault("serverdict", {})[key] = value
         self._save_session(sess)
 
-    def get_session_data(self, session_id, key, default=None):
+    def get_session_data(
+        self, session_id: str, key: str, default: Optional[Any] = None
+    ) -> Any:
         """
         Retrieve data stored with set_session_data
 
         Args:
-            session_id (string): The ID of this session as returned from check_auth
-            key (string): The key to store the data under
-            default (any): Value to return if the key has not been set
+            session_id: The ID of this session as returned from check_auth
+            key: The key to store the data under
+            default: Value to return if the key has not been set
         """
         sess = self._get_session_info(session_id)
         return sess.setdefault("serverdict", {}).get(key, default)
 
     @defer.inlineCallbacks
-    def _check_auth_dict(self, authdict, clientip):
+    def _check_auth_dict(self, authdict: Dict[str, Any], clientip: str):
         """Attempt to validate the auth dict provided by a client
 
         Args:
-            authdict (object): auth dict provided by the client
-            clientip (str): IP address of the client
+            authdict: auth dict provided by the client
+            clientip: IP address of the client
 
         Returns:
             Deferred: result of the stage verification.
@@ -425,10 +431,10 @@ class AuthHandler(BaseHandler):
         (canonical_id, callback) = yield self.validate_login(user_id, authdict)
         return canonical_id
 
-    def _get_params_recaptcha(self):
+    def _get_params_recaptcha(self) -> dict:
         return {"public_key": self.hs.config.recaptcha_public_key}
 
-    def _get_params_terms(self):
+    def _get_params_terms(self) -> dict:
         return {
             "policies": {
                 "privacy_policy": {
@@ -445,7 +451,9 @@ class AuthHandler(BaseHandler):
             }
         }
 
-    def _auth_dict_for_flows(self, flows, session):
+    def _auth_dict_for_flows(
+        self, flows: List[List[str]], session: Dict[str, Any]
+    ) -> Dict[str, Any]:
         public_flows = []
         for f in flows:
             public_flows.append(f)
@@ -455,7 +463,7 @@ class AuthHandler(BaseHandler):
             LoginType.TERMS: self._get_params_terms,
         }
 
-        params = {}
+        params = {}  # type: Dict[str, Any]
 
         for f in public_flows:
             for stage in f:
@@ -468,7 +476,13 @@ class AuthHandler(BaseHandler):
             "params": params,
         }
 
-    def _get_session_info(self, session_id):
+    def _get_session_info(self, session_id: Optional[str]) -> dict:
+        """
+        Gets or creates a session given a session ID.
+
+        The session can be used to track data across multiple requests, e.g. for
+        interactive authentication.
+        """
         if session_id not in self.sessions:
             session_id = None
 
@@ -481,7 +495,9 @@ class AuthHandler(BaseHandler):
         return self.sessions[session_id]
 
     @defer.inlineCallbacks
-    def get_access_token_for_user_id(self, user_id, device_id, valid_until_ms):
+    def get_access_token_for_user_id(
+        self, user_id: str, device_id: Optional[str], valid_until_ms: Optional[int]
+    ):
         """
         Creates a new access token for the user with the given user ID.
 
@@ -491,11 +507,11 @@ class AuthHandler(BaseHandler):
         The device will be recorded in the table if it is not there already.
 
         Args:
-            user_id (str): canonical User ID
-            device_id (str|None): the device ID to associate with the tokens.
+            user_id: canonical User ID
+            device_id: the device ID to associate with the tokens.
                None to leave the tokens unassociated with a device (deprecated:
                we should always have a device ID)
-            valid_until_ms (int|None): when the token is valid until. None for
+            valid_until_ms: when the token is valid until. None for
                 no expiry.
         Returns:
               The access token for the user's session.
@@ -530,13 +546,13 @@ class AuthHandler(BaseHandler):
         return access_token
 
     @defer.inlineCallbacks
-    def check_user_exists(self, user_id):
+    def check_user_exists(self, user_id: str):
         """
         Checks to see if a user with the given id exists. Will check case
         insensitively, but return None if there are multiple inexact matches.
 
         Args:
-            (unicode|bytes) user_id: complete @user:id
+            user_id: complete @user:id
 
         Returns:
             defer.Deferred: (unicode) canonical_user_id, or None if zero or
@@ -551,7 +567,7 @@ class AuthHandler(BaseHandler):
         return None
 
     @defer.inlineCallbacks
-    def _find_user_id_and_pwd_hash(self, user_id):
+    def _find_user_id_and_pwd_hash(self, user_id: str):
         """Checks to see if a user with the given id exists. Will check case
         insensitively, but will return None if there are multiple inexact
         matches.
@@ -581,7 +597,7 @@ class AuthHandler(BaseHandler):
             )
         return result
 
-    def get_supported_login_types(self):
+    def get_supported_login_types(self) -> Iterable[str]:
         """Get a the login types supported for the /login API
 
         By default this is just 'm.login.password' (unless password_enabled is
@@ -589,20 +605,20 @@ class AuthHandler(BaseHandler):
         other login types.
 
         Returns:
-            Iterable[str]: login types
+            login types
         """
         return self._supported_login_types
 
     @defer.inlineCallbacks
-    def validate_login(self, username, login_submission):
+    def validate_login(self, username: str, login_submission: Dict[str, Any]):
         """Authenticates the user for the /login API
 
         Also used by the user-interactive auth flow to validate
         m.login.password auth types.
 
         Args:
-            username (str): username supplied by the user
-            login_submission (dict): the whole of the login submission
+            username: username supplied by the user
+            login_submission: the whole of the login submission
                 (including 'type' and other relevant fields)
         Returns:
             Deferred[str, func]: canonical user id, and optional callback
@@ -690,13 +706,13 @@ class AuthHandler(BaseHandler):
         raise LoginError(403, "Invalid password", errcode=Codes.FORBIDDEN)
 
     @defer.inlineCallbacks
-    def check_password_provider_3pid(self, medium, address, password):
+    def check_password_provider_3pid(self, medium: str, address: str, password: str):
         """Check if a password provider is able to validate a thirdparty login
 
         Args:
-            medium (str): The medium of the 3pid (ex. email).
-            address (str): The address of the 3pid (ex. jdoe@example.com).
-            password (str): The password of the user.
+            medium: The medium of the 3pid (ex. email).
+            address: The address of the 3pid (ex. jdoe@example.com).
+            password: The password of the user.
 
         Returns:
             Deferred[(str|None, func|None)]: A tuple of `(user_id,
@@ -724,15 +740,15 @@ class AuthHandler(BaseHandler):
         return None, None
 
     @defer.inlineCallbacks
-    def _check_local_password(self, user_id, password):
+    def _check_local_password(self, user_id: str, password: str):
         """Authenticate a user against the local password database.
 
         user_id is checked case insensitively, but will return None if there are
         multiple inexact matches.
 
         Args:
-            user_id (unicode): complete @user:id
-            password (unicode): the provided password
+            user_id: complete @user:id
+            password: the provided password
         Returns:
             Deferred[unicode] the canonical_user_id, or Deferred[None] if
                 unknown user/bad password
@@ -755,7 +771,7 @@ class AuthHandler(BaseHandler):
         return user_id
 
     @defer.inlineCallbacks
-    def validate_short_term_login_token_and_get_user_id(self, login_token):
+    def validate_short_term_login_token_and_get_user_id(self, login_token: str):
         auth_api = self.hs.get_auth()
         user_id = None
         try:
@@ -769,11 +785,11 @@ class AuthHandler(BaseHandler):
         return user_id
 
     @defer.inlineCallbacks
-    def delete_access_token(self, access_token):
+    def delete_access_token(self, access_token: str):
         """Invalidate a single access token
 
         Args:
-            access_token (str): access token to be deleted
+            access_token: access token to be deleted
 
         Returns:
             Deferred
@@ -798,15 +814,17 @@ class AuthHandler(BaseHandler):
 
     @defer.inlineCallbacks
     def delete_access_tokens_for_user(
-        self, user_id, except_token_id=None, device_id=None
+        self,
+        user_id: str,
+        except_token_id: Optional[str] = None,
+        device_id: Optional[str] = None,
     ):
         """Invalidate access tokens belonging to a user
 
         Args:
-            user_id (str):  ID of user the tokens belong to
-            except_token_id (str|None): access_token ID which should *not* be
-                deleted
-            device_id (str|None):  ID of device the tokens are associated with.
+            user_id:  ID of user the tokens belong to
+            except_token_id: access_token ID which should *not* be deleted
+            device_id:  ID of device the tokens are associated with.
                 If None, tokens associated with any device (or no device) will
                 be deleted
         Returns:
@@ -830,7 +848,7 @@ class AuthHandler(BaseHandler):
         )
 
     @defer.inlineCallbacks
-    def add_threepid(self, user_id, medium, address, validated_at):
+    def add_threepid(self, user_id: str, medium: str, address: str, validated_at: int):
         # check if medium has a valid value
         if medium not in ["email", "msisdn"]:
             raise SynapseError(
@@ -856,19 +874,20 @@ class AuthHandler(BaseHandler):
         )
 
     @defer.inlineCallbacks
-    def delete_threepid(self, user_id, medium, address, id_server=None):
+    def delete_threepid(
+        self, user_id: str, medium: str, address: str, id_server: Optional[str] = None
+    ):
         """Attempts to unbind the 3pid on the identity servers and deletes it
         from the local database.
 
         Args:
-            user_id (str)
-            medium (str)
-            address (str)
-            id_server (str|None): Use the given identity server when unbinding
+            user_id: ID of user to remove the 3pid from.
+            medium: The medium of the 3pid being removed: "email" or "msisdn".
+            address: The 3pid address to remove.
+            id_server: Use the given identity server when unbinding
                 any threepids. If None then will attempt to unbind using the
                 identity server specified when binding (if known).
 
-
         Returns:
             Deferred[bool]: Returns True if successfully unbound the 3pid on
             the identity server, False if identity server doesn't support the
@@ -887,17 +906,18 @@ class AuthHandler(BaseHandler):
         yield self.store.user_delete_threepid(user_id, medium, address)
         return result
 
-    def _save_session(self, session):
+    def _save_session(self, session: Dict[str, Any]) -> None:
+        """Update the last used time on the session to now and add it back to the session store."""
         # TODO: Persistent storage
         logger.debug("Saving session %s", session)
         session["last_used"] = self.hs.get_clock().time_msec()
         self.sessions[session["id"]] = session
 
-    def hash(self, password):
+    def hash(self, password: str):
         """Computes a secure hash of password.
 
         Args:
-            password (unicode): Password to hash.
+            password: Password to hash.
 
         Returns:
             Deferred(unicode): Hashed password.
@@ -914,12 +934,12 @@ class AuthHandler(BaseHandler):
 
         return defer_to_thread(self.hs.get_reactor(), _do_hash)
 
-    def validate_hash(self, password, stored_hash):
+    def validate_hash(self, password: str, stored_hash: bytes):
         """Validates that self.hash(password) == stored_hash.
 
         Args:
-            password (unicode): Password to hash.
-            stored_hash (bytes): Expected hash value.
+            password: Password to hash.
+            stored_hash: Expected hash value.
 
         Returns:
             Deferred(bool): Whether self.hash(password) == stored_hash.
@@ -1007,7 +1027,9 @@ class MacaroonGenerator(object):
 
     hs = attr.ib()
 
-    def generate_access_token(self, user_id, extra_caveats=None):
+    def generate_access_token(
+        self, user_id: str, extra_caveats: Optional[List[str]] = None
+    ) -> str:
         extra_caveats = extra_caveats or []
         macaroon = self._generate_base_macaroon(user_id)
         macaroon.add_first_party_caveat("type = access")
@@ -1020,16 +1042,9 @@ class MacaroonGenerator(object):
             macaroon.add_first_party_caveat(caveat)
         return macaroon.serialize()
 
-    def generate_short_term_login_token(self, user_id, duration_in_ms=(2 * 60 * 1000)):
-        """
-
-        Args:
-            user_id (unicode):
-            duration_in_ms (int):
-
-        Returns:
-            unicode
-        """
+    def generate_short_term_login_token(
+        self, user_id: str, duration_in_ms: int = (2 * 60 * 1000)
+    ) -> str:
         macaroon = self._generate_base_macaroon(user_id)
         macaroon.add_first_party_caveat("type = login")
         now = self.hs.get_clock().time_msec()
@@ -1037,12 +1052,12 @@ class MacaroonGenerator(object):
         macaroon.add_first_party_caveat("time < %d" % (expiry,))
         return macaroon.serialize()
 
-    def generate_delete_pusher_token(self, user_id):
+    def generate_delete_pusher_token(self, user_id: str) -> str:
         macaroon = self._generate_base_macaroon(user_id)
         macaroon.add_first_party_caveat("type = delete_pusher")
         return macaroon.serialize()
 
-    def _generate_base_macaroon(self, user_id):
+    def _generate_base_macaroon(self, user_id: str) -> pymacaroons.Macaroon:
         macaroon = pymacaroons.Macaroon(
             location=self.hs.config.server_name,
             identifier="key",
diff --git a/synapse/handlers/device.py b/synapse/handlers/device.py
index 50cea3f378..a514c30714 100644
--- a/synapse/handlers/device.py
+++ b/synapse/handlers/device.py
@@ -742,6 +742,6 @@ class DeviceListUpdater(object):
 
         # We clobber the seen updates since we've re-synced from a given
         # point.
-        self._seen_updates[user_id] = set([stream_id])
+        self._seen_updates[user_id] = {stream_id}
 
         defer.returnValue(result)
diff --git a/synapse/handlers/directory.py b/synapse/handlers/directory.py
index db2104c5f6..1d842c369b 100644
--- a/synapse/handlers/directory.py
+++ b/synapse/handlers/directory.py
@@ -13,10 +13,9 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-
 import logging
 import string
-from typing import List
+from typing import Iterable, List, Optional
 
 from twisted.internet import defer
 
@@ -29,6 +28,7 @@ from synapse.api.errors import (
     StoreError,
     SynapseError,
 )
+from synapse.appservice import ApplicationService
 from synapse.types import Requester, RoomAlias, UserID, get_domain_from_id
 
 from ._base import BaseHandler
@@ -56,7 +56,13 @@ class DirectoryHandler(BaseHandler):
         self.spam_checker = hs.get_spam_checker()
 
     @defer.inlineCallbacks
-    def _create_association(self, room_alias, room_id, servers=None, creator=None):
+    def _create_association(
+        self,
+        room_alias: RoomAlias,
+        room_id: str,
+        servers: Optional[Iterable[str]] = None,
+        creator: Optional[str] = None,
+    ):
         # general association creation for both human users and app services
 
         for wchar in string.whitespace:
@@ -71,7 +77,7 @@ class DirectoryHandler(BaseHandler):
         # TODO(erikj): Check if there is a current association.
         if not servers:
             users = yield self.state.get_current_users_in_room(room_id)
-            servers = set(get_domain_from_id(u) for u in users)
+            servers = {get_domain_from_id(u) for u in users}
 
         if not servers:
             raise SynapseError(400, "Failed to get server list")
@@ -82,17 +88,21 @@ class DirectoryHandler(BaseHandler):
 
     @defer.inlineCallbacks
     def create_association(
-        self, requester, room_alias, room_id, servers=None, check_membership=True,
+        self,
+        requester: Requester,
+        room_alias: RoomAlias,
+        room_id: str,
+        servers: Optional[List[str]] = None,
+        check_membership: bool = True,
     ):
         """Attempt to create a new alias
 
         Args:
-            requester (Requester)
-            room_alias (RoomAlias)
-            room_id (str)
-            servers (list[str]|None): List of servers that others servers
-                should try and join via
-            check_membership (bool): Whether to check if the user is in the room
+            requester
+            room_alias
+            room_id
+            servers: Iterable of servers that others servers should try and join via
+            check_membership: Whether to check if the user is in the room
                 before the alias can be set (if the server's config requires it).
 
         Returns:
@@ -146,15 +156,15 @@ class DirectoryHandler(BaseHandler):
         yield self._create_association(room_alias, room_id, servers, creator=user_id)
 
     @defer.inlineCallbacks
-    def delete_association(self, requester, room_alias):
+    def delete_association(self, requester: Requester, room_alias: RoomAlias):
         """Remove an alias from the directory
 
         (this is only meant for human users; AS users should call
         delete_appservice_association)
 
         Args:
-            requester (Requester):
-            room_alias (RoomAlias):
+            requester
+            room_alias
 
         Returns:
             Deferred[unicode]: room id that the alias used to point to
@@ -190,16 +200,16 @@ class DirectoryHandler(BaseHandler):
         room_id = yield self._delete_association(room_alias)
 
         try:
-            yield self._update_canonical_alias(
-                requester, requester.user.to_string(), room_id, room_alias
-            )
+            yield self._update_canonical_alias(requester, user_id, room_id, room_alias)
         except AuthError as e:
             logger.info("Failed to update alias events: %s", e)
 
         return room_id
 
     @defer.inlineCallbacks
-    def delete_appservice_association(self, service, room_alias):
+    def delete_appservice_association(
+        self, service: ApplicationService, room_alias: RoomAlias
+    ):
         if not service.is_interested_in_alias(room_alias.to_string()):
             raise SynapseError(
                 400,
@@ -209,7 +219,7 @@ class DirectoryHandler(BaseHandler):
         yield self._delete_association(room_alias)
 
     @defer.inlineCallbacks
-    def _delete_association(self, room_alias):
+    def _delete_association(self, room_alias: RoomAlias):
         if not self.hs.is_mine(room_alias):
             raise SynapseError(400, "Room alias must be local")
 
@@ -218,7 +228,7 @@ class DirectoryHandler(BaseHandler):
         return room_id
 
     @defer.inlineCallbacks
-    def get_association(self, room_alias):
+    def get_association(self, room_alias: RoomAlias):
         room_id = None
         if self.hs.is_mine(room_alias):
             result = yield self.get_association_from_room_alias(room_alias)
@@ -254,7 +264,7 @@ class DirectoryHandler(BaseHandler):
             )
 
         users = yield self.state.get_current_users_in_room(room_id)
-        extra_servers = set(get_domain_from_id(u) for u in users)
+        extra_servers = {get_domain_from_id(u) for u in users}
         servers = set(extra_servers) | set(servers)
 
         # If this server is in the list of servers, return it first.
@@ -283,23 +293,9 @@ class DirectoryHandler(BaseHandler):
             )
 
     @defer.inlineCallbacks
-    def send_room_alias_update_event(self, requester, room_id):
-        aliases = yield self.store.get_aliases_for_room(room_id)
-
-        yield self.event_creation_handler.create_and_send_nonmember_event(
-            requester,
-            {
-                "type": EventTypes.Aliases,
-                "state_key": self.hs.hostname,
-                "room_id": room_id,
-                "sender": requester.user.to_string(),
-                "content": {"aliases": aliases},
-            },
-            ratelimit=False,
-        )
-
-    @defer.inlineCallbacks
-    def _update_canonical_alias(self, requester, user_id, room_id, room_alias):
+    def _update_canonical_alias(
+        self, requester: Requester, user_id: str, room_id: str, room_alias: RoomAlias
+    ):
         """
         Send an updated canonical alias event if the removed alias was set as
         the canonical alias or listed in the alt_aliases field.
@@ -322,15 +318,17 @@ class DirectoryHandler(BaseHandler):
             send_update = True
             content.pop("alias", "")
 
-        # Filter alt_aliases for the removed alias.
-        alt_aliases = content.pop("alt_aliases", None)
-        # If the aliases are not a list (or not found) do not attempt to modify
-        # the list.
-        if isinstance(alt_aliases, list):
+        # Filter the alt_aliases property for the removed alias. Note that the
+        # value is not modified if alt_aliases is of an unexpected form.
+        alt_aliases = content.get("alt_aliases")
+        if isinstance(alt_aliases, (list, tuple)) and alias_str in alt_aliases:
             send_update = True
             alt_aliases = [alias for alias in alt_aliases if alias != alias_str]
+
             if alt_aliases:
                 content["alt_aliases"] = alt_aliases
+            else:
+                del content["alt_aliases"]
 
         if send_update:
             yield self.event_creation_handler.create_and_send_nonmember_event(
@@ -346,7 +344,7 @@ class DirectoryHandler(BaseHandler):
             )
 
     @defer.inlineCallbacks
-    def get_association_from_room_alias(self, room_alias):
+    def get_association_from_room_alias(self, room_alias: RoomAlias):
         result = yield self.store.get_association_from_room_alias(room_alias)
         if not result:
             # Query AS to see if it exists
@@ -354,7 +352,7 @@ class DirectoryHandler(BaseHandler):
             result = yield as_handler.query_room_alias_exists(room_alias)
         return result
 
-    def can_modify_alias(self, alias, user_id=None):
+    def can_modify_alias(self, alias: RoomAlias, user_id: Optional[str] = None):
         # Any application service "interested" in an alias they are regexing on
         # can modify the alias.
         # Users can only modify the alias if ALL the interested services have
@@ -375,22 +373,42 @@ class DirectoryHandler(BaseHandler):
         return defer.succeed(True)
 
     @defer.inlineCallbacks
-    def _user_can_delete_alias(self, alias, user_id):
+    def _user_can_delete_alias(self, alias: RoomAlias, user_id: str):
+        """Determine whether a user can delete an alias.
+
+        One of the following must be true:
+
+        1. The user created the alias.
+        2. The user is a server administrator.
+        3. The user has a power-level sufficient to send a canonical alias event
+           for the current room.
+
+        """
         creator = yield self.store.get_room_alias_creator(alias.to_string())
 
         if creator is not None and creator == user_id:
             return True
 
-        is_admin = yield self.auth.is_server_admin(UserID.from_string(user_id))
-        return is_admin
+        # Resolve the alias to the corresponding room.
+        room_mapping = yield self.get_association(alias)
+        room_id = room_mapping["room_id"]
+        if not room_id:
+            return False
+
+        res = yield self.auth.check_can_change_room_list(
+            room_id, UserID.from_string(user_id)
+        )
+        return res
 
     @defer.inlineCallbacks
-    def edit_published_room_list(self, requester, room_id, visibility):
+    def edit_published_room_list(
+        self, requester: Requester, room_id: str, visibility: str
+    ):
         """Edit the entry of the room in the published room list.
 
         requester
-        room_id (str)
-        visibility (str): "public" or "private"
+        room_id
+        visibility: "public" or "private"
         """
         user_id = requester.user.to_string()
 
@@ -415,7 +433,15 @@ class DirectoryHandler(BaseHandler):
         if room is None:
             raise SynapseError(400, "Unknown room")
 
-        yield self.auth.check_can_change_room_list(room_id, requester.user)
+        can_change_room_list = yield self.auth.check_can_change_room_list(
+            room_id, requester.user
+        )
+        if not can_change_room_list:
+            raise AuthError(
+                403,
+                "This server requires you to be a moderator in the room to"
+                " edit its room list entry",
+            )
 
         making_public = visibility == "public"
         if making_public:
@@ -436,16 +462,16 @@ class DirectoryHandler(BaseHandler):
 
     @defer.inlineCallbacks
     def edit_published_appservice_room_list(
-        self, appservice_id, network_id, room_id, visibility
+        self, appservice_id: str, network_id: str, room_id: str, visibility: str
     ):
         """Add or remove a room from the appservice/network specific public
         room list.
 
         Args:
-            appservice_id (str): ID of the appservice that owns the list
-            network_id (str): The ID of the network the list is associated with
-            room_id (str)
-            visibility (str): either "public" or "private"
+            appservice_id: ID of the appservice that owns the list
+            network_id: The ID of the network the list is associated with
+            room_id
+            visibility: either "public" or "private"
         """
         if visibility not in ["public", "private"]:
             raise SynapseError(400, "Invalid visibility setting")
diff --git a/synapse/handlers/e2e_room_keys.py b/synapse/handlers/e2e_room_keys.py
index f1b4424a02..9abaf13b8f 100644
--- a/synapse/handlers/e2e_room_keys.py
+++ b/synapse/handlers/e2e_room_keys.py
@@ -207,6 +207,13 @@ class E2eRoomKeysHandler(object):
             changed = False  # if anything has changed, we need to update the etag
             for room_id, room in iteritems(room_keys["rooms"]):
                 for session_id, room_key in iteritems(room["sessions"]):
+                    if not isinstance(room_key["is_verified"], bool):
+                        msg = (
+                            "is_verified must be a boolean in keys for session %s in"
+                            "room %s" % (session_id, room_id)
+                        )
+                        raise SynapseError(400, msg, Codes.INVALID_PARAM)
+
                     log_kv(
                         {
                             "message": "Trying to upload room key",
diff --git a/synapse/handlers/federation.py b/synapse/handlers/federation.py
index eb20ef4aec..38ab6a8fc3 100644
--- a/synapse/handlers/federation.py
+++ b/synapse/handlers/federation.py
@@ -41,7 +41,6 @@ from synapse.api.errors import (
     FederationDeniedError,
     FederationError,
     RequestSendFailed,
-    StoreError,
     SynapseError,
 )
 from synapse.api.room_versions import KNOWN_ROOM_VERSIONS, RoomVersion, RoomVersions
@@ -61,6 +60,7 @@ from synapse.replication.http.devices import ReplicationUserDevicesResyncRestSer
 from synapse.replication.http.federation import (
     ReplicationCleanRoomRestServlet,
     ReplicationFederationSendEventsRestServlet,
+    ReplicationStoreRoomOnInviteRestServlet,
 )
 from synapse.replication.http.membership import ReplicationUserJoinedLeftRoomRestServlet
 from synapse.state import StateResolutionStore, resolve_events_with_store
@@ -161,8 +161,12 @@ class FederationHandler(BaseHandler):
             self._user_device_resync = ReplicationUserDevicesResyncRestServlet.make_client(
                 hs
             )
+            self._maybe_store_room_on_invite = ReplicationStoreRoomOnInviteRestServlet.make_client(
+                hs
+            )
         else:
             self._device_list_updater = hs.get_device_handler().device_list_updater
+            self._maybe_store_room_on_invite = self.store.maybe_store_room_on_invite
 
         # When joining a room we need to queue any events for that room up
         self.room_queues = {}
@@ -659,11 +663,11 @@ class FederationHandler(BaseHandler):
         # this can happen if a remote server claims that the state or
         # auth_events at an event in room A are actually events in room B
 
-        bad_events = list(
+        bad_events = [
             (event_id, event.room_id)
             for event_id, event in fetched_events.items()
             if event.room_id != room_id
-        )
+        ]
 
         for bad_event_id, bad_room_id in bad_events:
             # This is a bogus situation, but since we may only discover it a long time
@@ -707,28 +711,6 @@ class FederationHandler(BaseHandler):
         except AuthError as e:
             raise FederationError("ERROR", e.code, e.msg, affected=event.event_id)
 
-        room = await self.store.get_room(room_id)
-
-        if not room:
-            try:
-                prev_state_ids = await context.get_prev_state_ids()
-                create_event = await self.store.get_event(
-                    prev_state_ids[(EventTypes.Create, "")]
-                )
-
-                room_version_id = create_event.content.get(
-                    "room_version", RoomVersions.V1.identifier
-                )
-
-                await self.store.store_room(
-                    room_id=room_id,
-                    room_creator_user_id="",
-                    is_public=False,
-                    room_version=KNOWN_ROOM_VERSIONS[room_version_id],
-                )
-            except StoreError:
-                logger.exception("Failed to store room.")
-
         if event.type == EventTypes.Member:
             if event.membership == Membership.JOIN:
                 # Only fire user_joined_room if the user has acutally
@@ -856,7 +838,7 @@ class FederationHandler(BaseHandler):
 
         # Don't bother processing events we already have.
         seen_events = await self.store.have_events_in_timeline(
-            set(e.event_id for e in events)
+            {e.event_id for e in events}
         )
 
         events = [e for e in events if e.event_id not in seen_events]
@@ -866,7 +848,7 @@ class FederationHandler(BaseHandler):
 
         event_map = {e.event_id: e for e in events}
 
-        event_ids = set(e.event_id for e in events)
+        event_ids = {e.event_id for e in events}
 
         # build a list of events whose prev_events weren't in the batch.
         # (XXX: this will include events whose prev_events we already have; that doesn't
@@ -892,13 +874,13 @@ class FederationHandler(BaseHandler):
             state_events.update({s.event_id: s for s in state})
             events_to_state[e_id] = state
 
-        required_auth = set(
+        required_auth = {
             a_id
             for event in events
             + list(state_events.values())
             + list(auth_events.values())
             for a_id in event.auth_event_ids()
-        )
+        }
         auth_events.update(
             {e_id: event_map[e_id] for e_id in required_auth if e_id in event_map}
         )
@@ -1247,7 +1229,7 @@ class FederationHandler(BaseHandler):
     async def on_event_auth(self, event_id: str) -> List[EventBase]:
         event = await self.store.get_event(event_id)
         auth = await self.store.get_auth_chain(
-            [auth_id for auth_id in event.auth_event_ids()], include_given=True
+            list(event.auth_event_ids()), include_given=True
         )
         return list(auth)
 
@@ -1323,16 +1305,18 @@ class FederationHandler(BaseHandler):
 
             logger.debug("do_invite_join event: %s", event)
 
-            try:
-                await self.store.store_room(
-                    room_id=room_id,
-                    room_creator_user_id="",
-                    is_public=False,
-                    room_version=room_version_obj,
-                )
-            except Exception:
-                # FIXME
-                pass
+            # if this is the first time we've joined this room, it's time to add
+            # a row to `rooms` with the correct room version. If there's already a
+            # row there, we should override it, since it may have been populated
+            # based on an invite request which lied about the room version.
+            #
+            # federation_client.send_join has already checked that the room
+            # version in the received create event is the same as room_version_obj,
+            # so we can rely on it now.
+            #
+            await self.store.upsert_room_on_join(
+                room_id=room_id, room_version=room_version_obj,
+            )
 
             await self._persist_auth_tree(
                 origin, auth_chain, state, event, room_version_obj
@@ -1558,6 +1542,13 @@ class FederationHandler(BaseHandler):
         if event.state_key == self._server_notices_mxid:
             raise SynapseError(http_client.FORBIDDEN, "Cannot invite this user")
 
+        # keep a record of the room version, if we don't yet know it.
+        # (this may get overwritten if we later get a different room version in a
+        # join dance).
+        await self._maybe_store_room_on_invite(
+            room_id=event.room_id, room_version=room_version
+        )
+
         event.internal_metadata.outlier = True
         event.internal_metadata.out_of_band_membership = True
 
@@ -2152,7 +2143,7 @@ class FederationHandler(BaseHandler):
 
         # Now get the current auth_chain for the event.
         local_auth_chain = await self.store.get_auth_chain(
-            [auth_id for auth_id in event.auth_event_ids()], include_given=True
+            list(event.auth_event_ids()), include_given=True
         )
 
         # TODO: Check if we would now reject event_id. If so we need to tell
@@ -2654,7 +2645,7 @@ class FederationHandler(BaseHandler):
             member_handler = self.hs.get_room_member_handler()
             yield member_handler.send_membership_event(None, event, context)
         else:
-            destinations = set(x.split(":", 1)[-1] for x in (sender_user_id, room_id))
+            destinations = {x.split(":", 1)[-1] for x in (sender_user_id, room_id)}
             yield self.federation_client.forward_third_party_invite(
                 destinations, room_id, event_dict
             )
diff --git a/synapse/handlers/message.py b/synapse/handlers/message.py
index d6be280952..b743fc2dcc 100644
--- a/synapse/handlers/message.py
+++ b/synapse/handlers/message.py
@@ -160,7 +160,7 @@ class MessageHandler(object):
                 raise NotFoundError("Can't find event for token %s" % (at_token,))
 
             visible_events = yield filter_events_for_client(
-                self.storage, user_id, last_events, apply_retention_policies=False
+                self.storage, user_id, last_events, filter_send_to_client=False
             )
 
             event = last_events[0]
@@ -888,19 +888,60 @@ class EventCreationHandler(object):
         yield self.base_handler.maybe_kick_guest_users(event, context)
 
         if event.type == EventTypes.CanonicalAlias:
-            # Check the alias is acually valid (at this time at least)
+            # Validate a newly added alias or newly added alt_aliases.
+
+            original_alias = None
+            original_alt_aliases = set()
+
+            original_event_id = event.unsigned.get("replaces_state")
+            if original_event_id:
+                original_event = yield self.store.get_event(original_event_id)
+
+                if original_event:
+                    original_alias = original_event.content.get("alias", None)
+                    original_alt_aliases = original_event.content.get("alt_aliases", [])
+
+            # Check the alias is currently valid (if it has changed).
             room_alias_str = event.content.get("alias", None)
-            if room_alias_str:
+            directory_handler = self.hs.get_handlers().directory_handler
+            if room_alias_str and room_alias_str != original_alias:
                 room_alias = RoomAlias.from_string(room_alias_str)
-                directory_handler = self.hs.get_handlers().directory_handler
                 mapping = yield directory_handler.get_association(room_alias)
 
                 if mapping["room_id"] != event.room_id:
                     raise SynapseError(
                         400,
                         "Room alias %s does not point to the room" % (room_alias_str,),
+                        Codes.BAD_ALIAS,
                     )
 
+            # Check that alt_aliases is the proper form.
+            alt_aliases = event.content.get("alt_aliases", [])
+            if not isinstance(alt_aliases, (list, tuple)):
+                raise SynapseError(
+                    400, "The alt_aliases property must be a list.", Codes.INVALID_PARAM
+                )
+
+            # If the old version of alt_aliases is of an unknown form,
+            # completely replace it.
+            if not isinstance(original_alt_aliases, (list, tuple)):
+                original_alt_aliases = []
+
+            # Check that each alias is currently valid.
+            new_alt_aliases = set(alt_aliases) - set(original_alt_aliases)
+            if new_alt_aliases:
+                for alias_str in new_alt_aliases:
+                    room_alias = RoomAlias.from_string(alias_str)
+                    mapping = yield directory_handler.get_association(room_alias)
+
+                    if mapping["room_id"] != event.room_id:
+                        raise SynapseError(
+                            400,
+                            "Room alias %s does not point to the room"
+                            % (room_alias_str,),
+                            Codes.BAD_ALIAS,
+                        )
+
         federation_handler = self.hs.get_handlers().federation_handler
 
         if event.type == EventTypes.Member:
@@ -1016,11 +1057,10 @@ class EventCreationHandler(object):
             # matters as sometimes presence code can take a while.
             run_in_background(self._bump_active_time, requester.user)
 
-    @defer.inlineCallbacks
-    def _bump_active_time(self, user):
+    async def _bump_active_time(self, user):
         try:
             presence = self.hs.get_presence_handler()
-            yield presence.bump_presence_active_time(user)
+            await presence.bump_presence_active_time(user)
         except Exception:
             logger.exception("Error bumping presence active time")
 
diff --git a/synapse/handlers/presence.py b/synapse/handlers/presence.py
index 202aa9294f..5526015ddb 100644
--- a/synapse/handlers/presence.py
+++ b/synapse/handlers/presence.py
@@ -24,11 +24,12 @@ The methods that define policy are:
 
 import logging
 from contextlib import contextmanager
-from typing import Dict, Set
+from typing import Dict, List, Set
 
 from six import iteritems, itervalues
 
 from prometheus_client import Counter
+from typing_extensions import ContextManager
 
 from twisted.internet import defer
 
@@ -42,10 +43,14 @@ from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.storage.presence import UserPresenceState
 from synapse.types import UserID, get_domain_from_id
 from synapse.util.async_helpers import Linearizer
-from synapse.util.caches.descriptors import cachedInlineCallbacks
+from synapse.util.caches.descriptors import cached
 from synapse.util.metrics import Measure
 from synapse.util.wheel_timer import WheelTimer
 
+MYPY = False
+if MYPY:
+    import synapse.server
+
 logger = logging.getLogger(__name__)
 
 
@@ -97,7 +102,6 @@ assert LAST_ACTIVE_GRANULARITY < IDLE_TIMER
 class PresenceHandler(object):
     def __init__(self, hs: "synapse.server.HomeServer"):
         self.hs = hs
-        self.is_mine = hs.is_mine
         self.is_mine_id = hs.is_mine_id
         self.server_name = hs.hostname
         self.clock = hs.get_clock()
@@ -150,7 +154,7 @@ class PresenceHandler(object):
 
         # Set of users who have presence in the `user_to_current_state` that
         # have not yet been persisted
-        self.unpersisted_users_changes = set()
+        self.unpersisted_users_changes = set()  # type: Set[str]
 
         hs.get_reactor().addSystemEventTrigger(
             "before",
@@ -160,12 +164,11 @@ class PresenceHandler(object):
             self._on_shutdown,
         )
 
-        self.serial_to_user = {}
         self._next_serial = 1
 
         # Keeps track of the number of *ongoing* syncs on this process. While
         # this is non zero a user will never go offline.
-        self.user_to_num_current_syncs = {}
+        self.user_to_num_current_syncs = {}  # type: Dict[str, int]
 
         # Keeps track of the number of *ongoing* syncs on other processes.
         # While any sync is ongoing on another process the user will never
@@ -213,8 +216,7 @@ class PresenceHandler(object):
         self._event_pos = self.store.get_current_events_token()
         self._event_processing = False
 
-    @defer.inlineCallbacks
-    def _on_shutdown(self):
+    async def _on_shutdown(self):
         """Gets called when shutting down. This lets us persist any updates that
         we haven't yet persisted, e.g. updates that only changes some internal
         timers. This allows changes to persist across startup without having to
@@ -235,7 +237,7 @@ class PresenceHandler(object):
 
         if self.unpersisted_users_changes:
 
-            yield self.store.update_presence(
+            await self.store.update_presence(
                 [
                     self.user_to_current_state[user_id]
                     for user_id in self.unpersisted_users_changes
@@ -243,8 +245,7 @@ class PresenceHandler(object):
             )
         logger.info("Finished _on_shutdown")
 
-    @defer.inlineCallbacks
-    def _persist_unpersisted_changes(self):
+    async def _persist_unpersisted_changes(self):
         """We periodically persist the unpersisted changes, as otherwise they
         may stack up and slow down shutdown times.
         """
@@ -253,12 +254,11 @@ class PresenceHandler(object):
 
         if unpersisted:
             logger.info("Persisting %d unpersisted presence updates", len(unpersisted))
-            yield self.store.update_presence(
+            await self.store.update_presence(
                 [self.user_to_current_state[user_id] for user_id in unpersisted]
             )
 
-    @defer.inlineCallbacks
-    def _update_states(self, new_states):
+    async def _update_states(self, new_states):
         """Updates presence of users. Sets the appropriate timeouts. Pokes
         the notifier and federation if and only if the changed presence state
         should be sent to clients/servers.
@@ -267,7 +267,7 @@ class PresenceHandler(object):
 
         with Measure(self.clock, "presence_update_states"):
 
-            # NOTE: We purposefully don't yield between now and when we've
+            # NOTE: We purposefully don't await between now and when we've
             # calculated what we want to do with the new states, to avoid races.
 
             to_notify = {}  # Changes we want to notify everyone about
@@ -311,9 +311,9 @@ class PresenceHandler(object):
 
             if to_notify:
                 notified_presence_counter.inc(len(to_notify))
-                yield self._persist_and_notify(list(to_notify.values()))
+                await self._persist_and_notify(list(to_notify.values()))
 
-            self.unpersisted_users_changes |= set(s.user_id for s in new_states)
+            self.unpersisted_users_changes |= {s.user_id for s in new_states}
             self.unpersisted_users_changes -= set(to_notify.keys())
 
             to_federation_ping = {
@@ -326,7 +326,7 @@ class PresenceHandler(object):
 
                 self._push_to_remotes(to_federation_ping.values())
 
-    def _handle_timeouts(self):
+    async def _handle_timeouts(self):
         """Checks the presence of users that have timed out and updates as
         appropriate.
         """
@@ -368,10 +368,9 @@ class PresenceHandler(object):
             now=now,
         )
 
-        return self._update_states(changes)
+        return await self._update_states(changes)
 
-    @defer.inlineCallbacks
-    def bump_presence_active_time(self, user):
+    async def bump_presence_active_time(self, user):
         """We've seen the user do something that indicates they're interacting
         with the app.
         """
@@ -383,16 +382,17 @@ class PresenceHandler(object):
 
         bump_active_time_counter.inc()
 
-        prev_state = yield self.current_state_for_user(user_id)
+        prev_state = await self.current_state_for_user(user_id)
 
         new_fields = {"last_active_ts": self.clock.time_msec()}
         if prev_state.state == PresenceState.UNAVAILABLE:
             new_fields["state"] = PresenceState.ONLINE
 
-        yield self._update_states([prev_state.copy_and_replace(**new_fields)])
+        await self._update_states([prev_state.copy_and_replace(**new_fields)])
 
-    @defer.inlineCallbacks
-    def user_syncing(self, user_id, affect_presence=True):
+    async def user_syncing(
+        self, user_id: str, affect_presence: bool = True
+    ) -> ContextManager[None]:
         """Returns a context manager that should surround any stream requests
         from the user.
 
@@ -415,11 +415,11 @@ class PresenceHandler(object):
             curr_sync = self.user_to_num_current_syncs.get(user_id, 0)
             self.user_to_num_current_syncs[user_id] = curr_sync + 1
 
-            prev_state = yield self.current_state_for_user(user_id)
+            prev_state = await self.current_state_for_user(user_id)
             if prev_state.state == PresenceState.OFFLINE:
                 # If they're currently offline then bring them online, otherwise
                 # just update the last sync times.
-                yield self._update_states(
+                await self._update_states(
                     [
                         prev_state.copy_and_replace(
                             state=PresenceState.ONLINE,
@@ -429,7 +429,7 @@ class PresenceHandler(object):
                     ]
                 )
             else:
-                yield self._update_states(
+                await self._update_states(
                     [
                         prev_state.copy_and_replace(
                             last_user_sync_ts=self.clock.time_msec()
@@ -437,13 +437,12 @@ class PresenceHandler(object):
                     ]
                 )
 
-        @defer.inlineCallbacks
-        def _end():
+        async def _end():
             try:
                 self.user_to_num_current_syncs[user_id] -= 1
 
-                prev_state = yield self.current_state_for_user(user_id)
-                yield self._update_states(
+                prev_state = await self.current_state_for_user(user_id)
+                await self._update_states(
                     [
                         prev_state.copy_and_replace(
                             last_user_sync_ts=self.clock.time_msec()
@@ -480,8 +479,7 @@ class PresenceHandler(object):
         else:
             return set()
 
-    @defer.inlineCallbacks
-    def update_external_syncs_row(
+    async def update_external_syncs_row(
         self, process_id, user_id, is_syncing, sync_time_msec
     ):
         """Update the syncing users for an external process as a delta.
@@ -494,8 +492,8 @@ class PresenceHandler(object):
             is_syncing (bool): Whether or not the user is now syncing
             sync_time_msec(int): Time in ms when the user was last syncing
         """
-        with (yield self.external_sync_linearizer.queue(process_id)):
-            prev_state = yield self.current_state_for_user(user_id)
+        with (await self.external_sync_linearizer.queue(process_id)):
+            prev_state = await self.current_state_for_user(user_id)
 
             process_presence = self.external_process_to_current_syncs.setdefault(
                 process_id, set()
@@ -525,25 +523,24 @@ class PresenceHandler(object):
                 process_presence.discard(user_id)
 
             if updates:
-                yield self._update_states(updates)
+                await self._update_states(updates)
 
             self.external_process_last_updated_ms[process_id] = self.clock.time_msec()
 
-    @defer.inlineCallbacks
-    def update_external_syncs_clear(self, process_id):
+    async def update_external_syncs_clear(self, process_id):
         """Marks all users that had been marked as syncing by a given process
         as offline.
 
         Used when the process has stopped/disappeared.
         """
-        with (yield self.external_sync_linearizer.queue(process_id)):
+        with (await self.external_sync_linearizer.queue(process_id)):
             process_presence = self.external_process_to_current_syncs.pop(
                 process_id, set()
             )
-            prev_states = yield self.current_state_for_users(process_presence)
+            prev_states = await self.current_state_for_users(process_presence)
             time_now_ms = self.clock.time_msec()
 
-            yield self._update_states(
+            await self._update_states(
                 [
                     prev_state.copy_and_replace(last_user_sync_ts=time_now_ms)
                     for prev_state in itervalues(prev_states)
@@ -551,15 +548,13 @@ class PresenceHandler(object):
             )
             self.external_process_last_updated_ms.pop(process_id, None)
 
-    @defer.inlineCallbacks
-    def current_state_for_user(self, user_id):
+    async def current_state_for_user(self, user_id):
         """Get the current presence state for a user.
         """
-        res = yield self.current_state_for_users([user_id])
+        res = await self.current_state_for_users([user_id])
         return res[user_id]
 
-    @defer.inlineCallbacks
-    def current_state_for_users(self, user_ids):
+    async def current_state_for_users(self, user_ids):
         """Get the current presence state for multiple users.
 
         Returns:
@@ -574,7 +569,7 @@ class PresenceHandler(object):
         if missing:
             # There are things not in our in memory cache. Lets pull them out of
             # the database.
-            res = yield self.store.get_presence_for_users(missing)
+            res = await self.store.get_presence_for_users(missing)
             states.update(res)
 
             missing = [user_id for user_id, state in iteritems(states) if not state]
@@ -587,14 +582,13 @@ class PresenceHandler(object):
 
         return states
 
-    @defer.inlineCallbacks
-    def _persist_and_notify(self, states):
+    async def _persist_and_notify(self, states):
         """Persist states in the database, poke the notifier and send to
         interested remote servers
         """
-        stream_id, max_token = yield self.store.update_presence(states)
+        stream_id, max_token = await self.store.update_presence(states)
 
-        parties = yield get_interested_parties(self.store, states)
+        parties = await get_interested_parties(self.store, states)
         room_ids_to_states, users_to_states = parties
 
         self.notifier.on_new_event(
@@ -606,9 +600,8 @@ class PresenceHandler(object):
 
         self._push_to_remotes(states)
 
-    @defer.inlineCallbacks
-    def notify_for_states(self, state, stream_id):
-        parties = yield get_interested_parties(self.store, [state])
+    async def notify_for_states(self, state, stream_id):
+        parties = await get_interested_parties(self.store, [state])
         room_ids_to_states, users_to_states = parties
 
         self.notifier.on_new_event(
@@ -626,8 +619,7 @@ class PresenceHandler(object):
         """
         self.federation.send_presence(states)
 
-    @defer.inlineCallbacks
-    def incoming_presence(self, origin, content):
+    async def incoming_presence(self, origin, content):
         """Called when we receive a `m.presence` EDU from a remote server.
         """
         now = self.clock.time_msec()
@@ -670,21 +662,19 @@ class PresenceHandler(object):
             new_fields["status_msg"] = push.get("status_msg", None)
             new_fields["currently_active"] = push.get("currently_active", False)
 
-            prev_state = yield self.current_state_for_user(user_id)
+            prev_state = await self.current_state_for_user(user_id)
             updates.append(prev_state.copy_and_replace(**new_fields))
 
         if updates:
             federation_presence_counter.inc(len(updates))
-            yield self._update_states(updates)
+            await self._update_states(updates)
 
-    @defer.inlineCallbacks
-    def get_state(self, target_user, as_event=False):
-        results = yield self.get_states([target_user.to_string()], as_event=as_event)
+    async def get_state(self, target_user, as_event=False):
+        results = await self.get_states([target_user.to_string()], as_event=as_event)
 
         return results[0]
 
-    @defer.inlineCallbacks
-    def get_states(self, target_user_ids, as_event=False):
+    async def get_states(self, target_user_ids, as_event=False):
         """Get the presence state for users.
 
         Args:
@@ -695,10 +685,10 @@ class PresenceHandler(object):
             list
         """
 
-        updates = yield self.current_state_for_users(target_user_ids)
+        updates = await self.current_state_for_users(target_user_ids)
         updates = list(updates.values())
 
-        for user_id in set(target_user_ids) - set(u.user_id for u in updates):
+        for user_id in set(target_user_ids) - {u.user_id for u in updates}:
             updates.append(UserPresenceState.default(user_id))
 
         now = self.clock.time_msec()
@@ -713,8 +703,7 @@ class PresenceHandler(object):
         else:
             return updates
 
-    @defer.inlineCallbacks
-    def set_state(self, target_user, state, ignore_status_msg=False):
+    async def set_state(self, target_user, state, ignore_status_msg=False):
         """Set the presence state of the user.
         """
         status_msg = state.get("status_msg", None)
@@ -730,7 +719,7 @@ class PresenceHandler(object):
 
         user_id = target_user.to_string()
 
-        prev_state = yield self.current_state_for_user(user_id)
+        prev_state = await self.current_state_for_user(user_id)
 
         new_fields = {"state": presence}
 
@@ -741,16 +730,15 @@ class PresenceHandler(object):
         if presence == PresenceState.ONLINE:
             new_fields["last_active_ts"] = self.clock.time_msec()
 
-        yield self._update_states([prev_state.copy_and_replace(**new_fields)])
+        await self._update_states([prev_state.copy_and_replace(**new_fields)])
 
-    @defer.inlineCallbacks
-    def is_visible(self, observed_user, observer_user):
+    async def is_visible(self, observed_user, observer_user):
         """Returns whether a user can see another user's presence.
         """
-        observer_room_ids = yield self.store.get_rooms_for_user(
+        observer_room_ids = await self.store.get_rooms_for_user(
             observer_user.to_string()
         )
-        observed_room_ids = yield self.store.get_rooms_for_user(
+        observed_room_ids = await self.store.get_rooms_for_user(
             observed_user.to_string()
         )
 
@@ -759,8 +747,7 @@ class PresenceHandler(object):
 
         return False
 
-    @defer.inlineCallbacks
-    def get_all_presence_updates(self, last_id, current_id):
+    async def get_all_presence_updates(self, last_id, current_id):
         """
         Gets a list of presence update rows from between the given stream ids.
         Each row has:
@@ -775,7 +762,7 @@ class PresenceHandler(object):
         """
         # TODO(markjh): replicate the unpersisted changes.
         # This could use the in-memory stores for recent changes.
-        rows = yield self.store.get_all_presence_updates(last_id, current_id)
+        rows = await self.store.get_all_presence_updates(last_id, current_id)
         return rows
 
     def notify_new_event(self):
@@ -786,20 +773,18 @@ class PresenceHandler(object):
         if self._event_processing:
             return
 
-        @defer.inlineCallbacks
-        def _process_presence():
+        async def _process_presence():
             assert not self._event_processing
 
             self._event_processing = True
             try:
-                yield self._unsafe_process()
+                await self._unsafe_process()
             finally:
                 self._event_processing = False
 
         run_as_background_process("presence.notify_new_event", _process_presence)
 
-    @defer.inlineCallbacks
-    def _unsafe_process(self):
+    async def _unsafe_process(self):
         # Loop round handling deltas until we're up to date
         while True:
             with Measure(self.clock, "presence_delta"):
@@ -812,10 +797,10 @@ class PresenceHandler(object):
                     self._event_pos,
                     room_max_stream_ordering,
                 )
-                max_pos, deltas = yield self.store.get_current_state_deltas(
+                max_pos, deltas = await self.store.get_current_state_deltas(
                     self._event_pos, room_max_stream_ordering
                 )
-                yield self._handle_state_delta(deltas)
+                await self._handle_state_delta(deltas)
 
                 self._event_pos = max_pos
 
@@ -824,8 +809,7 @@ class PresenceHandler(object):
                     max_pos
                 )
 
-    @defer.inlineCallbacks
-    def _handle_state_delta(self, deltas):
+    async def _handle_state_delta(self, deltas):
         """Process current state deltas to find new joins that need to be
         handled.
         """
@@ -846,13 +830,13 @@ class PresenceHandler(object):
                 # joins.
                 continue
 
-            event = yield self.store.get_event(event_id, allow_none=True)
+            event = await self.store.get_event(event_id, allow_none=True)
             if not event or event.content.get("membership") != Membership.JOIN:
                 # We only care about joins
                 continue
 
             if prev_event_id:
-                prev_event = yield self.store.get_event(prev_event_id, allow_none=True)
+                prev_event = await self.store.get_event(prev_event_id, allow_none=True)
                 if (
                     prev_event
                     and prev_event.content.get("membership") == Membership.JOIN
@@ -860,10 +844,9 @@ class PresenceHandler(object):
                     # Ignore changes to join events.
                     continue
 
-            yield self._on_user_joined_room(room_id, state_key)
+            await self._on_user_joined_room(room_id, state_key)
 
-    @defer.inlineCallbacks
-    def _on_user_joined_room(self, room_id, user_id):
+    async def _on_user_joined_room(self, room_id, user_id):
         """Called when we detect a user joining the room via the current state
         delta stream.
 
@@ -882,11 +865,11 @@ class PresenceHandler(object):
             # TODO: We should be able to filter the hosts down to those that
             # haven't previously seen the user
 
-            state = yield self.current_state_for_user(user_id)
-            hosts = yield self.state.get_current_hosts_in_room(room_id)
+            state = await self.current_state_for_user(user_id)
+            hosts = await self.state.get_current_hosts_in_room(room_id)
 
             # Filter out ourselves.
-            hosts = set(host for host in hosts if host != self.server_name)
+            hosts = {host for host in hosts if host != self.server_name}
 
             self.federation.send_presence_to_destinations(
                 states=[state], destinations=hosts
@@ -903,10 +886,10 @@ class PresenceHandler(object):
             # TODO: Check that this is actually a new server joining the
             # room.
 
-            user_ids = yield self.state.get_current_users_in_room(room_id)
+            user_ids = await self.state.get_current_users_in_room(room_id)
             user_ids = list(filter(self.is_mine_id, user_ids))
 
-            states = yield self.current_state_for_users(user_ids)
+            states = await self.current_state_for_users(user_ids)
 
             # Filter out old presence, i.e. offline presence states where
             # the user hasn't been active for a week. We can change this
@@ -996,9 +979,8 @@ class PresenceEventSource(object):
         self.store = hs.get_datastore()
         self.state = hs.get_state_handler()
 
-    @defer.inlineCallbacks
     @log_function
-    def get_new_events(
+    async def get_new_events(
         self,
         user,
         from_key,
@@ -1045,7 +1027,7 @@ class PresenceEventSource(object):
             presence = self.get_presence_handler()
             stream_change_cache = self.store.presence_stream_cache
 
-            users_interested_in = yield self._get_interested_in(user, explicit_room_id)
+            users_interested_in = await self._get_interested_in(user, explicit_room_id)
 
             user_ids_changed = set()
             changed = None
@@ -1071,7 +1053,7 @@ class PresenceEventSource(object):
                 else:
                     user_ids_changed = users_interested_in
 
-            updates = yield presence.current_state_for_users(user_ids_changed)
+            updates = await presence.current_state_for_users(user_ids_changed)
 
         if include_offline:
             return (list(updates.values()), max_token)
@@ -1084,11 +1066,11 @@ class PresenceEventSource(object):
     def get_current_key(self):
         return self.store.get_current_presence_token()
 
-    def get_pagination_rows(self, user, pagination_config, key):
-        return self.get_new_events(user, from_key=None, include_offline=False)
+    async def get_pagination_rows(self, user, pagination_config, key):
+        return await self.get_new_events(user, from_key=None, include_offline=False)
 
-    @cachedInlineCallbacks(num_args=2, cache_context=True)
-    def _get_interested_in(self, user, explicit_room_id, cache_context):
+    @cached(num_args=2, cache_context=True)
+    async def _get_interested_in(self, user, explicit_room_id, cache_context):
         """Returns the set of users that the given user should see presence
         updates for
         """
@@ -1096,13 +1078,13 @@ class PresenceEventSource(object):
         users_interested_in = set()
         users_interested_in.add(user_id)  # So that we receive our own presence
 
-        users_who_share_room = yield self.store.get_users_who_share_room_with_user(
+        users_who_share_room = await self.store.get_users_who_share_room_with_user(
             user_id, on_invalidate=cache_context.invalidate
         )
         users_interested_in.update(users_who_share_room)
 
         if explicit_room_id:
-            user_ids = yield self.store.get_users_in_room(
+            user_ids = await self.store.get_users_in_room(
                 explicit_room_id, on_invalidate=cache_context.invalidate
             )
             users_interested_in.update(user_ids)
@@ -1277,8 +1259,8 @@ def get_interested_parties(store, states):
         2-tuple: `(room_ids_to_states, users_to_states)`,
         with each item being a dict of `entity_name` -> `[UserPresenceState]`
     """
-    room_ids_to_states = {}
-    users_to_states = {}
+    room_ids_to_states = {}  # type: Dict[str, List[UserPresenceState]]
+    users_to_states = {}  # type: Dict[str, List[UserPresenceState]]
     for state in states:
         room_ids = yield store.get_rooms_for_user(state.user_id)
         for room_id in room_ids:
diff --git a/synapse/handlers/profile.py b/synapse/handlers/profile.py
index f9579d69ee..50ce0c585b 100644
--- a/synapse/handlers/profile.py
+++ b/synapse/handlers/profile.py
@@ -28,7 +28,7 @@ from synapse.api.errors import (
     SynapseError,
 )
 from synapse.metrics.background_process_metrics import run_as_background_process
-from synapse.types import UserID, get_domain_from_id
+from synapse.types import UserID, create_requester, get_domain_from_id
 
 from ._base import BaseHandler
 
@@ -165,6 +165,12 @@ class BaseProfileHandler(BaseHandler):
         if new_displayname == "":
             new_displayname = None
 
+        # If the admin changes the display name of a user, the requesting user cannot send
+        # the join event to update the displayname in the rooms.
+        # This must be done by the target user himself.
+        if by_admin:
+            requester = create_requester(target_user)
+
         yield self.store.set_profile_displayname(target_user.localpart, new_displayname)
 
         if self.hs.config.user_directory_search_all_users:
@@ -217,6 +223,10 @@ class BaseProfileHandler(BaseHandler):
                 400, "Avatar URL is too long (max %i)" % (MAX_AVATAR_URL_LEN,)
             )
 
+        # Same like set_displayname
+        if by_admin:
+            requester = create_requester(target_user)
+
         yield self.store.set_profile_avatar_url(target_user.localpart, new_avatar_url)
 
         if self.hs.config.user_directory_search_all_users:
diff --git a/synapse/handlers/receipts.py b/synapse/handlers/receipts.py
index 9283c039e3..8bc100db42 100644
--- a/synapse/handlers/receipts.py
+++ b/synapse/handlers/receipts.py
@@ -94,7 +94,7 @@ class ReceiptsHandler(BaseHandler):
             # no new receipts
             return False
 
-        affected_room_ids = list(set([r.room_id for r in receipts]))
+        affected_room_ids = list({r.room_id for r in receipts})
 
         self.notifier.on_new_event("receipt_key", max_batch_id, rooms=affected_room_ids)
         # Note that the min here shouldn't be relied upon to be accurate.
diff --git a/synapse/handlers/room.py b/synapse/handlers/room.py
index 49ec2f48bc..f580ab2e9f 100644
--- a/synapse/handlers/room.py
+++ b/synapse/handlers/room.py
@@ -149,7 +149,9 @@ class RoomCreationHandler(BaseHandler):
         return ret
 
     @defer.inlineCallbacks
-    def _upgrade_room(self, requester, old_room_id, new_version):
+    def _upgrade_room(
+        self, requester: Requester, old_room_id: str, new_version: RoomVersion
+    ):
         user_id = requester.user.to_string()
 
         # start by allocating a new room id
@@ -290,16 +292,6 @@ class RoomCreationHandler(BaseHandler):
             except AuthError as e:
                 logger.warning("Unable to update PLs in old room: %s", e)
 
-        new_pl_content = copy_power_levels_contents(old_room_pl_state.content)
-
-        # pre-msc2260 rooms may not have the right setting for aliases. If no other
-        # value is set, set it now.
-        events_default = new_pl_content.get("events_default", 0)
-        new_pl_content.setdefault("events", {}).setdefault(
-            EventTypes.Aliases, events_default
-        )
-
-        logger.debug("Setting correct PLs in new room to %s", new_pl_content)
         yield self.event_creation_handler.create_and_send_nonmember_event(
             requester,
             {
@@ -307,7 +299,7 @@ class RoomCreationHandler(BaseHandler):
                 "state_key": "",
                 "room_id": new_room_id,
                 "sender": requester.user.to_string(),
-                "content": new_pl_content,
+                "content": old_room_pl_state.content,
             },
             ratelimit=False,
         )
@@ -353,7 +345,7 @@ class RoomCreationHandler(BaseHandler):
             # If so, mark the new room as non-federatable as well
             creation_content["m.federate"] = False
 
-        initial_state = dict()
+        initial_state = {}
 
         # Replicate relevant room events
         types_to_copy = (
@@ -448,19 +440,21 @@ class RoomCreationHandler(BaseHandler):
 
     @defer.inlineCallbacks
     def _move_aliases_to_new_room(
-        self, requester, old_room_id, new_room_id, old_room_state
+        self,
+        requester: Requester,
+        old_room_id: str,
+        new_room_id: str,
+        old_room_state: StateMap[str],
     ):
         directory_handler = self.hs.get_handlers().directory_handler
 
         aliases = yield self.store.get_aliases_for_room(old_room_id)
 
         # check to see if we have a canonical alias.
-        canonical_alias = None
+        canonical_alias_event = None
         canonical_alias_event_id = old_room_state.get((EventTypes.CanonicalAlias, ""))
         if canonical_alias_event_id:
             canonical_alias_event = yield self.store.get_event(canonical_alias_event_id)
-            if canonical_alias_event:
-                canonical_alias = canonical_alias_event.content.get("alias", "")
 
         # first we try to remove the aliases from the old room (we suppress sending
         # the room_aliases event until the end).
@@ -488,19 +482,6 @@ class RoomCreationHandler(BaseHandler):
         if not removed_aliases:
             return
 
-        try:
-            # this can fail if, for some reason, our user doesn't have perms to send
-            # m.room.aliases events in the old room (note that we've already checked that
-            # they have perms to send a tombstone event, so that's not terribly likely).
-            #
-            # If that happens, it's regrettable, but we should carry on: it's the same
-            # as when you remove an alias from the directory normally - it just means that
-            # the aliases event gets out of sync with the directory
-            # (cf https://github.com/vector-im/riot-web/issues/2369)
-            yield directory_handler.send_room_alias_update_event(requester, old_room_id)
-        except AuthError as e:
-            logger.warning("Failed to send updated alias event on old room: %s", e)
-
         # we can now add any aliases we successfully removed to the new room.
         for alias in removed_aliases:
             try:
@@ -517,8 +498,10 @@ class RoomCreationHandler(BaseHandler):
                 # checking module decides it shouldn't, or similar.
                 logger.error("Error adding alias %s to new room: %s", alias, e)
 
+        # If a canonical alias event existed for the old room, fire a canonical
+        # alias event for the new room with a copy of the information.
         try:
-            if canonical_alias and (canonical_alias in removed_aliases):
+            if canonical_alias_event:
                 yield self.event_creation_handler.create_and_send_nonmember_event(
                     requester,
                     {
@@ -526,12 +509,10 @@ class RoomCreationHandler(BaseHandler):
                         "state_key": "",
                         "room_id": new_room_id,
                         "sender": requester.user.to_string(),
-                        "content": {"alias": canonical_alias},
+                        "content": canonical_alias_event.content,
                     },
                     ratelimit=False,
                 )
-
-            yield directory_handler.send_room_alias_update_event(requester, new_room_id)
         except SynapseError as e:
             # again I'm not really expecting this to fail, but if it does, I'd rather
             # we returned the new room to the client at this point.
@@ -757,7 +738,6 @@ class RoomCreationHandler(BaseHandler):
 
         if room_alias:
             result["room_alias"] = room_alias.to_string()
-            yield directory_handler.send_room_alias_update_event(requester, room_id)
 
         return result
 
@@ -824,10 +804,6 @@ class RoomCreationHandler(BaseHandler):
                     EventTypes.RoomHistoryVisibility: 100,
                     EventTypes.CanonicalAlias: 50,
                     EventTypes.RoomAvatar: 50,
-                    # MSC2260: Allow everybody to send alias events by default
-                    # This will be reudundant on pre-MSC2260 rooms, since the
-                    # aliases event is special-cased.
-                    EventTypes.Aliases: 0,
                     EventTypes.Tombstone: 100,
                     EventTypes.ServerACL: 100,
                 },
diff --git a/synapse/handlers/room_list.py b/synapse/handlers/room_list.py
index c615206df1..0b7d3da680 100644
--- a/synapse/handlers/room_list.py
+++ b/synapse/handlers/room_list.py
@@ -216,15 +216,6 @@ class RoomListHandler(BaseHandler):
                         direction_is_forward=False,
                     ).to_token()
 
-        for room in results:
-            # populate search result entries with additional fields, namely
-            # 'aliases'
-            room_id = room["room_id"]
-
-            aliases = yield self.store.get_aliases_for_room(room_id)
-            if aliases:
-                room["aliases"] = aliases
-
         response["chunk"] = results
 
         response["total_room_count_estimate"] = yield self.store.count_public_rooms(
diff --git a/synapse/handlers/saml_handler.py b/synapse/handlers/saml_handler.py
index 9406753393..72c109981b 100644
--- a/synapse/handlers/saml_handler.py
+++ b/synapse/handlers/saml_handler.py
@@ -23,6 +23,7 @@ from saml2.client import Saml2Client
 
 from synapse.api.errors import SynapseError
 from synapse.config import ConfigError
+from synapse.http.server import finish_request
 from synapse.http.servlet import parse_string
 from synapse.module_api import ModuleApi
 from synapse.types import (
@@ -73,6 +74,8 @@ class SamlHandler:
         # a lock on the mappings
         self._mapping_lock = Linearizer(name="saml_mapping", clock=self._clock)
 
+        self._error_html_content = hs.config.saml2_error_html_content
+
     def handle_redirect_request(self, client_redirect_url):
         """Handle an incoming request to /login/sso/redirect
 
@@ -114,7 +117,22 @@ class SamlHandler:
         # the dict.
         self.expire_sessions()
 
-        user_id = await self._map_saml_response_to_user(resp_bytes, relay_state)
+        try:
+            user_id = await self._map_saml_response_to_user(resp_bytes, relay_state)
+        except Exception as e:
+            # If decoding the response or mapping it to a user failed, then log the
+            # error and tell the user that something went wrong.
+            logger.error(e)
+
+            request.setResponseCode(400)
+            request.setHeader(b"Content-Type", b"text/html; charset=utf-8")
+            request.setHeader(
+                b"Content-Length", b"%d" % (len(self._error_html_content),)
+            )
+            request.write(self._error_html_content.encode("utf8"))
+            finish_request(request)
+            return
+
         self._auth_handler.complete_sso_login(user_id, request, relay_state)
 
     async def _map_saml_response_to_user(self, resp_bytes, client_redirect_url):
diff --git a/synapse/handlers/search.py b/synapse/handlers/search.py
index 110097eab9..ec1542d416 100644
--- a/synapse/handlers/search.py
+++ b/synapse/handlers/search.py
@@ -184,7 +184,7 @@ class SearchHandler(BaseHandler):
             membership_list=[Membership.JOIN],
             # membership_list=[Membership.JOIN, Membership.LEAVE, Membership.Ban],
         )
-        room_ids = set(r.room_id for r in rooms)
+        room_ids = {r.room_id for r in rooms}
 
         # If doing a subset of all rooms seearch, check if any of the rooms
         # are from an upgraded room, and search their contents as well
@@ -374,12 +374,12 @@ class SearchHandler(BaseHandler):
                 ).to_string()
 
                 if include_profile:
-                    senders = set(
+                    senders = {
                         ev.sender
                         for ev in itertools.chain(
                             res["events_before"], [event], res["events_after"]
                         )
-                    )
+                    }
 
                     if res["events_after"]:
                         last_event_id = res["events_after"][-1].event_id
@@ -421,7 +421,7 @@ class SearchHandler(BaseHandler):
 
         state_results = {}
         if include_state:
-            rooms = set(e.room_id for e in allowed_events)
+            rooms = {e.room_id for e in allowed_events}
             for room_id in rooms:
                 state = yield self.state_handler.get_current_state(room_id)
                 state_results[room_id] = list(state.values())
diff --git a/synapse/handlers/set_password.py b/synapse/handlers/set_password.py
index d90c9e0108..12657ca698 100644
--- a/synapse/handlers/set_password.py
+++ b/synapse/handlers/set_password.py
@@ -13,10 +13,12 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import logging
+from typing import Optional
 
 from twisted.internet import defer
 
 from synapse.api.errors import Codes, StoreError, SynapseError
+from synapse.types import Requester
 
 from ._base import BaseHandler
 
@@ -32,14 +34,17 @@ class SetPasswordHandler(BaseHandler):
         self._device_handler = hs.get_device_handler()
 
     @defer.inlineCallbacks
-    def set_password(self, user_id, newpassword, requester=None):
+    def set_password(
+        self,
+        user_id: str,
+        new_password: str,
+        logout_devices: bool,
+        requester: Optional[Requester] = None,
+    ):
         if not self.hs.config.password_localdb_enabled:
             raise SynapseError(403, "Password change disabled", errcode=Codes.FORBIDDEN)
 
-        password_hash = yield self._auth_handler.hash(newpassword)
-
-        except_device_id = requester.device_id if requester else None
-        except_access_token_id = requester.access_token_id if requester else None
+        password_hash = yield self._auth_handler.hash(new_password)
 
         try:
             yield self.store.user_set_password_hash(user_id, password_hash)
@@ -48,14 +53,18 @@ class SetPasswordHandler(BaseHandler):
                 raise SynapseError(404, "Unknown user", Codes.NOT_FOUND)
             raise e
 
-        # we want to log out all of the user's other sessions. First delete
-        # all his other devices.
-        yield self._device_handler.delete_all_devices_for_user(
-            user_id, except_device_id=except_device_id
-        )
-
-        # and now delete any access tokens which weren't associated with
-        # devices (or were associated with this device).
-        yield self._auth_handler.delete_access_tokens_for_user(
-            user_id, except_token_id=except_access_token_id
-        )
+        # Optionally, log out all of the user's other sessions.
+        if logout_devices:
+            except_device_id = requester.device_id if requester else None
+            except_access_token_id = requester.access_token_id if requester else None
+
+            # First delete all of their other devices.
+            yield self._device_handler.delete_all_devices_for_user(
+                user_id, except_device_id=except_device_id
+            )
+
+            # and now delete any access tokens which weren't associated with
+            # devices (or were associated with this device).
+            yield self._auth_handler.delete_access_tokens_for_user(
+                user_id, except_token_id=except_access_token_id
+            )
diff --git a/synapse/handlers/sync.py b/synapse/handlers/sync.py
index 4324bc702e..669dbc8a48 100644
--- a/synapse/handlers/sync.py
+++ b/synapse/handlers/sync.py
@@ -682,11 +682,9 @@ class SyncHandler(object):
 
         # FIXME: order by stream ordering rather than as returned by SQL
         if joined_user_ids or invited_user_ids:
-            summary["m.heroes"] = sorted(
-                [user_id for user_id in (joined_user_ids + invited_user_ids)]
-            )[0:5]
+            summary["m.heroes"] = sorted(joined_user_ids + invited_user_ids)[0:5]
         else:
-            summary["m.heroes"] = sorted([user_id for user_id in gone_user_ids])[0:5]
+            summary["m.heroes"] = sorted(gone_user_ids)[0:5]
 
         if not sync_config.filter_collection.lazy_load_members():
             return summary
@@ -697,9 +695,9 @@ class SyncHandler(object):
 
         # track which members the client should already know about via LL:
         # Ones which are already in state...
-        existing_members = set(
+        existing_members = {
             user_id for (typ, user_id) in state.keys() if typ == EventTypes.Member
-        )
+        }
 
         # ...or ones which are in the timeline...
         for ev in batch.events:
@@ -773,10 +771,10 @@ class SyncHandler(object):
                 # We only request state for the members needed to display the
                 # timeline:
 
-                members_to_fetch = set(
+                members_to_fetch = {
                     event.sender  # FIXME: we also care about invite targets etc.
                     for event in batch.events
-                )
+                }
 
                 if full_state:
                     # always make sure we LL ourselves so we know we're in the room
@@ -1993,10 +1991,10 @@ def _calculate_state(
         )
     }
 
-    c_ids = set(e for e in itervalues(current))
-    ts_ids = set(e for e in itervalues(timeline_start))
-    p_ids = set(e for e in itervalues(previous))
-    tc_ids = set(e for e in itervalues(timeline_contains))
+    c_ids = set(itervalues(current))
+    ts_ids = set(itervalues(timeline_start))
+    p_ids = set(itervalues(previous))
+    tc_ids = set(itervalues(timeline_contains))
 
     # If we are lazyloading room members, we explicitly add the membership events
     # for the senders in the timeline into the state block returned by /sync,
diff --git a/synapse/handlers/typing.py b/synapse/handlers/typing.py
index 5406618431..391bceb0c4 100644
--- a/synapse/handlers/typing.py
+++ b/synapse/handlers/typing.py
@@ -198,7 +198,7 @@ class TypingHandler(object):
                 now=now, obj=member, then=now + FEDERATION_PING_INTERVAL
             )
 
-            for domain in set(get_domain_from_id(u) for u in users):
+            for domain in {get_domain_from_id(u) for u in users}:
                 if domain != self.server_name:
                     logger.debug("sending typing update to %s", domain)
                     self.federation.build_and_send_edu(
@@ -231,7 +231,7 @@ class TypingHandler(object):
             return
 
         users = yield self.state.get_current_users_in_room(room_id)
-        domains = set(get_domain_from_id(u) for u in users)
+        domains = {get_domain_from_id(u) for u in users}
 
         if self.server_name in domains:
             logger.info("Got typing update from %s: %r", user_id, content)
diff --git a/synapse/http/client.py b/synapse/http/client.py
index d4c285445e..3797545824 100644
--- a/synapse/http/client.py
+++ b/synapse/http/client.py
@@ -244,9 +244,6 @@ class SimpleHttpClient(object):
         pool.maxPersistentPerHost = max((100 * CACHE_SIZE_FACTOR, 5))
         pool.cachedConnectionTimeout = 2 * 60
 
-        # The default context factory in Twisted 14.0.0 (which we require) is
-        # BrowserLikePolicyForHTTPS which will do regular cert validation
-        # 'like a browser'
         self.agent = ProxyAgent(
             self.reactor,
             connectTimeout=15,
diff --git a/synapse/http/federation/matrix_federation_agent.py b/synapse/http/federation/matrix_federation_agent.py
index 647d26dc56..f5f917f5ae 100644
--- a/synapse/http/federation/matrix_federation_agent.py
+++ b/synapse/http/federation/matrix_federation_agent.py
@@ -45,7 +45,7 @@ class MatrixFederationAgent(object):
     Args:
         reactor (IReactor): twisted reactor to use for underlying requests
 
-        tls_client_options_factory (ClientTLSOptionsFactory|None):
+        tls_client_options_factory (FederationPolicyForHTTPS|None):
             factory to use for fetching client tls options, or none to disable TLS.
 
         _srv_resolver (SrvResolver|None):
diff --git a/synapse/logging/context.py b/synapse/logging/context.py
index 1b940842f6..860b99a4c6 100644
--- a/synapse/logging/context.py
+++ b/synapse/logging/context.py
@@ -27,10 +27,15 @@ import inspect
 import logging
 import threading
 import types
-from typing import Any, List
+from typing import TYPE_CHECKING, Optional, Tuple, TypeVar, Union
+
+from typing_extensions import Literal
 
 from twisted.internet import defer, threads
 
+if TYPE_CHECKING:
+    from synapse.logging.scopecontextmanager import _LogContextScope
+
 logger = logging.getLogger(__name__)
 
 try:
@@ -91,7 +96,7 @@ class ContextResourceUsage(object):
         "evt_db_fetch_count",
     ]
 
-    def __init__(self, copy_from=None):
+    def __init__(self, copy_from: "Optional[ContextResourceUsage]" = None) -> None:
         """Create a new ContextResourceUsage
 
         Args:
@@ -101,27 +106,28 @@ class ContextResourceUsage(object):
         if copy_from is None:
             self.reset()
         else:
-            self.ru_utime = copy_from.ru_utime
-            self.ru_stime = copy_from.ru_stime
-            self.db_txn_count = copy_from.db_txn_count
+            # FIXME: mypy can't infer the types set via reset() above, so specify explicitly for now
+            self.ru_utime = copy_from.ru_utime  # type: float
+            self.ru_stime = copy_from.ru_stime  # type: float
+            self.db_txn_count = copy_from.db_txn_count  # type: int
 
-            self.db_txn_duration_sec = copy_from.db_txn_duration_sec
-            self.db_sched_duration_sec = copy_from.db_sched_duration_sec
-            self.evt_db_fetch_count = copy_from.evt_db_fetch_count
+            self.db_txn_duration_sec = copy_from.db_txn_duration_sec  # type: float
+            self.db_sched_duration_sec = copy_from.db_sched_duration_sec  # type: float
+            self.evt_db_fetch_count = copy_from.evt_db_fetch_count  # type: int
 
-    def copy(self):
+    def copy(self) -> "ContextResourceUsage":
         return ContextResourceUsage(copy_from=self)
 
-    def reset(self):
+    def reset(self) -> None:
         self.ru_stime = 0.0
         self.ru_utime = 0.0
         self.db_txn_count = 0
 
-        self.db_txn_duration_sec = 0
-        self.db_sched_duration_sec = 0
+        self.db_txn_duration_sec = 0.0
+        self.db_sched_duration_sec = 0.0
         self.evt_db_fetch_count = 0
 
-    def __repr__(self):
+    def __repr__(self) -> str:
         return (
             "<ContextResourceUsage ru_stime='%r', ru_utime='%r', "
             "db_txn_count='%r', db_txn_duration_sec='%r', "
@@ -135,7 +141,7 @@ class ContextResourceUsage(object):
             self.evt_db_fetch_count,
         )
 
-    def __iadd__(self, other):
+    def __iadd__(self, other: "ContextResourceUsage") -> "ContextResourceUsage":
         """Add another ContextResourceUsage's stats to this one's.
 
         Args:
@@ -149,7 +155,7 @@ class ContextResourceUsage(object):
         self.evt_db_fetch_count += other.evt_db_fetch_count
         return self
 
-    def __isub__(self, other):
+    def __isub__(self, other: "ContextResourceUsage") -> "ContextResourceUsage":
         self.ru_utime -= other.ru_utime
         self.ru_stime -= other.ru_stime
         self.db_txn_count -= other.db_txn_count
@@ -158,17 +164,20 @@ class ContextResourceUsage(object):
         self.evt_db_fetch_count -= other.evt_db_fetch_count
         return self
 
-    def __add__(self, other):
+    def __add__(self, other: "ContextResourceUsage") -> "ContextResourceUsage":
         res = ContextResourceUsage(copy_from=self)
         res += other
         return res
 
-    def __sub__(self, other):
+    def __sub__(self, other: "ContextResourceUsage") -> "ContextResourceUsage":
         res = ContextResourceUsage(copy_from=self)
         res -= other
         return res
 
 
+LoggingContextOrSentinel = Union["LoggingContext", "LoggingContext.Sentinel"]
+
+
 class LoggingContext(object):
     """Additional context for log formatting. Contexts are scoped within a
     "with" block.
@@ -201,7 +210,15 @@ class LoggingContext(object):
     class Sentinel(object):
         """Sentinel to represent the root context"""
 
-        __slots__ = []  # type: List[Any]
+        __slots__ = ["previous_context", "alive", "request", "scope", "tag"]
+
+        def __init__(self) -> None:
+            # Minimal set for compatibility with LoggingContext
+            self.previous_context = None
+            self.alive = None
+            self.request = None
+            self.scope = None
+            self.tag = None
 
         def __str__(self):
             return "sentinel"
@@ -235,7 +252,7 @@ class LoggingContext(object):
 
     sentinel = Sentinel()
 
-    def __init__(self, name=None, parent_context=None, request=None):
+    def __init__(self, name=None, parent_context=None, request=None) -> None:
         self.previous_context = LoggingContext.current_context()
         self.name = name
 
@@ -250,7 +267,7 @@ class LoggingContext(object):
         self.request = None
         self.tag = ""
         self.alive = True
-        self.scope = None
+        self.scope = None  # type: Optional[_LogContextScope]
 
         self.parent_context = parent_context
 
@@ -261,13 +278,13 @@ class LoggingContext(object):
             # the request param overrides the request from the parent context
             self.request = request
 
-    def __str__(self):
+    def __str__(self) -> str:
         if self.request:
             return str(self.request)
         return "%s@%x" % (self.name, id(self))
 
     @classmethod
-    def current_context(cls):
+    def current_context(cls) -> LoggingContextOrSentinel:
         """Get the current logging context from thread local storage
 
         Returns:
@@ -276,7 +293,9 @@ class LoggingContext(object):
         return getattr(cls.thread_local, "current_context", cls.sentinel)
 
     @classmethod
-    def set_current_context(cls, context):
+    def set_current_context(
+        cls, context: LoggingContextOrSentinel
+    ) -> LoggingContextOrSentinel:
         """Set the current logging context in thread local storage
         Args:
             context(LoggingContext): The context to activate.
@@ -291,7 +310,7 @@ class LoggingContext(object):
             context.start()
         return current
 
-    def __enter__(self):
+    def __enter__(self) -> "LoggingContext":
         """Enters this logging context into thread local storage"""
         old_context = self.set_current_context(self)
         if self.previous_context != old_context:
@@ -304,7 +323,7 @@ class LoggingContext(object):
 
         return self
 
-    def __exit__(self, type, value, traceback):
+    def __exit__(self, type, value, traceback) -> None:
         """Restore the logging context in thread local storage to the state it
         was before this context was entered.
         Returns:
@@ -318,7 +337,6 @@ class LoggingContext(object):
                 logger.warning(
                     "Expected logging context %s but found %s", self, current
                 )
-        self.previous_context = None
         self.alive = False
 
         # if we have a parent, pass our CPU usage stats on
@@ -330,7 +348,7 @@ class LoggingContext(object):
             # reset them in case we get entered again
             self._resource_usage.reset()
 
-    def copy_to(self, record):
+    def copy_to(self, record) -> None:
         """Copy logging fields from this context to a log record or
         another LoggingContext
         """
@@ -341,14 +359,14 @@ class LoggingContext(object):
         # we also track the current scope:
         record.scope = self.scope
 
-    def copy_to_twisted_log_entry(self, record):
+    def copy_to_twisted_log_entry(self, record) -> None:
         """
         Copy logging fields from this context to a Twisted log record.
         """
         record["request"] = self.request
         record["scope"] = self.scope
 
-    def start(self):
+    def start(self) -> None:
         if get_thread_id() != self.main_thread:
             logger.warning("Started logcontext %s on different thread", self)
             return
@@ -358,7 +376,7 @@ class LoggingContext(object):
         if not self.usage_start:
             self.usage_start = get_thread_resource_usage()
 
-    def stop(self):
+    def stop(self) -> None:
         if get_thread_id() != self.main_thread:
             logger.warning("Stopped logcontext %s on different thread", self)
             return
@@ -378,7 +396,7 @@ class LoggingContext(object):
 
         self.usage_start = None
 
-    def get_resource_usage(self):
+    def get_resource_usage(self) -> ContextResourceUsage:
         """Get resources used by this logcontext so far.
 
         Returns:
@@ -398,11 +416,13 @@ class LoggingContext(object):
 
         return res
 
-    def _get_cputime(self):
+    def _get_cputime(self) -> Tuple[float, float]:
         """Get the cpu usage time so far
 
         Returns: Tuple[float, float]: seconds in user mode, seconds in system mode
         """
+        assert self.usage_start is not None
+
         current = get_thread_resource_usage()
 
         # Indicate to mypy that we know that self.usage_start is None.
@@ -430,13 +450,13 @@ class LoggingContext(object):
 
         return utime_delta, stime_delta
 
-    def add_database_transaction(self, duration_sec):
+    def add_database_transaction(self, duration_sec: float) -> None:
         if duration_sec < 0:
             raise ValueError("DB txn time can only be non-negative")
         self._resource_usage.db_txn_count += 1
         self._resource_usage.db_txn_duration_sec += duration_sec
 
-    def add_database_scheduled(self, sched_sec):
+    def add_database_scheduled(self, sched_sec: float) -> None:
         """Record a use of the database pool
 
         Args:
@@ -447,7 +467,7 @@ class LoggingContext(object):
             raise ValueError("DB scheduling time can only be non-negative")
         self._resource_usage.db_sched_duration_sec += sched_sec
 
-    def record_event_fetch(self, event_count):
+    def record_event_fetch(self, event_count: int) -> None:
         """Record a number of events being fetched from the db
 
         Args:
@@ -464,10 +484,10 @@ class LoggingContextFilter(logging.Filter):
             missing fields
     """
 
-    def __init__(self, **defaults):
+    def __init__(self, **defaults) -> None:
         self.defaults = defaults
 
-    def filter(self, record):
+    def filter(self, record) -> Literal[True]:
         """Add each fields from the logging contexts to the record.
         Returns:
             True to include the record in the log output.
@@ -492,12 +512,13 @@ class PreserveLoggingContext(object):
 
     __slots__ = ["current_context", "new_context", "has_parent"]
 
-    def __init__(self, new_context=None):
+    def __init__(self, new_context: Optional[LoggingContextOrSentinel] = None) -> None:
         if new_context is None:
-            new_context = LoggingContext.sentinel
-        self.new_context = new_context
+            self.new_context = LoggingContext.sentinel  # type: LoggingContextOrSentinel
+        else:
+            self.new_context = new_context
 
-    def __enter__(self):
+    def __enter__(self) -> None:
         """Captures the current logging context"""
         self.current_context = LoggingContext.set_current_context(self.new_context)
 
@@ -506,7 +527,7 @@ class PreserveLoggingContext(object):
             if not self.current_context.alive:
                 logger.debug("Entering dead context: %s", self.current_context)
 
-    def __exit__(self, type, value, traceback):
+    def __exit__(self, type, value, traceback) -> None:
         """Restores the current logging context"""
         context = LoggingContext.set_current_context(self.current_context)
 
@@ -525,7 +546,9 @@ class PreserveLoggingContext(object):
                 logger.debug("Restoring dead context: %s", self.current_context)
 
 
-def nested_logging_context(suffix, parent_context=None):
+def nested_logging_context(
+    suffix: str, parent_context: Optional[LoggingContext] = None
+) -> LoggingContext:
     """Creates a new logging context as a child of another.
 
     The nested logging context will have a 'request' made up of the parent context's
@@ -546,10 +569,12 @@ def nested_logging_context(suffix, parent_context=None):
     Returns:
         LoggingContext: new logging context.
     """
-    if parent_context is None:
-        parent_context = LoggingContext.current_context()
+    if parent_context is not None:
+        context = parent_context  # type: LoggingContextOrSentinel
+    else:
+        context = LoggingContext.current_context()
     return LoggingContext(
-        parent_context=parent_context, request=parent_context.request + "-" + suffix
+        parent_context=context, request=str(context.request) + "-" + suffix
     )
 
 
@@ -654,7 +679,10 @@ def make_deferred_yieldable(deferred):
     return deferred
 
 
-def _set_context_cb(result, context):
+ResultT = TypeVar("ResultT")
+
+
+def _set_context_cb(result: ResultT, context: LoggingContext) -> ResultT:
     """A callback function which just sets the logging context"""
     LoggingContext.set_current_context(context)
     return result
diff --git a/synapse/logging/utils.py b/synapse/logging/utils.py
index 6073fc2725..0c2527bd86 100644
--- a/synapse/logging/utils.py
+++ b/synapse/logging/utils.py
@@ -148,7 +148,7 @@ def trace_function(f):
             pathname=pathname,
             lineno=lineno,
             msg=msg,
-            args=tuple(),
+            args=(),
             exc_info=None,
         )
 
diff --git a/synapse/metrics/__init__.py b/synapse/metrics/__init__.py
index 0b45e1f52a..d2fd29acb4 100644
--- a/synapse/metrics/__init__.py
+++ b/synapse/metrics/__init__.py
@@ -20,7 +20,7 @@ import os
 import platform
 import threading
 import time
-from typing import Dict, Union
+from typing import Callable, Dict, Iterable, Optional, Tuple, Union
 
 import six
 
@@ -59,10 +59,12 @@ class RegistryProxy(object):
 @attr.s(hash=True)
 class LaterGauge(object):
 
-    name = attr.ib()
-    desc = attr.ib()
-    labels = attr.ib(hash=False)
-    caller = attr.ib()
+    name = attr.ib(type=str)
+    desc = attr.ib(type=str)
+    labels = attr.ib(hash=False, type=Optional[Iterable[str]])
+    # callback: should either return a value (if there are no labels for this metric),
+    # or dict mapping from a label tuple to a value
+    caller = attr.ib(type=Callable[[], Union[Dict[Tuple[str, ...], float], float]])
 
     def collect(self):
 
@@ -240,7 +242,7 @@ class BucketCollector(object):
         res.append(["+Inf", sum(data.values())])
 
         metric = HistogramMetricFamily(
-            self.name, "", buckets=res, sum_value=sum([x * y for x, y in data.items()])
+            self.name, "", buckets=res, sum_value=sum(x * y for x, y in data.items())
         )
         yield metric
 
diff --git a/synapse/metrics/background_process_metrics.py b/synapse/metrics/background_process_metrics.py
index c53d2a0d40..8449ef82f7 100644
--- a/synapse/metrics/background_process_metrics.py
+++ b/synapse/metrics/background_process_metrics.py
@@ -17,6 +17,7 @@ import logging
 import threading
 from asyncio import iscoroutine
 from functools import wraps
+from typing import Dict, Set
 
 import six
 
@@ -80,13 +81,13 @@ _background_process_db_sched_duration = Counter(
 # map from description to a counter, so that we can name our logcontexts
 # incrementally. (It actually duplicates _background_process_start_count, but
 # it's much simpler to do so than to try to combine them.)
-_background_process_counts = dict()  # type: dict[str, int]
+_background_process_counts = {}  # type: Dict[str, int]
 
 # map from description to the currently running background processes.
 #
 # it's kept as a dict of sets rather than a big set so that we can keep track
 # of process descriptions that no longer have any active processes.
-_background_processes = dict()  # type: dict[str, set[_BackgroundProcess]]
+_background_processes = {}  # type: Dict[str, Set[_BackgroundProcess]]
 
 # A lock that covers the above dicts
 _bg_metrics_lock = threading.Lock()
diff --git a/synapse/push/bulk_push_rule_evaluator.py b/synapse/push/bulk_push_rule_evaluator.py
index 7d9f5a38d9..433ca2f416 100644
--- a/synapse/push/bulk_push_rule_evaluator.py
+++ b/synapse/push/bulk_push_rule_evaluator.py
@@ -400,11 +400,11 @@ class RulesForRoom(object):
         if logger.isEnabledFor(logging.DEBUG):
             logger.debug("Found members %r: %r", self.room_id, members.values())
 
-        interested_in_user_ids = set(
+        interested_in_user_ids = {
             user_id
             for user_id, membership in itervalues(members)
             if membership == Membership.JOIN
-        )
+        }
 
         logger.debug("Joined: %r", interested_in_user_ids)
 
@@ -412,9 +412,9 @@ class RulesForRoom(object):
             interested_in_user_ids, on_invalidate=self.invalidate_all_cb
         )
 
-        user_ids = set(
+        user_ids = {
             uid for uid, have_pusher in iteritems(if_users_with_pushers) if have_pusher
-        )
+        }
 
         logger.debug("With pushers: %r", user_ids)
 
diff --git a/synapse/push/emailpusher.py b/synapse/push/emailpusher.py
index 8c818a86bf..ba4551d619 100644
--- a/synapse/push/emailpusher.py
+++ b/synapse/push/emailpusher.py
@@ -204,7 +204,7 @@ class EmailPusher(object):
                 yield self.send_notification(unprocessed, reason)
 
                 yield self.save_last_stream_ordering_and_success(
-                    max([ea["stream_ordering"] for ea in unprocessed])
+                    max(ea["stream_ordering"] for ea in unprocessed)
                 )
 
                 # we update the throttle on all the possible unprocessed push actions
diff --git a/synapse/push/mailer.py b/synapse/push/mailer.py
index b13b646bfd..73580c1c6c 100644
--- a/synapse/push/mailer.py
+++ b/synapse/push/mailer.py
@@ -526,12 +526,10 @@ class Mailer(object):
                     # If the room doesn't have a name, say who the messages
                     # are from explicitly to avoid, "messages in the Bob room"
                     sender_ids = list(
-                        set(
-                            [
-                                notif_events[n["event_id"]].sender
-                                for n in notifs_by_room[room_id]
-                            ]
-                        )
+                        {
+                            notif_events[n["event_id"]].sender
+                            for n in notifs_by_room[room_id]
+                        }
                     )
 
                     member_events = yield self.store.get_events(
@@ -557,13 +555,13 @@ class Mailer(object):
             else:
                 # If the reason room doesn't have a name, say who the messages
                 # are from explicitly to avoid, "messages in the Bob room"
+                room_id = reason["room_id"]
+
                 sender_ids = list(
-                    set(
-                        [
-                            notif_events[n["event_id"]].sender
-                            for n in notifs_by_room[reason["room_id"]]
-                        ]
-                    )
+                    {
+                        notif_events[n["event_id"]].sender
+                        for n in notifs_by_room[room_id]
+                    }
                 )
 
                 member_events = yield self.store.get_events(
diff --git a/synapse/push/presentable_names.py b/synapse/push/presentable_names.py
index 16a7e8e31d..0644a13cfc 100644
--- a/synapse/push/presentable_names.py
+++ b/synapse/push/presentable_names.py
@@ -18,6 +18,8 @@ import re
 
 from twisted.internet import defer
 
+from synapse.api.constants import EventTypes
+
 logger = logging.getLogger(__name__)
 
 # intentionally looser than what aliases we allow to be registered since
@@ -50,17 +52,17 @@ def calculate_room_name(
         (string or None) A human readable name for the room.
     """
     # does it have a name?
-    if ("m.room.name", "") in room_state_ids:
+    if (EventTypes.Name, "") in room_state_ids:
         m_room_name = yield store.get_event(
-            room_state_ids[("m.room.name", "")], allow_none=True
+            room_state_ids[(EventTypes.Name, "")], allow_none=True
         )
         if m_room_name and m_room_name.content and m_room_name.content["name"]:
             return m_room_name.content["name"]
 
     # does it have a canonical alias?
-    if ("m.room.canonical_alias", "") in room_state_ids:
+    if (EventTypes.CanonicalAlias, "") in room_state_ids:
         canon_alias = yield store.get_event(
-            room_state_ids[("m.room.canonical_alias", "")], allow_none=True
+            room_state_ids[(EventTypes.CanonicalAlias, "")], allow_none=True
         )
         if (
             canon_alias
@@ -74,32 +76,22 @@ def calculate_room_name(
     # for an event type, so rearrange the data structure
     room_state_bytype_ids = _state_as_two_level_dict(room_state_ids)
 
-    # right then, any aliases at all?
-    if "m.room.aliases" in room_state_bytype_ids:
-        m_room_aliases = room_state_bytype_ids["m.room.aliases"]
-        for alias_id in m_room_aliases.values():
-            alias_event = yield store.get_event(alias_id, allow_none=True)
-            if alias_event and alias_event.content.get("aliases"):
-                the_aliases = alias_event.content["aliases"]
-                if len(the_aliases) > 0 and _looks_like_an_alias(the_aliases[0]):
-                    return the_aliases[0]
-
     if not fallback_to_members:
         return None
 
     my_member_event = None
-    if ("m.room.member", user_id) in room_state_ids:
+    if (EventTypes.Member, user_id) in room_state_ids:
         my_member_event = yield store.get_event(
-            room_state_ids[("m.room.member", user_id)], allow_none=True
+            room_state_ids[(EventTypes.Member, user_id)], allow_none=True
         )
 
     if (
         my_member_event is not None
         and my_member_event.content["membership"] == "invite"
     ):
-        if ("m.room.member", my_member_event.sender) in room_state_ids:
+        if (EventTypes.Member, my_member_event.sender) in room_state_ids:
             inviter_member_event = yield store.get_event(
-                room_state_ids[("m.room.member", my_member_event.sender)],
+                room_state_ids[(EventTypes.Member, my_member_event.sender)],
                 allow_none=True,
             )
             if inviter_member_event:
@@ -114,9 +106,9 @@ def calculate_room_name(
 
     # we're going to have to generate a name based on who's in the room,
     # so find out who is in the room that isn't the user.
-    if "m.room.member" in room_state_bytype_ids:
+    if EventTypes.Member in room_state_bytype_ids:
         member_events = yield store.get_events(
-            list(room_state_bytype_ids["m.room.member"].values())
+            list(room_state_bytype_ids[EventTypes.Member].values())
         )
         all_members = [
             ev
@@ -138,9 +130,9 @@ def calculate_room_name(
             # self-chat, peeked room with 1 participant,
             # or inbound invite, or outbound 3PID invite.
             if all_members[0].sender == user_id:
-                if "m.room.third_party_invite" in room_state_bytype_ids:
+                if EventTypes.ThirdPartyInvite in room_state_bytype_ids:
                     third_party_invites = room_state_bytype_ids[
-                        "m.room.third_party_invite"
+                        EventTypes.ThirdPartyInvite
                     ].values()
 
                     if len(third_party_invites) > 0:
diff --git a/synapse/push/pusherpool.py b/synapse/push/pusherpool.py
index b9dca5bc63..88d203aa44 100644
--- a/synapse/push/pusherpool.py
+++ b/synapse/push/pusherpool.py
@@ -15,11 +15,17 @@
 # limitations under the License.
 
 import logging
+from collections import defaultdict
+from threading import Lock
+from typing import Dict, Tuple, Union
 
 from twisted.internet import defer
 
+from synapse.metrics import LaterGauge
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.push import PusherConfigException
+from synapse.push.emailpusher import EmailPusher
+from synapse.push.httppusher import HttpPusher
 from synapse.push.pusher import PusherFactory
 from synapse.util.async_helpers import concurrently_execute
 
@@ -47,7 +53,29 @@ class PusherPool:
         self._should_start_pushers = _hs.config.start_pushers
         self.store = self.hs.get_datastore()
         self.clock = self.hs.get_clock()
-        self.pushers = {}
+
+        # map from user id to app_id:pushkey to pusher
+        self.pushers = {}  # type: Dict[str, Dict[str, Union[HttpPusher, EmailPusher]]]
+
+        # a lock for the pushers dict, since `count_pushers` is called from an different
+        # and we otherwise get concurrent modification errors
+        self._pushers_lock = Lock()
+
+        def count_pushers():
+            results = defaultdict(int)  # type: Dict[Tuple[str, str], int]
+            with self._pushers_lock:
+                for pushers in self.pushers.values():
+                    for pusher in pushers.values():
+                        k = (type(pusher).__name__, pusher.app_id)
+                        results[k] += 1
+            return results
+
+        LaterGauge(
+            name="synapse_pushers",
+            desc="the number of active pushers",
+            labels=["kind", "app_id"],
+            caller=count_pushers,
+        )
 
     def start(self):
         """Starts the pushers off in a background process.
@@ -191,7 +219,7 @@ class PusherPool:
                 min_stream_id - 1, max_stream_id
             )
             # This returns a tuple, user_id is at index 3
-            users_affected = set([r[3] for r in updated_receipts])
+            users_affected = {r[3] for r in updated_receipts}
 
             for u in users_affected:
                 if u in self.pushers:
@@ -271,11 +299,12 @@ class PusherPool:
             return
 
         appid_pushkey = "%s:%s" % (pusherdict["app_id"], pusherdict["pushkey"])
-        byuser = self.pushers.setdefault(pusherdict["user_name"], {})
 
-        if appid_pushkey in byuser:
-            byuser[appid_pushkey].on_stop()
-        byuser[appid_pushkey] = p
+        with self._pushers_lock:
+            byuser = self.pushers.setdefault(pusherdict["user_name"], {})
+            if appid_pushkey in byuser:
+                byuser[appid_pushkey].on_stop()
+            byuser[appid_pushkey] = p
 
         # Check if there *may* be push to process. We do this as this check is a
         # lot cheaper to do than actually fetching the exact rows we need to
@@ -304,7 +333,9 @@ class PusherPool:
         if appid_pushkey in byuser:
             logger.info("Stopping pusher %s / %s", user_id, appid_pushkey)
             byuser[appid_pushkey].on_stop()
-            del byuser[appid_pushkey]
+            with self._pushers_lock:
+                del byuser[appid_pushkey]
+
         yield self.store.delete_pusher_by_app_id_pushkey_user_id(
             app_id, pushkey, user_id
         )
diff --git a/synapse/replication/http/_base.py b/synapse/replication/http/_base.py
index 444eb7b7f4..1be1ccbdf3 100644
--- a/synapse/replication/http/_base.py
+++ b/synapse/replication/http/_base.py
@@ -44,7 +44,7 @@ class ReplicationEndpoint(object):
     """Helper base class for defining new replication HTTP endpoints.
 
     This creates an endpoint under `/_synapse/replication/:NAME/:PATH_ARGS..`
-    (with an `/:txn_id` prefix for cached requests.), where NAME is a name,
+    (with a `/:txn_id` suffix for cached requests), where NAME is a name,
     PATH_ARGS are a tuple of parameters to be encoded in the URL.
 
     For example, if `NAME` is "send_event" and `PATH_ARGS` is `("event_id",)`,
diff --git a/synapse/replication/http/federation.py b/synapse/replication/http/federation.py
index 49a3251372..7e23b565b9 100644
--- a/synapse/replication/http/federation.py
+++ b/synapse/replication/http/federation.py
@@ -17,7 +17,8 @@ import logging
 
 from twisted.internet import defer
 
-from synapse.events import event_type_from_format_version
+from synapse.api.room_versions import KNOWN_ROOM_VERSIONS
+from synapse.events import make_event_from_dict
 from synapse.events.snapshot import EventContext
 from synapse.http.servlet import parse_json_object_from_request
 from synapse.replication.http._base import ReplicationEndpoint
@@ -37,6 +38,9 @@ class ReplicationFederationSendEventsRestServlet(ReplicationEndpoint):
         {
             "events": [{
                 "event": { .. serialized event .. },
+                "room_version": .., // "1", "2", "3", etc: the version of the room
+                                    // containing the event
+                "event_format_version": .., // 1,2,3 etc: the event format version
                 "internal_metadata": { .. serialized internal_metadata .. },
                 "rejected_reason": ..,   // The event.rejected_reason field
                 "context": { .. serialized event context .. },
@@ -72,6 +76,7 @@ class ReplicationFederationSendEventsRestServlet(ReplicationEndpoint):
             event_payloads.append(
                 {
                     "event": event.get_pdu_json(),
+                    "room_version": event.room_version.identifier,
                     "event_format_version": event.format_version,
                     "internal_metadata": event.internal_metadata.get_dict(),
                     "rejected_reason": event.rejected_reason,
@@ -94,12 +99,13 @@ class ReplicationFederationSendEventsRestServlet(ReplicationEndpoint):
             event_and_contexts = []
             for event_payload in event_payloads:
                 event_dict = event_payload["event"]
-                format_ver = event_payload["event_format_version"]
+                room_ver = KNOWN_ROOM_VERSIONS[event_payload["room_version"]]
                 internal_metadata = event_payload["internal_metadata"]
                 rejected_reason = event_payload["rejected_reason"]
 
-                EventType = event_type_from_format_version(format_ver)
-                event = EventType(event_dict, internal_metadata, rejected_reason)
+                event = make_event_from_dict(
+                    event_dict, room_ver, internal_metadata, rejected_reason
+                )
 
                 context = EventContext.deserialize(
                     self.storage, event_payload["context"]
@@ -211,7 +217,7 @@ class ReplicationCleanRoomRestServlet(ReplicationEndpoint):
 
     Request format:
 
-        POST /_synapse/replication/fed_query/:fed_cleanup_room/:txn_id
+        POST /_synapse/replication/fed_cleanup_room/:room_id/:txn_id
 
         {}
     """
@@ -238,8 +244,41 @@ class ReplicationCleanRoomRestServlet(ReplicationEndpoint):
         return 200, {}
 
 
+class ReplicationStoreRoomOnInviteRestServlet(ReplicationEndpoint):
+    """Called to clean up any data in DB for a given room, ready for the
+    server to join the room.
+
+    Request format:
+
+        POST /_synapse/replication/store_room_on_invite/:room_id/:txn_id
+
+        {
+            "room_version": "1",
+        }
+    """
+
+    NAME = "store_room_on_invite"
+    PATH_ARGS = ("room_id",)
+
+    def __init__(self, hs):
+        super().__init__(hs)
+
+        self.store = hs.get_datastore()
+
+    @staticmethod
+    def _serialize_payload(room_id, room_version):
+        return {"room_version": room_version.identifier}
+
+    async def _handle_request(self, request, room_id):
+        content = parse_json_object_from_request(request)
+        room_version = KNOWN_ROOM_VERSIONS[content["room_version"]]
+        await self.store.maybe_store_room_on_invite(room_id, room_version)
+        return 200, {}
+
+
 def register_servlets(hs, http_server):
     ReplicationFederationSendEventsRestServlet(hs).register(http_server)
     ReplicationFederationSendEduRestServlet(hs).register(http_server)
     ReplicationGetQueryRestServlet(hs).register(http_server)
     ReplicationCleanRoomRestServlet(hs).register(http_server)
+    ReplicationStoreRoomOnInviteRestServlet(hs).register(http_server)
diff --git a/synapse/replication/http/send_event.py b/synapse/replication/http/send_event.py
index 84b92f16ad..b74b088ff4 100644
--- a/synapse/replication/http/send_event.py
+++ b/synapse/replication/http/send_event.py
@@ -17,7 +17,8 @@ import logging
 
 from twisted.internet import defer
 
-from synapse.events import event_type_from_format_version
+from synapse.api.room_versions import KNOWN_ROOM_VERSIONS
+from synapse.events import make_event_from_dict
 from synapse.events.snapshot import EventContext
 from synapse.http.servlet import parse_json_object_from_request
 from synapse.replication.http._base import ReplicationEndpoint
@@ -37,6 +38,9 @@ class ReplicationSendEventRestServlet(ReplicationEndpoint):
 
         {
             "event": { .. serialized event .. },
+            "room_version": .., // "1", "2", "3", etc: the version of the room
+                                // containing the event
+            "event_format_version": .., // 1,2,3 etc: the event format version
             "internal_metadata": { .. serialized internal_metadata .. },
             "rejected_reason": ..,   // The event.rejected_reason field
             "context": { .. serialized event context .. },
@@ -77,6 +81,7 @@ class ReplicationSendEventRestServlet(ReplicationEndpoint):
 
         payload = {
             "event": event.get_pdu_json(),
+            "room_version": event.room_version.identifier,
             "event_format_version": event.format_version,
             "internal_metadata": event.internal_metadata.get_dict(),
             "rejected_reason": event.rejected_reason,
@@ -93,12 +98,13 @@ class ReplicationSendEventRestServlet(ReplicationEndpoint):
             content = parse_json_object_from_request(request)
 
             event_dict = content["event"]
-            format_ver = content["event_format_version"]
+            room_ver = KNOWN_ROOM_VERSIONS[content["room_version"]]
             internal_metadata = content["internal_metadata"]
             rejected_reason = content["rejected_reason"]
 
-            EventType = event_type_from_format_version(format_ver)
-            event = EventType(event_dict, internal_metadata, rejected_reason)
+            event = make_event_from_dict(
+                event_dict, room_ver, internal_metadata, rejected_reason
+            )
 
             requester = Requester.deserialize(self.store, content["requester"])
             context = EventContext.deserialize(self.storage, content["context"])
diff --git a/synapse/replication/slave/storage/events.py b/synapse/replication/slave/storage/events.py
index 3aa6cb8b96..e73342c657 100644
--- a/synapse/replication/slave/storage/events.py
+++ b/synapse/replication/slave/storage/events.py
@@ -32,6 +32,7 @@ from synapse.storage.data_stores.main.state import StateGroupWorkerStore
 from synapse.storage.data_stores.main.stream import StreamWorkerStore
 from synapse.storage.data_stores.main.user_erasure_store import UserErasureWorkerStore
 from synapse.storage.database import Database
+from synapse.util.caches.stream_change_cache import StreamChangeCache
 
 from ._base import BaseSlavedStore
 from ._slaved_id_tracker import SlavedIdTracker
@@ -68,6 +69,21 @@ class SlavedEventStore(
 
         super(SlavedEventStore, self).__init__(database, db_conn, hs)
 
+        events_max = self._stream_id_gen.get_current_token()
+        curr_state_delta_prefill, min_curr_state_delta_id = self.db.get_cache_dict(
+            db_conn,
+            "current_state_delta_stream",
+            entity_column="room_id",
+            stream_column="stream_id",
+            max_value=events_max,  # As we share the stream id with events token
+            limit=1000,
+        )
+        self._curr_state_delta_stream_cache = StreamChangeCache(
+            "_curr_state_delta_stream_cache",
+            min_curr_state_delta_id,
+            prefilled_cache=curr_state_delta_prefill,
+        )
+
     # Cached functions can't be accessed through a class instance so we need
     # to reach inside the __dict__ to extract them.
 
@@ -120,6 +136,10 @@ class SlavedEventStore(
                 backfilled=False,
             )
         elif row.type == EventsStreamCurrentStateRow.TypeId:
+            self._curr_state_delta_stream_cache.entity_has_changed(
+                row.data.room_id, token
+            )
+
             if data.type == EventTypes.Member:
                 self.get_rooms_for_user_with_stream_ordering.invalidate(
                     (data.state_key,)
diff --git a/synapse/replication/tcp/resource.py b/synapse/replication/tcp/resource.py
index ce60ae2e07..ce9d1fae12 100644
--- a/synapse/replication/tcp/resource.py
+++ b/synapse/replication/tcp/resource.py
@@ -323,7 +323,11 @@ class ReplicationStreamer(object):
 
         # We need to tell the presence handler that the connection has been
         # lost so that it can handle any ongoing syncs on that connection.
-        self.presence_handler.update_external_syncs_clear(connection.conn_id)
+        run_as_background_process(
+            "update_external_syncs_clear",
+            self.presence_handler.update_external_syncs_clear,
+            connection.conn_id,
+        )
 
 
 def _batch_updates(updates):
diff --git a/synapse/replication/tcp/streams/_base.py b/synapse/replication/tcp/streams/_base.py
index a8d568b14a..208e8a667b 100644
--- a/synapse/replication/tcp/streams/_base.py
+++ b/synapse/replication/tcp/streams/_base.py
@@ -24,7 +24,7 @@ import attr
 logger = logging.getLogger(__name__)
 
 
-MAX_EVENTS_BEHIND = 10000
+MAX_EVENTS_BEHIND = 500000
 
 BackfillStreamRow = namedtuple(
     "BackfillStreamRow",
diff --git a/synapse/res/templates/saml_error.html b/synapse/res/templates/saml_error.html
new file mode 100644
index 0000000000..bfd6449c5d
--- /dev/null
+++ b/synapse/res/templates/saml_error.html
@@ -0,0 +1,45 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <title>SSO error</title>
+</head>
+<body>
+    <p>Oops! Something went wrong during authentication<span id="errormsg"></span>.</p>
+    <p>
+        If you are seeing this page after clicking a link sent to you via email, make
+        sure you only click the confirmation link once, and that you open the
+        validation link in the same client you're logging in from.
+    </p>
+    <p>
+        Try logging in again from your Matrix client and if the problem persists
+        please contact the server's administrator.
+    </p>
+
+    <script type="text/javascript">
+        // Error handling to support Auth0 errors that we might get through a GET request
+        // to the validation endpoint. If an error is provided, it's either going to be
+        // located in the query string or in a query string-like URI fragment.
+        // We try to locate the error from any of these two locations, but if we can't
+        // we just don't print anything specific.
+        let searchStr = "";
+        if (window.location.search) {
+            // window.location.searchParams isn't always defined when
+            // window.location.search is, so it's more reliable to parse the latter.
+            searchStr = window.location.search;
+        } else if (window.location.hash) {
+            // Replace the # with a ? so that URLSearchParams does the right thing and
+            // doesn't parse the first parameter incorrectly.
+            searchStr = window.location.hash.replace("#", "?");
+        }
+
+        // We might end up with no error in the URL, so we need to check if we have one
+        // to print one.
+        let errorDesc = new URLSearchParams(searchStr).get("error_description")
+        if (errorDesc) {
+
+            document.getElementById("errormsg").innerText = ` ("${errorDesc}")`;
+        }
+    </script>
+</body>
+</html>
\ No newline at end of file
diff --git a/synapse/rest/admin/_base.py b/synapse/rest/admin/_base.py
index 459482eb6d..a96f75ce26 100644
--- a/synapse/rest/admin/_base.py
+++ b/synapse/rest/admin/_base.py
@@ -29,7 +29,7 @@ def historical_admin_path_patterns(path_regex):
     Note that this should only be used for existing endpoints: new ones should just
     register for the /_synapse/admin path.
     """
-    return list(
+    return [
         re.compile(prefix + path_regex)
         for prefix in (
             "^/_synapse/admin/v1",
@@ -37,7 +37,7 @@ def historical_admin_path_patterns(path_regex):
             "^/_matrix/client/unstable/admin",
             "^/_matrix/client/r0/admin",
         )
-    )
+    ]
 
 
 def admin_patterns(path_regex: str):
diff --git a/synapse/rest/admin/users.py b/synapse/rest/admin/users.py
index 064908fbb0..8551ac19b8 100644
--- a/synapse/rest/admin/users.py
+++ b/synapse/rest/admin/users.py
@@ -221,18 +221,22 @@ class UserRestServletV2(RestServlet):
                     raise SynapseError(400, "Invalid password")
                 else:
                     new_password = body["password"]
+                    logout_devices = True
                     await self.set_password_handler.set_password(
-                        target_user.to_string(), new_password, requester
+                        target_user.to_string(), new_password, logout_devices, requester
                     )
 
             if "deactivated" in body:
-                deactivate = bool(body["deactivated"])
+                deactivate = body["deactivated"]
+                if not isinstance(deactivate, bool):
+                    raise SynapseError(
+                        400, "'deactivated' parameter is not of type boolean"
+                    )
+
                 if deactivate and not user["deactivated"]:
-                    result = await self.deactivate_account_handler.deactivate_account(
+                    await self.deactivate_account_handler.deactivate_account(
                         target_user.to_string(), False
                     )
-                    if not result:
-                        raise SynapseError(500, "Could not deactivate user")
 
             user = await self.admin_handler.get_user(target_user)
             return 200, user
@@ -533,9 +537,10 @@ class ResetPasswordRestServlet(RestServlet):
         params = parse_json_object_from_request(request)
         assert_params_in_dict(params, ["new_password"])
         new_password = params["new_password"]
+        logout_devices = params.get("logout_devices", True)
 
         await self._set_password_handler.set_password(
-            target_user_id, new_password, requester
+            target_user_id, new_password, logout_devices, requester
         )
         return 200, {}
 
diff --git a/synapse/rest/client/v1/push_rule.py b/synapse/rest/client/v1/push_rule.py
index 4f74600239..9fd4908136 100644
--- a/synapse/rest/client/v1/push_rule.py
+++ b/synapse/rest/client/v1/push_rule.py
@@ -49,7 +49,7 @@ class PushRuleRestServlet(RestServlet):
         if self._is_worker:
             raise Exception("Cannot handle PUT /push_rules on worker")
 
-        spec = _rule_spec_from_path([x for x in path.split("/")])
+        spec = _rule_spec_from_path(path.split("/"))
         try:
             priority_class = _priority_class_from_spec(spec)
         except InvalidRuleException as e:
@@ -110,7 +110,7 @@ class PushRuleRestServlet(RestServlet):
         if self._is_worker:
             raise Exception("Cannot handle DELETE /push_rules on worker")
 
-        spec = _rule_spec_from_path([x for x in path.split("/")])
+        spec = _rule_spec_from_path(path.split("/"))
 
         requester = await self.auth.get_user_by_req(request)
         user_id = requester.user.to_string()
@@ -138,7 +138,7 @@ class PushRuleRestServlet(RestServlet):
 
         rules = format_push_rules_for_user(requester.user, rules)
 
-        path = [x for x in path.split("/")][1:]
+        path = path.split("/")[1:]
 
         if path == []:
             # we're a reference impl: pedantry is our job.
diff --git a/synapse/rest/client/v1/pusher.py b/synapse/rest/client/v1/pusher.py
index 6f6b7aed6e..550a2f1b44 100644
--- a/synapse/rest/client/v1/pusher.py
+++ b/synapse/rest/client/v1/pusher.py
@@ -54,9 +54,9 @@ class PushersRestServlet(RestServlet):
 
         pushers = await self.hs.get_datastore().get_pushers_by_user_id(user.to_string())
 
-        filtered_pushers = list(
+        filtered_pushers = [
             {k: v for k, v in p.items() if k in ALLOWED_KEYS} for p in pushers
-        )
+        ]
 
         return 200, {"pushers": filtered_pushers}
 
diff --git a/synapse/rest/client/v1/room.py b/synapse/rest/client/v1/room.py
index 64f51406fb..bffd43de5f 100644
--- a/synapse/rest/client/v1/room.py
+++ b/synapse/rest/client/v1/room.py
@@ -189,12 +189,6 @@ class RoomStateEventRestServlet(TransactionRestServlet):
 
         content = parse_json_object_from_request(request)
 
-        if event_type == EventTypes.Aliases:
-            # MSC2260
-            raise SynapseError(
-                400, "Cannot send m.room.aliases events via /rooms/{room_id}/state"
-            )
-
         event_dict = {
             "type": event_type,
             "content": content,
@@ -242,12 +236,6 @@ class RoomSendEventRestServlet(TransactionRestServlet):
         requester = await self.auth.get_user_by_req(request, allow_guest=True)
         content = parse_json_object_from_request(request)
 
-        if event_type == EventTypes.Aliases:
-            # MSC2260
-            raise SynapseError(
-                400, "Cannot send m.room.aliases events via /rooms/{room_id}/send"
-            )
-
         event_dict = {
             "type": event_type,
             "content": content,
diff --git a/synapse/rest/client/v2_alpha/account.py b/synapse/rest/client/v2_alpha/account.py
index dc837d6c75..631cc74cb4 100644
--- a/synapse/rest/client/v2_alpha/account.py
+++ b/synapse/rest/client/v2_alpha/account.py
@@ -265,8 +265,11 @@ class PasswordRestServlet(RestServlet):
 
         assert_params_in_dict(params, ["new_password"])
         new_password = params["new_password"]
+        logout_devices = params.get("logout_devices", True)
 
-        await self._set_password_handler.set_password(user_id, new_password, requester)
+        await self._set_password_handler.set_password(
+            user_id, new_password, logout_devices, requester
+        )
 
         return 200, {}
 
diff --git a/synapse/rest/client/v2_alpha/sync.py b/synapse/rest/client/v2_alpha/sync.py
index d8292ce29f..8fa68dd37f 100644
--- a/synapse/rest/client/v2_alpha/sync.py
+++ b/synapse/rest/client/v2_alpha/sync.py
@@ -72,7 +72,7 @@ class SyncRestServlet(RestServlet):
     """
 
     PATTERNS = client_patterns("/sync$")
-    ALLOWED_PRESENCE = set(["online", "offline", "unavailable"])
+    ALLOWED_PRESENCE = {"online", "offline", "unavailable"}
 
     def __init__(self, hs):
         super(SyncRestServlet, self).__init__()
diff --git a/synapse/rest/key/v2/remote_key_resource.py b/synapse/rest/key/v2/remote_key_resource.py
index 9d6813a047..ab671f7334 100644
--- a/synapse/rest/key/v2/remote_key_resource.py
+++ b/synapse/rest/key/v2/remote_key_resource.py
@@ -18,8 +18,6 @@ from typing import Dict, Set
 from canonicaljson import encode_canonical_json, json
 from signedjson.sign import sign_json
 
-from twisted.internet import defer
-
 from synapse.api.errors import Codes, SynapseError
 from synapse.crypto.keyring import ServerKeyFetcher
 from synapse.http.server import (
@@ -125,8 +123,7 @@ class RemoteKey(DirectServeResource):
 
         await self.query_keys(request, query, query_remote_on_cache_miss=True)
 
-    @defer.inlineCallbacks
-    def query_keys(self, request, query, query_remote_on_cache_miss=False):
+    async def query_keys(self, request, query, query_remote_on_cache_miss=False):
         logger.info("Handling query for keys %r", query)
 
         store_queries = []
@@ -143,13 +140,13 @@ class RemoteKey(DirectServeResource):
             for key_id in key_ids:
                 store_queries.append((server_name, key_id, None))
 
-        cached = yield self.store.get_server_keys_json(store_queries)
+        cached = await self.store.get_server_keys_json(store_queries)
 
         json_results = set()
 
         time_now_ms = self.clock.time_msec()
 
-        cache_misses = dict()  # type: Dict[str, Set[str]]
+        cache_misses = {}  # type: Dict[str, Set[str]]
         for (server_name, key_id, from_server), results in cached.items():
             results = [(result["ts_added_ms"], result) for result in results]
 
@@ -215,8 +212,8 @@ class RemoteKey(DirectServeResource):
                     json_results.add(bytes(result["key_json"]))
 
         if cache_misses and query_remote_on_cache_miss:
-            yield self.fetcher.get_keys(cache_misses)
-            yield self.query_keys(request, query, query_remote_on_cache_miss=False)
+            await self.fetcher.get_keys(cache_misses)
+            await self.query_keys(request, query, query_remote_on_cache_miss=False)
         else:
             signed_keys = []
             for key_json in json_results:
diff --git a/synapse/rest/media/v1/_base.py b/synapse/rest/media/v1/_base.py
index 65bbf00073..503f2bed98 100644
--- a/synapse/rest/media/v1/_base.py
+++ b/synapse/rest/media/v1/_base.py
@@ -30,6 +30,22 @@ from synapse.util.stringutils import is_ascii
 
 logger = logging.getLogger(__name__)
 
+# list all text content types that will have the charset default to UTF-8 when
+# none is given
+TEXT_CONTENT_TYPES = [
+    "text/css",
+    "text/csv",
+    "text/html",
+    "text/calendar",
+    "text/plain",
+    "text/javascript",
+    "application/json",
+    "application/ld+json",
+    "application/rtf",
+    "image/svg+xml",
+    "text/xml",
+]
+
 
 def parse_media_id(request):
     try:
@@ -96,7 +112,14 @@ def add_file_headers(request, media_type, file_size, upload_name):
     def _quote(x):
         return urllib.parse.quote(x.encode("utf-8"))
 
-    request.setHeader(b"Content-Type", media_type.encode("UTF-8"))
+    # Default to a UTF-8 charset for text content types.
+    # ex, uses UTF-8 for 'text/css' but not 'text/css; charset=UTF-16'
+    if media_type.lower() in TEXT_CONTENT_TYPES:
+        content_type = media_type + "; charset=UTF-8"
+    else:
+        content_type = media_type
+
+    request.setHeader(b"Content-Type", content_type.encode("UTF-8"))
     if upload_name:
         # RFC6266 section 4.1 [1] defines both `filename` and `filename*`.
         #
@@ -135,27 +158,25 @@ def add_file_headers(request, media_type, file_size, upload_name):
 
 # separators as defined in RFC2616. SP and HT are handled separately.
 # see _can_encode_filename_as_token.
-_FILENAME_SEPARATOR_CHARS = set(
-    (
-        "(",
-        ")",
-        "<",
-        ">",
-        "@",
-        ",",
-        ";",
-        ":",
-        "\\",
-        '"',
-        "/",
-        "[",
-        "]",
-        "?",
-        "=",
-        "{",
-        "}",
-    )
-)
+_FILENAME_SEPARATOR_CHARS = {
+    "(",
+    ")",
+    "<",
+    ">",
+    "@",
+    ",",
+    ";",
+    ":",
+    "\\",
+    '"',
+    "/",
+    "[",
+    "]",
+    "?",
+    "=",
+    "{",
+    "}",
+}
 
 
 def _can_encode_filename_as_token(x):
diff --git a/synapse/rest/saml2/response_resource.py b/synapse/rest/saml2/response_resource.py
index 69ecc5e4b4..a545c13db7 100644
--- a/synapse/rest/saml2/response_resource.py
+++ b/synapse/rest/saml2/response_resource.py
@@ -14,7 +14,11 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from synapse.http.server import DirectServeResource, wrap_html_request_handler
+from synapse.http.server import (
+    DirectServeResource,
+    finish_request,
+    wrap_html_request_handler,
+)
 
 
 class SAML2ResponseResource(DirectServeResource):
@@ -24,8 +28,20 @@ class SAML2ResponseResource(DirectServeResource):
 
     def __init__(self, hs):
         super().__init__()
+        self._error_html_content = hs.config.saml2_error_html_content
         self._saml_handler = hs.get_saml_handler()
 
+    async def _async_render_GET(self, request):
+        # We're not expecting any GET request on that resource if everything goes right,
+        # but some IdPs sometimes end up responding with a 302 redirect on this endpoint.
+        # In this case, just tell the user that something went wrong and they should
+        # try to authenticate again.
+        request.setResponseCode(400)
+        request.setHeader(b"Content-Type", b"text/html; charset=utf-8")
+        request.setHeader(b"Content-Length", b"%d" % (len(self._error_html_content),))
+        request.write(self._error_html_content.encode("utf8"))
+        finish_request(request)
+
     @wrap_html_request_handler
     async def _async_render_POST(self, request):
         return await self._saml_handler.handle_saml_response(request)
diff --git a/synapse/server.py b/synapse/server.py
index fd2f69e928..1b980371de 100644
--- a/synapse/server.py
+++ b/synapse/server.py
@@ -26,7 +26,6 @@ import logging
 import os
 
 from twisted.mail.smtp import sendmail
-from twisted.web.client import BrowserLikePolicyForHTTPS
 
 from synapse.api.auth import Auth
 from synapse.api.filtering import Filtering
@@ -35,6 +34,7 @@ from synapse.appservice.api import ApplicationServiceApi
 from synapse.appservice.scheduler import ApplicationServiceScheduler
 from synapse.config.homeserver import HomeServerConfig
 from synapse.crypto import context_factory
+from synapse.crypto.context_factory import RegularPolicyForHTTPS
 from synapse.crypto.keyring import Keyring
 from synapse.events.builder import EventBuilderFactory
 from synapse.events.spamcheck import SpamChecker
@@ -310,7 +310,7 @@ class HomeServer(object):
         return (
             InsecureInterceptableContextFactory()
             if self.config.use_insecure_ssl_client_just_for_testing_do_not_use
-            else BrowserLikePolicyForHTTPS()
+            else RegularPolicyForHTTPS()
         )
 
     def build_simple_http_client(self):
@@ -420,7 +420,7 @@ class HomeServer(object):
         return PusherPool(self)
 
     def build_http_client(self):
-        tls_client_options_factory = context_factory.ClientTLSOptionsFactory(
+        tls_client_options_factory = context_factory.FederationPolicyForHTTPS(
             self.config
         )
         return MatrixFederationHttpClient(self, tls_client_options_factory)
diff --git a/synapse/server.pyi b/synapse/server.pyi
index 40eabfe5d9..3844f0e12f 100644
--- a/synapse/server.pyi
+++ b/synapse/server.pyi
@@ -3,6 +3,7 @@ import twisted.internet
 import synapse.api.auth
 import synapse.config.homeserver
 import synapse.crypto.keyring
+import synapse.federation.federation_server
 import synapse.federation.sender
 import synapse.federation.transport.client
 import synapse.handlers
@@ -107,5 +108,9 @@ class HomeServer(object):
         self,
     ) -> synapse.replication.tcp.client.ReplicationClientHandler:
         pass
+    def get_federation_registry(
+        self,
+    ) -> synapse.federation.federation_server.FederationHandlerRegistry:
+        pass
     def is_mine_id(self, domain_id: str) -> bool:
         pass
diff --git a/synapse/state/__init__.py b/synapse/state/__init__.py
index fdd6bef6b4..4afefc6b1d 100644
--- a/synapse/state/__init__.py
+++ b/synapse/state/__init__.py
@@ -16,7 +16,7 @@
 
 import logging
 from collections import namedtuple
-from typing import Dict, Iterable, List, Optional
+from typing import Dict, Iterable, List, Optional, Set
 
 from six import iteritems, itervalues
 
@@ -662,23 +662,16 @@ class StateResolutionStore(object):
             allow_rejected=allow_rejected,
         )
 
-    def get_auth_chain(self, event_ids):
-        """Gets the full auth chain for a set of events (including rejected
-        events).
+    def get_auth_chain_difference(self, state_sets: List[Set[str]]):
+        """Given sets of state events figure out the auth chain difference (as
+        per state res v2 algorithm).
 
-        Includes the given event IDs in the result.
-
-        Note that:
-            1. All events must be state events.
-            2. For v1 rooms this may not have the full auth chain in the
-               presence of rejected events
-
-        Args:
-            event_ids (list): The event IDs of the events to fetch the auth
-                chain for. Must be state events.
+        This equivalent to fetching the full auth chain for each set of state
+        and returning the events that don't appear in each and every auth
+        chain.
 
         Returns:
-            Deferred[list[str]]: List of event IDs of the auth chain.
+            Deferred[Set[str]]: Set of event IDs.
         """
 
-        return self.store.get_auth_chain_ids(event_ids, include_given=True)
+        return self.store.get_auth_chain_difference(state_sets)
diff --git a/synapse/state/v1.py b/synapse/state/v1.py
index 24b7c0faef..9bf98d06f2 100644
--- a/synapse/state/v1.py
+++ b/synapse/state/v1.py
@@ -69,9 +69,9 @@ def resolve_events_with_store(
 
     unconflicted_state, conflicted_state = _seperate(state_sets)
 
-    needed_events = set(
+    needed_events = {
         event_id for event_ids in itervalues(conflicted_state) for event_id in event_ids
-    )
+    }
     needed_event_count = len(needed_events)
     if event_map is not None:
         needed_events -= set(iterkeys(event_map))
@@ -261,11 +261,11 @@ def _resolve_state_events(conflicted_state, auth_events):
 
 
 def _resolve_auth_events(events, auth_events):
-    reverse = [i for i in reversed(_ordered_events(events))]
+    reverse = list(reversed(_ordered_events(events)))
 
-    auth_keys = set(
+    auth_keys = {
         key for event in events for key in event_auth.auth_types_for_event(event)
-    )
+    }
 
     new_auth_events = {}
     for key in auth_keys:
diff --git a/synapse/state/v2.py b/synapse/state/v2.py
index 531018c6a5..18484e2fa6 100644
--- a/synapse/state/v2.py
+++ b/synapse/state/v2.py
@@ -105,7 +105,7 @@ def resolve_events_with_store(
                 % (room_id, event.event_id, event.room_id,)
             )
 
-    full_conflicted_set = set(eid for eid in full_conflicted_set if eid in event_map)
+    full_conflicted_set = {eid for eid in full_conflicted_set if eid in event_map}
 
     logger.debug("%d full_conflicted_set entries", len(full_conflicted_set))
 
@@ -227,36 +227,12 @@ def _get_auth_chain_difference(state_sets, event_map, state_res_store):
     Returns:
         Deferred[set[str]]: Set of event IDs
     """
-    common = set(itervalues(state_sets[0])).intersection(
-        *(itervalues(s) for s in state_sets[1:])
-    )
-
-    auth_sets = []
-    for state_set in state_sets:
-        auth_ids = set(
-            eid
-            for key, eid in iteritems(state_set)
-            if (
-                key[0] in (EventTypes.Member, EventTypes.ThirdPartyInvite)
-                or key
-                in (
-                    (EventTypes.PowerLevels, ""),
-                    (EventTypes.Create, ""),
-                    (EventTypes.JoinRules, ""),
-                )
-            )
-            and eid not in common
-        )
-
-        auth_chain = yield state_res_store.get_auth_chain(auth_ids)
-        auth_ids.update(auth_chain)
 
-        auth_sets.append(auth_ids)
-
-    intersection = set(auth_sets[0]).intersection(*auth_sets[1:])
-    union = set().union(*auth_sets)
+    difference = yield state_res_store.get_auth_chain_difference(
+        [set(state_set.values()) for state_set in state_sets]
+    )
 
-    return union - intersection
+    return difference
 
 
 def _seperate(state_sets):
@@ -275,7 +251,7 @@ def _seperate(state_sets):
     conflicted_state = {}
 
     for key in set(itertools.chain.from_iterable(state_sets)):
-        event_ids = set(state_set.get(key) for state_set in state_sets)
+        event_ids = {state_set.get(key) for state_set in state_sets}
         if len(event_ids) == 1:
             unconflicted_state[key] = event_ids.pop()
         else:
diff --git a/synapse/storage/_base.py b/synapse/storage/_base.py
index da3b99f93d..13de5f1f62 100644
--- a/synapse/storage/_base.py
+++ b/synapse/storage/_base.py
@@ -56,7 +56,7 @@ class SQLBaseStore(metaclass=ABCMeta):
             members_changed (iterable[str]): The user_ids of members that have
                 changed
         """
-        for host in set(get_domain_from_id(u) for u in members_changed):
+        for host in {get_domain_from_id(u) for u in members_changed}:
             self._attempt_to_invalidate_cache("is_host_joined", (room_id, host))
             self._attempt_to_invalidate_cache("was_host_joined", (room_id, host))
 
diff --git a/synapse/storage/background_updates.py b/synapse/storage/background_updates.py
index bd547f35cf..eb1a7e5002 100644
--- a/synapse/storage/background_updates.py
+++ b/synapse/storage/background_updates.py
@@ -189,7 +189,7 @@ class BackgroundUpdater(object):
                 keyvalues=None,
                 retcols=("update_name", "depends_on"),
             )
-            in_flight = set(update["update_name"] for update in updates)
+            in_flight = {update["update_name"] for update in updates}
             for update in updates:
                 if update["depends_on"] not in in_flight:
                     self._background_update_queue.append(update["update_name"])
diff --git a/synapse/storage/data_stores/main/__init__.py b/synapse/storage/data_stores/main/__init__.py
index 2700cca822..acca079f23 100644
--- a/synapse/storage/data_stores/main/__init__.py
+++ b/synapse/storage/data_stores/main/__init__.py
@@ -20,6 +20,7 @@ import logging
 import time
 
 from synapse.api.constants import PresenceState
+from synapse.config.homeserver import HomeServerConfig
 from synapse.storage.database import Database
 from synapse.storage.engines import PostgresEngine
 from synapse.storage.util.id_generators import (
@@ -117,16 +118,6 @@ class DataStore(
         self._clock = hs.get_clock()
         self.database_engine = database.engine
 
-        all_users_native = are_all_users_on_domain(
-            db_conn.cursor(), database.engine, hs.hostname
-        )
-        if not all_users_native:
-            raise Exception(
-                "Found users in database not native to %s!\n"
-                "You cannot changed a synapse server_name after it's been configured"
-                % (hs.hostname,)
-            )
-
         self._stream_id_gen = StreamIdGenerator(
             db_conn,
             "events",
@@ -567,13 +558,26 @@ class DataStore(
         )
 
 
-def are_all_users_on_domain(txn, database_engine, domain):
+def check_database_before_upgrade(cur, database_engine, config: HomeServerConfig):
+    """Called before upgrading an existing database to check that it is broadly sane
+    compared with the configuration.
+    """
+    domain = config.server_name
+
     sql = database_engine.convert_param_style(
         "SELECT COUNT(*) FROM users WHERE name NOT LIKE ?"
     )
     pat = "%:" + domain
-    txn.execute(sql, (pat,))
-    num_not_matching = txn.fetchall()[0][0]
+    cur.execute(sql, (pat,))
+    num_not_matching = cur.fetchall()[0][0]
     if num_not_matching == 0:
-        return True
-    return False
+        return
+
+    raise Exception(
+        "Found users in database not native to %s!\n"
+        "You cannot changed a synapse server_name after it's been configured"
+        % (domain,)
+    )
+
+
+__all__ = ["DataStore", "check_database_before_upgrade"]
diff --git a/synapse/storage/data_stores/main/appservice.py b/synapse/storage/data_stores/main/appservice.py
index b2f39649fd..efbc06c796 100644
--- a/synapse/storage/data_stores/main/appservice.py
+++ b/synapse/storage/data_stores/main/appservice.py
@@ -135,7 +135,7 @@ class ApplicationServiceTransactionWorkerStore(
             may be empty.
         """
         results = yield self.db.simple_select_list(
-            "application_services_state", dict(state=state), ["as_id"]
+            "application_services_state", {"state": state}, ["as_id"]
         )
         # NB: This assumes this class is linked with ApplicationServiceStore
         as_list = self.get_app_services()
@@ -158,7 +158,7 @@ class ApplicationServiceTransactionWorkerStore(
         """
         result = yield self.db.simple_select_one(
             "application_services_state",
-            dict(as_id=service.id),
+            {"as_id": service.id},
             ["state"],
             allow_none=True,
             desc="get_appservice_state",
@@ -177,7 +177,7 @@ class ApplicationServiceTransactionWorkerStore(
             A Deferred which resolves when the state was set successfully.
         """
         return self.db.simple_upsert(
-            "application_services_state", dict(as_id=service.id), dict(state=state)
+            "application_services_state", {"as_id": service.id}, {"state": state}
         )
 
     def create_appservice_txn(self, service, events):
@@ -253,13 +253,15 @@ class ApplicationServiceTransactionWorkerStore(
             self.db.simple_upsert_txn(
                 txn,
                 "application_services_state",
-                dict(as_id=service.id),
-                dict(last_txn=txn_id),
+                {"as_id": service.id},
+                {"last_txn": txn_id},
             )
 
             # Delete txn
             self.db.simple_delete_txn(
-                txn, "application_services_txns", dict(txn_id=txn_id, as_id=service.id)
+                txn,
+                "application_services_txns",
+                {"txn_id": txn_id, "as_id": service.id},
             )
 
         return self.db.runInteraction(
diff --git a/synapse/storage/data_stores/main/client_ips.py b/synapse/storage/data_stores/main/client_ips.py
index 13f4c9c72e..e1ccb27142 100644
--- a/synapse/storage/data_stores/main/client_ips.py
+++ b/synapse/storage/data_stores/main/client_ips.py
@@ -530,7 +530,7 @@ class ClientIpStore(ClientIpBackgroundUpdateStore):
             ((row["access_token"], row["ip"]), (row["user_agent"], row["last_seen"]))
             for row in rows
         )
-        return list(
+        return [
             {
                 "access_token": access_token,
                 "ip": ip,
@@ -538,7 +538,7 @@ class ClientIpStore(ClientIpBackgroundUpdateStore):
                 "last_seen": last_seen,
             }
             for (access_token, ip), (user_agent, last_seen) in iteritems(results)
-        )
+        ]
 
     @wrap_as_background_process("prune_old_user_ips")
     async def _prune_old_user_ips(self):
diff --git a/synapse/storage/data_stores/main/devices.py b/synapse/storage/data_stores/main/devices.py
index b7617efb80..d55733a4cd 100644
--- a/synapse/storage/data_stores/main/devices.py
+++ b/synapse/storage/data_stores/main/devices.py
@@ -137,7 +137,7 @@ class DeviceWorkerStore(SQLBaseStore):
 
         # get the cross-signing keys of the users in the list, so that we can
         # determine which of the device changes were cross-signing keys
-        users = set(r[0] for r in updates)
+        users = {r[0] for r in updates}
         master_key_by_user = {}
         self_signing_key_by_user = {}
         for user in users:
@@ -446,7 +446,7 @@ class DeviceWorkerStore(SQLBaseStore):
             a set of user_ids and results_map is a mapping of
             user_id -> device_id -> device_info
         """
-        user_ids = set(user_id for user_id, _ in query_list)
+        user_ids = {user_id for user_id, _ in query_list}
         user_map = yield self.get_device_list_last_stream_id_for_remotes(list(user_ids))
 
         # We go and check if any of the users need to have their device lists
@@ -454,10 +454,9 @@ class DeviceWorkerStore(SQLBaseStore):
         users_needing_resync = yield self.get_user_ids_requiring_device_list_resync(
             user_ids
         )
-        user_ids_in_cache = (
-            set(user_id for user_id, stream_id in user_map.items() if stream_id)
-            - users_needing_resync
-        )
+        user_ids_in_cache = {
+            user_id for user_id, stream_id in user_map.items() if stream_id
+        } - users_needing_resync
         user_ids_not_in_cache = user_ids - user_ids_in_cache
 
         results = {}
@@ -604,7 +603,7 @@ class DeviceWorkerStore(SQLBaseStore):
             rows = yield self.db.execute(
                 "get_users_whose_signatures_changed", None, sql, user_id, from_key
             )
-            return set(user for row in rows for user in json.loads(row[0]))
+            return {user for row in rows for user in json.loads(row[0])}
         else:
             return set()
 
diff --git a/synapse/storage/data_stores/main/end_to_end_keys.py b/synapse/storage/data_stores/main/end_to_end_keys.py
index e551606f9d..001a53f9b4 100644
--- a/synapse/storage/data_stores/main/end_to_end_keys.py
+++ b/synapse/storage/data_stores/main/end_to_end_keys.py
@@ -680,11 +680,6 @@ class EndToEndKeyStore(EndToEndKeyWorkerStore, SQLBaseStore):
                 'user_signing' for a user-signing key
             key (dict): the key data
         """
-        # the cross-signing keys need to occupy the same namespace as devices,
-        # since signatures are identified by device ID.  So add an entry to the
-        # device table to make sure that we don't have a collision with device
-        # IDs
-
         # the 'key' dict will look something like:
         # {
         #   "user_id": "@alice:example.com",
@@ -701,16 +696,24 @@ class EndToEndKeyStore(EndToEndKeyWorkerStore, SQLBaseStore):
         # The "keys" property must only have one entry, which will be the public
         # key, so we just grab the first value in there
         pubkey = next(iter(key["keys"].values()))
-        self.db.simple_insert_txn(
-            txn,
-            "devices",
-            values={
-                "user_id": user_id,
-                "device_id": pubkey,
-                "display_name": key_type + " signing key",
-                "hidden": True,
-            },
-        )
+
+        # The cross-signing keys need to occupy the same namespace as devices,
+        # since signatures are identified by device ID.  So add an entry to the
+        # device table to make sure that we don't have a collision with device
+        # IDs.
+        # We only need to do this for local users, since remote servers should be
+        # responsible for checking this for their own users.
+        if self.hs.is_mine_id(user_id):
+            self.db.simple_insert_txn(
+                txn,
+                "devices",
+                values={
+                    "user_id": user_id,
+                    "device_id": pubkey,
+                    "display_name": key_type + " signing key",
+                    "hidden": True,
+                },
+            )
 
         # and finally, store the key itself
         with self._cross_signing_id_gen.get_next() as stream_id:
diff --git a/synapse/storage/data_stores/main/event_federation.py b/synapse/storage/data_stores/main/event_federation.py
index 60c67457b4..62d4e9f599 100644
--- a/synapse/storage/data_stores/main/event_federation.py
+++ b/synapse/storage/data_stores/main/event_federation.py
@@ -14,8 +14,8 @@
 # limitations under the License.
 import itertools
 import logging
+from typing import Dict, List, Optional, Set, Tuple
 
-from six.moves import range
 from six.moves.queue import Empty, PriorityQueue
 
 from twisted.internet import defer
@@ -27,6 +27,7 @@ from synapse.storage.data_stores.main.events_worker import EventsWorkerStore
 from synapse.storage.data_stores.main.signatures import SignatureWorkerStore
 from synapse.storage.database import Database
 from synapse.util.caches.descriptors import cached
+from synapse.util.iterutils import batch_iter
 
 logger = logging.getLogger(__name__)
 
@@ -46,21 +47,37 @@ class EventFederationWorkerStore(EventsWorkerStore, SignatureWorkerStore, SQLBas
             event_ids, include_given=include_given
         ).addCallback(self.get_events_as_list)
 
-    def get_auth_chain_ids(self, event_ids, include_given=False):
+    def get_auth_chain_ids(
+        self,
+        event_ids: List[str],
+        include_given: bool = False,
+        ignore_events: Optional[Set[str]] = None,
+    ):
         """Get auth events for given event_ids. The events *must* be state events.
 
         Args:
-            event_ids (list): state events
-            include_given (bool): include the given events in result
+            event_ids: state events
+            include_given: include the given events in result
+            ignore_events: Set of events to exclude from the returned auth
+                chain. This is useful if the caller will just discard the
+                given events anyway, and saves us from figuring out their auth
+                chains if not required.
 
         Returns:
             list of event_ids
         """
         return self.db.runInteraction(
-            "get_auth_chain_ids", self._get_auth_chain_ids_txn, event_ids, include_given
+            "get_auth_chain_ids",
+            self._get_auth_chain_ids_txn,
+            event_ids,
+            include_given,
+            ignore_events,
         )
 
-    def _get_auth_chain_ids_txn(self, txn, event_ids, include_given):
+    def _get_auth_chain_ids_txn(self, txn, event_ids, include_given, ignore_events):
+        if ignore_events is None:
+            ignore_events = set()
+
         if include_given:
             results = set(event_ids)
         else:
@@ -71,15 +88,14 @@ class EventFederationWorkerStore(EventsWorkerStore, SignatureWorkerStore, SQLBas
         front = set(event_ids)
         while front:
             new_front = set()
-            front_list = list(front)
-            chunks = [front_list[x : x + 100] for x in range(0, len(front), 100)]
-            for chunk in chunks:
+            for chunk in batch_iter(front, 100):
                 clause, args = make_in_list_sql_clause(
                     txn.database_engine, "event_id", chunk
                 )
-                txn.execute(base_sql + clause, list(args))
-                new_front.update([r[0] for r in txn])
+                txn.execute(base_sql + clause, args)
+                new_front.update(r[0] for r in txn)
 
+            new_front -= ignore_events
             new_front -= results
 
             front = new_front
@@ -87,6 +103,154 @@ class EventFederationWorkerStore(EventsWorkerStore, SignatureWorkerStore, SQLBas
 
         return list(results)
 
+    def get_auth_chain_difference(self, state_sets: List[Set[str]]):
+        """Given sets of state events figure out the auth chain difference (as
+        per state res v2 algorithm).
+
+        This equivalent to fetching the full auth chain for each set of state
+        and returning the events that don't appear in each and every auth
+        chain.
+
+        Returns:
+            Deferred[Set[str]]
+        """
+
+        return self.db.runInteraction(
+            "get_auth_chain_difference",
+            self._get_auth_chain_difference_txn,
+            state_sets,
+        )
+
+    def _get_auth_chain_difference_txn(
+        self, txn, state_sets: List[Set[str]]
+    ) -> Set[str]:
+
+        # Algorithm Description
+        # ~~~~~~~~~~~~~~~~~~~~~
+        #
+        # The idea here is to basically walk the auth graph of each state set in
+        # tandem, keeping track of which auth events are reachable by each state
+        # set. If we reach an auth event we've already visited (via a different
+        # state set) then we mark that auth event and all ancestors as reachable
+        # by the state set. This requires that we keep track of the auth chains
+        # in memory.
+        #
+        # Doing it in a such a way means that we can stop early if all auth
+        # events we're currently walking are reachable by all state sets.
+        #
+        # *Note*: We can't stop walking an event's auth chain if it is reachable
+        # by all state sets. This is because other auth chains we're walking
+        # might be reachable only via the original auth chain. For example,
+        # given the following auth chain:
+        #
+        #       A -> C -> D -> E
+        #           /         /
+        #       B -´---------´
+        #
+        # and state sets {A} and {B} then walking the auth chains of A and B
+        # would immediately show that C is reachable by both. However, if we
+        # stopped at C then we'd only reach E via the auth chain of B and so E
+        # would errornously get included in the returned difference.
+        #
+        # The other thing that we do is limit the number of auth chains we walk
+        # at once, due to practical limits (i.e. we can only query the database
+        # with a limited set of parameters). We pick the auth chains we walk
+        # each iteration based on their depth, in the hope that events with a
+        # lower depth are likely reachable by those with higher depths.
+        #
+        # We could use any ordering that we believe would give a rough
+        # topological ordering, e.g. origin server timestamp. If the ordering
+        # chosen is not topological then the algorithm still produces the right
+        # result, but perhaps a bit more inefficiently. This is why it is safe
+        # to use "depth" here.
+
+        initial_events = set(state_sets[0]).union(*state_sets[1:])
+
+        # Dict from events in auth chains to which sets *cannot* reach them.
+        # I.e. if the set is empty then all sets can reach the event.
+        event_to_missing_sets = {
+            event_id: {i for i, a in enumerate(state_sets) if event_id not in a}
+            for event_id in initial_events
+        }
+
+        # We need to get the depth of the initial events for sorting purposes.
+        sql = """
+            SELECT depth, event_id FROM events
+            WHERE %s
+            ORDER BY depth ASC
+        """
+        clause, args = make_in_list_sql_clause(
+            txn.database_engine, "event_id", initial_events
+        )
+        txn.execute(sql % (clause,), args)
+
+        # The sorted list of events whose auth chains we should walk.
+        search = txn.fetchall()  # type: List[Tuple[int, str]]
+
+        # Map from event to its auth events
+        event_to_auth_events = {}  # type: Dict[str, Set[str]]
+
+        base_sql = """
+            SELECT a.event_id, auth_id, depth
+            FROM event_auth AS a
+            INNER JOIN events AS e ON (e.event_id = a.auth_id)
+            WHERE
+        """
+
+        while search:
+            # Check whether all our current walks are reachable by all state
+            # sets. If so we can bail.
+            if all(not event_to_missing_sets[eid] for _, eid in search):
+                break
+
+            # Fetch the auth events and their depths of the N last events we're
+            # currently walking
+            search, chunk = search[:-100], search[-100:]
+            clause, args = make_in_list_sql_clause(
+                txn.database_engine, "a.event_id", [e_id for _, e_id in chunk]
+            )
+            txn.execute(base_sql + clause, args)
+
+            for event_id, auth_event_id, auth_event_depth in txn:
+                event_to_auth_events.setdefault(event_id, set()).add(auth_event_id)
+
+                sets = event_to_missing_sets.get(auth_event_id)
+                if sets is None:
+                    # First time we're seeing this event, so we add it to the
+                    # queue of things to fetch.
+                    search.append((auth_event_depth, auth_event_id))
+
+                    # Assume that this event is unreachable from any of the
+                    # state sets until proven otherwise
+                    sets = event_to_missing_sets[auth_event_id] = set(
+                        range(len(state_sets))
+                    )
+                else:
+                    # We've previously seen this event, so look up its auth
+                    # events and recursively mark all ancestors as reachable
+                    # by the current event's state set.
+                    a_ids = event_to_auth_events.get(auth_event_id)
+                    while a_ids:
+                        new_aids = set()
+                        for a_id in a_ids:
+                            event_to_missing_sets[a_id].intersection_update(
+                                event_to_missing_sets[event_id]
+                            )
+
+                            b = event_to_auth_events.get(a_id)
+                            if b:
+                                new_aids.update(b)
+
+                        a_ids = new_aids
+
+                # Mark that the auth event is reachable by the approriate sets.
+                sets.intersection_update(event_to_missing_sets[event_id])
+
+            search.sort()
+
+        # Return all events where not all sets can reach them.
+        return {eid for eid, n in event_to_missing_sets.items() if n}
+
     def get_oldest_events_in_room(self, room_id):
         return self.db.runInteraction(
             "get_oldest_events_in_room", self._get_oldest_events_in_room_txn, room_id
@@ -410,7 +574,7 @@ class EventFederationWorkerStore(EventsWorkerStore, SignatureWorkerStore, SQLBas
                     query, (room_id, event_id, False, limit - len(event_results))
                 )
 
-                new_results = set(t[0] for t in txn) - seen_events
+                new_results = {t[0] for t in txn} - seen_events
 
                 new_front |= new_results
                 seen_events |= new_results
diff --git a/synapse/storage/data_stores/main/event_push_actions.py b/synapse/storage/data_stores/main/event_push_actions.py
index 9988a6d3fc..8eed590929 100644
--- a/synapse/storage/data_stores/main/event_push_actions.py
+++ b/synapse/storage/data_stores/main/event_push_actions.py
@@ -608,6 +608,23 @@ class EventPushActionsWorkerStore(SQLBaseStore):
 
         return range_end
 
+    @defer.inlineCallbacks
+    def get_time_of_last_push_action_before(self, stream_ordering):
+        def f(txn):
+            sql = (
+                "SELECT e.received_ts"
+                " FROM event_push_actions AS ep"
+                " JOIN events e ON ep.room_id = e.room_id AND ep.event_id = e.event_id"
+                " WHERE ep.stream_ordering > ?"
+                " ORDER BY ep.stream_ordering ASC"
+                " LIMIT 1"
+            )
+            txn.execute(sql, (stream_ordering,))
+            return txn.fetchone()
+
+        result = yield self.db.runInteraction("get_time_of_last_push_action_before", f)
+        return result[0] if result else None
+
 
 class EventPushActionsStore(EventPushActionsWorkerStore):
     EPA_HIGHLIGHT_INDEX = "epa_highlight_index"
@@ -736,23 +753,6 @@ class EventPushActionsStore(EventPushActionsWorkerStore):
         return push_actions
 
     @defer.inlineCallbacks
-    def get_time_of_last_push_action_before(self, stream_ordering):
-        def f(txn):
-            sql = (
-                "SELECT e.received_ts"
-                " FROM event_push_actions AS ep"
-                " JOIN events e ON ep.room_id = e.room_id AND ep.event_id = e.event_id"
-                " WHERE ep.stream_ordering > ?"
-                " ORDER BY ep.stream_ordering ASC"
-                " LIMIT 1"
-            )
-            txn.execute(sql, (stream_ordering,))
-            return txn.fetchone()
-
-        result = yield self.db.runInteraction("get_time_of_last_push_action_before", f)
-        return result[0] if result else None
-
-    @defer.inlineCallbacks
     def get_latest_push_action_stream_ordering(self):
         def f(txn):
             txn.execute("SELECT MAX(stream_ordering) FROM event_push_actions")
diff --git a/synapse/storage/data_stores/main/events.py b/synapse/storage/data_stores/main/events.py
index c9d0d68c3a..d593ef47b8 100644
--- a/synapse/storage/data_stores/main/events.py
+++ b/synapse/storage/data_stores/main/events.py
@@ -145,7 +145,7 @@ class EventsStore(
             return txn.fetchall()
 
         res = yield self.db.runInteraction("read_forward_extremities", fetch)
-        self._current_forward_extremities_amount = c_counter(list(x[0] for x in res))
+        self._current_forward_extremities_amount = c_counter([x[0] for x in res])
 
     @_retry_on_integrity_error
     @defer.inlineCallbacks
@@ -598,11 +598,11 @@ class EventsStore(
             # We find out which membership events we may have deleted
             # and which we have added, then we invlidate the caches for all
             # those users.
-            members_changed = set(
+            members_changed = {
                 state_key
                 for ev_type, state_key in itertools.chain(to_delete, to_insert)
                 if ev_type == EventTypes.Member
-            )
+            }
 
             for member in members_changed:
                 txn.call_after(
@@ -1168,7 +1168,11 @@ class EventsStore(
                 and original_event.internal_metadata.is_redacted()
             ):
                 # Redaction was allowed
-                pruned_json = encode_json(prune_event_dict(original_event.get_dict()))
+                pruned_json = encode_json(
+                    prune_event_dict(
+                        original_event.room_version, original_event.get_dict()
+                    )
+                )
             else:
                 # Redaction wasn't allowed
                 pruned_json = None
@@ -1615,7 +1619,7 @@ class EventsStore(
         """
         )
 
-        referenced_state_groups = set(sg for sg, in txn)
+        referenced_state_groups = {sg for sg, in txn}
         logger.info(
             "[purge] found %i referenced state groups", len(referenced_state_groups)
         )
@@ -1929,7 +1933,9 @@ class EventsStore(
                 return
 
             # Prune the event's dict then convert it to JSON.
-            pruned_json = encode_json(prune_event_dict(event.get_dict()))
+            pruned_json = encode_json(
+                prune_event_dict(event.room_version, event.get_dict())
+            )
 
             # Update the event_json table to replace the event's JSON with the pruned
             # JSON.
diff --git a/synapse/storage/data_stores/main/events_bg_updates.py b/synapse/storage/data_stores/main/events_bg_updates.py
index 5177b71016..f54c8b1ee0 100644
--- a/synapse/storage/data_stores/main/events_bg_updates.py
+++ b/synapse/storage/data_stores/main/events_bg_updates.py
@@ -402,7 +402,7 @@ class EventsBackgroundUpdatesStore(SQLBaseStore):
                     keyvalues={},
                     retcols=("room_id",),
                 )
-                room_ids = set(row["room_id"] for row in rows)
+                room_ids = {row["room_id"] for row in rows}
                 for room_id in room_ids:
                     txn.call_after(
                         self.get_latest_event_ids_in_room.invalidate, (room_id,)
diff --git a/synapse/storage/data_stores/main/events_worker.py b/synapse/storage/data_stores/main/events_worker.py
index 7251e819f5..ca237c6f12 100644
--- a/synapse/storage/data_stores/main/events_worker.py
+++ b/synapse/storage/data_stores/main/events_worker.py
@@ -28,9 +28,12 @@ from twisted.internet import defer
 
 from synapse.api.constants import EventTypes
 from synapse.api.errors import NotFoundError
-from synapse.api.room_versions import EventFormatVersions
-from synapse.events import FrozenEvent, event_type_from_format_version  # noqa: F401
-from synapse.events.snapshot import EventContext  # noqa: F401
+from synapse.api.room_versions import (
+    KNOWN_ROOM_VERSIONS,
+    EventFormatVersions,
+    RoomVersions,
+)
+from synapse.events import make_event_from_dict
 from synapse.events.utils import prune_event
 from synapse.logging.context import LoggingContext, PreserveLoggingContext
 from synapse.metrics.background_process_metrics import run_as_background_process
@@ -494,9 +497,9 @@ class EventsWorkerStore(SQLBaseStore):
         """
         with Measure(self._clock, "_fetch_event_list"):
             try:
-                events_to_fetch = set(
+                events_to_fetch = {
                     event_id for events, _ in event_list for event_id in events
-                )
+                }
 
                 row_dict = self.db.new_transaction(
                     conn, "do_fetch", [], [], self._fetch_event_rows, events_to_fetch
@@ -580,8 +583,49 @@ class EventsWorkerStore(SQLBaseStore):
                 # of a event format version, so it must be a V1 event.
                 format_version = EventFormatVersions.V1
 
-            original_ev = event_type_from_format_version(format_version)(
+            room_version_id = row["room_version_id"]
+
+            if not room_version_id:
+                # this should only happen for out-of-band membership events
+                if not internal_metadata.get("out_of_band_membership"):
+                    logger.warning(
+                        "Room %s for event %s is unknown", d["room_id"], event_id
+                    )
+                    continue
+
+                # take a wild stab at the room version based on the event format
+                if format_version == EventFormatVersions.V1:
+                    room_version = RoomVersions.V1
+                elif format_version == EventFormatVersions.V2:
+                    room_version = RoomVersions.V3
+                else:
+                    room_version = RoomVersions.V5
+            else:
+                room_version = KNOWN_ROOM_VERSIONS.get(room_version_id)
+                if not room_version:
+                    logger.error(
+                        "Event %s in room %s has unknown room version %s",
+                        event_id,
+                        d["room_id"],
+                        room_version_id,
+                    )
+                    continue
+
+                if room_version.event_format != format_version:
+                    logger.error(
+                        "Event %s in room %s with version %s has wrong format: "
+                        "expected %s, was %s",
+                        event_id,
+                        d["room_id"],
+                        room_version_id,
+                        room_version.event_format,
+                        format_version,
+                    )
+                    continue
+
+            original_ev = make_event_from_dict(
                 event_dict=d,
+                room_version=room_version,
                 internal_metadata_dict=internal_metadata,
                 rejected_reason=rejected_reason,
             )
@@ -661,6 +705,12 @@ class EventsWorkerStore(SQLBaseStore):
            of EventFormatVersions. 'None' means the event predates
            EventFormatVersions (so the event is format V1).
 
+         * room_version_id (str|None): The version of the room which contains the event.
+           Hopefully one of RoomVersions.
+
+           Due to historical reasons, there may be a few events in the database which
+           do not have an associated room; in this case None will be returned here.
+
          * rejected_reason (str|None): if the event was rejected, the reason
            why.
 
@@ -676,17 +726,18 @@ class EventsWorkerStore(SQLBaseStore):
         """
         event_dict = {}
         for evs in batch_iter(event_ids, 200):
-            sql = (
-                "SELECT "
-                " e.event_id, "
-                " e.internal_metadata,"
-                " e.json,"
-                " e.format_version, "
-                " rej.reason "
-                " FROM event_json as e"
-                " LEFT JOIN rejections as rej USING (event_id)"
-                " WHERE "
-            )
+            sql = """\
+                SELECT
+                  e.event_id,
+                  e.internal_metadata,
+                  e.json,
+                  e.format_version,
+                  r.room_version,
+                  rej.reason
+                FROM event_json as e
+                  LEFT JOIN rooms r USING (room_id)
+                  LEFT JOIN rejections as rej USING (event_id)
+                WHERE """
 
             clause, args = make_in_list_sql_clause(
                 txn.database_engine, "e.event_id", evs
@@ -701,7 +752,8 @@ class EventsWorkerStore(SQLBaseStore):
                     "internal_metadata": row[1],
                     "json": row[2],
                     "format_version": row[3],
-                    "rejected_reason": row[4],
+                    "room_version_id": row[4],
+                    "rejected_reason": row[5],
                     "redactions": [],
                 }
 
@@ -804,7 +856,7 @@ class EventsWorkerStore(SQLBaseStore):
             desc="have_events_in_timeline",
         )
 
-        return set(r["event_id"] for r in rows)
+        return {r["event_id"] for r in rows}
 
     @defer.inlineCallbacks
     def have_seen_events(self, event_ids):
diff --git a/synapse/storage/data_stores/main/monthly_active_users.py b/synapse/storage/data_stores/main/monthly_active_users.py
index 1507a14e09..925bc5691b 100644
--- a/synapse/storage/data_stores/main/monthly_active_users.py
+++ b/synapse/storage/data_stores/main/monthly_active_users.py
@@ -43,13 +43,40 @@ class MonthlyActiveUsersWorkerStore(SQLBaseStore):
 
         def _count_users(txn):
             sql = "SELECT COALESCE(count(*), 0) FROM monthly_active_users"
-
             txn.execute(sql)
             (count,) = txn.fetchone()
             return count
 
         return self.db.runInteraction("count_users", _count_users)
 
+    @cached(num_args=0)
+    def get_monthly_active_count_by_service(self):
+        """Generates current count of monthly active users broken down by service.
+        A service is typically an appservice but also includes native matrix users.
+        Since the `monthly_active_users` table is populated from the `user_ips` table
+        `config.track_appservice_user_ips` must be set to `true` for this
+        method to return anything other than native matrix users.
+
+        Returns:
+            Deferred[dict]: dict that includes a mapping between app_service_id
+                and the number of occurrences.
+
+        """
+
+        def _count_users_by_service(txn):
+            sql = """
+                SELECT COALESCE(appservice_id, 'native'), COALESCE(count(*), 0)
+                FROM monthly_active_users
+                LEFT JOIN users ON monthly_active_users.user_id=users.name
+                GROUP BY appservice_id;
+            """
+
+            txn.execute(sql)
+            result = txn.fetchall()
+            return dict(result)
+
+        return self.db.runInteraction("count_users_by_service", _count_users_by_service)
+
     @defer.inlineCallbacks
     def get_registered_reserved_users(self):
         """Of the reserved threepids defined in config, which are associated
@@ -292,6 +319,9 @@ class MonthlyActiveUsersStore(MonthlyActiveUsersWorkerStore):
 
         self._invalidate_cache_and_stream(txn, self.get_monthly_active_count, ())
         self._invalidate_cache_and_stream(
+            txn, self.get_monthly_active_count_by_service, ()
+        )
+        self._invalidate_cache_and_stream(
             txn, self.user_last_seen_monthly_active, (user_id,)
         )
 
diff --git a/synapse/storage/data_stores/main/push_rule.py b/synapse/storage/data_stores/main/push_rule.py
index e2673ae073..62ac88d9f2 100644
--- a/synapse/storage/data_stores/main/push_rule.py
+++ b/synapse/storage/data_stores/main/push_rule.py
@@ -276,21 +276,21 @@ class PushRulesWorkerStore(
         # We ignore app service users for now. This is so that we don't fill
         # up the `get_if_users_have_pushers` cache with AS entries that we
         # know don't have pushers, nor even read receipts.
-        local_users_in_room = set(
+        local_users_in_room = {
             u
             for u in users_in_room
             if self.hs.is_mine_id(u)
             and not self.get_if_app_services_interested_in_user(u)
-        )
+        }
 
         # users in the room who have pushers need to get push rules run because
         # that's how their pushers work
         if_users_with_pushers = yield self.get_if_users_have_pushers(
             local_users_in_room, on_invalidate=cache_context.invalidate
         )
-        user_ids = set(
+        user_ids = {
             uid for uid, have_pusher in if_users_with_pushers.items() if have_pusher
-        )
+        }
 
         users_with_receipts = yield self.get_users_with_read_receipts_in_room(
             room_id, on_invalidate=cache_context.invalidate
diff --git a/synapse/storage/data_stores/main/pusher.py b/synapse/storage/data_stores/main/pusher.py
index 6b03233262..547b9d69cb 100644
--- a/synapse/storage/data_stores/main/pusher.py
+++ b/synapse/storage/data_stores/main/pusher.py
@@ -197,6 +197,84 @@ class PusherWorkerStore(SQLBaseStore):
 
         return result
 
+    @defer.inlineCallbacks
+    def update_pusher_last_stream_ordering(
+        self, app_id, pushkey, user_id, last_stream_ordering
+    ):
+        yield self.db.simple_update_one(
+            "pushers",
+            {"app_id": app_id, "pushkey": pushkey, "user_name": user_id},
+            {"last_stream_ordering": last_stream_ordering},
+            desc="update_pusher_last_stream_ordering",
+        )
+
+    @defer.inlineCallbacks
+    def update_pusher_last_stream_ordering_and_success(
+        self, app_id, pushkey, user_id, last_stream_ordering, last_success
+    ):
+        """Update the last stream ordering position we've processed up to for
+        the given pusher.
+
+        Args:
+            app_id (str)
+            pushkey (str)
+            last_stream_ordering (int)
+            last_success (int)
+
+        Returns:
+            Deferred[bool]: True if the pusher still exists; False if it has been deleted.
+        """
+        updated = yield self.db.simple_update(
+            table="pushers",
+            keyvalues={"app_id": app_id, "pushkey": pushkey, "user_name": user_id},
+            updatevalues={
+                "last_stream_ordering": last_stream_ordering,
+                "last_success": last_success,
+            },
+            desc="update_pusher_last_stream_ordering_and_success",
+        )
+
+        return bool(updated)
+
+    @defer.inlineCallbacks
+    def update_pusher_failing_since(self, app_id, pushkey, user_id, failing_since):
+        yield self.db.simple_update(
+            table="pushers",
+            keyvalues={"app_id": app_id, "pushkey": pushkey, "user_name": user_id},
+            updatevalues={"failing_since": failing_since},
+            desc="update_pusher_failing_since",
+        )
+
+    @defer.inlineCallbacks
+    def get_throttle_params_by_room(self, pusher_id):
+        res = yield self.db.simple_select_list(
+            "pusher_throttle",
+            {"pusher": pusher_id},
+            ["room_id", "last_sent_ts", "throttle_ms"],
+            desc="get_throttle_params_by_room",
+        )
+
+        params_by_room = {}
+        for row in res:
+            params_by_room[row["room_id"]] = {
+                "last_sent_ts": row["last_sent_ts"],
+                "throttle_ms": row["throttle_ms"],
+            }
+
+        return params_by_room
+
+    @defer.inlineCallbacks
+    def set_throttle_params(self, pusher_id, room_id, params):
+        # no need to lock because `pusher_throttle` has a primary key on
+        # (pusher, room_id) so simple_upsert will retry
+        yield self.db.simple_upsert(
+            "pusher_throttle",
+            {"pusher": pusher_id, "room_id": room_id},
+            params,
+            desc="set_throttle_params",
+            lock=False,
+        )
+
 
 class PusherStore(PusherWorkerStore):
     def get_pushers_stream_token(self):
@@ -282,81 +360,3 @@ class PusherStore(PusherWorkerStore):
 
         with self._pushers_id_gen.get_next() as stream_id:
             yield self.db.runInteraction("delete_pusher", delete_pusher_txn, stream_id)
-
-    @defer.inlineCallbacks
-    def update_pusher_last_stream_ordering(
-        self, app_id, pushkey, user_id, last_stream_ordering
-    ):
-        yield self.db.simple_update_one(
-            "pushers",
-            {"app_id": app_id, "pushkey": pushkey, "user_name": user_id},
-            {"last_stream_ordering": last_stream_ordering},
-            desc="update_pusher_last_stream_ordering",
-        )
-
-    @defer.inlineCallbacks
-    def update_pusher_last_stream_ordering_and_success(
-        self, app_id, pushkey, user_id, last_stream_ordering, last_success
-    ):
-        """Update the last stream ordering position we've processed up to for
-        the given pusher.
-
-        Args:
-            app_id (str)
-            pushkey (str)
-            last_stream_ordering (int)
-            last_success (int)
-
-        Returns:
-            Deferred[bool]: True if the pusher still exists; False if it has been deleted.
-        """
-        updated = yield self.db.simple_update(
-            table="pushers",
-            keyvalues={"app_id": app_id, "pushkey": pushkey, "user_name": user_id},
-            updatevalues={
-                "last_stream_ordering": last_stream_ordering,
-                "last_success": last_success,
-            },
-            desc="update_pusher_last_stream_ordering_and_success",
-        )
-
-        return bool(updated)
-
-    @defer.inlineCallbacks
-    def update_pusher_failing_since(self, app_id, pushkey, user_id, failing_since):
-        yield self.db.simple_update(
-            table="pushers",
-            keyvalues={"app_id": app_id, "pushkey": pushkey, "user_name": user_id},
-            updatevalues={"failing_since": failing_since},
-            desc="update_pusher_failing_since",
-        )
-
-    @defer.inlineCallbacks
-    def get_throttle_params_by_room(self, pusher_id):
-        res = yield self.db.simple_select_list(
-            "pusher_throttle",
-            {"pusher": pusher_id},
-            ["room_id", "last_sent_ts", "throttle_ms"],
-            desc="get_throttle_params_by_room",
-        )
-
-        params_by_room = {}
-        for row in res:
-            params_by_room[row["room_id"]] = {
-                "last_sent_ts": row["last_sent_ts"],
-                "throttle_ms": row["throttle_ms"],
-            }
-
-        return params_by_room
-
-    @defer.inlineCallbacks
-    def set_throttle_params(self, pusher_id, room_id, params):
-        # no need to lock because `pusher_throttle` has a primary key on
-        # (pusher, room_id) so simple_upsert will retry
-        yield self.db.simple_upsert(
-            "pusher_throttle",
-            {"pusher": pusher_id, "room_id": room_id},
-            params,
-            desc="set_throttle_params",
-            lock=False,
-        )
diff --git a/synapse/storage/data_stores/main/receipts.py b/synapse/storage/data_stores/main/receipts.py
index 96e54d145e..0d932a0672 100644
--- a/synapse/storage/data_stores/main/receipts.py
+++ b/synapse/storage/data_stores/main/receipts.py
@@ -58,7 +58,7 @@ class ReceiptsWorkerStore(SQLBaseStore):
     @cachedInlineCallbacks()
     def get_users_with_read_receipts_in_room(self, room_id):
         receipts = yield self.get_receipts_for_room(room_id, "m.read")
-        return set(r["user_id"] for r in receipts)
+        return {r["user_id"] for r in receipts}
 
     @cached(num_args=2)
     def get_receipts_for_room(self, room_id, receipt_type):
@@ -283,7 +283,7 @@ class ReceiptsWorkerStore(SQLBaseStore):
                 args.append(limit)
             txn.execute(sql, args)
 
-            return list(r[0:5] + (json.loads(r[5]),) for r in txn)
+            return [r[0:5] + (json.loads(r[5]),) for r in txn]
 
         return self.db.runInteraction(
             "get_all_updated_receipts", get_all_updated_receipts_txn
diff --git a/synapse/storage/data_stores/main/room.py b/synapse/storage/data_stores/main/room.py
index 9a17e336ba..e6c10c6316 100644
--- a/synapse/storage/data_stores/main/room.py
+++ b/synapse/storage/data_stores/main/room.py
@@ -954,6 +954,23 @@ class RoomStore(RoomBackgroundUpdateStore, RoomWorkerStore, SearchStore):
 
         self.config = hs.config
 
+    async def upsert_room_on_join(self, room_id: str, room_version: RoomVersion):
+        """Ensure that the room is stored in the table
+
+        Called when we join a room over federation, and overwrites any room version
+        currently in the table.
+        """
+        await self.db.simple_upsert(
+            desc="upsert_room_on_join",
+            table="rooms",
+            keyvalues={"room_id": room_id},
+            values={"room_version": room_version.identifier},
+            insertion_values={"is_public": False, "creator": ""},
+            # rooms has a unique constraint on room_id, so no need to lock when doing an
+            # emulated upsert.
+            lock=False,
+        )
+
     @defer.inlineCallbacks
     def store_room(
         self,
@@ -1003,6 +1020,26 @@ class RoomStore(RoomBackgroundUpdateStore, RoomWorkerStore, SearchStore):
             logger.error("store_room with room_id=%s failed: %s", room_id, e)
             raise StoreError(500, "Problem creating room.")
 
+    async def maybe_store_room_on_invite(self, room_id: str, room_version: RoomVersion):
+        """
+        When we receive an invite over federation, store the version of the room if we
+        don't already know the room version.
+        """
+        await self.db.simple_upsert(
+            desc="maybe_store_room_on_invite",
+            table="rooms",
+            keyvalues={"room_id": room_id},
+            values={},
+            insertion_values={
+                "room_version": room_version.identifier,
+                "is_public": False,
+                "creator": "",
+            },
+            # rooms has a unique constraint on room_id, so no need to lock when doing an
+            # emulated upsert.
+            lock=False,
+        )
+
     @defer.inlineCallbacks
     def set_room_is_public(self, room_id, is_public):
         def set_room_is_public_txn(txn, next_id):
diff --git a/synapse/storage/data_stores/main/roommember.py b/synapse/storage/data_stores/main/roommember.py
index d5ced05701..d5bd0cb5cf 100644
--- a/synapse/storage/data_stores/main/roommember.py
+++ b/synapse/storage/data_stores/main/roommember.py
@@ -465,7 +465,7 @@ class RoomMemberWorkerStore(EventsWorkerStore):
 
             txn.execute(sql % (clause,), args)
 
-            return set(row[0] for row in txn)
+            return {row[0] for row in txn}
 
         return await self.db.runInteraction(
             "get_users_server_still_shares_room_with",
@@ -826,7 +826,7 @@ class RoomMemberWorkerStore(EventsWorkerStore):
                 GROUP BY room_id, user_id;
             """
             txn.execute(sql, (user_id,))
-            return set(row[0] for row in txn if row[1] == 0)
+            return {row[0] for row in txn if row[1] == 0}
 
         return self.db.runInteraction(
             "get_forgotten_rooms_for_user", _get_forgotten_rooms_for_user_txn
diff --git a/synapse/storage/data_stores/main/schema/delta/57/rooms_version_column_3.sql.postgres b/synapse/storage/data_stores/main/schema/delta/57/rooms_version_column_3.sql.postgres
new file mode 100644
index 0000000000..92aaadde0d
--- /dev/null
+++ b/synapse/storage/data_stores/main/schema/delta/57/rooms_version_column_3.sql.postgres
@@ -0,0 +1,39 @@
+/* Copyright 2020 The Matrix.org Foundation C.I.C.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+-- When we first added the room_version column to the rooms table, it was populated from
+-- the current_state_events table. However, there was an issue causing a background
+-- update to clean up the current_state_events table for rooms where the server is no
+-- longer participating, before that column could be populated. Therefore, some rooms had
+-- a NULL room_version.
+
+-- The rooms_version_column_2.sql.* delta files were introduced to make the populating
+-- synchronous instead of running it in a background update, which fixed this issue.
+-- However, all of the instances of Synapse installed or updated in the meantime got
+-- their rooms table corrupted with NULL room_versions.
+
+-- This query fishes out the room versions from the create event using the state_events
+-- table instead of the current_state_events one, as the former still have all of the
+-- create events.
+
+UPDATE rooms SET room_version=(
+    SELECT COALESCE(json::json->'content'->>'room_version','1')
+    FROM state_events se INNER JOIN event_json ej USING (event_id)
+    WHERE se.room_id=rooms.room_id AND se.type='m.room.create' AND se.state_key=''
+    LIMIT 1
+) WHERE rooms.room_version IS NULL;
+
+-- see also rooms_version_column_3.sql.sqlite which has a copy of the above query, using
+-- sqlite syntax for the json extraction.
diff --git a/synapse/storage/data_stores/main/schema/delta/57/rooms_version_column_3.sql.sqlite b/synapse/storage/data_stores/main/schema/delta/57/rooms_version_column_3.sql.sqlite
new file mode 100644
index 0000000000..e19dab97cb
--- /dev/null
+++ b/synapse/storage/data_stores/main/schema/delta/57/rooms_version_column_3.sql.sqlite
@@ -0,0 +1,23 @@
+/* Copyright 2020 The Matrix.org Foundation C.I.C.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+-- see rooms_version_column_3.sql.postgres for details of what's going on here.
+
+UPDATE rooms SET room_version=(
+    SELECT COALESCE(json_extract(ej.json, '$.content.room_version'), '1')
+    FROM state_events se INNER JOIN event_json ej USING (event_id)
+    WHERE se.room_id=rooms.room_id AND se.type='m.room.create' AND se.state_key=''
+    LIMIT 1
+) WHERE rooms.room_version IS NULL;
diff --git a/synapse/storage/data_stores/main/schema/full_schemas/README.md b/synapse/storage/data_stores/main/schema/full_schemas/README.md
index bbd3f18604..c00f287190 100644
--- a/synapse/storage/data_stores/main/schema/full_schemas/README.md
+++ b/synapse/storage/data_stores/main/schema/full_schemas/README.md
@@ -1,13 +1,21 @@
-# Building full schema dumps
+# Synapse Database Schemas
 
-These schemas need to be made from a database that has had all background updates run.
+These schemas are used as a basis to create brand new Synapse databases, on both
+SQLite3 and Postgres.
 
-To do so, use `scripts-dev/make_full_schema.sh`. This will produce
-`full.sql.postgres ` and `full.sql.sqlite` files.
+## Building full schema dumps
+
+If you want to recreate these schemas, they need to be made from a database that
+has had all background updates run.
+
+To do so, use `scripts-dev/make_full_schema.sh`. This will produce new
+`full.sql.postgres ` and `full.sql.sqlite` files. 
 
 Ensure postgres is installed and your user has the ability to run bash commands
-such as `createdb`.
+such as `createdb`, then call
+
+    ./scripts-dev/make_full_schema.sh -p postgres_username -o output_dir/
 
-```
-./scripts-dev/make_full_schema.sh -p postgres_username -o output_dir/
-```
+There are currently two folders with full-schema snapshots. `16` is a snapshot
+from 2015, for historical reference. The other contains the most recent full
+schema snapshot.
diff --git a/synapse/storage/data_stores/main/state.py b/synapse/storage/data_stores/main/state.py
index 3d34103e67..3a3b9a8e72 100644
--- a/synapse/storage/data_stores/main/state.py
+++ b/synapse/storage/data_stores/main/state.py
@@ -321,7 +321,7 @@ class StateGroupWorkerStore(EventsWorkerStore, SQLBaseStore):
             desc="get_referenced_state_groups",
         )
 
-        return set(row["state_group"] for row in rows)
+        return {row["state_group"] for row in rows}
 
 
 class MainStateBackgroundUpdateStore(RoomMemberWorkerStore):
@@ -367,7 +367,7 @@ class MainStateBackgroundUpdateStore(RoomMemberWorkerStore):
             """
 
             txn.execute(sql, (last_room_id, batch_size))
-            room_ids = list(row[0] for row in txn)
+            room_ids = [row[0] for row in txn]
             if not room_ids:
                 return True, set()
 
@@ -384,7 +384,7 @@ class MainStateBackgroundUpdateStore(RoomMemberWorkerStore):
 
             txn.execute(sql, (last_room_id, room_ids[-1], "%:" + self.server_name))
 
-            joined_room_ids = set(row[0] for row in txn)
+            joined_room_ids = {row[0] for row in txn}
 
             left_rooms = set(room_ids) - joined_room_ids
 
@@ -404,7 +404,7 @@ class MainStateBackgroundUpdateStore(RoomMemberWorkerStore):
                 retcols=("state_key",),
             )
 
-            potentially_left_users = set(row["state_key"] for row in rows)
+            potentially_left_users = {row["state_key"] for row in rows}
 
             # Now lets actually delete the rooms from the DB.
             self.db.simple_delete_many_txn(
diff --git a/synapse/storage/data_stores/main/state_deltas.py b/synapse/storage/data_stores/main/state_deltas.py
index 12c982cb26..725e12507f 100644
--- a/synapse/storage/data_stores/main/state_deltas.py
+++ b/synapse/storage/data_stores/main/state_deltas.py
@@ -15,6 +15,8 @@
 
 import logging
 
+from twisted.internet import defer
+
 from synapse.storage._base import SQLBaseStore
 
 logger = logging.getLogger(__name__)
@@ -56,7 +58,7 @@ class StateDeltasStore(SQLBaseStore):
             # if the CSDs haven't changed between prev_stream_id and now, we
             # know for certain that they haven't changed between prev_stream_id and
             # max_stream_id.
-            return max_stream_id, []
+            return defer.succeed((max_stream_id, []))
 
         def get_current_state_deltas_txn(txn):
             # First we calculate the max stream id that will give us less than
diff --git a/synapse/storage/data_stores/main/stream.py b/synapse/storage/data_stores/main/stream.py
index 056b25b13a..ada5cce6c2 100644
--- a/synapse/storage/data_stores/main/stream.py
+++ b/synapse/storage/data_stores/main/stream.py
@@ -346,11 +346,11 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
             from_key (str): The room_key portion of a StreamToken
         """
         from_key = RoomStreamToken.parse_stream_token(from_key).stream
-        return set(
+        return {
             room_id
             for room_id in room_ids
             if self._events_stream_cache.has_entity_changed(room_id, from_key)
-        )
+        }
 
     @defer.inlineCallbacks
     def get_room_events_stream_for_room(
@@ -679,11 +679,11 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
         )
 
         events_before = yield self.get_events_as_list(
-            [e for e in results["before"]["event_ids"]], get_prev_content=True
+            list(results["before"]["event_ids"]), get_prev_content=True
         )
 
         events_after = yield self.get_events_as_list(
-            [e for e in results["after"]["event_ids"]], get_prev_content=True
+            list(results["after"]["event_ids"]), get_prev_content=True
         )
 
         return {
diff --git a/synapse/storage/data_stores/main/user_erasure_store.py b/synapse/storage/data_stores/main/user_erasure_store.py
index af8025bc17..ec6b8a4ffd 100644
--- a/synapse/storage/data_stores/main/user_erasure_store.py
+++ b/synapse/storage/data_stores/main/user_erasure_store.py
@@ -63,9 +63,9 @@ class UserErasureWorkerStore(SQLBaseStore):
             retcols=("user_id",),
             desc="are_users_erased",
         )
-        erased_users = set(row["user_id"] for row in rows)
+        erased_users = {row["user_id"] for row in rows}
 
-        res = dict((u, u in erased_users) for u in user_ids)
+        res = {u: u in erased_users for u in user_ids}
         return res
 
 
diff --git a/synapse/storage/data_stores/state/store.py b/synapse/storage/data_stores/state/store.py
index c4ee9b7ccb..57a5267663 100644
--- a/synapse/storage/data_stores/state/store.py
+++ b/synapse/storage/data_stores/state/store.py
@@ -520,11 +520,11 @@ class StateGroupDataStore(StateBackgroundUpdateStore, SQLBaseStore):
             retcols=("state_group",),
         )
 
-        remaining_state_groups = set(
+        remaining_state_groups = {
             row["state_group"]
             for row in rows
             if row["state_group"] not in state_groups_to_delete
-        )
+        }
 
         logger.info(
             "[purge] de-delta-ing %i remaining state groups",
diff --git a/synapse/storage/database.py b/synapse/storage/database.py
index 3eeb2f7c04..e61595336c 100644
--- a/synapse/storage/database.py
+++ b/synapse/storage/database.py
@@ -15,9 +15,9 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 import logging
-import sys
 import time
-from typing import Iterable, Tuple
+from time import monotonic as monotonic_time
+from typing import Any, Callable, Dict, Iterable, Iterator, List, Optional, Tuple
 
 from six import iteritems, iterkeys, itervalues
 from six.moves import intern, range
@@ -29,27 +29,21 @@ from twisted.internet import defer
 
 from synapse.api.errors import StoreError
 from synapse.config.database import DatabaseConnectionConfig
-from synapse.logging.context import LoggingContext, make_deferred_yieldable
+from synapse.logging.context import (
+    LoggingContext,
+    LoggingContextOrSentinel,
+    make_deferred_yieldable,
+)
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.storage.background_updates import BackgroundUpdater
-from synapse.storage.engines import PostgresEngine, Sqlite3Engine
+from synapse.storage.engines import BaseDatabaseEngine, PostgresEngine, Sqlite3Engine
+from synapse.storage.types import Connection, Cursor
 from synapse.util.stringutils import exception_to_unicode
 
-# import a function which will return a monotonic time, in seconds
-try:
-    # on python 3, use time.monotonic, since time.clock can go backwards
-    from time import monotonic as monotonic_time
-except ImportError:
-    # ... but python 2 doesn't have it
-    from time import clock as monotonic_time
-
 logger = logging.getLogger(__name__)
 
-try:
-    MAX_TXN_ID = sys.maxint - 1
-except AttributeError:
-    # python 3 does not have a maximum int value
-    MAX_TXN_ID = 2 ** 63 - 1
+# python 3 does not have a maximum int value
+MAX_TXN_ID = 2 ** 63 - 1
 
 sql_logger = logging.getLogger("synapse.storage.SQL")
 transaction_logger = logging.getLogger("synapse.storage.txn")
@@ -77,7 +71,7 @@ UNIQUE_INDEX_BACKGROUND_UPDATES = {
 
 
 def make_pool(
-    reactor, db_config: DatabaseConnectionConfig, engine
+    reactor, db_config: DatabaseConnectionConfig, engine: BaseDatabaseEngine
 ) -> adbapi.ConnectionPool:
     """Get the connection pool for the database.
     """
@@ -90,7 +84,9 @@ def make_pool(
     )
 
 
-def make_conn(db_config: DatabaseConnectionConfig, engine):
+def make_conn(
+    db_config: DatabaseConnectionConfig, engine: BaseDatabaseEngine
+) -> Connection:
     """Make a new connection to the database and return it.
 
     Returns:
@@ -107,20 +103,27 @@ def make_conn(db_config: DatabaseConnectionConfig, engine):
     return db_conn
 
 
-class LoggingTransaction(object):
+# The type of entry which goes on our after_callbacks and exception_callbacks lists.
+#
+# Python 3.5.2 doesn't support Callable with an ellipsis, so we wrap it in quotes so
+# that mypy sees the type but the runtime python doesn't.
+_CallbackListEntry = Tuple["Callable[..., None]", Iterable[Any], Dict[str, Any]]
+
+
+class LoggingTransaction:
     """An object that almost-transparently proxies for the 'txn' object
     passed to the constructor. Adds logging and metrics to the .execute()
     method.
 
     Args:
         txn: The database transcation object to wrap.
-        name (str): The name of this transactions for logging.
-        database_engine (Sqlite3Engine|PostgresEngine)
-        after_callbacks(list|None): A list that callbacks will be appended to
+        name: The name of this transactions for logging.
+        database_engine
+        after_callbacks: A list that callbacks will be appended to
             that have been added by `call_after` which should be run on
             successful completion of the transaction. None indicates that no
             callbacks should be allowed to be scheduled to run.
-        exception_callbacks(list|None): A list that callbacks will be appended
+        exception_callbacks: A list that callbacks will be appended
             to that have been added by `call_on_exception` which should be run
             if transaction ends with an error. None indicates that no callbacks
             should be allowed to be scheduled to run.
@@ -135,46 +138,67 @@ class LoggingTransaction(object):
     ]
 
     def __init__(
-        self, txn, name, database_engine, after_callbacks=None, exception_callbacks=None
+        self,
+        txn: Cursor,
+        name: str,
+        database_engine: BaseDatabaseEngine,
+        after_callbacks: Optional[List[_CallbackListEntry]] = None,
+        exception_callbacks: Optional[List[_CallbackListEntry]] = None,
     ):
-        object.__setattr__(self, "txn", txn)
-        object.__setattr__(self, "name", name)
-        object.__setattr__(self, "database_engine", database_engine)
-        object.__setattr__(self, "after_callbacks", after_callbacks)
-        object.__setattr__(self, "exception_callbacks", exception_callbacks)
+        self.txn = txn
+        self.name = name
+        self.database_engine = database_engine
+        self.after_callbacks = after_callbacks
+        self.exception_callbacks = exception_callbacks
 
-    def call_after(self, callback, *args, **kwargs):
+    def call_after(self, callback: "Callable[..., None]", *args, **kwargs):
         """Call the given callback on the main twisted thread after the
         transaction has finished. Used to invalidate the caches on the
         correct thread.
         """
+        # if self.after_callbacks is None, that means that whatever constructed the
+        # LoggingTransaction isn't expecting there to be any callbacks; assert that
+        # is not the case.
+        assert self.after_callbacks is not None
         self.after_callbacks.append((callback, args, kwargs))
 
-    def call_on_exception(self, callback, *args, **kwargs):
+    def call_on_exception(self, callback: "Callable[..., None]", *args, **kwargs):
+        # if self.exception_callbacks is None, that means that whatever constructed the
+        # LoggingTransaction isn't expecting there to be any callbacks; assert that
+        # is not the case.
+        assert self.exception_callbacks is not None
         self.exception_callbacks.append((callback, args, kwargs))
 
-    def __getattr__(self, name):
-        return getattr(self.txn, name)
+    def fetchall(self) -> List[Tuple]:
+        return self.txn.fetchall()
 
-    def __setattr__(self, name, value):
-        setattr(self.txn, name, value)
+    def fetchone(self) -> Tuple:
+        return self.txn.fetchone()
 
-    def __iter__(self):
+    def __iter__(self) -> Iterator[Tuple]:
         return self.txn.__iter__()
 
+    @property
+    def rowcount(self) -> int:
+        return self.txn.rowcount
+
+    @property
+    def description(self) -> Any:
+        return self.txn.description
+
     def execute_batch(self, sql, args):
         if isinstance(self.database_engine, PostgresEngine):
-            from psycopg2.extras import execute_batch
+            from psycopg2.extras import execute_batch  # type: ignore
 
             self._do_execute(lambda *x: execute_batch(self.txn, *x), sql, args)
         else:
             for val in args:
                 self.execute(sql, val)
 
-    def execute(self, sql, *args):
+    def execute(self, sql: str, *args: Any):
         self._do_execute(self.txn.execute, sql, *args)
 
-    def executemany(self, sql, *args):
+    def executemany(self, sql: str, *args: Any):
         self._do_execute(self.txn.executemany, sql, *args)
 
     def _make_sql_one_line(self, sql):
@@ -207,6 +231,9 @@ class LoggingTransaction(object):
             sql_logger.debug("[SQL time] {%s} %f sec", self.name, secs)
             sql_query_timer.labels(sql.split()[0]).observe(secs)
 
+    def close(self):
+        self.txn.close()
+
 
 class PerformanceCounters(object):
     def __init__(self):
@@ -251,7 +278,9 @@ class Database(object):
 
     _TXN_ID = 0
 
-    def __init__(self, hs, database_config: DatabaseConnectionConfig, engine):
+    def __init__(
+        self, hs, database_config: DatabaseConnectionConfig, engine: BaseDatabaseEngine
+    ):
         self.hs = hs
         self._clock = hs.get_clock()
         self._database_config = database_config
@@ -259,9 +288,9 @@ class Database(object):
 
         self.updates = BackgroundUpdater(hs, self)
 
-        self._previous_txn_total_time = 0
-        self._current_txn_total_time = 0
-        self._previous_loop_ts = 0
+        self._previous_txn_total_time = 0.0
+        self._current_txn_total_time = 0.0
+        self._previous_loop_ts = 0.0
 
         # TODO(paul): These can eventually be removed once the metrics code
         #   is running in mainline, and we have some nice monitoring frontends
@@ -463,23 +492,23 @@ class Database(object):
             sql_txn_timer.labels(desc).observe(duration)
 
     @defer.inlineCallbacks
-    def runInteraction(self, desc, func, *args, **kwargs):
+    def runInteraction(self, desc: str, func: Callable, *args: Any, **kwargs: Any):
         """Starts a transaction on the database and runs a given function
 
         Arguments:
-            desc (str): description of the transaction, for logging and metrics
-            func (func): callback function, which will be called with a
+            desc: description of the transaction, for logging and metrics
+            func: callback function, which will be called with a
                 database transaction (twisted.enterprise.adbapi.Transaction) as
                 its first argument, followed by `args` and `kwargs`.
 
-            args (list): positional args to pass to `func`
-            kwargs (dict): named args to pass to `func`
+            args: positional args to pass to `func`
+            kwargs: named args to pass to `func`
 
         Returns:
             Deferred: The result of func
         """
-        after_callbacks = []
-        exception_callbacks = []
+        after_callbacks = []  # type: List[_CallbackListEntry]
+        exception_callbacks = []  # type: List[_CallbackListEntry]
 
         if LoggingContext.current_context() == LoggingContext.sentinel:
             logger.warning("Starting db txn '%s' from sentinel context", desc)
@@ -505,20 +534,22 @@ class Database(object):
         return result
 
     @defer.inlineCallbacks
-    def runWithConnection(self, func, *args, **kwargs):
+    def runWithConnection(self, func: Callable, *args: Any, **kwargs: Any):
         """Wraps the .runWithConnection() method on the underlying db_pool.
 
         Arguments:
-            func (func): callback function, which will be called with a
+            func: callback function, which will be called with a
                 database connection (twisted.enterprise.adbapi.Connection) as
                 its first argument, followed by `args` and `kwargs`.
-            args (list): positional args to pass to `func`
-            kwargs (dict): named args to pass to `func`
+            args: positional args to pass to `func`
+            kwargs: named args to pass to `func`
 
         Returns:
             Deferred: The result of func
         """
-        parent_context = LoggingContext.current_context()
+        parent_context = (
+            LoggingContext.current_context()
+        )  # type: Optional[LoggingContextOrSentinel]
         if parent_context == LoggingContext.sentinel:
             logger.warning(
                 "Starting db connection from sentinel context: metrics will be lost"
@@ -554,8 +585,8 @@ class Database(object):
         Returns:
             A list of dicts where the key is the column header.
         """
-        col_headers = list(intern(str(column[0])) for column in cursor.description)
-        results = list(dict(zip(col_headers, row)) for row in cursor)
+        col_headers = [intern(str(column[0])) for column in cursor.description]
+        results = [dict(zip(col_headers, row)) for row in cursor]
         return results
 
     def execute(self, desc, decoder, query, *args):
@@ -800,7 +831,7 @@ class Database(object):
                 return False
 
         # We didn't find any existing rows, so insert a new one
-        allvalues = {}
+        allvalues = {}  # type: Dict[str, Any]
         allvalues.update(keyvalues)
         allvalues.update(values)
         allvalues.update(insertion_values)
@@ -829,7 +860,7 @@ class Database(object):
         Returns:
             None
         """
-        allvalues = {}
+        allvalues = {}  # type: Dict[str, Any]
         allvalues.update(keyvalues)
         allvalues.update(insertion_values)
 
@@ -916,7 +947,7 @@ class Database(object):
         Returns:
             None
         """
-        allnames = []
+        allnames = []  # type: List[str]
         allnames.extend(key_names)
         allnames.extend(value_names)
 
@@ -1100,7 +1131,7 @@ class Database(object):
             keyvalues : dict of column names and values to select the rows with
             retcols : list of strings giving the names of the columns to return
         """
-        results = []
+        results = []  # type: List[Dict[str, Any]]
 
         if not iterable:
             return results
@@ -1439,7 +1470,7 @@ class Database(object):
             raise ValueError("order_direction must be one of 'ASC' or 'DESC'.")
 
         where_clause = "WHERE " if filters or keyvalues else ""
-        arg_list = []
+        arg_list = []  # type: List[Any]
         if filters:
             where_clause += " AND ".join("%s LIKE ?" % (k,) for k in filters)
             arg_list += list(filters.values())
@@ -1504,7 +1535,7 @@ class Database(object):
 
 def make_in_list_sql_clause(
     database_engine, column: str, iterable: Iterable
-) -> Tuple[str, Iterable]:
+) -> Tuple[str, list]:
     """Returns an SQL clause that checks the given column is in the iterable.
 
     On SQLite this expands to `column IN (?, ?, ...)`, whereas on Postgres
diff --git a/synapse/storage/engines/__init__.py b/synapse/storage/engines/__init__.py
index 9d2d519922..035f9ea6e9 100644
--- a/synapse/storage/engines/__init__.py
+++ b/synapse/storage/engines/__init__.py
@@ -12,29 +12,31 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-
-import importlib
 import platform
 
-from ._base import IncorrectDatabaseSetup
+from ._base import BaseDatabaseEngine, IncorrectDatabaseSetup
 from .postgres import PostgresEngine
 from .sqlite import Sqlite3Engine
 
-SUPPORTED_MODULE = {"sqlite3": Sqlite3Engine, "psycopg2": PostgresEngine}
-
 
-def create_engine(database_config):
+def create_engine(database_config) -> BaseDatabaseEngine:
     name = database_config["name"]
-    engine_class = SUPPORTED_MODULE.get(name, None)
 
-    if engine_class:
+    if name == "sqlite3":
+        import sqlite3
+
+        return Sqlite3Engine(sqlite3, database_config)
+
+    if name == "psycopg2":
         # pypy requires psycopg2cffi rather than psycopg2
-        if name == "psycopg2" and platform.python_implementation() == "PyPy":
-            name = "psycopg2cffi"
-        module = importlib.import_module(name)
-        return engine_class(module, database_config)
+        if platform.python_implementation() == "PyPy":
+            import psycopg2cffi as psycopg2  # type: ignore
+        else:
+            import psycopg2  # type: ignore
+
+        return PostgresEngine(psycopg2, database_config)
 
     raise RuntimeError("Unsupported database engine '%s'" % (name,))
 
 
-__all__ = ["create_engine", "IncorrectDatabaseSetup"]
+__all__ = ["create_engine", "BaseDatabaseEngine", "IncorrectDatabaseSetup"]
diff --git a/synapse/storage/engines/_base.py b/synapse/storage/engines/_base.py
index ec5a4d198b..ab0bbe4bd3 100644
--- a/synapse/storage/engines/_base.py
+++ b/synapse/storage/engines/_base.py
@@ -12,7 +12,94 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import abc
+from typing import Generic, TypeVar
+
+from synapse.storage.types import Connection
 
 
 class IncorrectDatabaseSetup(RuntimeError):
     pass
+
+
+ConnectionType = TypeVar("ConnectionType", bound=Connection)
+
+
+class BaseDatabaseEngine(Generic[ConnectionType], metaclass=abc.ABCMeta):
+    def __init__(self, module, database_config: dict):
+        self.module = module
+
+    @property
+    @abc.abstractmethod
+    def single_threaded(self) -> bool:
+        ...
+
+    @property
+    @abc.abstractmethod
+    def can_native_upsert(self) -> bool:
+        """
+        Do we support native UPSERTs?
+        """
+        ...
+
+    @property
+    @abc.abstractmethod
+    def supports_tuple_comparison(self) -> bool:
+        """
+        Do we support comparing tuples, i.e. `(a, b) > (c, d)`?
+        """
+        ...
+
+    @property
+    @abc.abstractmethod
+    def supports_using_any_list(self) -> bool:
+        """
+        Do we support using `a = ANY(?)` and passing a list
+        """
+        ...
+
+    @abc.abstractmethod
+    def check_database(
+        self, db_conn: ConnectionType, allow_outdated_version: bool = False
+    ) -> None:
+        ...
+
+    @abc.abstractmethod
+    def check_new_database(self, txn) -> None:
+        """Gets called when setting up a brand new database. This allows us to
+        apply stricter checks on new databases versus existing database.
+        """
+        ...
+
+    @abc.abstractmethod
+    def convert_param_style(self, sql: str) -> str:
+        ...
+
+    @abc.abstractmethod
+    def on_new_connection(self, db_conn: ConnectionType) -> None:
+        ...
+
+    @abc.abstractmethod
+    def is_deadlock(self, error: Exception) -> bool:
+        ...
+
+    @abc.abstractmethod
+    def is_connection_closed(self, conn: ConnectionType) -> bool:
+        ...
+
+    @abc.abstractmethod
+    def lock_table(self, txn, table: str) -> None:
+        ...
+
+    @abc.abstractmethod
+    def get_next_state_group_id(self, txn) -> int:
+        """Returns an int that can be used as a new state_group ID
+        """
+        ...
+
+    @property
+    @abc.abstractmethod
+    def server_version(self) -> str:
+        """Gets a string giving the server version. For example: '3.22.0'
+        """
+        ...
diff --git a/synapse/storage/engines/postgres.py b/synapse/storage/engines/postgres.py
index a077345960..6c7d08a6f2 100644
--- a/synapse/storage/engines/postgres.py
+++ b/synapse/storage/engines/postgres.py
@@ -15,16 +15,14 @@
 
 import logging
 
-from ._base import IncorrectDatabaseSetup
+from ._base import BaseDatabaseEngine, IncorrectDatabaseSetup
 
 logger = logging.getLogger(__name__)
 
 
-class PostgresEngine(object):
-    single_threaded = False
-
+class PostgresEngine(BaseDatabaseEngine):
     def __init__(self, database_module, database_config):
-        self.module = database_module
+        super().__init__(database_module, database_config)
         self.module.extensions.register_type(self.module.extensions.UNICODE)
 
         # Disables passing `bytes` to txn.execute, c.f. #6186. If you do
@@ -36,6 +34,10 @@ class PostgresEngine(object):
         self.synchronous_commit = database_config.get("synchronous_commit", True)
         self._version = None  # unknown as yet
 
+    @property
+    def single_threaded(self) -> bool:
+        return False
+
     def check_database(self, db_conn, allow_outdated_version: bool = False):
         # Get the version of PostgreSQL that we're using. As per the psycopg2
         # docs: The number is formed by converting the major, minor, and
@@ -53,7 +55,7 @@ class PostgresEngine(object):
             if rows and rows[0][0] != "UTF8":
                 raise IncorrectDatabaseSetup(
                     "Database has incorrect encoding: '%s' instead of 'UTF8'\n"
-                    "See docs/postgres.rst for more information." % (rows[0][0],)
+                    "See docs/postgres.md for more information." % (rows[0][0],)
                 )
 
             txn.execute(
@@ -62,12 +64,16 @@ class PostgresEngine(object):
             collation, ctype = txn.fetchone()
             if collation != "C":
                 logger.warning(
-                    "Database has incorrect collation of %r. Should be 'C'", collation
+                    "Database has incorrect collation of %r. Should be 'C'\n"
+                    "See docs/postgres.md for more information.",
+                    collation,
                 )
 
             if ctype != "C":
                 logger.warning(
-                    "Database has incorrect ctype of %r. Should be 'C'", ctype
+                    "Database has incorrect ctype of %r. Should be 'C'\n"
+                    "See docs/postgres.md for more information.",
+                    ctype,
                 )
 
     def check_new_database(self, txn):
diff --git a/synapse/storage/engines/sqlite.py b/synapse/storage/engines/sqlite.py
index 641e490697..2bfeefd54e 100644
--- a/synapse/storage/engines/sqlite.py
+++ b/synapse/storage/engines/sqlite.py
@@ -12,16 +12,16 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-
+import sqlite3
 import struct
 import threading
 
+from synapse.storage.engines import BaseDatabaseEngine
 
-class Sqlite3Engine(object):
-    single_threaded = True
 
+class Sqlite3Engine(BaseDatabaseEngine[sqlite3.Connection]):
     def __init__(self, database_module, database_config):
-        self.module = database_module
+        super().__init__(database_module, database_config)
 
         database = database_config.get("args", {}).get("database")
         self._is_in_memory = database in (None, ":memory:",)
@@ -32,6 +32,10 @@ class Sqlite3Engine(object):
         self._current_state_group_id_lock = threading.Lock()
 
     @property
+    def single_threaded(self) -> bool:
+        return True
+
+    @property
     def can_native_upsert(self):
         """
         Do we support native UPSERTs? This requires SQLite3 3.24+, plus some
@@ -68,7 +72,6 @@ class Sqlite3Engine(object):
         return sql
 
     def on_new_connection(self, db_conn):
-
         # We need to import here to avoid an import loop.
         from synapse.storage.prepare_database import prepare_database
 
diff --git a/synapse/storage/persist_events.py b/synapse/storage/persist_events.py
index b950550f23..0f9ac1cf09 100644
--- a/synapse/storage/persist_events.py
+++ b/synapse/storage/persist_events.py
@@ -602,14 +602,14 @@ class EventsPersistenceStorage(object):
             event_id_to_state_group.update(event_to_groups)
 
         # State groups of old_latest_event_ids
-        old_state_groups = set(
+        old_state_groups = {
             event_id_to_state_group[evid] for evid in old_latest_event_ids
-        )
+        }
 
         # State groups of new_latest_event_ids
-        new_state_groups = set(
+        new_state_groups = {
             event_id_to_state_group[evid] for evid in new_latest_event_ids
-        )
+        }
 
         # If they old and new groups are the same then we don't need to do
         # anything.
diff --git a/synapse/storage/prepare_database.py b/synapse/storage/prepare_database.py
index c285ef52a0..6cb7d4b922 100644
--- a/synapse/storage/prepare_database.py
+++ b/synapse/storage/prepare_database.py
@@ -278,13 +278,17 @@ def _upgrade_existing_database(
             the current_version wasn't generated by applying those delta files.
         database_engine (DatabaseEngine)
         config (synapse.config.homeserver.HomeServerConfig|None):
-            application config, or None if we are connecting to an existing
-            database which we expect to be configured already
+            None if we are initialising a blank database, otherwise the application
+            config
         data_stores (list[str]): The names of the data stores to instantiate
             on the given database.
         is_empty (bool): Is this a blank database? I.e. do we need to run the
             upgrade portions of the delta scripts.
     """
+    if is_empty:
+        assert not applied_delta_files
+    else:
+        assert config
 
     if current_version > SCHEMA_VERSION:
         raise ValueError(
@@ -292,6 +296,13 @@ def _upgrade_existing_database(
             + "new for the server to understand"
         )
 
+    # some of the deltas assume that config.server_name is set correctly, so now
+    # is a good time to run the sanity check.
+    if not is_empty and "main" in data_stores:
+        from synapse.storage.data_stores.main import check_database_before_upgrade
+
+        check_database_before_upgrade(cur, database_engine, config)
+
     start_ver = current_version
     if not upgraded:
         start_ver += 1
@@ -345,9 +356,9 @@ def _upgrade_existing_database(
                     "Could not open delta dir for version %d: %s" % (v, directory)
                 )
 
-        duplicates = set(
+        duplicates = {
             file_name for file_name, count in file_name_counter.items() if count > 1
-        )
+        }
         if duplicates:
             # We don't support using the same file name in the same delta version.
             raise PrepareDatabaseException(
@@ -454,7 +465,7 @@ def _apply_module_schema_files(cur, database_engine, modname, names_and_streams)
         ),
         (modname,),
     )
-    applied_deltas = set(d for d, in cur)
+    applied_deltas = {d for d, in cur}
     for (name, stream) in names_and_streams:
         if name in applied_deltas:
             continue
diff --git a/synapse/storage/types.py b/synapse/storage/types.py
new file mode 100644
index 0000000000..daff81c5ee
--- /dev/null
+++ b/synapse/storage/types.py
@@ -0,0 +1,65 @@
+# -*- coding: utf-8 -*-
+# Copyright 2020 The Matrix.org Foundation C.I.C.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from typing import Any, Iterable, Iterator, List, Tuple
+
+from typing_extensions import Protocol
+
+
+"""
+Some very basic protocol definitions for the DB-API2 classes specified in PEP-249
+"""
+
+
+class Cursor(Protocol):
+    def execute(self, sql: str, parameters: Iterable[Any] = ...) -> Any:
+        ...
+
+    def executemany(self, sql: str, parameters: Iterable[Iterable[Any]]) -> Any:
+        ...
+
+    def fetchall(self) -> List[Tuple]:
+        ...
+
+    def fetchone(self) -> Tuple:
+        ...
+
+    @property
+    def description(self) -> Any:
+        return None
+
+    @property
+    def rowcount(self) -> int:
+        return 0
+
+    def __iter__(self) -> Iterator[Tuple]:
+        ...
+
+    def close(self) -> None:
+        ...
+
+
+class Connection(Protocol):
+    def cursor(self) -> Cursor:
+        ...
+
+    def close(self) -> None:
+        ...
+
+    def commit(self) -> None:
+        ...
+
+    def rollback(self, *args, **kwargs) -> None:
+        ...
diff --git a/synapse/types.py b/synapse/types.py
index f3cd465735..acf60baddc 100644
--- a/synapse/types.py
+++ b/synapse/types.py
@@ -23,7 +23,7 @@ import attr
 from signedjson.key import decode_verify_key_bytes
 from unpaddedbase64 import decode_base64
 
-from synapse.api.errors import SynapseError
+from synapse.api.errors import Codes, SynapseError
 
 # define a version of typing.Collection that works on python 3.5
 if sys.version_info[:3] >= (3, 6, 0):
@@ -166,11 +166,13 @@ class DomainSpecificString(namedtuple("DomainSpecificString", ("localpart", "dom
         return self
 
     @classmethod
-    def from_string(cls, s):
+    def from_string(cls, s: str):
         """Parse the string given by 's' into a structure object."""
         if len(s) < 1 or s[0:1] != cls.SIGIL:
             raise SynapseError(
-                400, "Expected %s string to start with '%s'" % (cls.__name__, cls.SIGIL)
+                400,
+                "Expected %s string to start with '%s'" % (cls.__name__, cls.SIGIL),
+                Codes.INVALID_PARAM,
             )
 
         parts = s[1:].split(":", 1)
@@ -179,6 +181,7 @@ class DomainSpecificString(namedtuple("DomainSpecificString", ("localpart", "dom
                 400,
                 "Expected %s of the form '%slocalname:domain'"
                 % (cls.__name__, cls.SIGIL),
+                Codes.INVALID_PARAM,
             )
 
         domain = parts[1]
@@ -235,11 +238,13 @@ class GroupID(DomainSpecificString):
     def from_string(cls, s):
         group_id = super(GroupID, cls).from_string(s)
         if not group_id.localpart:
-            raise SynapseError(400, "Group ID cannot be empty")
+            raise SynapseError(400, "Group ID cannot be empty", Codes.INVALID_PARAM)
 
         if contains_invalid_mxid_characters(group_id.localpart):
             raise SynapseError(
-                400, "Group ID can only contain characters a-z, 0-9, or '=_-./'"
+                400,
+                "Group ID can only contain characters a-z, 0-9, or '=_-./'",
+                Codes.INVALID_PARAM,
             )
 
         return group_id
diff --git a/synapse/util/frozenutils.py b/synapse/util/frozenutils.py
index 635b897d6c..f2ccd5e7c6 100644
--- a/synapse/util/frozenutils.py
+++ b/synapse/util/frozenutils.py
@@ -30,7 +30,7 @@ def freeze(o):
         return o
 
     try:
-        return tuple([freeze(i) for i in o])
+        return tuple(freeze(i) for i in o)
     except TypeError:
         pass
 
diff --git a/synapse/visibility.py b/synapse/visibility.py
index d0abd8f04f..bab41182b9 100644
--- a/synapse/visibility.py
+++ b/synapse/visibility.py
@@ -49,7 +49,7 @@ def filter_events_for_client(
     events,
     is_peeking=False,
     always_include_ids=frozenset(),
-    apply_retention_policies=True,
+    filter_send_to_client=True,
 ):
     """
     Check which events a user is allowed to see. If the user can see the event but its
@@ -65,17 +65,16 @@ def filter_events_for_client(
             events
         always_include_ids (set(event_id)): set of event ids to specifically
             include (unless sender is ignored)
-        apply_retention_policies (bool): Whether to filter out events that's older than
-            allowed by the room's retention policy. Useful when this function is called
-            to e.g. check whether a user should be allowed to see the state at a given
-            event rather than to know if it should send an event to a user's client(s).
+        filter_send_to_client (bool): Whether we're checking an event that's going to be
+            sent to a client. This might not always be the case since this function can
+            also be called to check whether a user can see the state at a given point.
 
     Returns:
         Deferred[list[synapse.events.EventBase]]
     """
     # Filter out events that have been soft failed so that we don't relay them
     # to clients.
-    events = list(e for e in events if not e.internal_metadata.is_soft_failed())
+    events = [e for e in events if not e.internal_metadata.is_soft_failed()]
 
     types = ((EventTypes.RoomHistoryVisibility, ""), (EventTypes.Member, user_id))
     event_id_to_state = yield storage.state.get_state_for_events(
@@ -96,8 +95,8 @@ def filter_events_for_client(
 
     erased_senders = yield storage.main.are_users_erased((e.sender for e in events))
 
-    if apply_retention_policies:
-        room_ids = set(e.room_id for e in events)
+    if filter_send_to_client:
+        room_ids = {e.room_id for e in events}
         retention_policies = {}
 
         for room_id in room_ids:
@@ -119,27 +118,36 @@ def filter_events_for_client(
 
                the original event if they can see it as normal.
         """
-        if not event.is_state() and event.sender in ignore_list:
-            return None
-
-        # Until MSC2261 has landed we can't redact malicious alias events, so for
-        # now we temporarily filter out m.room.aliases entirely to mitigate
-        # abuse, while we spec a better solution to advertising aliases
-        # on rooms.
-        if event.type == EventTypes.Aliases:
-            return None
-
-        # Don't try to apply the room's retention policy if the event is a state event, as
-        # MSC1763 states that retention is only considered for non-state events.
-        if apply_retention_policies and not event.is_state():
-            retention_policy = retention_policies[event.room_id]
-            max_lifetime = retention_policy.get("max_lifetime")
-
-            if max_lifetime is not None:
-                oldest_allowed_ts = storage.main.clock.time_msec() - max_lifetime
-
-                if event.origin_server_ts < oldest_allowed_ts:
-                    return None
+        # Only run some checks if these events aren't about to be sent to clients. This is
+        # because, if this is not the case, we're probably only checking if the users can
+        # see events in the room at that point in the DAG, and that shouldn't be decided
+        # on those checks.
+        if filter_send_to_client:
+            if event.type == "org.matrix.dummy_event":
+                return None
+
+            if not event.is_state() and event.sender in ignore_list:
+                return None
+
+            # Until MSC2261 has landed we can't redact malicious alias events, so for
+            # now we temporarily filter out m.room.aliases entirely to mitigate
+            # abuse, while we spec a better solution to advertising aliases
+            # on rooms.
+            if event.type == EventTypes.Aliases:
+                return None
+
+            # Don't try to apply the room's retention policy if the event is a state
+            # event, as MSC1763 states that retention is only considered for non-state
+            # events.
+            if not event.is_state():
+                retention_policy = retention_policies[event.room_id]
+                max_lifetime = retention_policy.get("max_lifetime")
+
+                if max_lifetime is not None:
+                    oldest_allowed_ts = storage.main.clock.time_msec() - max_lifetime
+
+                    if event.origin_server_ts < oldest_allowed_ts:
+                        return None
 
         if event.event_id in always_include_ids:
             return event