diff --git a/docs/media_repository.md b/docs/media_repository.md
index 99ee8f1ef7..ba17f8a856 100644
--- a/docs/media_repository.md
+++ b/docs/media_repository.md
@@ -2,29 +2,80 @@
*Synapse implementation-specific details for the media repository*
-The media repository is where attachments and avatar photos are stored.
-It stores attachment content and thumbnails for media uploaded by local users.
-It caches attachment content and thumbnails for media uploaded by remote users.
+The media repository
+ * stores avatars, attachments and their thumbnails for media uploaded by local
+ users.
+ * caches avatars, attachments and their thumbnails for media uploaded by remote
+ users.
+ * caches resources and thumbnails used for
+ [URL previews](development/url_previews.md).
-## Storage
+All media in Matrix can be identified by a unique
+[MXC URI](https://spec.matrix.org/latest/client-server-api/#matrix-content-mxc-uris),
+consisting of a server name and media ID:
+```
+mxc://<server-name>/<media-id>
+```
-Each item of media is assigned a `media_id` when it is uploaded.
-The `media_id` is a randomly chosen, URL safe 24 character string.
+## Local Media
+Synapse generates 24 character media IDs for content uploaded by local users.
+These media IDs consist of upper and lowercase letters and are case-sensitive.
+Other homeserver implementations may generate media IDs differently.
-Metadata such as the MIME type, upload time and length are stored in the
-sqlite3 database indexed by `media_id`.
+Local media is recorded in the `local_media_repository` table, which includes
+metadata such as MIME types, upload times and file sizes.
+Note that this table is shared by the URL cache, which has a different media ID
+scheme.
-Content is stored on the filesystem under a `"local_content"` directory.
+### Paths
+A file with media ID `aabbcccccccccccccccccccc` and its `128x96` `image/jpeg`
+thumbnail, created by scaling, would be stored at:
+```
+local_content/aa/bb/cccccccccccccccccccc
+local_thumbnails/aa/bb/cccccccccccccccccccc/128-96-image-jpeg-scale
+```
-Thumbnails are stored under a `"local_thumbnails"` directory.
+## Remote Media
+When media from a remote homeserver is requested from Synapse, it is assigned
+a local `filesystem_id`, with the same format as locally-generated media IDs,
+as described above.
-The item with `media_id` `"aabbccccccccdddddddddddd"` is stored under
-`"local_content/aa/bb/ccccccccdddddddddddd"`. Its thumbnail with width
-`128` and height `96` and type `"image/jpeg"` is stored under
-`"local_thumbnails/aa/bb/ccccccccdddddddddddd/128-96-image-jpeg"`
+A record of remote media is stored in the `remote_media_cache` table, which
+can be used to map remote MXC URIs (server names and media IDs) to local
+`filesystem_id`s.
-Remote content is cached under `"remote_content"` directory. Each item of
-remote content is assigned a local `"filesystem_id"` to ensure that the
-directory structure `"remote_content/server_name/aa/bb/ccccccccdddddddddddd"`
-is appropriate. Thumbnails for remote content are stored under
-`"remote_thumbnail/server_name/..."`
+### Paths
+A file from `matrix.org` with `filesystem_id` `aabbcccccccccccccccccccc` and its
+`128x96` `image/jpeg` thumbnail, created by scaling, would be stored at:
+```
+remote_content/matrix.org/aa/bb/cccccccccccccccccccc
+remote_thumbnail/matrix.org/aa/bb/cccccccccccccccccccc/128-96-image-jpeg-scale
+```
+Older thumbnails may omit the thumbnailing method:
+```
+remote_thumbnail/matrix.org/aa/bb/cccccccccccccccccccc/128-96-image-jpeg
+```
+
+Note that `remote_thumbnail/` does not have an `s`.
+
+## URL Previews
+See [URL Previews](development/url_previews.md) for documentation on the URL preview
+process.
+
+When generating previews for URLs, Synapse may download and cache various
+resources, including images. These resources are assigned temporary media IDs
+of the form `yyyy-mm-dd_aaaaaaaaaaaaaaaa`, where `yyyy-mm-dd` is the current
+date and `aaaaaaaaaaaaaaaa` is a random sequence of 16 case-sensitive letters.
+
+The metadata for these cached resources is stored in the
+`local_media_repository` and `local_media_repository_url_cache` tables.
+
+Resources for URL previews are deleted after a few days.
+
+### Paths
+The file with media ID `yyyy-mm-dd_aaaaaaaaaaaaaaaa` and its `128x96`
+`image/jpeg` thumbnail, created by scaling, would be stored at:
+```
+url_cache/yyyy-mm-dd/aaaaaaaaaaaaaaaa
+url_cache_thumbnails/yyyy-mm-dd/aaaaaaaaaaaaaaaa/128-96-image-jpeg-scale
+```
diff --git a/docs/modules/background_update_controller_callbacks.md b/docs/modules/background_update_controller_callbacks.md
new file mode 100644
index 0000000000..b3e7c259f4
--- /dev/null
+++ b/docs/modules/background_update_controller_callbacks.md
@@ -0,0 +1,71 @@
+# Background update controller callbacks
+
+Background update controller callbacks allow module developers to control (e.g. rate-limit)
+how database background updates are run. A database background update is an operation
+Synapse runs on its database in the background after it starts. It's usually used to run
+database operations that would take too long if they were run at the same time as schema
+updates (which are run on startup) and delay Synapse's startup too much: populating a
+table with a big amount of data, adding an index on a big table, deleting superfluous data,
+etc.
+
+Background update controller callbacks can be registered using the module API's
+`register_background_update_controller_callbacks` method. Only the first module (in order
+of appearance in Synapse's configuration file) calling this method can register background
+update controller callbacks, subsequent calls are ignored.
+
+The available background update controller callbacks are:
+
+### `on_update`
+
+_First introduced in Synapse v1.49.0_
+
+```python
+def on_update(update_name: str, database_name: str, one_shot: bool) -> AsyncContextManager[int]
+```
+
+Called when about to do an iteration of a background update. The module is given the name
+of the update, the name of the database, and a flag to indicate whether the background
+update will happen in one go and may take a long time (e.g. creating indices). If this last
+argument is set to `False`, the update will be run in batches.
+
+The module must return an async context manager. It will be entered before Synapse runs a
+background update; this should return the desired duration of the iteration, in
+milliseconds.
+
+The context manager will be exited when the iteration completes. Note that the duration
+returned by the context manager is a target, and an iteration may take substantially longer
+or shorter. If the `one_shot` flag is set to `True`, the duration returned is ignored.
+
+__Note__: Unlike most module callbacks in Synapse, this one is _synchronous_. This is
+because asynchronous operations are expected to be run by the async context manager.
+
+This callback is required when registering any other background update controller callback.
+
+### `default_batch_size`
+
+_First introduced in Synapse v1.49.0_
+
+```python
+async def default_batch_size(update_name: str, database_name: str) -> int
+```
+
+Called before the first iteration of a background update, with the name of the update and
+of the database. The module must return the number of elements to process in this first
+iteration.
+
+If this callback is not defined, Synapse will use a default value of 100.
+
+### `min_batch_size`
+
+_First introduced in Synapse v1.49.0_
+
+```python
+async def min_batch_size(update_name: str, database_name: str) -> int
+```
+
+Called before running a new batch for a background update, with the name of the update and
+of the database. The module must return an integer representing the minimum number of
+elements to process in this iteration. This number must be at least 1, and is used to
+ensure that progress is always made.
+
+If this callback is not defined, Synapse will use a default value of 100.
diff --git a/docs/modules/writing_a_module.md b/docs/modules/writing_a_module.md
index 7764e06692..e7c0ffad58 100644
--- a/docs/modules/writing_a_module.md
+++ b/docs/modules/writing_a_module.md
@@ -71,15 +71,15 @@ Modules **must** register their web resources in their `__init__` method.
## Registering a callback
Modules can use Synapse's module API to register callbacks. Callbacks are functions that
-Synapse will call when performing specific actions. Callbacks must be asynchronous, and
-are split in categories. A single module may implement callbacks from multiple categories,
-and is under no obligation to implement all callbacks from the categories it registers
-callbacks for.
+Synapse will call when performing specific actions. Callbacks must be asynchronous (unless
+specified otherwise), and are split in categories. A single module may implement callbacks
+from multiple categories, and is under no obligation to implement all callbacks from the
+categories it registers callbacks for.
Modules can register callbacks using one of the module API's `register_[...]_callbacks`
methods. The callback functions are passed to these methods as keyword arguments, with
-the callback name as the argument name and the function as its value. This is demonstrated
-in the example below. A `register_[...]_callbacks` method exists for each category.
+the callback name as the argument name and the function as its value. A
+`register_[...]_callbacks` method exists for each category.
Callbacks for each category can be found on their respective page of the
[Synapse documentation website](https://matrix-org.github.io/synapse).
\ No newline at end of file
diff --git a/docs/templates.md b/docs/templates.md
index a240f58b54..2b66e9d862 100644
--- a/docs/templates.md
+++ b/docs/templates.md
@@ -71,7 +71,12 @@ Below are the templates Synapse will look for when generating the content of an
* `sender_avatar_url`: the avatar URL (as a `mxc://` URL) for the event's
sender
* `sender_hash`: a hash of the user ID of the sender
+ * `msgtype`: the type of the message
+ * `body_text_html`: html representation of the message
+ * `body_text_plain`: plaintext representation of the message
+ * `image_url`: mxc url of an image, when "msgtype" is "m.image"
* `link`: a `matrix.to` link to the room
+ * `avator_url`: url to the room's avator
* `reason`: information on the event that triggered the email to be sent. It's an
object with the following attributes:
* `room_id`: the ID of the room the event was sent in
diff --git a/docs/workers.md b/docs/workers.md
index 17c8bfeef6..fd83e2ddeb 100644
--- a/docs/workers.md
+++ b/docs/workers.md
@@ -210,7 +210,7 @@ expressions:
^/_matrix/federation/v1/get_groups_publicised$
^/_matrix/key/v2/query
^/_matrix/federation/unstable/org.matrix.msc2946/spaces/
- ^/_matrix/federation/unstable/org.matrix.msc2946/hierarchy/
+ ^/_matrix/federation/(v1|unstable/org.matrix.msc2946)/hierarchy/
# Inbound federation transaction request
^/_matrix/federation/v1/send/
@@ -223,7 +223,7 @@ expressions:
^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/members$
^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/state$
^/_matrix/client/unstable/org.matrix.msc2946/rooms/.*/spaces$
- ^/_matrix/client/unstable/org.matrix.msc2946/rooms/.*/hierarchy$
+ ^/_matrix/client/(v1|unstable/org.matrix.msc2946)/rooms/.*/hierarchy$
^/_matrix/client/unstable/im.nheko.summary/rooms/.*/summary$
^/_matrix/client/(api/v1|r0|v3|unstable)/account/3pid$
^/_matrix/client/(api/v1|r0|v3|unstable)/devices$
|