summary refs log tree commit diff
diff options
context:
space:
mode:
-rw-r--r--changelog.d/11415.doc1
-rw-r--r--docs/media_repository.md89
2 files changed, 71 insertions, 19 deletions
diff --git a/changelog.d/11415.doc b/changelog.d/11415.doc
new file mode 100644
index 0000000000..e405531867
--- /dev/null
+++ b/changelog.d/11415.doc
@@ -0,0 +1 @@
+Update the media repository documentation.
diff --git a/docs/media_repository.md b/docs/media_repository.md
index 99ee8f1ef7..ba17f8a856 100644
--- a/docs/media_repository.md
+++ b/docs/media_repository.md
@@ -2,29 +2,80 @@
 
 *Synapse implementation-specific details for the media repository*
 
-The media repository is where attachments and avatar photos are stored.
-It stores attachment content and thumbnails for media uploaded by local users.
-It caches attachment content and thumbnails for media uploaded by remote users.
+The media repository
+ * stores avatars, attachments and their thumbnails for media uploaded by local
+   users.
+ * caches avatars, attachments and their thumbnails for media uploaded by remote
+   users.
+ * caches resources and thumbnails used for
+   [URL previews](development/url_previews.md).
 
-## Storage
+All media in Matrix can be identified by a unique
+[MXC URI](https://spec.matrix.org/latest/client-server-api/#matrix-content-mxc-uris),
+consisting of a server name and media ID:
+```
+mxc://<server-name>/<media-id>
+```
 
-Each item of media is assigned a `media_id` when it is uploaded.
-The `media_id` is a randomly chosen, URL safe 24 character string.
+## Local Media
+Synapse generates 24 character media IDs for content uploaded by local users.
+These media IDs consist of upper and lowercase letters and are case-sensitive.
+Other homeserver implementations may generate media IDs differently.
 
-Metadata such as the MIME type, upload time and length are stored in the
-sqlite3 database indexed by `media_id`.
+Local media is recorded in the `local_media_repository` table, which includes
+metadata such as MIME types, upload times and file sizes.
+Note that this table is shared by the URL cache, which has a different media ID
+scheme.
 
-Content is stored on the filesystem under a `"local_content"` directory.
+### Paths
+A file with media ID `aabbcccccccccccccccccccc` and its `128x96` `image/jpeg`
+thumbnail, created by scaling, would be stored at:
+```
+local_content/aa/bb/cccccccccccccccccccc
+local_thumbnails/aa/bb/cccccccccccccccccccc/128-96-image-jpeg-scale
+```
 
-Thumbnails are stored under a `"local_thumbnails"` directory.
+## Remote Media
+When media from a remote homeserver is requested from Synapse, it is assigned
+a local `filesystem_id`, with the same format as locally-generated media IDs,
+as described above.
 
-The item with `media_id` `"aabbccccccccdddddddddddd"` is stored under
-`"local_content/aa/bb/ccccccccdddddddddddd"`. Its thumbnail with width
-`128` and height `96` and type `"image/jpeg"` is stored under
-`"local_thumbnails/aa/bb/ccccccccdddddddddddd/128-96-image-jpeg"`
+A record of remote media is stored in the `remote_media_cache` table, which
+can be used to map remote MXC URIs (server names and media IDs) to local
+`filesystem_id`s.
 
-Remote content is cached under `"remote_content"` directory. Each item of
-remote content is assigned a local `"filesystem_id"` to ensure that the
-directory structure `"remote_content/server_name/aa/bb/ccccccccdddddddddddd"`
-is appropriate. Thumbnails for remote content are stored under
-`"remote_thumbnail/server_name/..."`
+### Paths
+A file from `matrix.org` with `filesystem_id` `aabbcccccccccccccccccccc` and its
+`128x96` `image/jpeg` thumbnail, created by scaling, would be stored at:
+```
+remote_content/matrix.org/aa/bb/cccccccccccccccccccc
+remote_thumbnail/matrix.org/aa/bb/cccccccccccccccccccc/128-96-image-jpeg-scale
+```
+Older thumbnails may omit the thumbnailing method:
+```
+remote_thumbnail/matrix.org/aa/bb/cccccccccccccccccccc/128-96-image-jpeg
+```
+
+Note that `remote_thumbnail/` does not have an `s`.
+
+## URL Previews
+See [URL Previews](development/url_previews.md) for documentation on the URL preview
+process.
+
+When generating previews for URLs, Synapse may download and cache various
+resources, including images. These resources are assigned temporary media IDs
+of the form `yyyy-mm-dd_aaaaaaaaaaaaaaaa`, where `yyyy-mm-dd` is the current
+date and `aaaaaaaaaaaaaaaa` is a random sequence of 16 case-sensitive letters.
+
+The metadata for these cached resources is stored in the
+`local_media_repository` and `local_media_repository_url_cache` tables.
+
+Resources for URL previews are deleted after a few days.
+
+### Paths
+The file with media ID `yyyy-mm-dd_aaaaaaaaaaaaaaaa` and its `128x96`
+`image/jpeg` thumbnail, created by scaling, would be stored at:
+```
+url_cache/yyyy-mm-dd/aaaaaaaaaaaaaaaa
+url_cache_thumbnails/yyyy-mm-dd/aaaaaaaaaaaaaaaa/128-96-image-jpeg-scale
+```