From 3e9f830a1f3368f5210862317e608a528735a38b Mon Sep 17 00:00:00 2001 From: babolivier Date: Tue, 12 Oct 2021 09:56:14 +0000 Subject: deploy: 60af28c5dd803ac4ad1aa216574cac33b6daed6a --- v1.45/development/url_previews.html | 331 ++++++++++++++++++++++++++++++++++++ 1 file changed, 331 insertions(+) create mode 100644 v1.45/development/url_previews.html (limited to 'v1.45/development/url_previews.html') diff --git a/v1.45/development/url_previews.html b/v1.45/development/url_previews.html new file mode 100644 index 0000000000..5ae569794c --- /dev/null +++ b/v1.45/development/url_previews.html @@ -0,0 +1,331 @@ + + + + + + URL Previews - Synapse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + + + + + + +
+
+ +
+ +
+ +

URL Previews

+

The GET /_matrix/media/r0/preview_url endpoint provides a generic preview API +for URLs which outputs Open Graph responses (with some Matrix +specific additions).

+

This does have trade-offs compared to other designs:

+
    +
  • Pros: +
      +
    • Simple and flexible; can be used by any clients at any point
    • +
    +
  • +
  • Cons: +
      +
    • If each homeserver provides one of these independently, all the HSes in a +room may needlessly DoS the target URI
    • +
    • The URL metadata must be stored somewhere, rather than just using Matrix +itself to store the media.
    • +
    • Matrix cannot be used to distribute the metadata between homeservers.
    • +
    +
  • +
+

When Synapse is asked to preview a URL it does the following:

+
    +
  1. Checks against a URL blacklist (defined as url_preview_url_blacklist in the +config).
  2. +
  3. Checks the in-memory cache by URLs and returns the result if it exists. (This +is also used to de-duplicate processing of multiple in-flight requests at once.)
  4. +
  5. Kicks off a background process to generate a preview: +
      +
    1. Checks the database cache by URL and timestamp and returns the result if it +has not expired and was successful (a 2xx return code).
    2. +
    3. Checks if the URL matches an oEmbed pattern. If it +does, update the URL to download.
    4. +
    5. Downloads the URL and stores it into a file via the media storage provider +and saves the local media metadata.
    6. +
    7. If the media is an image: +
        +
      1. Generates thumbnails.
      2. +
      3. Generates an Open Graph response based on image properties.
      4. +
      +
    8. +
    9. If the media is HTML: +
        +
      1. Decodes the HTML via the stored file.
      2. +
      3. Generates an Open Graph response from the HTML.
      4. +
      5. If an image exists in the Open Graph response: +
          +
        1. Downloads the URL and stores it into a file via the media storage +provider and saves the local media metadata.
        2. +
        3. Generates thumbnails.
        4. +
        5. Updates the Open Graph response based on image properties.
        6. +
        +
      6. +
      +
    10. +
    11. If the media is JSON and an oEmbed URL was found: +
        +
      1. Convert the oEmbed response to an Open Graph response.
      2. +
      3. If a thumbnail or image is in the oEmbed response: +
          +
        1. Downloads the URL and stores it into a file via the media storage +provider and saves the local media metadata.
        2. +
        3. Generates thumbnails.
        4. +
        5. Updates the Open Graph response based on image properties.
        6. +
        +
      4. +
      +
    12. +
    13. Stores the result in the database cache.
    14. +
    +
  6. +
  7. Returns the result.
  8. +
+

The in-memory cache expires after 1 hour.

+

Expired entries in the database cache (and their associated media files) are +deleted every 10 seconds. The default expiration time is 1 hour from download.

+ +
+ + +
+
+ + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file -- cgit 1.5.1