summary refs log tree commit diff
path: root/tests/rest/media (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Separate HTTP preview code and URL previewer. (#15269)Patrick Cloke2023-03-201-18/+16
| | | Separates REST layer code from the actual URL previewing.
* Refactor media modules. (#15146)Patrick Cloke2023-02-277-2145/+3
| | | | | | | * Removes the `v1` directory from `test.rest.media.v1`. * Moves the non-REST code from `synapse.rest.media.v1` to `synapse.media`. * Flatten the `v1` directory from `synapse.rest.media`, but leave compatiblity with 3rd party media repositories and spam checkers.
* Do not fail completely if oEmbed autodiscovery fails. (#15092)Patrick Cloke2023-02-231-3/+41
| | | | | | Previously if an autodiscovered oEmbed request failed (e.g. the oEmbed endpoint is down or does not exist) then the entire URL preview would fail. Instead we now return everything we can, even if this additional request fails.
* Bump black from 22.12.0 to 23.1.0 (#15103)dependabot[bot]2023-02-223-7/+0
|
* Typecheck tests.rest.media.v1.test_media_storage (#15008)David Robertson2023-02-071-18/+31
| | | | | | | | | | | * Fix MediaStorage type hint * Typecheck tests.rest.media.v1.test_media_storage * Changelog * Remove assert and make the comment succinct * Fix syntax for olddeps
* Unescape HTML entities in oEmbed titles. (#14781)Jeyachandran Rathnam2023-01-091-0/+10
| | | | | | | It doesn't seem valid that HTML entities should appear in the title field of oEmbed responses, but a popular WordPress plug-in seems to do it. There should not be harm in unescaping these.
* Be more lenient in the oEmbed response parsing. (#14089)Patrick Cloke2022-10-071-1/+102
| | | | | | Attempt to parse any valid information from an oEmbed response (instead of bailing at the first unexpected data). This should allow for more partial oEmbed data to be returned, resulting in better / more URL previews, even if those URL previews are only partial.
* Add a `MXCUri` class to make working with mxc uri's easier. (#13162)Andrew Morgan2022-09-151-64/+38
|
* Provide more info why we don't have any thumbnails to serve (#13038)Eric Eastwood2022-07-151-8/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix https://github.com/matrix-org/synapse/issues/13016 ## New error code and status ### Before Previously, we returned a `404` for `/thumbnail` which isn't even in the spec. ```json { "errcode": "M_NOT_FOUND", "error": "Not found [b'hs1', b'tefQeZhmVxoiBfuFQUKRzJxc']" } ``` ### After What does the spec say? > 400: The request does not make sense to the server, or the server cannot thumbnail the content. For example, the client requested non-integer dimensions or asked for negatively-sized images. > > *-- https://spec.matrix.org/v1.1/client-server-api/#get_matrixmediav3thumbnailservernamemediaid* Now with this PR, we respond with a `400` when we don't have thumbnails to serve and we explain why we might not have any thumbnails. ```json { "errcode": "M_UNKNOWN", "error": "Cannot find any thumbnails for the requested media ([b'example.com', b'12345']). This might mean the media is not a supported_media_format=(image/jpeg, image/jpg, image/webp, image/gif, image/png) or that thumbnailing failed for some other reason. (Dynamic thumbnails are disabled on this server.)", } ``` > Cannot find any thumbnails for the requested media ([b'example.com', b'12345']). This might mean the media is not a supported_media_format=(image/jpeg, image/jpg, image/webp, image/gif, image/png) or that thumbnailing failed for some other reason. (Dynamic thumbnails are disabled on this server.) --- We still respond with a 404 in many other places. But we can iterate on those later and maybe keep some in some specific places after spec updates/clarification: https://github.com/matrix-org/matrix-spec/issues/1122 We can also iterate on the bugs where Synapse doesn't thumbnail when it should in other issues/PRs.
* Uniformize spam-checker API, part 5: expand other spam-checker callbacks to ↵David Teller2022-07-111-3/+67
| | | | | | return `Tuple[Codes, dict]` (#13044) Signed-off-by: David Teller <davidt@element.io> Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
* Merge branch 'master' into developAndrew Morgan2022-06-281-0/+17
|\
| * Merge pull request from GHSA-22p3-qrh9-cx32reivilibre2022-06-281-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Make _iterate_over_text easier to read by using simple data structures * Prefer a set of tags to ignore In my tests, it's 4x faster to check for containment in a set of this size * Add a stack size limit to _iterate_over_text * Continue accepting the case where there is no body element * Use an early return instead for None Co-authored-by: Richard van der Hoff <richard@matrix.org>
* | Add Cross-Origin-Resource-Policy header to thumbnail and download media ↵Robert Long2022-06-271-0/+20
| | | | | | | | endpoints (#12944)
* | Improve URL previews for sites with only Twitter card information. (#13056)Patrick Cloke2022-06-161-0/+41
|/ | | | | | Pull out `twitter:` meta tags when generating a preview and use it to augment any `og:` meta tags. Prefers Open Graph information over Twitter card information.
* Prevent local quarantined media from being claimed by media retention (#12972)Andrew Morgan2022-06-071-13/+96
|
* Do not break URL previews if an image is unreachable. (#12950)Patrick Cloke2022-06-061-0/+35
| | | | Avoid breaking a URL preview completely if the chosen image 404s or is unreachable for some other reason (e.g. DNS).
* Improve URL previews for some pages (#12951)Patrick Cloke2022-06-031-1/+36
| | | | | * Skip `og` and `meta` tags where the value is empty. * Fallback to the favicon if there are no other images. * Ignore tags meant for navigation.
* Add config options for media retention (#12732)Andrew Morgan2022-05-311-0/+238
|
* Ensure the type of URL attributes is always str when matching against ↵Brendan Abolivier2022-03-311-2/+41
| | | | preview blacklist (#12333)
* Clean-up logic for rebasing URLs during URL preview. (#12219)Patrick Cloke2022-03-161-43/+11
| | | | By using urljoin from the standard library and reducing the number of places URLs are rebased.
* Add type hints to `tests/rest`. (#12208)Dirk Klimpel2022-03-112-82/+111
| | | Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
* Add type hints to `tests/rest` (#12146)Dirk Klimpel2022-03-034-58/+58
| | | | | | | * Add type hints to `tests/rest` * newsfile * change import from `SigningKey`
* Replace assertEquals and friends with non-deprecated versions. (#12092)Patrick Cloke2022-02-281-1/+1
|
* Remove `HomeServer.get_datastore()` (#12031)Richard van der Hoff2022-02-231-1/+1
| | | | | | | The presence of this method was confusing, and mostly present for backwards compatibility. Let's get rid of it. Part of #11733
* Implement a content type allow list for URL previews (#11936)Denis Kasak2022-02-101-0/+72
| | | | | | | This implements an allow list for content types for which Synapse will attempt URL preview. If a URL resolves to a resource with a content type which isn't in the list, the download will terminate immediately. This makes sense given that Synapse would never successfully generate a URL preview for such files in the first place, and helps prevent issues with streaming media servers, such as #8302. Signed-off-by: Denis Kasak dkasak@termina.org.uk
* Support rendering previews with data: URLs in them (#11767)Patrick Cloke2022-01-242-8/+554
| | | | | Images which are data URLs will no longer break URL previews and will properly be "downloaded" and thumbnailed.
* Move HTML parsing to a separate file for URL previews. (#11566)Patrick Cloke2021-12-131-0/+1
| | | | | | | * Splits the logic for parsing HTML from the resource handling code. * Fix a circular import in the oEmbed code (which uses the HTML parsing code). * Renames some of the HTML parsing methods to: * Make it clear which methods are "internal" to the module. * Clarify what the methods do.
* Fix media repository failing when media store path contains symlinks (#11446)Sean Quah2021-12-021-1/+108
|
* Prevent the media store from writing outside of the configured directorySean Quah2021-11-191-0/+250
| | | | | Also tighten validation of server names by forbidding invalid characters in IPv6 addresses and empty domain labels.
* Handle missing Content-Type header when accessing remote media (#11200)Shay2021-11-011-2/+16
| | | | | | | | | | | | | | | | | | | | | * add code to handle missing content-type header and a test to verify that it works * add handling for missing content-type in the /upload endpoint as well * slightly refactor test code to put private method in approriate place * handle possible null value for content-type when pulling from the local db * add changelog * refactor test and add code to handle missing content-type in cached remote media * requested changes * Update changelog.d/11200.bugfix Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com> Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Be more lenient when parsing the version for oEmbed responses. (#11065)Patrick Cloke2021-10-131-0/+51
|
* Add tests for `MediaFilePaths` (#11057)Sean Quah2021-10-121-0/+238
|
* Autodiscover oEmbed endpoint from returned HTML (#10822)Patrick Cloke2021-10-081-1/+99
| | | | | Searches the returned HTML for an oEmbed endpoint using the autodiscovery mechanism (`<link rel=...>`), and will request it to generate the preview.
* Fix empty `url_cache_thumbnails/yyyy-mm-dd/` directories being left behind ↵Sean Quah2021-09-291-0/+31
| | | | (#10924)
* Avoid storing URL cache files in storage providers (#10911)Sean Quah2021-09-271-0/+130
| | | | | URL cache files are short-lived and it does not make sense to offload them (eg. to the cloud) or back them up.
* Use direct references for configuration variables (part 5). (#10897)Patrick Cloke2021-09-241-1/+1
|
* Add reactor to `SynapseRequest` and fix up types. (#10868)Erik Johnston2021-09-241-4/+4
|
* Include more information in oEmbed previews. (#10819)Patrick Cloke2021-09-221-9/+21
| | | | | | | * Improved titles (fall back to the author name if there's not title) and include the site name. * Handle photo/video payloads. * Include the original URL in the Open Graph response. * Fix the expiration time (by properly converting from seconds to milliseconds).
* Refactor oEmbed previews (#10814)Patrick Cloke2021-09-211-13/+13
| | | | | | | | | | | | | The major change is moving the decision of whether to use oEmbed further up the call-stack. This reverts the _download_url method to being a "dumb" functionwhich takes a single URL and downloads it (as it was before #7920). This also makes more minor refactorings: * Renames internal variables for clarity. * Factors out shared code between the HTML and rich oEmbed previews. * Fixes tests to preview an oEmbed image.
* Create a constant for a small png image in tests. (#10834)Patrick Cloke2021-09-161-14/+4
| | | To avoid duplicating it between a few tests.
* Request JSON for oEmbed requests (and ignore XML only providers). (#10759)Patrick Cloke2021-09-081-1/+54
| | | | | | | | This adds the format to the request arguments / URL to ensure that JSON data is returned (which is all that Synapse supports). This also adds additional error checking / filtering to the configuration file to ignore XML-only providers.
* Allow configuration of the oEmbed URLs. (#10714)Patrick Cloke2021-08-311-110/+102
| | | | | This adds configuration options (under an `oembed` section) to configure which URLs are matched to use oEmbed for URL previews.
* Fix error when selecting between thumbnails with the same quality (#10684)Sean2021-08-251-1/+38
| | | Fixes #10318
* Flatten the synapse.rest.client package (#10600)reivilibre2021-08-171-1/+1
|
* [pyupgrade] `tests/` (#10347)Jonathan de Jong2021-07-131-1/+1
|
* Standardise the module interface (#10062)Brendan Abolivier2021-06-181-0/+3
| | | This PR adds a common configuration section for all modules (see docs). These modules are then loaded at startup by the homeserver. Modules register their hooks and web resources using the new `register_[...]_callbacks` and `register_web_resource` methods of the module API.
* Remove redundant "coding: utf-8" lines (#9786)Jonathan de Jong2021-04-145-5/+0
| | | | | | | Part of #9744 Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
* Use mock from the stdlib. (#9772)Patrick Cloke2021-04-092-4/+2
|
* Handle image transparency better when thumbnailing. (#9473)Patrick Cloke2021-03-091-8/+21
| | | | Properly uses RGBA mode for 1- and 8-bit images with transparency (instead of RBG mode).
* Add testErik Johnston2021-02-191-3/+66
|
* Update black, and run auto formatting over the codebase (#9381)Eric Eastwood2021-02-161-3/+11
| | | | | | | - Update black version to the latest - Run black auto formatting over the codebase - Run autoformatting according to [`docs/code_style.md `](https://github.com/matrix-org/synapse/blob/80d6dc9783aa80886a133756028984dbf8920168/docs/code_style.md) - Update `code_style.md` docs around installing black to use the correct version
* Add check_media_file_for_spam spam checker hookErik Johnston2021-02-041-0/+94
|
* Return a 404 if no valid thumbnail is found. (#9163)Patrick Cloke2021-01-211-1/+24
| | | | | | If no thumbnail of the requested type exists, return a 404 instead of erroring. This doesn't quite match the spec (which does not define what happens if no thumbnail can be found), but is consistent with what Synapse already does.
* Skip unit tests which require optional dependencies (#9031)Richard van der Hoff2021-01-071-0/+7
| | | If we are lacking an optional dependency, skip the tests that rely on it.
* Remove spurious "SynapseRequest" result from `make_request"Richard van der Hoff2020-12-152-21/+21
| | | | This was never used, so let's get rid of it.
* Add X-Robots-Tag header to stop crawlers from indexing media (#8887)Aaron Raimist2020-12-081-0/+13
| | | | | | | Fixes / related to: https://github.com/matrix-org/synapse/issues/6533 This should do essentially the same thing as a robots.txt file telling robots to not index the media repo. https://developers.google.com/search/reference/robots_meta_tag Signed-off-by: Aaron Raimist <aaron@raim.ist>
* remove unused FakeResponse (#8864)Richard van der Hoff2020-12-021-26/+0
|
* Apply an IP range blacklist to push and key revocation requests. (#8821)Patrick Cloke2020-12-021-1/+1
| | | | | | | | | | | | Replaces the `federation_ip_range_blacklist` configuration setting with an `ip_range_blacklist` setting with wider scope. It now applies to: * Federation * Identity servers * Push notifications * Checking key validitity for third-party invite events The old `federation_ip_range_blacklist` setting is still honored if present, but with reduced scope (it only applies to federation and identity servers).
* Make `make_request` actually render the requestRichard van der Hoff2020-11-162-40/+33
| | | | | | remove the stubbing out of `request.process`, so that `requestReceived` also renders the request via the appropriate resource. Replace render() with a stub for now.
* Fix the URL in the URL preview testsRichard van der Hoff2020-11-161-19/+22
| | | | the preview resource is mointed at preview_url, not url_preview
* use global make_request() directly where we have a custom ResourceRichard van der Hoff2020-11-151-3/+14
| | | | | | Where we want to render a request against a specific Resource, call the global make_request() function rather than the one in HomeserverTestCase, allowing us to pass in an appropriate `Site`.
* Do not error when thumbnailing invalid files (#8236)Patrick Cloke2020-09-091-10/+29
| | | | If a file cannot be thumbnailed for some reason (e.g. the file is empty), then catch the exception and convert it to a reasonable error message for the client.
* Stop sub-classing object (#8249)Patrick Cloke2020-09-041-3/+3
|
* Support oEmbed for media previews. (#7920)Patrick Cloke2020-07-271-8/+134
| | | Fixes previews of Twitter URLs by using their oEmbed endpoint to grab content.
* Convert more of the media code to async/await (#7873)Patrick Cloke2020-07-241-1/+4
|
* isort 5 compatibility (#7786)Will Hunt2020-07-051-3/+1
| | | The CI appears to use the latest version of isort, which is a problem when isort gets a major version bump. Rather than try to pin the version, I've done the necessary to make isort5 happy with synapse.
* Fetch from the r0 media path instead of the unspecced v1. (#7714)Patrick Cloke2020-06-171-1/+1
|
* Replace all remaining six usage with native Python 3 equivalents (#7704)Dagfinn Ilmari Mannsåker2020-06-161-1/+1
|
* Add support for webp thumbnailing (#7586)WGH2020-06-051-36/+99
| | | | | Closes #4382 Signed-off-by: Maxim Plotnikov <wgh@torlan.ru>
* Allow specifying the value of Accept-Language header for URL previews (#7265)Andrew Morgan2020-04-151-0/+55
|
* Lint + changelogBrendan Abolivier2020-01-221-3/+1
|
* Remove unused importBrendan Abolivier2020-01-221-1/+1
|
* Add tests for thumbnailingBrendan Abolivier2020-01-221-3/+45
|
* Apply suggestions from code reviewRichard van der Hoff2019-11-051-0/+1
| | | | Co-Authored-By: Brendan Abolivier <babolivier@matrix.org> Co-Authored-By: Erik Johnston <erik@matrix.org>
* Strip overlong OpenGraph data from url previewRichard van der Hoff2019-11-051-0/+34
| | | | ... to stop people causing DoSes with malicious web pages
* Move logging utilities out of the side drawer of util/ and into logging/ (#5606)Amber Brown2019-07-041-1/+1
|
* Fix media repo breaking (#5593)Amber Brown2019-07-021-0/+12
|
* Make the http server handle coroutine-making REST servlets (#5475)Amber Brown2019-06-291-10/+15
|
* Run Black. (#5482)Amber Brown2019-06-203-38/+38
|
* Migrate all tests to use the dict-based config format instead of hanging ↵Amber Brown2019-05-132-35/+22
| | | | items off HomeserverConfig (#5171)
* URL preview blacklisting fixes (#5155)Andrew Morgan2019-05-101-11/+11
| | | Prevents a SynapseError being raised inside of a IResolutionReceiver and instead opts to just return 0 results. This thus means that we have to lump a failed lookup and a blacklisted lookup together with the same error message, but the substitute should be generic enough to cover both cases.
* Run Black on the tests again (#5170)Amber Brown2019-05-101-10/+4
|
* Fix parsing of Content-Disposition headers (#4763)Richard van der Hoff2019-02-271-0/+45
| | | | | | | | | | | * Fix parsing of Content-Disposition headers TIL: filenames in content-dispostion headers can contain semicolons, and aren't %-encoded. * fix python2 incompatibility * Fix docstrings
* Fix IP URL previews on Python 3 (#4215)Amber Brown2018-12-221-98/+326
|
* Fix more logcontext leaks in tests (#4209)Richard van der Hoff2018-11-271-1/+2
|
* Fix logcontext leak in test_url_previewRichard van der Hoff2018-11-191-1/+2
|
* Fix Content-Disposition in media repository (#4176)Amber Brown2018-11-151-0/+145
|
* Use <meta> tags to discover the per-page encoding of html previews (#4183)Amber Brown2018-11-151-0/+77
|
* Fix URL preview bugs (type error when loading cache from db, content-type ↵Amber Brown2018-11-081-0/+164
| | | | including quotes) (#4157)
* Run black.black2018-08-101-4/+2
|
* run isortAmber Brown2018-07-091-7/+7
|
* Pass around the reactor explicitly (#3385)Amber Brown2018-06-221-2/+3
|
* Fix broken unit test for media storageErik Johnston2018-02-051-1/+6
|
* Add unit testsErik Johnston2018-01-183-0/+109