| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
| |
Separates REST layer code from the actual URL previewing.
|
|
|
|
|
|
|
| |
* Removes the `v1` directory from `test.rest.media.v1`.
* Moves the non-REST code from `synapse.rest.media.v1` to `synapse.media`.
* Flatten the `v1` directory from `synapse.rest.media`, but leave compatiblity
with 3rd party media repositories and spam checkers.
|
|
|
|
|
|
| |
Previously if an autodiscovered oEmbed request failed (e.g. the
oEmbed endpoint is down or does not exist) then the entire URL
preview would fail. Instead we now return everything we can, even
if this additional request fails.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix MediaStorage type hint
* Typecheck tests.rest.media.v1.test_media_storage
* Changelog
* Remove assert and make the comment succinct
* Fix syntax for olddeps
|
|
|
|
|
|
|
| |
It doesn't seem valid that HTML entities should appear in
the title field of oEmbed responses, but a popular WordPress
plug-in seems to do it.
There should not be harm in unescaping these.
|
|
|
|
|
|
| |
Attempt to parse any valid information from an oEmbed response
(instead of bailing at the first unexpected data). This should allow
for more partial oEmbed data to be returned, resulting in better /
more URL previews, even if those URL previews are only partial.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix https://github.com/matrix-org/synapse/issues/13016
## New error code and status
### Before
Previously, we returned a `404` for `/thumbnail` which isn't even in the spec.
```json
{
"errcode": "M_NOT_FOUND",
"error": "Not found [b'hs1', b'tefQeZhmVxoiBfuFQUKRzJxc']"
}
```
### After
What does the spec say?
> 400: The request does not make sense to the server, or the server cannot thumbnail the content. For example, the client requested non-integer dimensions or asked for negatively-sized images.
>
> *-- https://spec.matrix.org/v1.1/client-server-api/#get_matrixmediav3thumbnailservernamemediaid*
Now with this PR, we respond with a `400` when we don't have thumbnails to serve and we explain why we might not have any thumbnails.
```json
{
"errcode": "M_UNKNOWN",
"error": "Cannot find any thumbnails for the requested media ([b'example.com', b'12345']). This might mean the media is not a supported_media_format=(image/jpeg, image/jpg, image/webp, image/gif, image/png) or that thumbnailing failed for some other reason. (Dynamic thumbnails are disabled on this server.)",
}
```
> Cannot find any thumbnails for the requested media ([b'example.com', b'12345']). This might mean the media is not a supported_media_format=(image/jpeg, image/jpg, image/webp, image/gif, image/png) or that thumbnailing failed for some other reason. (Dynamic thumbnails are disabled on this server.)
---
We still respond with a 404 in many other places. But we can iterate on those later and maybe keep some in some specific places after spec updates/clarification: https://github.com/matrix-org/matrix-spec/issues/1122
We can also iterate on the bugs where Synapse doesn't thumbnail when it should in other issues/PRs.
|
|
|
|
|
|
| |
return `Tuple[Codes, dict]` (#13044)
Signed-off-by: David Teller <davidt@element.io>
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* Make _iterate_over_text easier to read by using simple data structures
* Prefer a set of tags to ignore
In my tests, it's 4x faster to check for containment in a set of this size
* Add a stack size limit to _iterate_over_text
* Continue accepting the case where there is no body element
* Use an early return instead for None
Co-authored-by: Richard van der Hoff <richard@matrix.org>
|
| |
| |
| |
| | |
endpoints (#12944)
|
|/
|
|
|
|
| |
Pull out `twitter:` meta tags when generating a preview and
use it to augment any `og:` meta tags.
Prefers Open Graph information over Twitter card information.
|
| |
|
|
|
|
| |
Avoid breaking a URL preview completely if the chosen image 404s
or is unreachable for some other reason (e.g. DNS).
|
|
|
|
|
| |
* Skip `og` and `meta` tags where the value is empty.
* Fallback to the favicon if there are no other images.
* Ignore tags meant for navigation.
|
| |
|
|
|
|
| |
preview blacklist (#12333)
|
|
|
|
| |
By using urljoin from the standard library and reducing the number
of places URLs are rebased.
|
|
|
| |
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
|
|
|
|
|
|
|
| |
* Add type hints to `tests/rest`
* newsfile
* change import from `SigningKey`
|
| |
|
|
|
|
|
|
|
| |
The presence of this method was confusing, and mostly present for backwards
compatibility. Let's get rid of it.
Part of #11733
|
|
|
|
|
|
|
| |
This implements an allow list for content types for which Synapse will attempt URL preview. If a URL resolves to a resource with a content type which isn't in the list, the download will terminate immediately.
This makes sense given that Synapse would never successfully generate a URL preview for such files in the first place, and helps prevent issues with streaming media servers, such as #8302.
Signed-off-by: Denis Kasak dkasak@termina.org.uk
|
|
|
|
|
| |
Images which are data URLs will no longer break URL
previews and will properly be "downloaded" and
thumbnailed.
|
|
|
|
|
|
|
| |
* Splits the logic for parsing HTML from the resource handling code.
* Fix a circular import in the oEmbed code (which uses the HTML parsing code).
* Renames some of the HTML parsing methods to:
* Make it clear which methods are "internal" to the module.
* Clarify what the methods do.
|
| |
|
|
|
|
|
| |
Also tighten validation of server names by forbidding invalid characters
in IPv6 addresses and empty domain labels.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* add code to handle missing content-type header and a test to verify that it works
* add handling for missing content-type in the /upload endpoint as well
* slightly refactor test code to put private method in approriate place
* handle possible null value for content-type when pulling from the local db
* add changelog
* refactor test and add code to handle missing content-type in cached remote media
* requested changes
* Update changelog.d/11200.bugfix
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
|
| |
|
| |
|
|
|
|
|
| |
Searches the returned HTML for an oEmbed endpoint using the
autodiscovery mechanism (`<link rel=...>`), and will request it
to generate the preview.
|
|
|
|
| |
(#10924)
|
|
|
|
|
| |
URL cache files are short-lived and it does not make sense to offload
them (eg. to the cloud) or back them up.
|
| |
|
| |
|
|
|
|
|
|
|
| |
* Improved titles (fall back to the author name if there's not title) and include the site name.
* Handle photo/video payloads.
* Include the original URL in the Open Graph response.
* Fix the expiration time (by properly converting from seconds to milliseconds).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The major change is moving the decision of whether to use oEmbed
further up the call-stack. This reverts the _download_url method to
being a "dumb" functionwhich takes a single URL and downloads it
(as it was before #7920).
This also makes more minor refactorings:
* Renames internal variables for clarity.
* Factors out shared code between the HTML and rich oEmbed
previews.
* Fixes tests to preview an oEmbed image.
|
|
|
| |
To avoid duplicating it between a few tests.
|
|
|
|
|
|
|
|
| |
This adds the format to the request arguments / URL to
ensure that JSON data is returned (which is all that
Synapse supports).
This also adds additional error checking / filtering to the
configuration file to ignore XML-only providers.
|
|
|
|
|
| |
This adds configuration options (under an `oembed` section) to
configure which URLs are matched to use oEmbed for URL
previews.
|
|
|
| |
Fixes #10318
|
| |
|
| |
|
|
|
| |
This PR adds a common configuration section for all modules (see docs). These modules are then loaded at startup by the homeserver. Modules register their hooks and web resources using the new `register_[...]_callbacks` and `register_web_resource` methods of the module API.
|
|
|
|
|
|
|
| |
Part of #9744
Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.
`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
|
| |
|
|
|
|
| |
Properly uses RGBA mode for 1- and 8-bit images with transparency
(instead of RBG mode).
|
| |
|
|
|
|
|
|
|
| |
- Update black version to the latest
- Run black auto formatting over the codebase
- Run autoformatting according to [`docs/code_style.md
`](https://github.com/matrix-org/synapse/blob/80d6dc9783aa80886a133756028984dbf8920168/docs/code_style.md)
- Update `code_style.md` docs around installing black to use the correct version
|
| |
|
|
|
|
|
|
| |
If no thumbnail of the requested type exists, return a 404 instead
of erroring. This doesn't quite match the spec (which does not define
what happens if no thumbnail can be found), but is consistent with
what Synapse already does.
|
|
|
| |
If we are lacking an optional dependency, skip the tests that rely on it.
|
|
|
|
| |
This was never used, so let's get rid of it.
|
|
|
|
|
|
|
| |
Fixes / related to: https://github.com/matrix-org/synapse/issues/6533
This should do essentially the same thing as a robots.txt file telling robots to not index the media repo. https://developers.google.com/search/reference/robots_meta_tag
Signed-off-by: Aaron Raimist <aaron@raim.ist>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replaces the `federation_ip_range_blacklist` configuration setting with an
`ip_range_blacklist` setting with wider scope. It now applies to:
* Federation
* Identity servers
* Push notifications
* Checking key validitity for third-party invite events
The old `federation_ip_range_blacklist` setting is still honored if present, but
with reduced scope (it only applies to federation and identity servers).
|
|
|
|
|
|
| |
remove the stubbing out of `request.process`, so that `requestReceived` also renders the request via the appropriate resource.
Replace render() with a stub for now.
|
|
|
|
| |
the preview resource is mointed at preview_url, not url_preview
|
|
|
|
|
|
| |
Where we want to render a request against a specific Resource, call the global
make_request() function rather than the one in HomeserverTestCase, allowing us
to pass in an appropriate `Site`.
|
|
|
|
| |
If a file cannot be thumbnailed for some reason (e.g. the file is empty), then
catch the exception and convert it to a reasonable error message for the client.
|
| |
|
|
|
| |
Fixes previews of Twitter URLs by using their oEmbed endpoint to grab content.
|
| |
|
|
|
| |
The CI appears to use the latest version of isort, which is a problem when isort gets a major version bump. Rather than try to pin the version, I've done the necessary to make isort5 happy with synapse.
|
| |
|
| |
|
|
|
|
|
| |
Closes #4382
Signed-off-by: Maxim Plotnikov <wgh@torlan.ru>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Co-Authored-By: Brendan Abolivier <babolivier@matrix.org>
Co-Authored-By: Erik Johnston <erik@matrix.org>
|
|
|
|
| |
... to stop people causing DoSes with malicious web pages
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
items off HomeserverConfig (#5171)
|
|
|
| |
Prevents a SynapseError being raised inside of a IResolutionReceiver and instead opts to just return 0 results. This thus means that we have to lump a failed lookup and a blacklisted lookup together with the same error message, but the substitute should be generic enough to cover both cases.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix parsing of Content-Disposition headers
TIL: filenames in content-dispostion headers can contain semicolons, and aren't
%-encoded.
* fix python2 incompatibility
* Fix docstrings
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
including quotes) (#4157)
|
| |
|
| |
|
| |
|
| |
|
|
|