| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Images which are data URLs will no longer break URL
previews and will properly be "downloaded" and
thumbnailed.
|
|
|
|
|
|
|
| |
* Splits the logic for parsing HTML from the resource handling code.
* Fix a circular import in the oEmbed code (which uses the HTML parsing code).
* Renames some of the HTML parsing methods to:
* Make it clear which methods are "internal" to the module.
* Clarify what the methods do.
|
| |
|
|
|
|
| |
There's no point in trying more than once since it is guaranteed to
continually fail.
|
|
|
|
| |
This follows similar logic to BeautifulSoup where we attempt different
character encodings until we find one which works.
|
|
|
|
|
| |
Searches the returned HTML for an oEmbed endpoint using the
autodiscovery mechanism (`<link rel=...>`), and will request it
to generate the preview.
|
| |
|
|
|
|
|
|
|
| |
Part of #9744
Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.
`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
|
|
|
|
| |
* Handle the case of lxml not finding a document tree.
* Parse the document encoding from the XML tag.
|
|
|
|
|
|
| |
Treat unknown encodings (according to lxml) as UTF-8
when generating a preview for HTML documents. This
isn't fully accurate, but will hopefully give a reasonable
title and summary.
|
|
|
| |
If we are lacking an optional dependency, skip the tests that rely on it.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Marcin Bachry <hegel666@gmail.com>
|
|
|
|
|
|
|
| |
The old test expected an incorrect wrapping due to the preview function
not using unicode properly, so it got the wrong length.
Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
|
|
|
|
| |
Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
|
|
|
|
|
|
|
|
|
| |
This includes:
- Splitting out methods of a class into stand alone functions, to make
them easier to test.
- Adding unit tests to split out functions, testing HTML -> preview.
- Handle the fact that elements in lxml may have tail text.
|
|
|