| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
* Handle the case of lxml not finding a document tree.
* Parse the document encoding from the XML tag.
|
|
|
|
|
|
| |
Treat unknown encodings (according to lxml) as UTF-8
when generating a preview for HTML documents. This
isn't fully accurate, but will hopefully give a reasonable
title and summary.
|
|
|
| |
If we are lacking an optional dependency, skip the tests that rely on it.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Marcin Bachry <hegel666@gmail.com>
|
|
|
|
|
|
|
| |
The old test expected an incorrect wrapping due to the preview function
not using unicode properly, so it got the wrong length.
Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
|
|
|
|
| |
Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
|
|
|
|
|
|
|
|
|
| |
This includes:
- Splitting out methods of a class into stand alone functions, to make
them easier to test.
- Adding unit tests to split out functions, testing HTML -> preview.
- Handle the fact that elements in lxml may have tail text.
|
|
|