summary refs log tree commit diff
path: root/synapse/storage/databases/main/search.py (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Switch search SQL to triple-quote strings. (#14311)Patrick Cloke2022-10-281-89/+99
| | | | For ease of reading we switch from concatenated strings to triple quote strings.
* Fix tests for change in PostgreSQL 14 behavior change. (#14310)Patrick Cloke2022-10-271-3/+2
| | | | | | | PostgreSQL 14 changed the behavior of `websearch_to_tsquery` to improve some behaviour. The tests were hitting those edge-cases about handling of hanging double quotes. This fixes the tests to take into account the PostgreSQL version.
* Unified search query syntax using the full-text search capabilities of the ↵James Salter2022-10-251-35/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | underlying DB. (#11635) Support a unified search query syntax which leverages more of the full-text search of each database supported by Synapse. Supports, with the same syntax across Postgresql 11+ and Sqlite: - quoted "search terms" - `AND`, `OR`, `-` (negation) operators - Matching words based on their stem, e.g. searches for "dog" matches documents containing "dogs". This is achieved by - If on postgresql 11+, pass the user input to `websearch_to_tsquery` - If on sqlite, manually parse the query and transform it into the sqlite-specific query syntax. Note that postgresql 10, which is close to end-of-life, falls back to using `phraseto_tsquery`, which only supports a subset of the features. Multiple terms separated by a space are implicitly ANDed. Note that: 1. There is no escaping of full-text syntax that might be supported by the database; e.g. `NOT`, `NEAR`, `*` in sqlite. This runs the risk that people might discover this as accidental functionality and depend on something we don't guarantee. 2. English text is assumed for stemming. To support other languages, either the target language needs to be known at the time of indexing the message (via room metadata, or otherwise), or a separate index for each language supported could be created. Sqlite docs: https://www.sqlite.org/fts3.html#full_text_index_queries Postgres docs: https://www.postgresql.org/docs/11/textsearch-controls.html
* Update mypy and mypy-zope, attempt 3 (#13993)David Robertson2022-09-301-1/+1
| | | Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
* Revert "Update mypy and mypy-zope (#13925)"David Robertson2022-09-301-1/+1
| | | | This reverts commit 6d543d6d9f56e39199b7e460d0081b02d61f12be.
* Update mypy and mypy-zope (#13925)David Robertson2022-09-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Update mypy and mypy-zope * Unignore assigning to LogRecord attributes Presumably https://github.com/python/typeshed/pull/8064 makes this ok Cherry-picked from #13521 * Remove unused ignores due to mypy ParamSpec fixes https://github.com/python/mypy/pull/12668 Cherry-picked from #13521 * Remove additional unused ignores * Fix new mypy complaints related to `assertGreater` Presumably due to https://github.com/python/typeshed/pull/8077 * Changelog * Reword changelog Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
* Replace noop background updates with DELETE. (#12954)Patrick Cloke2022-06-131-10/+0
| | | | Removes the `register_noop_background_update` and deletes the background updates directly in a delta file.
* Add some type hints to datastore. (#12477)Dirk Klimpel2022-05-101-13/+20
|
* remove constantly lib use and switch to enums. (#12624)andrew do2022-05-041-4/+4
|
* Add some type hints to datastore. (#12255)Dirk Klimpel2022-03-281-13/+13
|
* Fix broken background updates when using sqlite with `enable_search` off ↵Sean Quah2022-03-141-6/+7
| | | | | (#12215) Signed-off-by: Sean Quah <seanq@element.io>
* Fix non-strings in the `event_search` table (#12037)Sean Quah2022-02-241-0/+26
| | | | | | | Don't attempt to add non-string `value`s to `event_search` and add a background update to clear out bad rows from `event_search` when using sqlite. Signed-off-by: Sean Quah <seanq@element.io>
* Refactor search code to reduce function size. (#11991)Patrick Cloke2022-02-151-7/+10
| | | | | | | | | Splits the search code into a few logical functions instead of a single unreadable function. There are also a few additional changes for readability. After refactoring it was clear to see there were some unused and unnecessary variables, which were simplified.
* Convert all namedtuples to attrs. (#11665)Patrick Cloke2021-12-301-5/+11
| | | To improve type hints throughout the code.
* Type hint the constructors of the data store classes (#11555)Sean Quah2021-12-131-3/+17
|
* Add type hints for most `HomeServer` parameters (#11095)Sean Quah2021-10-221-3/+6
|
* Use direct references for configuration variables (part 6). (#10916)Patrick Cloke2021-09-291-2/+2
|
* Treat "\u0000" as "\u0020" for the purposes of message search (message ↵Hillery Shay2021-09-221-9/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | indexing) (#10820) * add test to check if null code points are being inserted * add logic to detect and replace null code points before insertion into db * lints * add license to test * change approach to null substitution * add type hint for SearchEntry * Add changelog entry Signed-off-by: H.Shay <shaysquared@gmail.com> * updated changelog * update chanelog message * remove duplicate changelog * Update synapse/storage/databases/main/events.py remove extra space Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> * rename and move test file, update tests, delete old test file * fix typo in comments * update _find_highlights_in_postgres to replace null byte with space * replace null byte in sqlite search insertion * beef up and reorganize test for this pr * update changelog * add type hints and update docstring * check db engine directly vs using env variable * refactor tests to be less repetetive * move rplace logic into seperate function * requested changes * Fix typo. * Update synapse/storage/databases/main/search.py Co-authored-by: reivilibre <olivier@librepush.net> * Update changelog.d/10820.misc Co-authored-by: Aaron Raimist <aaron@raim.ist> Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> Co-authored-by: reivilibre <olivier@librepush.net> Co-authored-by: Aaron Raimist <aaron@raim.ist>
* Remove `synapse.types.Collection` (#9856)Richard van der Hoff2021-04-221-2/+1
| | | This is no longer required, since we have dropped support for Python 3.5.
* Remove redundant "coding: utf-8" lines (#9786)Jonathan de Jong2021-04-141-1/+0
| | | | | | | Part of #9744 Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
* Add type hints to various handlers. (#9223)Patrick Cloke2021-01-261-1/+2
| | | | With this change all handlers except the e2e_* ones have type hints enabled.
* Use execute_batch in more places (#9188)Erik Johnston2021-01-211-2/+2
| | | | | * Use execute_batch in more places * Newsfile
* Simplify super() calls to Python 3 syntax. (#8344)Patrick Cloke2020-09-181-2/+2
| | | | | | | This converts calls like super(Foo, self) -> super(). Generated with: sed -i "" -Ee 's/super\([^\(]+\)/super()/g' **/*.py
* Convert additional databases to async/await part 3 (#8201)Patrick Cloke2020-09-011-6/+9
|
* Convert additional database stores to async/await (#8045)Patrick Cloke2020-08-071-34/+35
|
* Rename database classes to make some sense (#8033)Erik Johnston2020-08-051-0/+710