summary refs log tree commit diff
path: root/synapse/util/caches/stream_change_cache.py (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Speed up SS room sorting (#17468)Erik Johnston2024-07-231-4/+8
| | | | | | | We do this by bulk fetching the latest stream ordering. --------- Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
* Add optimisation to `StreamChangeCache` (#17130)Erik Johnston2024-05-061-1/+19
| | | | | | | When there have been lots of changes compared with the number of entities, we can do a fast(er) path. Locally I ran some benchmarking, and the comparison seems to give the best determination of which method we use.
* Fix bug where `StreamChangeCache` would not respect cache factors (#17152)Erik Johnston2024-05-031-1/+1
| | | Annoyingly mypy didn't pick up this typo.
* Correctly mention previous copyright (#16820)Erik Johnston2024-01-231-0/+1
| | | | | During the migration the automated script to update the copyright headers accidentally got rid of some of the existing copyright lines. Reinstate them.
* Update license headersPatrick Cloke2023-11-211-10/+16
|
* Check the stream position before checking if the cache is empty. (#14639)Patrick Cloke2022-12-081-4/+5
| | | | | | An empty cache does not mean the entity has no changed, if it is earlier than the earliest known stream position return that the entity *has* changed since the cache cannot accurately answer that query.
* Better return type for `get_all_entities_changed` (#14604)Erik Johnston2022-12-051-15/+37
| | | | Help callers from using the return value incorrectly by ensuring that callers explicitly check if there was a cache hit or not.
* Compare to the earliest known stream pos in the stream change cache. (#14435)Patrick Cloke2022-12-051-26/+116
| | | | | | The internal methods of the StreamChangeCache were inconsistently treating the earliest known stream position as valid. It is now treated as invalid, meaning the cache cannot determine if an entity at the earliest known stream position has changed or not.
* Remove duplicated code to evict entries. (#14410)Patrick Cloke2022-11-101-9/+2
| | | | | | | | This code was factored out to a method, but also left in-place. Calling this twice in a row makes no sense: the first call will reduce the size appropriately, but the loop will immediately exit since the cache size was already reduced.
* More types for synapse.util, part 1 (#10888)David Robertson2021-10-061-3/+3
| | | | | | | | | | | | | | The following modules now pass `disallow_untyped_defs`: * synapse.util.caches.cached_call * synapse.util.caches.lrucache * synapse.util.caches.response_cache * synapse.util.caches.stream_change_cache * synapse.util.caches.ttlcache pass * synapse.util.daemonize * synapse.util.patch_inline_callbacks pass `no-untyped-defs` * synapse.util.versionstring Additional typing in synapse.util.metrics. Didn't get this to pass `no-untyped-defs`, think I'll need to watch #10847
* Add types to synapse.util. (#10601)reivilibre2021-09-101-1/+1
|
* Use inline type hints in `http/federation/`, `storage/` and `util/` (#10381)Jonathan de Jong2021-07-151-3/+3
|
* Remove `synapse.types.Collection` (#9856)Richard van der Hoff2021-04-221-2/+1
| | | This is no longer required, since we have dropped support for Python 3.5.
* Remove redundant "coding: utf-8" lines (#9786)Jonathan de Jong2021-04-141-1/+0
| | | | | | | Part of #9744 Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
* Update black, and run auto formatting over the codebase (#9381)Eric Eastwood2021-02-161-4/+2
| | | | | | | - Update black version to the latest - Run black auto formatting over the codebase - Run autoformatting according to [`docs/code_style.md `](https://github.com/matrix-org/synapse/blob/80d6dc9783aa80886a133756028984dbf8920168/docs/code_style.md) - Update `code_style.md` docs around installing black to use the correct version
* Replace all remaining six usage with native Python 3 equivalents (#7704)Dagfinn Ilmari Mannsåker2020-06-161-3/+1
|
* Allow configuration of Synapse's cache without using synctl or environment ↵Amber Brown2020-05-111-2/+31
| | | | variables (#6391)
* Speed up fetching device lists changes in sync.Erik Johnston2020-05-051-4/+15
| | | | | Currently we copy `users_who_share_room` needlessly about three times, which is expensive when the set is large (which it can easily be).
* Extend StreamChangeCache to support multiple entities per stream ID (#7303)Richard van der Hoff2020-04-221-46/+71
| | | | | | | | | | | | | | | | | | | First some background: StreamChangeCache is used to keep track of what "entities" have changed since a given stream ID. So for example, we might use it to keep track of when the last to-device message for a given user was received [1], and hence whether we need to pull any to-device messages from the database on a sync [2]. Now, it turns out that StreamChangeCache didn't support more than one thing being changed at a given stream_id (this was part of the problem with #7206). However, it's entirely valid to send to-device messages to more than one user at a time. As it turns out, this did in fact work, because *some* methods of StreamChangeCache coped ok with having multiple things changing on the same stream ID, and it seems we never actually use the methods which don't work on the stream change caches where we allow multiple changes at the same stream ID. But that feels horribly fragile, hence: let's update StreamChangeCache to properly support this, and add some typing and some more tests while we're at it. [1]: https://github.com/matrix-org/synapse/blob/release-v1.12.3/synapse/storage/data_stores/main/deviceinbox.py#L301 [2]: https://github.com/matrix-org/synapse/blob/release-v1.12.3/synapse/storage/data_stores/main/deviceinbox.py#L47-L51
* On catchup, process each row with its own stream id (#7286)Richard van der Hoff2020-04-201-0/+3
| | | | | | Other parts of the code (such as the StreamChangeCache) assume that there will not be multiple changes with the same stream id. This code was introduced in #7024, and I hope this fixes #7206.
* Run Black. (#5482)Amber Brown2019-06-201-6/+7
|
* Make scripts/ and scripts-dev/ pass pyflakes (and the rest of the codebase ↵Amber Brown2018-10-201-1/+3
| | | | on py3) (#4068)
* Use efficient .intersectionErik Johnston2018-07-171-4/+1
|
* Fix perf regression in PR #3530Erik Johnston2018-07-171-1/+6
| | | | | | | | The get_entities_changed function was changed to return all changed entities since the given stream position, rather than only those changed from a given list of entities. This resulted in the function incorrectly returning large numbers of entities that, for example, caused large increases in database usage.
* Don't return unknown entities in get_entities_changedErik Johnston2018-07-131-8/+1
| | | | | | | | The stream cache keeps track of all entities that have changed since a particular stream position, so get_entities_changed does not need to return unknown entites when given a larger stream position. This makes it consistent with the behaviour of has_entity_changed.
* Reduce set building in get_entities_changedRichard van der Hoff2018-07-121-8/+12
| | | | | | | | | | | This line shows up as about 5% of cpu time on a synchrotron: not_known_entities = set(entities) - set(self._entity_to_key) Presumably the problem here is that _entity_to_key can be largeish, and building a set for its keys every time this function is called is slow. Here we rewrite the logic to avoid building so many sets.
* run isortAmber Brown2018-07-091-3/+2
|
* Revert "Revert "Try to not use as much CPU in the StreamChangeCache"" (#3454)Amber Brown2018-06-281-2/+4
|
* Revert "Try to not use as much CPU in the StreamChangeCache"Matthew Hodgson2018-06-261-4/+2
|
* fixesAmber Brown2018-06-261-2/+2
|
* fixesAmber Brown2018-06-261-2/+2
|
* try and make loading items from the cache fasterAmber Brown2018-06-261-2/+4
|
* Port to sortedcontainers (with tests!) (#3332)Amber Brown2018-06-061-26/+31
|
* replacing portionsAmber Brown2018-05-211-1/+1
|
* Revert "Use sortedcontainers instead of blist"Richard van der Hoff2018-04-131-2/+2
| | | | | | | | | | | This reverts commit 9fbe70a7dc3afabfdac176ba1f4be32dd44602aa. It turns out that sortedcontainers.SortedDict is not an exact match for blist.sorteddict; in particular, `popitem()` removes things from the opposite end of the dict. This is trivial to fix, but I want to add some unit tests, and potentially some more thought about it, before we do so.
* Use sortedcontainers instead of blistVincent Breitmoser2018-04-101-2/+2
| | | | | | | | This commit drop-in replaces blist with SortedContainers. They are written in pure python so work with pypy, but perform as good as native implementations, at least in a couple benchmarks: http://www.grantjenks.com/docs/sortedcontainers/performance.html
* Define CACHE_SIZE_FACTOR onceErik Johnston2017-07-041-5/+1
|
* Rewrite conditionalErik Johnston2017-06-091-1/+1
|
* Fix has_any_entity_changedErik Johnston2017-06-091-4/+4
| | | | | | | | Occaisonally has_any_entity_changed would throw the error: "Set changed size during iteration" when taking the max of the `sorteddict`. While its uncertain how that happens, its quite inefficient to iterate over the entire dict anyway so we change to using the more traditional `bisect_*` functions.
* Add stream change cacheErik Johnston2017-05-311-0/+15
|
* Fix assertion to stop transaction queue getting wedgedRichard van der Hoff2017-03-151-1/+1
| | | | | | | | ... and update some docstrings to correctly reflect the types being used. get_new_device_msgs_for_remote can return a long under some circumstances, which was being stored in last_device_list_stream_id_by_dest, and was then upsetting things on the next loop.
* Change get_pos_of_last_change to return upper boundErik Johnston2016-09-151-3/+4
|
* Use stream_change cache to make get_forward_extremeties_for_room cache more ↵Erik Johnston2016-09-151-0/+5
| | | | effective
* Change CacheMetrics to be quickerErik Johnston2016-06-031-8/+8
| | | | | | We change it so that each cache has an individual CacheMetric, instead of having one global CacheMetric. This means that when a cache tries to increment a counter it does not need to go through so many indirections.
* Return list, not generator.Erik Johnston2016-03-141-3/+1
|
* Limit stream change cache size tooErik Johnston2016-03-011-1/+5
|
* Change the way we figure out presence updates for small deltasErik Johnston2016-02-231-0/+16
|
* If stream pos is greater then earliest known key and entity hasn't changed, ↵Erik Johnston2016-01-291-8/+3
| | | | then entity hasn't changed
* Prefill stream change cachesErik Johnston2016-01-291-1/+4
|
* If the same as the earliest key, assume nothing has changed.Erik Johnston2016-01-281-0/+5
|
* Correctly update _entity_to_keyErik Johnston2016-01-281-4/+5
|
* Fix inequalitiesErik Johnston2016-01-281-2/+2
|
* Include cache hits with has_entity_changedErik Johnston2016-01-281-0/+4
|
* Change name and doc has_entity_changedErik Johnston2016-01-281-1/+3
|
* Cache tags and account dataErik Johnston2016-01-281-0/+95