| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
| |
Currently we only track evictions due to size or time constraints.
|
|
|
| |
Should have been caught in #10826.
|
| |
|
|
|
|
|
|
|
|
|
| |
* update black version
* run updated version of black on code
* newsfragment
* enumerate python versions
|
|
|
| |
This adds some opentracing annotations to ResponseCache, to make it easier to see what's going on; in particular, it adds a link back to the initial trace which is actually doing the work of generating the response.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When all entries in an `LruCache` have a size of 0 according to the
provided `size_callback`, and `drop_from_cache` is called on a cache
node, the node would be unlinked from the LRU linked list but remain in
the cache dictionary. An assertion would be later be tripped due to the
inconsistency.
Avoid unintentionally calling `__len__` and use a strict `is None`
check instead when unwrapping the weak reference.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following modules now pass `disallow_untyped_defs`:
* synapse.util.caches.cached_call
* synapse.util.caches.lrucache
* synapse.util.caches.response_cache
* synapse.util.caches.stream_change_cache
* synapse.util.caches.ttlcache pass
* synapse.util.daemonize
* synapse.util.patch_inline_callbacks pass `no-untyped-defs`
* synapse.util.versionstring
Additional typing in synapse.util.metrics. Didn't get this to pass `no-untyped-defs`, think I'll need to watch #10847
|
|
|
|
| |
* Allow LruCaches to opt out of time-based expiry
* Don't expire `get_users_who_share_room` & friends
|
|
|
| |
So we can see distinguish between "evicting because the cache is too big" and "evicting because the cache entries haven't been recently used".
|
| |
|
| |
|
|
|
| |
Mostly this involves decorating a few Deferred declarations with extra type hints. We wrap the types in quotes to avoid runtime errors when running against older versions of Twisted that don't have generics on Deferred.
|
|
|
|
|
| |
tighten up some of the typing in CachedCall, which is going to be needed when
Twisted 21.7 brings better typing on Deferred.
|
|
|
|
|
|
|
|
|
| |
This PR is tantamount to running
```
pyupgrade --py36-plus --keep-percent-format `find synapse/ -type f -name "*.py"`
```
Part of #9744
|
| |
|
| |
|
|
|
|
|
| |
This is the first of two PRs which seek to address #8518. This first PR lays the groundwork by extending ResponseCache; a second PR (#10158) will update the SyncHandler to actually use it, and fix the bug.
The idea here is that we allow the callback given to ResponseCache.wrap to decide whether its result should be cached or not. We do that by (optionally) passing a ResponseCacheContext into it, which it can modify.
|
|
|
|
|
|
|
|
|
|
| |
* Make `invalidate` and `invalidate_many` do the same thing
... so that we can do either over the invalidation replication stream, and also
because they always confused me a bit.
* Kill off `invalidate_many`
* changelog
|
|
|
|
|
|
|
| |
`keylen` seems to be a thing that is frequently incorrectly set, and we don't really need it.
The only time it was used was to figure out if we had removed a subtree in `del_multi`, which we can do better by changing `TreeCache.pop` to return a different type (`TreeCacheNode`).
Commits should be independently reviewable.
|
|
|
|
|
|
| |
- use a tuple rather than a list for the iterable that is passed into the
wrapped function, for performance
- test that we can pass an iterable and that keys are correctly deduped.
|
|
|
|
|
| |
This will double count slightly in the presence of interned strings. It's off by default as it can consume a lot of resources.
|
| |
|
|
|
| |
I went through and removed a bunch of cruft that was lying around for compatibility with old Python versions. This PR also will now prevent Synapse from starting unless you're running Python 3.6+.
|
|
|
| |
This is no longer required, since we have dropped support for Python 3.5.
|
|
|
|
|
|
|
| |
Part of #9744
Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.
`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
|
|
|
|
|
|
|
| |
Part of #9366
Adds in fixes for B006 and B008, both relating to mutable parameter lint errors.
Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Running `dmypy run` will do a `mypy` check while spinning up a daemon
that makes rerunning `dmypy run` a lot faster.
`dmypy` doesn't support `follow_imports = silent` and has
`local_partial_types` enabled, so this PR enables those options and
fixes the issues that were newly raised. Note that `local_partial_types`
will be enabled by default in upcoming mypy releases.
|
| |
|
| |
|
|
|
|
|
|
|
| |
This reverts commit f5c93fc9931e4029bbd8000f398b6f39d67a8c46.
This is being backed out due to a regression (#9507) and additional
review feedback being provided.
|
|
|
|
|
|
|
| |
This fixes #8518 by adding a conditional check on `SyncResult` in a function when `prev_stream_token == current_stream_token`, as a sanity check. In `CachedResponse.set.<remove>()`, the result is immediately popped from the cache if the conditional function returns "false".
This prevents the caching of a timed-out `SyncResult` (that has `next_key` as the stream key that produced that `SyncResult`). The cache is prevented from returning a `SyncResult` that makes the client request the same stream key over and over again, effectively making it stuck in a loop of requesting and getting a response immediately for as long as the cache keeps those values.
Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
|
|
|
|
|
|
|
| |
- Update black version to the latest
- Run black auto formatting over the codebase
- Run autoformatting according to [`docs/code_style.md
`](https://github.com/matrix-org/synapse/blob/80d6dc9783aa80886a133756028984dbf8920168/docs/code_style.md)
- Update `code_style.md` docs around installing black to use the correct version
|
|
|
|
| |
Ensure that we lock correctly to prevent multiple concurrent metadata load
requests, and generally clean up the way we construct the metadata cache.
|
| |
|
|
|
| |
We don't always need the full power of a DeferredCache.
|
|
|
| |
don't bother constricting a CacheContext unless we need one.
|
| |
|
| |
|
| |
|
|
|
| |
We need to make sure we are readu for the `set_cache_factor` callback.
|
|
|
|
|
|
|
|
|
|
|
| |
* Add `DeferredCache.get_immediate` method
A bunch of things that are currently calling `DeferredCache.get` are only
really interested in the result if it's completed. We can optimise and simplify
this case.
* Remove unused 'default' parameter to DeferredCache.get()
* another get_immediate instance
|
|
|
| |
Most of these uses don't need a full-blown DeferredCache; LruCache is lighter and more appropriate.
|
| |
|
|
|
| |
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
|
| |
|
|
|
|
|
| |
rather than have everything that instantiates an LruCache manage metrics
separately, have LruCache do it itself.
|
|
|
| |
This seemed to entail dragging in a type stub for SortedList.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
slots use less memory (and attribute access is faster) while slightly
limiting the flexibility of the class attributes. This focuses on objects
which are instantiated "often" and for short periods of time.
|
| |
|
|
|
| |
This requires adding a mypy plugin to fiddle with the type signatures a bit.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
This is already correctly done when we instansiate the cache, but wasn't
when it got reloaded (which always happens at least once on startup).
|
| |
|
|
|
|
| |
variables (#6391)
|
|
|
|
|
| |
Currently we copy `users_who_share_room` needlessly about three times,
which is expensive when the set is large (which it can easily be).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
First some background: StreamChangeCache is used to keep track of what "entities" have
changed since a given stream ID. So for example, we might use it to keep track of when the last
to-device message for a given user was received [1], and hence whether we need to pull any to-device messages from the database on a sync [2].
Now, it turns out that StreamChangeCache didn't support more than one thing being changed at
a given stream_id (this was part of the problem with #7206). However, it's entirely valid to send
to-device messages to more than one user at a time.
As it turns out, this did in fact work, because *some* methods of StreamChangeCache coped
ok with having multiple things changing on the same stream ID, and it seems we never actually
use the methods which don't work on the stream change caches where we allow multiple
changes at the same stream ID. But that feels horribly fragile, hence: let's update
StreamChangeCache to properly support this, and add some typing and some more tests while
we're at it.
[1]: https://github.com/matrix-org/synapse/blob/release-v1.12.3/synapse/storage/data_stores/main/deviceinbox.py#L301
[2]: https://github.com/matrix-org/synapse/blob/release-v1.12.3/synapse/storage/data_stores/main/deviceinbox.py#L47-L51
|
|
|
|
|
|
| |
Other parts of the code (such as the StreamChangeCache) assume that there will
not be multiple changes with the same stream id.
This code was introduced in #7024, and I hope this fixes #7206.
|
|
|
|
|
|
|
|
| |
A lot of the things we log at INFO are now a bit superfluous, so lets
make them DEBUG logs to reduce the amount we log by default.
Co-Authored-By: Brendan Abolivier <babolivier@matrix.org>
Co-authored-by: Brendan Abolivier <github@brendanabolivier.com>
|
| |
|
| |
|
| |
|
|
|
| |
Replace every instance of `logger.warn` with `logger.warning` as the former is deprecated.
|
| |
|
| |
|
|
|
|
|
|
| |
* type checking fixes
* changelog
|
|
|
|
|
|
|
|
|
| |
This gives a bit of a grace period where we can attempt to refetch a
remote `well-known`, while still using the cached result if that fails.
Hopefully this will make the well-known resolution a bit more torelant
of failures, rather than it immediately treating failures as "no result"
and caching that for an hour.
|
|
|
|
|
|
|
| |
There was some inconsistent behaviour in the caching layer around how
exceptions were handled - particularly synchronously-thrown ones.
This seems to be most easily handled by pushing the creation of
ObservableDeferreds down from CacheDescriptor to the Cache.
|
|
|
|
|
|
| |
* Add a prometheus metric for active cache lookups.
* changelog
|
| |
|
| |
|
|
|
|
|
|
|
| |
Closes #4583
Does slightly less than #5045, which prevented a room from being upgraded multiple times, one after another. This PR still allows that, but just prevents two from happening at the same time.
Mostly just to mitigate the fact that servers are slow and it can take a moment for the room upgrade to actually complete. We don't want people sending another request to upgrade the room when really they just thought the first didn't go through.
|
| |
|
| |
|
|
|
|
| |
on py3) (#4068)
|
| |
|
| |
|
|\ |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.
(This appears to have been a longstanding bug, but one which has been made more
of a problem by #3932 and #3933).
(This was originally done by Erik as part of #3933. I'm cherry-picking it
because really it's a fix in its own right)
|
| |
| |
| |
| |
| |
| | |
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.
|
| |
| |
| |
| | |
Hopefully helps with #3931
|
|/ |
|
|
|
|
|
|
|
|
| |
ExpiringCache required that `start()` be called before it would actually
start expiring entries. A number of places didn't do that.
This PR removes `start` from ExpiringCache, and automatically starts
backround reaping process on creation instead.
|
| |
|
| |
|
|
|
|
|
| |
Because it was complicated and annoyed me. I suspect this will be more
efficient too.
|
|
|
|
|
|
|
|
|
| |
It turns out that looping_call does check the deferred returned by its
callback, and (at least in the case of client_ips), we were relying on this,
and I broke it in #3604.
Update run_as_background_process to return the deferred, and make sure we
return it to clock.looping_call.
|
|
|
|
|
|
|
|
| |
This fixes #3518, and ensures that we get useful logs and metrics for lots of
things that happen in the background.
(There are certainly more things that happen in the background; these are just
the common ones I've found running a single-process synapse locally).
|
| |
|
|
|
|
|
|
|
|
| |
The get_entities_changed function was changed to return all changed
entities since the given stream position, rather than only those changed
from a given list of entities. This resulted in the function incorrectly
returning large numbers of entities that, for example, caused large
increases in database usage.
|
|
|
|
|
|
|
|
| |
The stream cache keeps track of all entities that have changed since
a particular stream position, so get_entities_changed does not need to
return unknown entites when given a larger stream position.
This makes it consistent with the behaviour of has_entity_changed.
|
|
|
|
|
|
|
|
|
|
|
| |
This line shows up as about 5% of cpu time on a synchrotron:
not_known_entities = set(entities) - set(self._entity_to_key)
Presumably the problem here is that _entity_to_key can be largeish, and
building a set for its keys every time this function is called is slow.
Here we rewrite the logic to avoid building so many sets.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
When _get_state_for_groups is given a wildcard filter, just do a complete
lookup. Hopefully this will give us the best of both worlds by not filling up
the ram if we only need one or two keys, but also making the cache still work
for the federation reader usecase.
|
| |
|
| |
|
|
|
|
| |
they're not meant to be lazy (#3307)
|
|\
| |
| | |
remaining isintance fixes
|
| |
| |
| |
| | |
Signed-off-by: Adrian Tschira <nota@notafile.com>
|
| | |
|
|\| |
|
| |
| |
| |
| | |
Signed-off-by: Adrian Tschira <nota@notafile.com>
|
| | |
|
| | |
|
|/ |
|
|\
| |
| | |
Refactor ResponseCache usage
|
| |
| |
| |
| |
| | |
Turns out that ObservableDeferred.observe doesn't return a deferred if the
result is already completed. Fix handling and improve documentation.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Adds a `.wrap` method to ResponseCache which wraps up the boilerplate of a
(get, set) pair, and then use it throughout the codebase.
This will be largely non-functional, but does include the following functional
changes:
* federation_server.on_context_state_request: drops use of _server_linearizer
which looked redundant and could cause incorrect cache misses by yielding
between the get and the set.
* RoomListHandler.get_remote_public_room_list(): fixes logcontext leaks
* the wrap function includes some logging. I'm hoping this won't be too noisy
on production.
|
|/
|
|
|
|
|
|
|
|
|
| |
This reverts commit 9fbe70a7dc3afabfdac176ba1f4be32dd44602aa.
It turns out that sortedcontainers.SortedDict is not an exact match for
blist.sorteddict; in particular, `popitem()` removes things from the opposite
end of the dict.
This is trivial to fix, but I want to add some unit tests, and potentially some
more thought about it, before we do so.
|
|\
| |
| | |
Add metrics for ResponseCache
|
| | |
|
|\ \
| | |
| | | |
Document the behaviour of ResponseCache
|
| | |
| | |
| | |
| | |
| | |
| | | |
it looks like everything that uses ResponseCache expects to have to
`make_deferred_yieldable` its results. It's debatable whether that is the best
approach, but let's document it for now to avoid further confusion.
|
| |/
|/|
| |
| |
| |
| |
| |
| | |
This commit drop-in replaces blist with SortedContainers. They are
written in pure python so work with pypy, but perform as good as
native implementations, at least in a couple benchmarks:
http://www.grantjenks.com/docs/sortedcontainers/performance.html
|
|/
|
|
|
| |
Fixes an issue where a cache invalidation would invalidate *all* pending
entries, rather than just the entry that we intended to invalidate.
|
| |
|
|
|
|
|
|
|
|
|
| |
The state cache bases its size on the sum of the size of entries. The
size of the entry is calculated once on insertion, so it is important
that the size of entries does not change.
The DictionaryCache modified the entries size, which caused the state
cache to incorrectly think it was smaller than it actually was.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
This is because pruning them was a significant performance drain on
matrix.org
|
| |
|
|
|
|
|
|
|
|
| |
Occaisonally has_any_entity_changed would throw the error: "Set changed
size during iteration" when taking the max of the `sorteddict`. While
its uncertain how that happens, its quite inefficient to iterate over
the entire dict anyway so we change to using the more traditional
`bisect_*` functions.
|
| |
|
| |
|
|
|
|
|
|
| |
We update the normal cache descriptors to handle caches with a single
argument specially so that the key wasn't a 1-tuple. We need to update
the cache list to be aware of this.
|
|
|
|
|
|
|
|
|
| |
Most of the time was spent copying a dict to filter out sentinel values
that indicated that keys did not exist in the dict. The sentinel values
were added to ensure that we cached the non-existence of keys.
By updating DictionaryCache to keep track of which keys were known to
not exist itself we can remove a dictionary copy.
|
|
|
|
|
| |
Otherwise the hit ration of plain get_events gets completely skewed by
calls to get_joined_users* functions.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the cache descriptors store deferreds rather than raw values,
this is a simple way of triggering only one database hit and sharing the
result if two callers attempt to get the same value.
However, there are a few caches that simply store a mapping from string
to string (or int). These caches can have a large number of entries,
under the assumption that each entry is small. However, the size of a
deferred (specifically the size of ObservableDeferred) is signigicantly
larger than that of the raw value, 2kb vs 32b.
This PR therefore changes the cache descriptors to store the raw values
rather than the deferreds.
As a side effect cached storage function now either return a deferred or
the actual value, as the cached list decriptor already does. This is
fine as we always end up just yield'ing on the returned value
eventually, which handles that case correctly.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
This is because getcallargs recomputes the getargspec, amongst other
things, which we don't need to do as its already been done
|
| |
|
|
|
|
|
|
|
| |
The cache wrappers had a habit of leaking the logcontext into the reactor while
the lookup function was running, and then not restoring it correctly when the
lookup function had completed. It's all the fault of
`preserve_context_over_{fn,deferred}` which are basically a bit broken.
|
|
|
|
|
|
|
|
|
| |
The `@cached` decorator on `KeyStore._get_server_verify_key` was missing
its `num_args` parameter, which meant that it was returning the wrong key for
any server which had more than one recorded key.
By way of a fix, change the default for `num_args` to be *all* arguments. To
implement that, factor out a common base class for `CacheDescriptor` and `CacheListDescriptor`.
|
|
|
|
|
|
|
|
| |
... and update some docstrings to correctly reflect the types being used.
get_new_device_msgs_for_remote can return a long under some circumstances,
which was being stored in last_device_list_stream_id_by_dest, and was then
upsetting things on the next loop.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of calculating the size of the cache repeatedly, which can take
a long time now that it can use a callback, instead cache the size and
update that on insertion and deletion.
This requires changing the cache descriptors to have two caches, one for
pending deferreds and the other for the actual values. There's no reason
to evict from the pending deferreds as they won't take up any more
memory.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
effective
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|\ |
|
| | |
|
| | |
|
| | |
|
|/
|
|
|
|
| |
We change it so that each cache has an individual CacheMetric, instead
of having one global CacheMetric. This means that when a cache tries to
increment a counter it does not need to go through so many indirections.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
then entity hasn't changed
|
| |
|
|\
| |
| | |
Make /sync "better".
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
|/ |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
erikj/dictionary_cache
|
| |
|
|
|