| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
This will double count slightly in the presence of interned strings. It's off by default as it can consume a lot of resources.
|
| |
|
|
|
| |
I went through and removed a bunch of cruft that was lying around for compatibility with old Python versions. This PR also will now prevent Synapse from starting unless you're running Python 3.6+.
|
| |
|
|
|
| |
This is no longer required, since we have dropped support for Python 3.5.
|
|\ |
|
| |
| |
| |
| |
| | |
As far as I can tell our logging contexts are meant to log the request ID, or sometimes the request ID followed by a suffix (this is generally stored in the name field of LoggingContext). There's also code to log the name@memory location, but I'm not sure this is ever used.
This simplifies the code paths to require every logging context to have a name and use that in logging. For sub-contexts (created via nested_logging_contexts, defer_to_threadpool, Measure) we use the current context's str (which becomes their name or the string "sentinel") and then potentially modify that (e.g. add a suffix).
|
| |
| |
| |
| | |
Signed-off-by: Denis Kasak <dkasak@termina.org.uk>
|
|/
|
|
|
|
|
| |
Part of #9744
Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.
`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
|
|
|
|
|
|
|
| |
Part of #9366
Adds in fixes for B006 and B008, both relating to mutable parameter lint errors.
Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Running `dmypy run` will do a `mypy` check while spinning up a daemon
that makes rerunning `dmypy run` a lot faster.
`dmypy` doesn't support `follow_imports = silent` and has
`local_partial_types` enabled, so this PR enables those options and
fixes the issues that were newly raised. Note that `local_partial_types`
will be enabled by default in upcoming mypy releases.
|
| |
|
|
|
|
|
| |
- Merge 'isinstance' calls.
- Remove unnecessary dict call outside of comprehension.
- Use 'sys.exit()' calls.
|
| |
|
| |
|
|
|
| |
This great big stack of commits is a a whole load of hoop-jumping to make it easier to store additional values in login tokens, and then to actually store the SSO Identity Provider in the login token. (Making use of that data will follow in a subsequent PR.)
|
|
|
|
|
|
|
| |
This reverts commit f5c93fc9931e4029bbd8000f398b6f39d67a8c46.
This is being backed out due to a regression (#9507) and additional
review feedback being provided.
|
|
|
|
|
|
|
| |
This fixes #8518 by adding a conditional check on `SyncResult` in a function when `prev_stream_token == current_stream_token`, as a sanity check. In `CachedResponse.set.<remove>()`, the result is immediately popped from the cache if the conditional function returns "false".
This prevents the caching of a timed-out `SyncResult` (that has `next_key` as the stream key that produced that `SyncResult`). The cache is prevented from returning a `SyncResult` that makes the client request the same stream key over and over again, effectively making it stuck in a loop of requesting and getting a response immediately for as long as the cache keeps those values.
Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
|
|
|
|
|
|
|
| |
- Update black version to the latest
- Run black auto formatting over the codebase
- Run autoformatting according to [`docs/code_style.md
`](https://github.com/matrix-org/synapse/blob/80d6dc9783aa80886a133756028984dbf8920168/docs/code_style.md)
- Update `code_style.md` docs around installing black to use the correct version
|
|
|
|
| |
Ensure that we lock correctly to prevent multiple concurrent metadata load
requests, and generally clean up the way we construct the metadata cache.
|
| |
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Synapse 1.27.0rc2 (2021-02-11)
==============================
Features
--------
- Further improvements to the user experience of registration via single sign-on. ([\#9297](https://github.com/matrix-org/synapse/issues/9297))
Bugfixes
--------
- Fix ratelimiting introduced in v1.27.0rc1 for invites to respect the `ratelimit` flag on application services. ([\#9302](https://github.com/matrix-org/synapse/issues/9302))
- Do not automatically calculate `public_baseurl` since it can be wrong in some situations. Reverts behaviour introduced in v1.26.0. ([\#9313](https://github.com/matrix-org/synapse/issues/9313))
Improved Documentation
----------------------
- Clarify the sample configuration for changes made to the template loading code. ([\#9310](https://github.com/matrix-org/synapse/issues/9310))
|
| |
| |
| |
| | |
This breaks some people's configurations (if their Client-Server API
is not accessed via port 443).
|
|/
|
|
|
| |
* Adds type hints to the groups servlet and stringutils code.
* Assert the maximum length of some input values for spec compliance.
|
|\ |
|
| |
| |
| |
| |
| | |
There's some prelimiary work here to pull out the construction of a jinja environment to a separate function.
I wanted to load the template at display time rather than load time, so that it's easy to update on the fly. Honestly, I think we should do this with all our templates: the risk of ending up with malformed templates is far outweighed by the improved turnaround time for an admin trying to update them.
|
|/
|
|
|
|
|
|
|
| |
the homeserver config (#9229)
If a Synapse module's config block were empty in YAML, thus being translated to a `Nonetype` in Python, then some modules could fail as that None ends up getting passed to their `parse_config` method. Modules are expected to accept a `dict` instead.
This PR ensures that if the user does end up specifying an empty config block (such as what [the default oidc config in the sample config](https://github.com/matrix-org/synapse/blob/5310808d3bebd17275355ecd474bc013e8c7462d/docs/sample_config.yaml#L1816-L1845) states) then `None` is not passed to the module. An empty dict is passed instead.
This code assumes that no existing modules are relying on receiving a `None` config block, but I'd really hope that they aren't.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
`distutils` is pretty much deprecated these days, and replaced with
`setuptools`. It's also annoying because it's you can't `pip install` it, and
it's hard to figure out which debian package we should depend on to make sure
it's there.
Since we only use it for a tiny function anyway, let's just vendor said
function into our codebase.
|
|
|
|
| |
We passed in a graph to `sorted_topologically` which didn't have an
entry for each node (as we dropped nodes with no edges).
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Spam checker modules can now provide async methods. This is implemented
in a backwards-compatible manner.
|
|
|
|
|
|
|
|
|
|
| |
The idea is that the parse_config method of extension modules can raise either a ConfigError or a JsonValidationError,
and it will be magically turned into a legible error message. There's a few components to it:
* Separating the "path" and the "message" parts of a ConfigError, so that we can fiddle with the path bit to turn it
into an absolute path.
* Generally improving the way ConfigErrors get printed.
* Passing in the config path to load_module so that it can wrap any exceptions that get caught appropriately.
|
|
|
| |
We don't always need the full power of a DeferredCache.
|
|\
| |
| | |
Fix serialisation errors when using third-party event rules.
|
| |
| |
| |
| |
| |
| | |
Not being able to serialise `frozendicts` is fragile, and it's annoying to have
to think about which serialiser you want. There's no real downside to
supporting frozendicts, so let's just have one json encoder.
|
|/
|
|
|
|
|
|
| |
This allows trailing commas in multi-line arg lists.
Minor, but we might as well keep our formatting current with regard to
our minimum supported Python version.
Signed-off-by: Dan Callahan <danc@element.io>
|
|
|
| |
don't bother constricting a CacheContext unless we need one.
|
| |
|
| |
|
| |
|
|
|
| |
We need to make sure we are readu for the `set_cache_factor` callback.
|
|
|
|
|
|
|
|
|
|
|
| |
* Add `DeferredCache.get_immediate` method
A bunch of things that are currently calling `DeferredCache.get` are only
really interested in the result if it's completed. We can optimise and simplify
this case.
* Remove unused 'default' parameter to DeferredCache.get()
* another get_immediate instance
|
|
|
| |
Most of these uses don't need a full-blown DeferredCache; LruCache is lighter and more appropriate.
|
| |
|
|
|
| |
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
|
| |
|
|
|
|
|
| |
rather than have everything that instantiates an LruCache manage metrics
separately, have LruCache do it itself.
|
|
|
| |
This seemed to entail dragging in a type stub for SortedList.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This turns:
Failed to parse config for 'myplugin': Exception('error message')
into:
Failed to parse config for 'myplugin': error message.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Remove `on_timeout_cancel` from `timeout_deferred`
The `on_timeout_cancel` param to `timeout_deferred` wasn't always called on a
timeout (in particular if the canceller raised an exception), so it was
unreliable. It was also only used in one place, and to be honest it's easier to
do what it does a different way.
* Fix handling of connection timeouts in outgoing http requests
Turns out that if we get a timeout during connection, then a different
exception is raised, which wasn't always handled correctly.
To fix it, catch the exception in SimpleHttpClient and turn it into a
RequestTimedOutError (which is already a documented exception).
Also add a description to RequestTimedOutError so that we can see which stage
it failed at.
* Fix incorrect handling of timeouts reading federation responses
This was trapping the wrong sort of TimeoutError, so was never being hit.
The effect was relatively minor, but we should fix this so that it does the
expected thing.
* Fix inconsistent handling of `timeout` param between methods
`get_json`, `put_json` and `delete_json` were applying a different timeout to
the response body to `post_json`; bring them in line and test.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Erik Johnston <erik@matrix.org>
|
|
|
|
|
|
|
| |
This converts calls like super(Foo, self) -> super().
Generated with:
sed -i "" -Ee 's/super\([^\(]+\)/super()/g' **/*.py
|
| |
|
|
|
|
|
| |
slots use less memory (and attribute access is faster) while slightly
limiting the flexibility of the class attributes. This focuses on objects
which are instantiated "often" and for short periods of time.
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Synapse 1.20.0rc3 (2020-09-11)
==============================
Bugfixes
--------
- Fix a bug introduced in v1.20.0rc1 where the wrong exception was raised when invalid JSON data is encountered. ([\#8291](https://github.com/matrix-org/synapse/issues/8291))
|
| | |
|
| |
| |
| |
| |
| | |
Removes the `user_joined_room` and stops calling it since there are no observers.
Also cleans-up some other unused signals and related code.
|
| | |
|
|/
|
|
|
| |
By importing from canonicaljson the simplejson module was still being used
in some situations. After this change the std lib json is consistenty used
throughout Synapse.
|
| |
|
|
|
| |
This requires adding a mypy plugin to fiddle with the type signatures a bit.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
Closes: https://github.com/matrix-org/synapse/issues/6766
Equivalent Sydent PR: https://github.com/matrix-org/sydent/pull/309
I believe it's now time to remove the extra allowed `:` from `client_secret` parameters.
|
| |
|
| |
|
| |
|
| |
|
|
|
| |
This solves the problem that the first few lines are logged twice on matrix.org. Hopefully the comments explain it.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has long been something I've wanted to do. Basically the `Daemonize` code
is both too flexible and not flexible enough, in that it offers a bunch of
features that we don't use (changing UID, closing FDs in the child, logging to
syslog) and doesn't offer a bunch that we could do with (redirecting stdout/err
to a file instead of /dev/null; having the parent not exit until the child is
running).
As a first step, I've lifted the Daemonize code and removed the bits we don't
use. This should be a non-functional change. Fixing everything else will come
later.
|
| |
|
| |
|
| |
|
|
|
| |
fixes #7016
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
While working on https://github.com/matrix-org/synapse/issues/5665 I found myself digging into the `Ratelimiter` class and seeing that it was both:
* Rather undocumented, and
* causing a *lot* of config checks
This PR attempts to refactor and comment the `Ratelimiter` class, as well as encourage config file accesses to only be done at instantiation.
Best to be reviewed commit-by-commit.
|
|
|
|
|
|
| |
Instead of storing and sending an ACK for every single row we send
synchronously, we instead do it asynchronously while batching up
updates.
|
|
|
|
| |
This is already correctly done when we instansiate the cache, but wasn't
when it got reloaded (which always happens at least once on startup).
|
|
|
| |
`Failure()` is more cunning than `Failure(e)`.
|
| |
|
|
|
|
| |
this is a no-op on python 3.
|
|
|
|
| |
this is a no-op on python 3.
|
| |
|
|
|
|
| |
variables (#6391)
|
|
|
|
|
| |
Currently we copy `users_who_share_room` needlessly about three times,
which is expensive when the set is large (which it can easily be).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
First some background: StreamChangeCache is used to keep track of what "entities" have
changed since a given stream ID. So for example, we might use it to keep track of when the last
to-device message for a given user was received [1], and hence whether we need to pull any to-device messages from the database on a sync [2].
Now, it turns out that StreamChangeCache didn't support more than one thing being changed at
a given stream_id (this was part of the problem with #7206). However, it's entirely valid to send
to-device messages to more than one user at a time.
As it turns out, this did in fact work, because *some* methods of StreamChangeCache coped
ok with having multiple things changing on the same stream ID, and it seems we never actually
use the methods which don't work on the stream change caches where we allow multiple
changes at the same stream ID. But that feels horribly fragile, hence: let's update
StreamChangeCache to properly support this, and add some typing and some more tests while
we're at it.
[1]: https://github.com/matrix-org/synapse/blob/release-v1.12.3/synapse/storage/data_stores/main/deviceinbox.py#L301
[2]: https://github.com/matrix-org/synapse/blob/release-v1.12.3/synapse/storage/data_stores/main/deviceinbox.py#L47-L51
|
|
|
|
|
|
| |
Other parts of the code (such as the StreamChangeCache) assume that there will
not be multiple changes with the same stream id.
This code was introduced in #7024, and I hope this fixes #7206.
|
|
|
|
| |
make sure we clear out all but one update for the user
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Pull Sentinel out of LoggingContext
... and drop a few unnecessary references to it
* Factor out LoggingContext.current_context
move `current_context` and `set_context` out to top-level functions.
Mostly this means that I can more easily trace what's actually referring to
LoggingContext, but I think it's generally neater.
* move copy-to-parent into `stop`
this really just makes `start` and `stop` more symetric. It also means that it
behaves correctly if you manually `set_log_context` rather than using the
context manager.
* Replace `LoggingContext.alive` with `finished`
Turn `alive` into `finished` and make it a bit better defined.
|
|
|
|
| |
Ensure good comprehension hygiene using flake8-comprehensions.
|
|
|
|
|
|
|
|
| |
A lot of the things we log at INFO are now a bit superfluous, so lets
make them DEBUG logs to reduce the amount we log by default.
Co-Authored-By: Brendan Abolivier <babolivier@matrix.org>
Co-authored-by: Brendan Abolivier <github@brendanabolivier.com>
|
| |
|
| |
|
|
|
|
|
|
| |
... since the whole response is huge.
We even need to break up the assertions, since kibana otherwise truncates them.
|
| |
|
|
|
|
|
| |
Some modules don't need any config, so having to define a `config` property
just to keep the loader happy is a bit annoying.
|
|
|
| |
The main point here is to make sure that the state returned by _get_state_in_room has been authed before we try to use it as state in the room.
|
| |
|
| |
|
|
|
|
|
| |
`Measure` incorrectly assumed that it was the only thing being done by the parent `LoggingContext`. For instance, during a "renew group attestations" operation, hundreds of `outbound_request` calls could take place in parallel, all using the same `LoggingContext`. This would mean that any resources used during *any* of those calls would be reported against *all* of them, producing wildly inaccurate results.
Instead, we now give each `Measure` block its own `LoggingContext` (using the parent `LoggingContext` mechanism to ensure that the log lines look correct and that the metrics are ultimately propogated to the top level for reporting against requests/backgrond tasks).
|
| |
|
| |
|
| |
|
|
|
| |
Replace every instance of `logger.warn` with `logger.warning` as the former is deprecated.
|
| |
|
|
|
|
|
|
|
| |
This makes it easier to use in an async/await world.
Also fixes a bug where cache descriptors would occaisonally return a raw
value rather than a deferred.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
| |
Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
* type checking fixes
* changelog
|
|
|
|
|
|
|
|
|
| |
We have set the max retry interval to a value larger than a postgres or
sqlite int can hold, which caused exceptions when updating the
destinations table.
To fix postgres we need to change the column to a bigint, and for sqlite
we lower the max interval to 2**62 (which is still incredibly long).
|
|\ |
|
| |
| |
| |
| | |
Track the time that a server started failing at, for general analysis purposes.
|
| |
| |
| |
| |
| |
| | |
Essentially the intention here is to end up blacklisting servers which never
respond to federation requests.
Fixes https://github.com/matrix-org/synapse/issues/5113.
|
| |
| |
| |
| | |
This was intended to introduce an element of jitter; instead it gave you a
30/60 chance of resetting to zero.
|
| |
| |
| |
| |
| |
| |
| | |
This is a redo of https://github.com/matrix-org/synapse/pull/5897 but with `id_access_token` accepted.
Implements [MSC2134](https://github.com/matrix-org/matrix-doc/pull/2134) plus Identity Service v2 authentication ala [MSC2140](https://github.com/matrix-org/matrix-doc/pull/2140).
Identity lookup-related functions were also moved from `RoomMemberHandler` to `IdentityHandler`.
|
| |
| |
| |
| | |
* remove some unused code
* make things which were constants into constants for efficiency and clarity
|
| |
| |
| |
| |
| | |
This reverts commit 71fc04069a5770a204c3514e0237d7374df257a8.
This broke 3PID invites as #5892 was required for it to work correctly.
|
| |
| |
| |
| |
| |
| |
| | |
Fixes https://github.com/matrix-org/synapse/issues/5861
Adds support for the v2 lookup API as defined in [MSC2134](https://github.com/matrix-org/matrix-doc/pull/2134). Currently this is only used for 3PID invites.
Sytest PR: https://github.com/matrix-org/sytest/pull/679
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This gives a bit of a grace period where we can attempt to refetch a
remote `well-known`, while still using the cached result if that fails.
Hopefully this will make the well-known resolution a bit more torelant
of failures, rather than it immediately treating failures as "no result"
and caching that for an hour.
|
|/
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a bug where the default attribute maps were prioritised over
user-specified ones, resulting in incorrect mappings.
The problem is that if you call SPConfig.load() multiple times, it adds new
attribute mappers to a list. So by calling it with the default config first,
and then the user-specified config, we would always get the default mappers
before the user-specified mappers.
To solve this, let's merge the config dicts first, and then pass them to
SPConfig.
|
| |
|
| |
|
|
|
|
|
|
|
| |
There was some inconsistent behaviour in the caching layer around how
exceptions were handled - particularly synchronously-thrown ones.
This seems to be most easily handled by pushing the creation of
ObservableDeferreds down from CacheDescriptor to the Cache.
|
|
|
|
|
|
| |
* Add a prometheus metric for active cache lookups.
* changelog
|
| |
|
|
|
|
|
|
|
|
|
| |
The version of a module isn't going to change over the lifetime of the
process (assuming no funky hot reloading is going on, which it isn't),
so let's just cache the result to avoid spawning lots of git
subprocesses.
Fixes #5672.
|
|
|
|
|
|
|
| |
- Put the default window_size back to 1000ms (broken by #5181)
- Make the `rc_federation` config actually do something
- fix an off-by-one error in the 'concurrent' limit
- Avoid creating an unused `_PerHostRatelimiter` object for every single
incoming request
|
|
|
|
|
|
|
|
| |
(#5617)
* Improve the backwards compatibility re-exports of synapse.logging.context.
* reexport logformatter too
|
| |
|
|
|
|
|
|
|
|
| |
* Fix 'utime went backwards' errors on daemonization.
Fixes #5608
* remove spurious debug
|
|
|
|
| |
Fixes #5602, #5603
|
| |
|
|
|
|
|
|
|
| |
Closes #4583
Does slightly less than #5045, which prevented a room from being upgraded multiple times, one after another. This PR still allows that, but just prevents two from happening at the same time.
Mostly just to mitigate the fact that servers are slow and it can take a moment for the room upgrade to actually complete. We don't want people sending another request to upgrade the room when really they just thought the first didn't go through.
|
|
|
|
|
| |
Sentry will catch the errors if they happen, so that should be good enough, and
woun't make things explode if we hit the error condition.
|
|\ |
|
| | |
|
|/
|
|
| |
Check that our clocks go forward.
|
|
|
| |
Fixes a regression introduced in #5335.
|
| |
|
| |
|
| |
|
|\
| |
| | |
Allow client event serialization to be async
|
| |
| |
| | |
Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
|
| | |
|
|/ |
|
|\ |
|
| | |
|
| | |
|
|/
|
| |
Avoid sending syntax errors from the manhole to sentry.
|
| |
|
|\
| |
| | |
Implement workaround for login error.
|
| |
| |
| |
| | |
Signed-off-by: Robert Jacob <xperimental@solidproject.de>
|
|/ |
|
| |
|
|
|
|
| |
This is a bit of a half-assed effort at fixing https://github.com/matrix-org/synapse/issues/4252. Fundamentally the right answer is to drop support for Python 2.
|
|\
| |
| |
| | |
erikj/alias_disallow_list
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Wrap calls to deferToThread() in a thing which uses a child logcontext to
attribute CPU usage to the right request.
While we're in the area, remove the logcontext_tracer stuff, which is never
used, and afaik doesn't work.
Fixes #4064
|
| |
| |
| |
| | |
on py3) (#4068)
|
| | |
|
| | |
|
| | |
|
|/ |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
If a looping call function errors, then it kills the loop entirely.
Currently it throws away the exception logs, so we should make it
actually log them.
Fixes #3929
|
| |
|
| |
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
transactions (#3959)
when processing incoming transactions, it can be hard to see what's going on,
because we process a bunch of stuff in parallel, and because we may end up
recursively working our way through a chain of three or four events.
This commit creates a way to use logcontexts to add the relevant event ids to
the log lines.
|
|\| |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.
(This appears to have been a longstanding bug, but one which has been made more
of a problem by #3932 and #3933).
(This was originally done by Erik as part of #3933. I'm cherry-picking it
because really it's a fix in its own right)
|
| |
| |
| |
| |
| |
| | |
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.
|
| |
| |
| |
| | |
Hopefully helps with #3931
|
|/ |
|
|
|
|
|
|
|
|
| |
ExpiringCache required that `start()` be called before it would actually
start expiring entries. A number of places didn't do that.
This PR removes `start` from ExpiringCache, and automatically starts
backround reaping process on creation instead.
|
|
|
|
|
|
|
|
|
|
| |
Let's try to rationalise the logging that happens when we are processing an
incoming transaction, to make it easier to figure out what is going wrong when
they take ages. In particular:
- make everything start with a [room_id event_id] prefix
- make sure we log a warning when catching exceptions rather than just turning
them into other, more cryptic, exceptions.
|
| |
|
| |
|
|
|
|
|
|
|
| |
The existing deferred timeout helper function (and the one into twisted)
suffer from a bug when a deferred's canceller throws an exception, #3842.
The new helper function doesn't suffer from this problem.
|
|
|
|
|
| |
Turns out deferred.cancel sometimes throws, so we do that last to ensure
that we always do resolve the new deferred.
|
|
|
|
| |
This is an attempt to mitigate #3842 by adding yet-another-timeout
|
| |
|
|
|
|
|
| |
Newer versions of openssh client refuse to connect to the old key due to
its length.
|
|
|
|
|
| |
This fixes bugs introduced in #3700, by making sure that we behave sanely
when an incoming connection is closed before the headers are read.
|
|
|
|
|
| |
Make the logcontext filter not explode if it somehow ends up with a logcontext
of None, since that infinite-loops the whole logging system.
|
| |
|
|\ |
|
| |
| |
| |
| |
| |
| | |
Turns out that cancellation of inlineDeferreds didn't really work properly
until Twisted 18.7. This commit refactors Linearizer.queue to avoid
inlineCallbacks.
|
|/ |
|
| |
|
| |
|
|
|
|
|
| |
Because it was complicated and annoyed me. I suspect this will be more
efficient too.
|
|
|
|
|
|
|
|
|
| |
It turns out that looping_call does check the deferred returned by its
callback, and (at least in the case of client_ips), we were relying on this,
and I broke it in #3604.
Update run_as_background_process to return the deferred, and make sure we
return it to clock.looping_call.
|
| |
|
|
|
|
|
| |
Linearizer was effectively a Limiter with max_count=1, so rather than
maintaining two sets of code, let's combine them.
|
|
|
|
|
| |
* give them names, to improve logging
* use a deque rather than a list for efficiency
|
|
|
|
| |
Fixes #3570
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is more involved than it might otherwise be, because the current
implementation just drops its logcontexts and runs everything in the sentinel
context.
It turns out that we aren't actually using a bunch of the functionality here
(notably suppress_failures and the fact that Distributor.fire returns a
deferred), so the easiest way to fix this is actually by simplifying a bunch of
code.
|
|
|
|
|
|
|
|
| |
This fixes #3518, and ensures that we get useful logs and metrics for lots of
things that happen in the background.
(There are certainly more things that happen in the background; these are just
the common ones I've found running a single-process synapse locally).
|
| |
|
|
|
|
|
|
|
|
| |
The get_entities_changed function was changed to return all changed
entities since the given stream position, rather than only those changed
from a given list of entities. This resulted in the function incorrectly
returning large numbers of entities that, for example, caused large
increases in database usage.
|
|\
| |
| | |
Don't return unknown entities in get_entities_changed
|
| |
| |
| |
| |
| |
| |
| |
| | |
The stream cache keeps track of all entities that have changed since
a particular stream position, so get_entities_changed does not need to
return unknown entites when given a larger stream position.
This makes it consistent with the behaviour of has_entity_changed.
|
|/
|
|
|
|
|
|
| |
popitem removes the *most recent* item by default [1]. We want the oldest.
Fixes #3524
[1]: https://docs.python.org/2/library/collections.html#collections.OrderedDict.popitem
|
|
|
|
|
|
|
|
|
|
|
| |
This line shows up as about 5% of cpu time on a synchrotron:
not_known_entities = set(entities) - set(self._entity_to_key)
Presumably the problem here is that _entity_to_key can be largeish, and
building a set for its keys every time this function is called is slow.
Here we rewrite the logic to avoid building so many sets.
|
|
|
|
|
| |
Let's try to include time spent in the DB threads in the per-request/block cpu
usage metrics.
|
|
|
|
|
| |
Factor out the resource usage tracking out to a separate object, which can be
passed around and copied independently of the logcontext itself.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
When _get_state_for_groups is given a wildcard filter, just do a complete
lookup. Hopefully this will give us the best of both worlds by not filling up
the ram if we only need one or two keys, but also making the cache still work
for the federation reader usecase.
|
|\
| |
| | |
Log number of events fetched from DB
|
| |
| |
| |
| | |
so that we can stub it for the sentinel and not have a billion failing UTs
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When we finish processing a request, log the number of events we fetched from
the database to handle it.
[I'm trying to figure out which requests are responsible for large amounts of
event cache churn. It may turn out to be more helpful to add counts to the
prometheus per-request/block metrics, but that is an extension to this code
anyway.]
|
|/ |
|