| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Debug for #8631.
I'm having a hard time tracking down what's going wrong in that issue.
In the reported example, I could see server A sending federation traffic
to server B and all was well. Yet B reports out-of-sync device updates
from A.
I couldn't see what was _in_ the events being sent from A to B. So I
have added some crude logging to track
- when we have updates to send to a remote HS
- the edus we actually accumulate to send
- when a federation transaction includes a device list update edu
- when such an EDU is received
This is a bit of a sledgehammer.
|
| |
|
|
|
| |
This skips a few methods which are difficult to type.
|
|
|
| |
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
|
| |
|
| |
|
|
|
|
| |
Instead of proxying through the magic getter of the RootConfig
object. This should be more performant (and is more explicit).
|
| |
|
|
|
|
|
| |
Instead of wrapping the JSON into an object, this creates concrete
instances for Transaction and Edu. This allows for improved type
hints and simplified code.
|
|
|
|
|
|
| |
This is to help with performance, where trying to connect to thousands
of hosts at once can consume a lot of CPU (due to TLS etc).
Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
|
| |
|
|
|
|
|
| |
Hopefully this will help us track down where to-device messages are getting
lost/delayed.
|
|
|
|
| |
This reverts commit 05e8c70c059f8ebb066e029bc3aa3e0cefef1019.
|
|
|
| |
This is no longer required, since we have dropped support for Python 3.5.
|
| |
|
| |
|
|
|
|
| |
Every single time I want to access the config object, I have to remember
whether or not we use `get_config`. Let's just get rid of it.
|
|
|
|
|
| |
This basically speeds up federation by "squeezing" each individual dual database call (to destinations and destination_rooms), which previously happened per every event, into one call for an entire batch (100 max).
Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
|
|
|
|
|
|
|
| |
Part of #9744
Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.
`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
|
|
|
|
|
|
|
| |
We pull all destinations requiring catchup from the DB in batches.
However, if all those destinations get filtered out (due to the
federation sender being sharded), then the `last_processed` destination
doesn't get updated, and we keep requesting the same set repeatedly.
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment, if you'd like to share presence between local or remote users, those users must be sharing a room together. This isn't always the most convenient or useful situation though.
This PR adds a module to Synapse that will allow deployments to set up extra logic on where presence updates should be routed. The module must implement two methods, `get_users_for_states` and `get_interested_users`. These methods are given presence updates or user IDs and must return information that Synapse will use to grant passing presence updates around.
A method is additionally added to `ModuleApi` which allows triggering a set of users to receive the current, online presence information for all users they are considered interested in. This is the equivalent of that user receiving presence information during an initial sync.
The goal of this module is to be fairly generic and useful for a variety of applications, with hard requirements being:
* Sending state for a specific set or all known users to a defined set of local and remote users.
* The ability to trigger an initial sync for specific users, so they receive all current state.
|
| |
|
|
|
|
| |
Includes an abstract base class which both the FederationSender
and the FederationRemoteSendQueue must implement.
|
|
|
|
|
| |
Broke in #9640
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently federation catchup will send the last *local* event that we
failed to send to the remote. This can cause issues for large rooms
where lots of servers have sent events while the remote server was down,
as when it comes back up again it'll be flooded with events from various
points in the DAG.
Instead, let's make it so that all the servers send the most recent
events, even if its not theirs. The remote should deduplicate the
events, so there shouldn't be much overhead in doing this.
Alternatively, the servers could only send local events if they were
also extremities and hope that the other server will send the event
over, but that is a bit risky.
|
|
|
|
|
|
|
|
|
|
| |
Federation catch up mode is very inefficient if the number of events
that the remote server has missed is small, since handling gaps can be
very expensive, c.f. #9492.
Instead of going into catch up mode whenever we see an error, we instead
do so only if we've backed off from trying the remote for more than an
hour (the assumption being that in such a case it is more than a
transient failure).
|
|
|
|
|
|
|
|
| |
Following the advice at
https://prometheus.io/docs/practices/instrumentation/#timestamps-not-time-since,
it's preferable to export unix timestamps, not ages.
There doesn't seem to be any particular naming convention for timestamp
metrics.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#9402)
This PR attempts to eliminate unnecessary presence sending work when your local server joins a room, or when a remote server joins a room your server is participating in by processing state deltas in chunks rather than individually.
---
When your server joins a room for the first time, it requests the historical state as well. This chunk of new state is passed to the presence handler which, after filtering that state down to only membership joins, will send presence updates to homeservers for each join processed.
It turns out that we were being a bit naive and processing each event individually, and sending out presence updates for every one of those joins. Even if many different joins were users on the same server (hello IRC bridges), we'd send presence to that same homeserver for every remote user join we saw.
This PR attempts to deduplicate all of that by processing the entire batch of state deltas at once, instead of only doing each join individually. We process the joins and note down which servers need which presence:
* If it was a local user join, send that user's latest presence to all servers in the room
* If it was a remote user join, send the presence for all local users in the room to that homeserver
We deduplicate by inserting all of those pending updates into a dictionary of the form:
```
{
server_name1: {presence_update1, ...},
server_name2: {presence_update1, presence_update2, ...}
}
```
Only after building this dict do we then start sending out presence updates.
|
|
|
|
|
|
|
| |
- Update black version to the latest
- Run black auto formatting over the codebase
- Run autoformatting according to [`docs/code_style.md
`](https://github.com/matrix-org/synapse/blob/80d6dc9783aa80886a133756028984dbf8920168/docs/code_style.md)
- Update `code_style.md` docs around installing black to use the correct version
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#8536)
* Fix outbound federaion with multiple event persisters.
We incorrectly notified federation senders that the minimum persisted
stream position had advanced when we got an `RDATA` from an event
persister.
Notifying of federation senders already correctly happens in the
notifier, so we just delete the offending line.
* Change some interfaces to use RoomStreamToken.
By enforcing use of `RoomStreamTokens` we make it less likely that
people pass in random ints that they got from somewhere random.
|
|
|
|
|
|
|
|
| |
There's no need for it to be in the dict as well as the events table. Instead,
we store it in a separate attribute in the EventInternalMetadata object, and
populate that on load.
This means that we can rely on it being correctly populated for any event which
has been persited to the database.
|
| |
|
|
|
|
|
| |
Add a pair of federation metrics to track the delays in sending PDUs to/from
particular servers.
|
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
* Fix _set_destination_retry_timings
This came about because the code assumed that retry_interval
could not be NULL — which has been challenged by catch-up.
|
| |
|
|
|
|
|
| |
ordering after transmission (#8247)
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
|
|
|
| |
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
|
| |
|
|
|
| |
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Empty federation transmission queues when we are backing off.
Fixes #7828.
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
* Address feedback
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
* Reword newsfile
|
| |
|
|\ |
|
| | |
|
|/ |
|
| |
|
| |
|
|
|
| |
This reuses the same scheme as federation sender sharding
|
|
|
|
|
|
|
|
| |
It was correct at the time of our friend Jorik writing it (checking
git blame), but the world has moved now and it is no longer a
generator.
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
|
| |
|
| |
|
|
|
|
| |
Introduced in #7755, not yet released.
|
| |
|
| |
|
| |
|
|
|
|
| |
looks like we managed to break this during the refactorathon.
|
|
|
| |
This changes the replication protocol so that the server does not send down `RDATA` for rows that happened before the client connected. Instead, the server will send a `POSITION` and clients then query the database (or master out of band) to get up to date.
|
| |
|
|
|
|
|
| |
This will be used to retry outbound transactions to a remote server if
we think it might have come back up.
|
| |
|
| |
|
|\ |
|
| |
| |
| | |
* update version of black and also fix the mypy config being overridden
|
| |
| |
| | |
Replace every instance of `logger.warn` with `logger.warning` as the former is deprecated.
|
| | |
|
|\| |
|
| |
| |
| |
| |
| | |
This is in preparation for having multiple data stores that offer
different functionality, e.g. splitting out state or event storage.
|
|/ |
|
| |
|
|
|
| |
Co-Authored-By: Erik Johnston <erik@matrix.org>
|
|
|
|
|
|
| |
The contexts were being filtered too early so the send loop wasn't
being linked to them unless the destination
was whitelisted.
|
|
|
|
|
| |
Propagate opentracing contexts through EDUs
Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
|
| |
|
|
|
|
| |
this hasn't done anything for years
|
| |
|
| |
|
|
|
|
|
|
|
| |
Adds new config option `cleanup_extremities_with_dummy_events` which
periodically sends dummy events to rooms with more than 10 extremities.
THIS IS REALLY EXPERIMENTAL.
|
|
|
|
| |
This code confused the hell out of me today. Split _get_new_device_messages
into its two (unrelated) parts.
|
|
|
| |
fixes #5153
|
|
|
|
| |
... mostly to fix pep8 fails
|
|
|
| |
Fixes #3951.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Primarily this fixes a bug in the handling of remote users joining a
room where the server sent out the presence for all local users in the
room to all servers in the room.
We also change to using the state delta stream, rather than the
distributor, as it will make it easier to split processing out of the
master process (as well as being more flexible).
Finally, when sending presence states to newly joined servers we filter
out old presence states to reduce the number sent. Initially we filter
out states that are offline and have a last active more than a week ago,
though this can be changed down the line.
Fixes #3962
|
|
|
|
| |
Rate-limit outgoing read-receipts as per #4730.
|
|
|