| Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
|
|
|
|
deprecation notices to v1.32.0 upgrade notes (#9849)
Fixes https://github.com/matrix-org/synapse/issues/9846.
Adds important removal information from the top of https://github.com/matrix-org/synapse/releases/tag/v1.32.0rc1 into UPGRADE.rst.
|
|
As far as I can tell our logging contexts are meant to log the request ID, or sometimes the request ID followed by a suffix (this is generally stored in the name field of LoggingContext). There's also code to log the name@memory location, but I'm not sure this is ever used.
This simplifies the code paths to require every logging context to have a name and use that in logging. For sub-contexts (created via nested_logging_contexts, defer_to_threadpool, Measure) we use the current context's str (which becomes their name or the string "sentinel") and then potentially modify that (e.g. add a suffix).
|
|
|
|
|
|
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
|
|
Signed-off-by: Dan Callahan <danc@element.io>
|
|
|
|
* Drop Python 3.5 from Trove classifier metadata.
Signed-off-by: Dan Callahan <danc@element.io>
|
|
Signed-off-by: Dan Callahan <danc@element.io>
|
|
This change ensures that the appservice registration behaviour follows the spec. We decided to do this for Dendrite, so it made sense to also make a PR for synapse to correct the behaviour.
|
|
|
|
There's no point logging this twice.
|
|
By providing the additional build tag for `msc2946`.
|
|
Signed-off-by: Dan Callahan <danc@element.io>
|
|
Related: #8334
Deprecated in: #9429 - Synapse 1.28.0 (2021-02-25)
`GET /_synapse/admin/v1/users/<user_id>` has no
- unit tests
- documentation
API in v2 is available (#5925 - 12/2019, v1.7.0).
API is misleading. It expects `user_id` and returns a list of all users.
Signed-off-by: Dirk Klimpel dirk@klimpel.org
|
|
Part of #9366
Adds in fixes for B006 and B008, both relating to mutable parameter lint errors.
Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
|
|
|
|
We pull all destinations requiring catchup from the DB in batches.
However, if all those destinations get filtered out (due to the
federation sender being sharded), then the `last_processed` destination
doesn't get updated, and we keep requesting the same set repeatedly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Apparently on tox 2.5, `usedevelop` overrides `skip_install`, so we end up
trying to install the full dependencies even for the `-old` environment.
|
|
They don't make any sense on the intermediate builder image. The final
images needs them to be of use for anyone.
Signed-off-by: Johannes Wienke <languitar@semipol.de>
|
|
(#9735)
When joining a room with join rules set to 'restricted', check if the
user is a member of the spaces defined in the 'allow' key of the join rules.
This only applies to an experimental room version, as defined in MSC3083.
|
|
Records additional request information into the structured logs,
e.g. the requester, IP address, etc.
|
|
present (#8926)
This PR modifies `GaugeBucketCollector` to only report data once it has been updated, rather than initially reporting a value of 0. Fixes zero values being reported for some metrics on startup until a background job to update the metric's value runs later.
|
|
At the moment, if you'd like to share presence between local or remote users, those users must be sharing a room together. This isn't always the most convenient or useful situation though.
This PR adds a module to Synapse that will allow deployments to set up extra logic on where presence updates should be routed. The module must implement two methods, `get_users_for_states` and `get_interested_users`. These methods are given presence updates or user IDs and must return information that Synapse will use to grant passing presence updates around.
A method is additionally added to `ModuleApi` which allows triggering a set of users to receive the current, online presence information for all users they are considered interested in. This is the equivalent of that user receiving presence information during an initial sync.
The goal of this module is to be fairly generic and useful for a variety of applications, with hard requirements being:
* Sending state for a specific set or all known users to a defined set of local and remote users.
* The ability to trigger an initial sync for specific users, so they receive all current state.
|
|
|
|
|
|
The `remote_media_cache_thumbnails_media_origin_media_id_thumbna_key`
constraint is superceded by
`remote_media_repository_thumbn_media_origin_id_width_height_met` (which adds
`thumbnail_method` to the unique key).
PR #7124 made an attempt to remove the old constraint, but got the name wrong,
so it didn't work. Here we update the bg update and rerun it.
Fixes #8649.
|
|
|
|
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Per MSC3083.
|
|
|
|
The regex should be terminated so that subdomain matches of another
domain are not accepted. Just ensuring that someone doesn't shoot
themselves in the foot by copying our example.
Signed-off-by: Denis Kasak <dkasak@termina.org.uk>
|
|
Fixes #9642.
Signed-off-by: Cristina Muñoz <hi@xmunoz.com>
|
|
This PR rewrites the original complement.sh script with a number of improvements:
* We can now use a local checkout of Complement (configurable with `COMPLEMENT_DIR`), though the default behaviour still downloads the master branch.
* You can now specify a regex of test names to run, or just run all tests.
* We now use the Synapse test blacklist tag (so all tests will pass).
|
|
|
|
`room_invite_state_types` was inconvenient as a configuration setting, because
anyone that ever set it would not receive any new types that were added to the
defaults. Here, we deprecate the old setting, and replace it with a couple of
new settings under `room_prejoin_state`.
|
|
This should fix a class of bug where we forget to check if e.g. the appservice shouldn't be ratelimited.
We also check the `ratelimit_override` table to check if the user has ratelimiting disabled. That table is really only meant to override the event sender ratelimiting, so we don't use any values from it (as they might not make sense for different rate limits), but we do infer that if ratelimiting is disabled for the user we should disabled all ratelimits.
Fixes #9663
|
|
|
|
|
|
For it's obvious performance benefits. `dmypy` support landed in #9692.
|
|
|
|
non-pip package (#9074)
Signed-off-by: blakehawkins blake.hawkins.11@gmail.com
|
|
Includes an abstract base class which both the FederationSender
and the FederationRemoteSendQueue must implement.
|
|
I've reiterated the advice about using `oidc` to migrate, since I've seen a few
people caught by this.
I've also removed a couple of the examples as they are duplicating the OIDC
documentation, and I think they might be leading people astray.
|
|
|
|
|
|
If you have the wrong version of `cryptography` installed, synapse suggests:
```
To install run:
pip install --upgrade --force 'cryptography>=3.4.7;python_version>='3.6''
```
However, the use of ' inside '...' doesn't work, so when you run this, you get
an error.
|
|
Make pip install faster in Docker build for [Complement](https://github.com/matrix-org/complement) testing.
If files have changed in a `COPY` command, Docker will invalidate all of the layers below. So I changed the order of operations to install all dependencies before we `COPY synapse /synapse/synapse/`. This allows Docker to use our cached layer of dependencies even when we change the source of Synapse and speed up builds dramatically! `53.5s` -> `3.7s` builds 🤘
As an alternative, I did try using BuildKit caches but this still took 30 seconds overall on that step. 15 seconds to gather the dependencies from the cache and another 15 seconds to `Installing collected packages`.
Fix https://github.com/matrix-org/synapse/issues/9364
|
|
This warning is somewhat confusing to users, so let's suppress it
|
|
Running `dmypy run` will do a `mypy` check while spinning up a daemon
that makes rerunning `dmypy run` a lot faster.
`dmypy` doesn't support `follow_imports = silent` and has
`local_partial_types` enabled, so this PR enables those options and
fixes the issues that were newly raised. Note that `local_partial_types`
will be enabled by default in upcoming mypy releases.
|
|
|
|
|
|
|
|
cryptography (#9697)
|
|
Fixes redirect loop
Signed-off-by: Paul Tötterman <paul.totterman@iki.fi>
|
|
using /usr/bin/env (#9689)
On NixOS, `bash` isn't under `/bin/bash` but rather in some directory in `$PATH`. Locally, I've been patching those scripts to make them work.
`/usr/bin/env` seems to be the only [portable way](https://unix.stackexchange.com/questions/29608/why-is-it-better-to-use-usr-bin-env-name-instead-of-path-to-name-as-my) to use binaries from the PATH as interpreters.
Signed-off-by: Quentin Gliech <quentingliech@gmail.com>
|
|
Make it clearer in the source install step that the platform specific
prerequisites must be installed first.
Signed-off-by: Serban Constantin <serban.constantin@gmail.com>
|
|
Split off from https://github.com/matrix-org/synapse/pull/9491
Adds a storage method for getting the current presence of all local users, optionally excluding those that are offline. This will be used by the code in #9491 when a PresenceRouter module informs Synapse that a given user should have `"ALL"` user presence updates routed to them. Specifically, it is used here: https://github.com/matrix-org/synapse/blob/b588f16e391d664b11f43257eabf70663f0c6d59/synapse/handlers/presence.py#L1131-L1133
Note that there is a `get_all_presence_updates` function just above. That function is intended to walk up the table through stream IDs, and is primarily used by the presence replication stream. I could possibly make use of it in the PresenceRouter-related code, but it would be a bit of a bodge.
|
|
Broke in #9640
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
|
|
|
|
|
|
When we hit an unknown room in the space tree, see if there are other servers that we might be able to poll to get the data.
Fixes: #9447
|
|
|
|
This fixes an error ("Cannot determine consistent method resolution order (MRO)")
when running mypy with a cache.
|
|
|
|
|
|
It's legitimate behaviour to try and join a bunch of rooms at once.
|
|
|
|
Builds on the work done in #9643 to add a federation API for space summaries.
There's a bit of refactoring of the existing client-server code first, to avoid too much duplication.
|
|
|
|
Addresses https://github.com/matrix-org/synapse-dinsic/issues/70
This PR causes `ProxyAgent` to attempt to extract credentials from an `HTTPS_PROXY` env var. If credentials are found, a `Proxy-Authorization` header ([details](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Proxy-Authorization)) is sent to the proxy server to authenticate against it. The headers are *not* passed to the remote server.
Also added some type hints.
|
|
Cf. https://github.com/opencontainers/image-spec/blob/master/annotations.md#pre-defined-annotation-keys
Signed-off-by: Johannes Wienke <languitar@semipol.de>
|
|
- Merge 'isinstance' calls.
- Remove unnecessary dict call outside of comprehension.
- Use 'sys.exit()' calls.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This is very bare-bones for now: federation will come soon, while pagination is descoped for now but will come later.
|
|
|
|
|
|
Currently federation catchup will send the last *local* event that we
failed to send to the remote. This can cause issues for large rooms
where lots of servers have sent events while the remote server was down,
as when it comes back up again it'll be flooded with events from various
points in the DAG.
Instead, let's make it so that all the servers send the most recent
events, even if its not theirs. The remote should deduplicate the
events, so there shouldn't be much overhead in doing this.
Alternatively, the servers could only send local events if they were
also extremities and hope that the other server will send the event
over, but that is a bit risky.
|
|
|
|
serialize_event (#9585)
This bug was discovered by DINUM. We were modifying `serialized_event["content"]`, which - if you've got `USE_FROZEN_DICTS` turned on or are [using a third party rules module](https://github.com/matrix-org/synapse/blob/17cd48fe5171d50da4cb59db647b993168e7dfab/synapse/events/third_party_rules.py#L73-L76) - will raise a 500 if you try to a edit a reply to a message.
`serialized_event["content"]` could be set to the edit event's content, instead of a copy of it, which is bad as we attempt to modify it. Instead, we also end up modifying the original event's content. DINUM uses a third party rules module, which meant the event's content got frozen and thus an exception was raised.
To be clear, the problem is not that the event's content was frozen. In fact doing so helped us uncover the fact we weren't copying event content correctly.
|
|
By splitting this to two separate methods the callers know
what methods they can expect on the handler.
|
|
ones (#9634)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
|
|
We had two functions named `get_forward_extremities_for_room` and
`get_forward_extremeties_for_room` that took different paramters. We
rename one of them to avoid confusion.
|
|
* Populate `internal_metadata.outlier` based on `events` table
Rather than relying on `outlier` being in the `internal_metadata` column,
populate it based on the `events.outlier` column.
* Move `outlier` out of InternalMetadata._dict
Ultimately, this will allow us to stop writing it to the database. For now, we
have to grandfather it back in so as to maintain compatibility with older
versions of Synapse.
|
|
|
|
* Adds B00 to ignored checks.
* Fixes remaining issues.
|
|
Allows limiting who can login using OIDC via the claims
made from the IdP.
|
|
Instead of if the user does not have a password hash. This allows a SSO
user to add a password to their account, but only if the local password
database is configured.
|
|
|
|
|
|
|
|
|
|
Fixes https://github.com/matrix-org/synapse/issues/9572
When a SSO user logs in for the first time, we create a local Matrix user for them. This goes through the register_user flow, which ends up triggering the spam checker. Spam checker modules don't currently have any way to differentiate between a user trying to sign up initially, versus an SSO user (whom has presumably already been approved elsewhere) trying to log in for the first time.
This PR passes `auth_provider_id` as an argument to the `check_registration_for_spam` function. This argument will contain an ID of an SSO provider (`"saml"`, `"cas"`, etc.) if one was used, else `None`.
|
|
Co-authored-by: Will Hunt <willh@matrix.org>
Co-authored-by: Erik Johnston <erik@matrix.org>
|
|
* Handle an empty cookie as an invalid macaroon.
* Newsfragment
|
|
The stable format uses different brand identifiers, so we need to support two
identifiers for each IdP.
|
|
... and complain if people try to turn it off.
|
|
There's no need to do aggregation bundling for state events. Doing so can cause performance issues.
|
|
* Fix Internal Server Error on `GET /saml2/authn_response`
Seems to have been introduced in #8765 (Synapse 1.24.0)
* Fix newsfile
|
|
|
|
|
|
Federation catch up mode is very inefficient if the number of events
that the remote server has missed is small, since handling gaps can be
very expensive, c.f. #9492.
Instead of going into catch up mode whenever we see an error, we instead
do so only if we've backed off from trying the remote for more than an
hour (the assumption being that in such a case it is more than a
transient failure).
|
|
Background: When we receive incoming federation traffic, and notice that we are missing prev_events from
the incoming traffic, first we do a `/get_missing_events` request, and then if we still have missing prev_events,
we set up new backwards-extremities. To do that, we need to make a `/state_ids` request to ask the remote
server for the state at those prev_events, and then we may need to then ask the remote server for any events
in that state which we don't already have, as well as the auth events for those missing state events, so that we
can auth them.
This PR attempts to optimise the processing of that state request. The `state_ids` API returns a list of the state
events, as well as a list of all the auth events for *all* of those state events. The optimisation comes from the
observation that we are currently loading all of those auth events into memory at the start of the operation, but
we almost certainly aren't going to need *all* of the auth events. Rather, we can check that we have them, and
leave the actual load into memory for later. (Ideally the federation API would tell us which auth events we're
actually going to need, but it doesn't.)
The effect of this is to reduce the number of events that I need to load for an event in Matrix HQ from about
60000 to about 22000, which means it can stay in my in-memory cache, whereas previously the sheer number
of events meant that all 60K events had to be loaded from db for each request, due to the amount of cache
churn. (NB I've already tripled the size of the cache from its default of 10K).
Unfortunately I've ended up basically C&Ping `_get_state_for_room` and `_get_events_from_store_or_dest` into
a new method, because `_get_state_for_room` is also called during backfill, which expects the auth events to be
returned, so the same tricks don't work. That said, I don't really know why that codepath is completely different
(ultimately we're doing the same thing in setting up a new backwards extremity) so I've left a TODO suggesting
that we clean it up.
|
|
|
|
If more transactions arrive from an origin while we're still processing the
first one, reject them.
Hopefully a quick fix to https://github.com/matrix-org/synapse/issues/9489
|
|
Put the room id in the logcontext, to make it easier to understand what's going on.
|
|
|
|
|
|
Fixes: #8393
|
|
... because namedtuples suck
Fix up a couple of other annotations to keep mypy happy.
|
|
We either need to pass the auth provider over the replication api, or make sure
we report the auth provider on the worker that received the request. I've gone
with the latter.
|
|
Mention that parse_config must exist and note the
check_media_file_for_spam method.
|
|
This uses a simplified version of get_chain_cover_difference to calculate
auth chain of events.
|
|
|
|
|
|
This is a companion change to apply the fix in #9498 /
922788c6043138165c025c78effeda87de842bab to previously
purged rooms.
|
|
Earlier [I was convinced](https://github.com/matrix-org/synapse/issues/9565) that we didn't have an Admin API for listing media uploaded by a user. Foolishly I was looking under the Media Admin API documentation, instead of the User Admin API documentation.
I thought it'd be helpful to link to the latter so others don't hit the same dead end :)
|
|
Apple had to be special. They want a client secret which is generated from an EC key.
Fixes #9220. Also fixes #9212 while I'm here.
|
|
Fixes #8915
|
|
Type hint fixes due to Twisted 21.2.0 adding type hints.
|
|
Properly uses RGBA mode for 1- and 8-bit images with transparency
(instead of RBG mode).
|
|
The hashes are from commits due to auto-formatting, e.g. running black.
git can be configured to use this automatically by running the following:
git config blame.ignoreRevsFile .git-blame-ignore-revs
|
|
After 0764d0c6e575793ca506cf021aff3c4b9e0a5972
|
|
I noticed that I'd occasionally have `scripts-dev/lint.sh` fail when messing about with config options in my PR. The script calls `scripts-dev/config-lint.sh`, which attempts some validation on the sample config.
It does this by using `sed` to edit the sample_config, and then seeing if the file changed using `git diff`.
The problem is: if you changed the sample_config as part of your commit, this script will error regardless.
This PR attempts to change the check so that existing, unstaged changes to the sample_config will not cause the script to report an invalid file.
|
|
|
|
|
|
|
|
token (#9559)
This notice is giving a heads up to the planned spec compliance fix https://github.com/matrix-org/synapse/pull/9548.
|
|
|
|
Unfortunately this doesn't test re-joining the room since
that requires having another homeserver to query over
federation, which isn't easily doable in unit tests.
|
|
|
|
|
|
interfaces. (#9528)
This helps fix some type hints when running with Twisted 21.2.0.
|
|
Update reverse proxy to add OpenBSD relayd example configuration.
Signed-off-by: Leo Bärring <leo.barring@protonmail.com>
|
|
|
|
Following the advice at
https://prometheus.io/docs/practices/instrumentation/#timestamps-not-time-since,
it's preferable to export unix timestamps, not ages.
There doesn't seem to be any particular naming convention for timestamp
metrics.
|
|
Add prom metrics for number of users successfully registering and logging in, by SSO provider.
|
|
This great big stack of commits is a a whole load of hoop-jumping to make it easier to store additional values in login tokens, and then to actually store the SSO Identity Provider in the login token. (Making use of that data will follow in a subsequent PR.)
|
|
|
|
|
|
|
|
|
|
Should fix some remaining warnings
|
|
Turns out matrix.org has an event that has duplicate auth events (which really isn't supposed to happen, but here we are). This caused the background update to fail due to `UniqueViolation`.
|
|
|
|
|
|
|
|
|
|
Turns out nginx overwrites the Host header by default.
|
|
Prevent presence background jobs from running when presence is disabled
Signed-off-by: Aaron Raimist <aaron@raim.ist>
|
|
This reverts commit f5c93fc9931e4029bbd8000f398b6f39d67a8c46.
This is being backed out due to a regression (#9507) and additional
review feedback being provided.
|
|
It landed in schema version 58 after 59 had been created, causing some
servers to not run it. The main effect of was that not all rooms had
their chain cover calculated correctly. After the BG updates complete
the chain covers will get fixed when a new state event in the affected
rooms is received.
|
|
Fixes #9504
|
|
|
|
|
|
By consuming the response if the headers imply that the
content is too large.
|
|
This also pins the Twisted version in the mypy job for CI until
proper type hints are fixed throughout Synapse.
|
|
In #75, bytecode was disabled (from a bit of FUD back in `python<2.4` days, according to dev chat), I think it's safe enough to enable it again.
Added in `__pycache__/` and `.pyc`/`.pyd` to `.gitignore`, to extra-insure compiled files don't get committed.
`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
|
|
### Changes proposed in this PR
- Add support for the `no_proxy` and `NO_PROXY` environment variables
- Internally rely on urllib's [`proxy_bypass_environment`](https://github.com/python/cpython/blob/bdb941be423bde8b02a5695ccf51c303d6204bed/Lib/urllib/request.py#L2519)
- Extract env variables using urllib's `getproxies`/[`getproxies_environment`](https://github.com/python/cpython/blob/bdb941be423bde8b02a5695ccf51c303d6204bed/Lib/urllib/request.py#L2488) which supports lowercase + uppercase, preferring lowercase, except for `HTTP_PROXY` in a CGI environment
This does contain behaviour changes for consumers so making sure these are called out:
- `no_proxy`/`NO_PROXY` is now respected
- lowercase `https_proxy` is now allowed and taken over `HTTPS_PROXY`
Related to #9306 which also uses `ProxyAgent`
Signed-off-by: Timothy Leung tim95@hotmail.co.uk
|
|
... otherwise, we don't get the cookie back.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
rewrite XForwardedForRequest to set `isSecure()` based on
`X-Forwarded-Proto`. Also implement `getClientAddress()` while we're here.
|
|
|
|
|
|
This fixes #8518 by adding a conditional check on `SyncResult` in a function when `prev_stream_token == current_stream_token`, as a sanity check. In `CachedResponse.set.<remove>()`, the result is immediately popped from the cache if the conditional function returns "false".
This prevents the caching of a timed-out `SyncResult` (that has `next_key` as the stream key that produced that `SyncResult`). The cache is prevented from returning a `SyncResult` that makes the client request the same stream key over and over again, effectively making it stuck in a loop of requesting and getting a response immediately for as long as the cache keeps those values.
Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
|
|
* Split ShardedWorkerHandlingConfig
This is so that we have a type level understanding of when it is safe to
call `get_instance(..)` (as opposed to `should_handle(..)`).
* Remove special cases in ShardedWorkerHandlingConfig.
`ShardedWorkerHandlingConfig` tried to handle the various different ways
it was possible to configure federation senders and pushers. This led to
special cases that weren't hit during testing.
To fix this the handling of the different cases is moved from there and
`generic_worker` into the worker config class. This allows us to have
the logic in one place and allows the rest of the code to ignore the
different cases.
|
|
The idea here is to stop people forgetting to call `check_consistency`. Folks can still just pass in `None` to the new args in `build_sequence_generator`, but hopefully they won't.
|
|
|
|
This confused me for a while.
|
|
And ensure the consistency of `event_auth_chain_id`.
|
|
|
|
`uploads_path` was a thing that was never used; most of it was removed in #6628
but a few vestiges remained.
|
|
|
|
It should be possible to reload `synapse.target` to have the reload propagate
to all the synapse units.
|
|
This PR remove the cache for the `get_shared_rooms_for_users` storage method (the db method driving the experimental "what rooms do I share with this user?" feature: [MSC2666](https://github.com/matrix-org/matrix-doc/pull/2666)). Currently subsequent requests to the endpoint will return the same result, even if your shared rooms with that user have changed.
The cache was added in https://github.com/matrix-org/synapse/pull/7785, but we forgot to ensure it was invalidated appropriately.
Upon attempting to invalidate it, I found that the cache had to be entirely invalidated whenever a user (remote or local) joined or left a room. This didn't make for a very useful cache, especially for a function that may or may not be called very often. Thus, I've opted to remove it instead of invalidating it.
|
|
The user directory sample config section was a little messy, and didn't adhere to our [recommended config format guidelines](https://github.com/matrix-org/synapse/blob/develop/docs/code_style.md#configuration-file-format).
This PR cleans that up a bit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(#9402)
This PR attempts to eliminate unnecessary presence sending work when your local server joins a room, or when a remote server joins a room your server is participating in by processing state deltas in chunks rather than individually.
---
When your server joins a room for the first time, it requests the historical state as well. This chunk of new state is passed to the presence handler which, after filtering that state down to only membership joins, will send presence updates to homeservers for each join processed.
It turns out that we were being a bit naive and processing each event individually, and sending out presence updates for every one of those joins. Even if many different joins were users on the same server (hello IRC bridges), we'd send presence to that same homeserver for every remote user join we saw.
This PR attempts to deduplicate all of that by processing the entire batch of state deltas at once, instead of only doing each join individually. We process the joins and note down which servers need which presence:
* If it was a local user join, send that user's latest presence to all servers in the room
* If it was a remote user join, send the presence for all local users in the room to that homeserver
We deduplicate by inserting all of those pending updates into a dictionary of the form:
```
{
server_name1: {presence_update1, ...},
ser |