| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
| |
Spawning from https://github.com/matrix-org/synapse/pull/15731
|
|
|
|
|
|
|
| |
`current_state_events` (#15731)
This helps with the upstream `is_host_joined()` and `is_host_invited()` functions.
`membership` was added to `current_state_events` in https://github.com/matrix-org/synapse/pull/5706 and forced in https://github.com/matrix-org/synapse/pull/13745
|
|
|
|
|
| |
The cached decorators always return a Deferred, which was not
properly propagated. It was close enough when wrapping coroutines,
but failed if a bare function was wrapped.
|
|
|
|
|
| |
This is largely based off the stats and user directory updater code.
Signed-off-by: Sean Quah <seanq@matrix.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's important that collections returned from `@cached` methods are not
modified, otherwise future retrievals from the cache will return the
modified collection.
This applies to the return values from `@cached` methods and the values
inside the dictionaries returned by `@cachedList` methods. It's not
necessary for the dictionaries returned by `@cachedList` methods
themselves to be read-only.
Signed-off-by: Sean Quah <seanq@matrix.org>
Co-authored-by: David Robertson <davidr@element.io>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#14870)
* Allow `AbstractSet` in `StrCollection`
Or else frozensets are excluded. This will be useful in an upcoming
commit where I plan to change a function that accepts `List[str]` to
accept `StrCollection` instead.
* `rooms_to_exclude` -> `rooms_to_exclude_globally`
I am about to make use of this exclusion mechanism to exclude rooms for
a specific user and a specific sync. This rename helps to clarify the
distinction between the global config and the rooms to exclude for a
specific sync.
* Better function names for internal sync methods
* Track a list of excluded rooms on SyncResultBuilder
I plan to feed a list of partially stated rooms for this sync to ignore
* Exclude partial state rooms during eager sync
using the mechanism established in the previous commit
* Track un-partial-state stream in sync tokens
So that we can work out which rooms have become fully-stated during a
given sync period.
* Fix mutation of `@cached` return value
This was fouling up a complement test added alongside this PR.
Excluding a room would mean the set of forgotten rooms in the cache
would be extended. This means that room could be erroneously considered
forgotten in the future.
Introduced in #12310, Synapse 1.57.0. I don't think this had any
user-visible side effects (until now).
* SyncResultBuilder: track rooms to force as newly joined
Similar plan as before. We've omitted rooms from certain sync responses;
now we establish the mechanism to reintroduce them into future syncs.
* Read new field, to present rooms as newly joined
* Force un-partial-stated rooms to be newly-joined
for eager incremental syncs only, provided they're still fully stated
* Notify user stream listeners to wake up long polling syncs
* Changelog
* Typo fix
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Unnecessary list cast
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Rephrase comment
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Another comment
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Fixup merge(?)
* Poke notifier when receiving un-partial-stated msg over replication
* Fixup merge whoops
Thanks MV :)
Co-authored-by: Mathieu Velen <mathieuv@matrix.org>
Co-authored-by: Mathieu Velten <mathieuv@matrix.org>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
|
|
| |
* Pull out hero selection logic
* Include heroes in partial join response's state
* Changelog
* Fixup trial test
* Remove TODO
|
|
|
|
| |
(`get_users_in_room` mis-use) (#13958)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix presence bug introduced in 1.64 by #13313
Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
* Add changelog
* Add DISTINCT
* Apply suggestions from code review
Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
|
|
|
|
|
| |
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Erik Johnston <erik@matrix.org>
Co-authored-by: David Robertson <davidr@element.io>
|
| |
|
|
|
|
|
| |
Fixes #13942. Introduced in #13575.
Basically, let's only get the ordered set of hosts out of the DB if we need an ordered set of hosts. Since we split the function up the caching won't be as good, but I think it will still be fine as e.g. multiple backfill requests for the same room will hit the cache.
|
| |
|
|
|
|
| |
(#13885)
|
| |
|
|
|
|
|
|
|
| |
* Remove checks for membership column in current_state_events
* Add schema script to force through the
`current_state_events_membership` background job
Contributed by Nick @ Beeper (@fizzadar).
|
|
|
|
|
|
|
| |
Update the docstrings for `get_users_in_room` and
`get_current_hosts_in_room` to explain the impact of partial state.
Signed-off-by: Sean Quah <seanq@matrix.org>
|
|
|
|
|
|
|
|
|
|
| |
(#13748)
Handle malformed user IDs with no colons in `get_current_hosts_in_room`.
It's not currently possible for a malformed user ID to join a room, so
this error would never be hit.
Signed-off-by: Sean Quah <seanq@matrix.org>
|
| |
|
|
|
|
|
|
|
| |
The method doesn't actually do any data fetching and the method that
does, `_get_joined_profile_from_event_id`, has its own cache.
Signed off by Nick @ Beeper (@Fizzadar).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimize how we calculate `likely_domains` during backfill because I've seen this take 17s in production just to `get_current_state` which is used to `get_domains_from_state` (see case [*2. Loading tons of events* in the `/messages` investigation issue](https://github.com/matrix-org/synapse/issues/13356)).
There are 3 ways we currently calculate hosts that are in the room:
1. `get_current_state` -> `get_domains_from_state`
- Used in `backfill` to calculate `likely_domains` and `/timestamp_to_event` because it was cargo-culted from `backfill`
- This one is being eliminated in favor of `get_current_hosts_in_room` in this PR 🕳
1. `get_current_hosts_in_room`
- Used for other federation things like sending read receipts and typing indicators
1. `get_hosts_in_room_at_events`
- Used when pushing out events over federation to other servers in the `_process_event_queue_loop`
Fix https://github.com/matrix-org/synapse/issues/13626
Part of https://github.com/matrix-org/synapse/issues/13356
Mentioned in [internal doc](https://docs.google.com/document/d/1lvUoVfYUiy6UaHB6Rb4HicjaJAU40-APue9Q4vzuW3c/edit#bookmark=id.2tvwz3yhcafh)
### Query performance
#### Before
The query from `get_current_state` sucks just because we have to get all 80k events. And we see almost the exact same performance locally trying to get all of these events (16s vs 17s):
```
synapse=# SELECT type, state_key, event_id FROM current_state_events WHERE room_id = '!OGEhHVWSdvArJzumhm:matrix.org';
Time: 16035.612 ms (00:16.036)
synapse=# SELECT type, state_key, event_id FROM current_state_events WHERE room_id = '!OGEhHVWSdvArJzumhm:matrix.org';
Time: 4243.237 ms (00:04.243)
```
But what about `get_current_hosts_in_room`: When there is 8M rows in the `current_state_events` table, the previous query in `get_current_hosts_in_room` took 13s from complete freshness (when the events were first added). But takes 930ms after a Postgres restart or 390ms if running back to back to back.
```sh
$ psql synapse
synapse=# \timing on
synapse=# SELECT COUNT(DISTINCT substring(state_key FROM '@[^:]*:(.*)$'))
FROM current_state_events
WHERE
type = 'm.room.member'
AND membership = 'join'
AND room_id = '!OGEhHVWSdvArJzumhm:matrix.org';
count
-------
4130
(1 row)
Time: 13181.598 ms (00:13.182)
synapse=# SELECT COUNT(*) from current_state_events where room_id = '!OGEhHVWSdvArJzumhm:matrix.org';
count
-------
80814
synapse=# SELECT COUNT(*) from current_state_events;
count
---------
8162847
synapse=# SELECT pg_size_pretty( pg_total_relation_size('current_state_events') );
pg_size_pretty
----------------
4702 MB
```
#### After
I'm not sure how long it takes from complete freshness as I only really get that opportunity once (maybe restarting computer but that's cumbersome) and it's not really relevant to normal operating times. Maybe you get closer to the fresh times the more access variability there is so that Postgres caches aren't as exact. Update: The longest I've seen this run for is 6.4s and 4.5s after a computer restart.
After a Postgres restart, it takes 330ms and running back to back takes 260ms.
```sh
$ psql synapse
synapse=# \timing on
Timing is on.
synapse=# SELECT
substring(c.state_key FROM '@[^:]*:(.*)$') as host
FROM current_state_events c
/* Get the depth of the event from the events table */
INNER JOIN events AS e USING (event_id)
WHERE
c.type = 'm.room.member'
AND c.membership = 'join'
AND c.room_id = '!OGEhHVWSdvArJzumhm:matrix.org'
GROUP BY host
ORDER BY min(e.depth) ASC;
Time: 333.800 ms
```
#### Going further
To improve things further we could add a `limit` parameter to `get_current_hosts_in_room`. Realistically, we don't need 4k domains to choose from because there is no way we're going to query that many before we a) probably get an answer or b) we give up.
Another thing we can do is optimize the query to use a index skip scan:
- https://wiki.postgresql.org/wiki/Loose_indexscan
- Index Skip Scan, https://commitfest.postgresql.org/37/1741/
- https://www.timescale.com/blog/how-we-made-distinct-queries-up-to-8000x-faster-on-postgresql/
|
|
|
|
|
| |
first (`get_users_in_room` mis-use) (#13608)
See https://github.com/matrix-org/synapse/pull/13575#discussion_r953023755
|
|
|
| |
Broke in #13573.
|
|
|
| |
The profile objects are never used and increase cache size significantly.
|
| |
|
|
|
|
| |
stated. (#13514)
|
|
|
|
|
|
| |
Still maintains local in memory lookup optimisation, but does any external
lookup as part of the deferred that prevents duplicate lookups for the same
event at once. This makes the assumption that fetching from an external
cache is a non-zero load operation.
|
|
|
|
|
|
|
|
| |
See #10826 and #10786 for context as to why we had to disable pruning on
those caches.
Now that `get_users_who_share_room_with_user` is called frequently only
for presence, we just need to make calls to it less frequent and then we
can remove the various levels of caching that is going on.
|
|
|
|
| |
`_get_joined_profiles_from_event_ids`. (#13300)
|
| |
|
|
|
|
|
| |
Some experimental prep work to enable external event caching based on #9379 & #12955. Doesn't actually move the cache at all, just lays the groundwork for async implemented caches.
Signed off by Nick @ Beeper (@Fizzadar)
|
| |
|
| |
|
|
|
|
|
| |
federation (#12964)
Reducing the amount of state we pull from the DB is useful as fetching state is expensive in terms of DB, CPU and memory.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Refactor and convert `Linearizer` to async. This makes a `Linearizer`
cancellation bug easier to fix.
Also refactor to use an async context manager, which eliminates an
unlikely footgun where code that doesn't immediately use the context
manager could forget to release the lock.
Signed-off-by: Sean Quah <seanq@element.io>
|
| |
|
|
|
|
|
| |
This should speed up push rule calculations for rooms with large numbers of local users when the main push rule cache fails.
Co-authored-by: reivilibre <oliverw@matrix.org>
|
| |
|
|
|
|
|
|
| |
For users with large accounts it is inefficient to calculate the set of
users they share a room with (and takes a lot of space in the cache).
Instead we can look at users whose devices have changed since the last
sync and check if they share a room with the syncing user.
|
| |
|
| |
|
|
|
|
|
| |
We're going to add a `state_key` column to the `events` table, so we need to
add some disambiguation to queries which use it.
|
| |
|
| |
|
| |
|
|
|
|
| |
* Allow LruCaches to opt out of time-based expiry
* Don't expire `get_users_who_share_room` & friends
|
|
|
|
| |
Instead of proxying through the magic getter of the RootConfig
object. This should be more performant (and is more explicit).
|
|
|
| |
No functional changes here. This came out as I was working to tackle #5677
|
|
|
|
| |
A user will still see this room if it is in a local cache, but it will
not reappear if clearing the cache and reloading.
|
|
|
|
| |
Instead of using namedtuples. This helps with asserting type hints
and code completion.
|
|
|
|
| |
Ensure we only load an event from the DB once when the same event is requested multiple times at once.
|
|
|
|
|
|
|
|
|
| |
This PR is tantamount to running
```
pyupgrade --py36-plus --keep-percent-format `find synapse/ -type f -name "*.py"`
```
Part of #9744
|
|
|
|
|
|
|
|
|
|
| |
Previously only world-readable rooms were shown. This means that
rooms which are public, knockable, or invite-only with a pending invitation,
are included in a space summary. It also applies the same logic to
the experimental room version from MSC3083 -- if a user has access
to the proper allowed rooms then it is shown in the spaces summary.
This change is made per MSC3173 allowing stripped state of a room to
be shown to any potential room joiner.
|
|
|
|
|
|
|
|
|
| |
Fixes https://github.com/matrix-org/synapse/issues/10030.
We were expecting milliseconds where we should have provided a value in seconds.
The impact of this bug isn't too bad. The code is intended to count the number of remote servers that the homeserver can see and report that as a metric. This metric is supposed to run initially 1 second after server startup, and every 60s as well. Instead, it ran 1,000 seconds after server startup, and every 60s after startup.
This fix allows for the correct metrics to be collected immediately, as well as preventing a random collection 1,000s in the future after startup.
|
| |
|
| |
|
|
|
| |
This is no longer required, since we have dropped support for Python 3.5.
|
|
|
|
|
| |
Fixes: #9797.
Should help reduce CPU usage on the user directory, especially when memberships change in rooms with lots of state history.
|
|
|
|
|
|
|
| |
Part of #9744
Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.
`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
|
|
|
|
|
|
|
| |
- Update black version to the latest
- Run black auto formatting over the codebase
- Run autoformatting according to [`docs/code_style.md
`](https://github.com/matrix-org/synapse/blob/80d6dc9783aa80886a133756028984dbf8920168/docs/code_style.md)
- Update `code_style.md` docs around installing black to use the correct version
|
|
|
|
|
| |
* Use execute_batch in more places
* Newsfile
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is another PR that grew out of #6739.
The existing code for checking whether a user is currently invited to a room when they want to leave the room looks like the following:
https://github.com/matrix-org/synapse/blob/f737368a26bb9eea401fcc3a5bdd7e0b59e91f09/synapse/handlers/room_member.py#L518-L540
It calls `get_invite_for_local_user_in_room`, which will actually query *all* rooms the user has been invited to, before iterating over them and matching via the room ID. It will then return a tuple of a lot of information which we pull the event ID out of.
I need to do a similar check for knocking, but this code wasn't very efficient. I then tried to write a different implementation using `StateHandler.get_current_state` but this actually didn't work as we haven't *joined* the room yet - we've only been invited to it. That means that only certain tables in Synapse have our desired `invite` membership state. One of those tables is `local_current_membership`.
So I wrote a store method that just queries that table instead
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
* Add `DeferredCache.get_immediate` method
A bunch of things that are currently calling `DeferredCache.get` are only
really interested in the result if it's completed. We can optimise and simplify
this case.
* Remove unused 'default' parameter to DeferredCache.get()
* another get_immediate instance
|
| |
|
|
|
|
|
| |
This is so we can tell what is going on when things are taking a while to start up.
The main change here is to ensure that transactions that are created during startup get correctly logged like normal transactions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea is to remove some of the places we pass around `int`, where it can represent one of two things:
1. the position of an event in the stream; or
2. a token that partitions the stream, used as part of the stream tokens.
The valid operations are then:
1. did a position happen before or after a token;
2. get all events that happened before or after a token; and
3. get all events between two tokens.
(Note that we don't want to allow other operations as we want to change the tokens to be vector clocks rather than simple ints)
|
|
|
|
|
|
|
| |
This converts calls like super(Foo, self) -> super().
Generated with:
sed -i "" -Ee 's/super\([^\(]+\)/super()/g' **/*.py
|
| |
|
| |
|
|
|
|
| |
async (#8197)
|
| |
|
| |
|
| |
|
|
|