summary refs log tree commit diff
path: root/synapse/handlers/sliding_sync/__init__.py (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Pass leave from remote invite rejection down Sliding Sync (#18375)Devon Hudson2025-05-081-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes #17753 ### Dev notes The `sliding_sync_membership_snapshots` and `sliding_sync_joined_rooms` database tables were added in https://github.com/element-hq/synapse/pull/17512 ### Pull Request Checklist <!-- Please read https://element-hq.github.io/synapse/latest/development/contributing_guide.html before submitting your pull request --> * [X] Pull request is based on the develop branch * [x] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [X] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters)) --------- Co-authored-by: Erik Johnston <erik@matrix.org> Co-authored-by: Olivier 'reivilibre <oliverw@matrix.org> Co-authored-by: Eric Eastwood <erice@element.io>
* Merge branch 'master' into developQuentin Gliech2024-12-031-6/+73
|\
| * Handle null invite and knock room stateErik Johnston2024-12-031-5/+11
| |
| * Sliding Sync: Fix state leaking on incremental syncEric Eastwood2024-12-031-1/+62
| |
* | Sliding Sync: Include invite, ban, kick, targets when `$LAZY`-loading room ↵Eric Eastwood2024-12-021-3/+9
|/ | | | | members (#17947) Part of https://github.com/element-hq/synapse/issues/17929
* Sliding Sync: Lazy-loading room members on incremental sync (remember ↵Eric Eastwood2024-11-041-36/+132
| | | | | | | | memberships) (#17809) Lazy-loading room members on incremental sync and remember which memberships we've sent down the connection before (up-to 100) Fix https://github.com/element-hq/synapse/issues/17804
* Sliding Sync: Slight optimization when fetching state for the room ↵Eric Eastwood2024-10-141-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (`get_events_as_list(...)`) (#17718) Spawning from @kegsay [pointing out](https://matrix.to/#/!cnVVNLKqgUzNTOFQkz:matrix.org/$ExOO7J8uPUQSyH-9Uxc_QCa8jlXX9uK4VRtkSC0EI3o?via=element.io&via=matrix.org&via=jki.re) that the Sliding Sync endpoint doesn't handle a large room with a lot of state well on initial sync (requesting all state via `required_state: [ ["*","*"] ]`) (it just takes forever). After investigating further, the slow part is just `get_events_as_list(...)` fetching all of the current state ID's out for the room (which can be 100k+ events for rooms with a lot of membership). This is just a slow thing in Synapse in general and the same thing happens in Sync v2 or the `/state` endpoint. --- The only idea I had to improve things was to use `batch_iter` to only try fetching a fixed amount at a time instead of working with large maps, lists, and sets. This doesn't seem to have much effect though. There is already a `batch_iter(event_ids, 200)` in `_fetch_event_rows(...)` for when we actually have to touch the database and that's inside a queue to deduplicate work. I did notice one slight optimization to use `get_events_as_list(...)` directly instead of `get_events(...)`. `get_events(...)` just turns the result from `get_events_as_list(...)` into a dict and since we're just iterating over the events, we don't need the dict/map.
* Correctly changes to required state config in sliding sync (#17785)Erik Johnston2024-10-141-10/+224
| | | | | | | | | | | | | | | | | | Fixes https://github.com/element-hq/synapse/issues/17698 This handles `required_state` changes by checking if new state has been added to the config, and if so fetching and returning that from the current state. This also takes care to ensure that given a state entry S that is added, removed and then re-added that we do *not* send S down a second time if there have been no changes to S in the current state. This is fine for Rust SDK (as it just remembers all state), but we might decide not to do this behaviour in the MSC. If we decide to always send down S then its easy enough to rip out all the code. --------- Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
* Sliding sync: omit bump stamp when it is unchanged (#17788)Erik Johnston2024-10-081-10/+44
| | | This saves some DB lookups in rooms
* Sliding sync minor performance speed up using new table (#17787)Erik Johnston2024-10-081-9/+18
| | | | | | | Use the new tables to work out which rooms have changed. --------- Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
* Never return negative bump stamp (#17748)Erik Johnston2024-09-241-0/+16
| | | Fixes #17737
* Sliding Sync: Avoid fetching left rooms and add back `newly_left` rooms (#17725)Eric Eastwood2024-09-191-7/+19
| | | | | | | | | | | | | | | Performance optimization: We can avoid fetching rooms that the user has left themselves (which could be a significant amount), then only add back rooms that the user has `newly_left` (left in the token range of an incremental sync). It's a lot faster to fetch less rooms than fetch them all and throw them away in most cases. Since the user only leaves a room (or is state reset out) once in a blue moon, we can avoid a lot of work. Based on @erikjohnston's branch, erikj/ss_perf --------- Co-authored-by: Erik Johnston <erik@matrix.org>
* Sliding Sync: Short-circuit `have_finished_sliding_sync_background_jobs` ↵Eric Eastwood2024-09-171-2/+2
| | | | | | | | (#17723) We only need to check it if returned bump stamp is `None`, which is rare. Pulling this change out from one of @erikjohnston's branches (https://github.com/element-hq/synapse/compare/develop...erikj/ss_perf)
* Sliding Sync: Increase concurrency of sliding sync a bit (#17696)Erik Johnston2024-09-121-1/+1
| | | | | | For initial requests a typical page size is 20 rooms, so we may as well do the batching as 20. This should speed up bigger syncs a little bit.
* Sliding sync: don't fetch room summary for named rooms. (#17683)Erik Johnston2024-09-111-40/+38
| | | | | | | | | | For rooms with a name we can skip fetching a full room summary, as we don't need to calculate heroes, and instead just fetch the room counts directly. This also changes things to not return counts and heroes for non-joined rooms. For left/banned rooms we were returning zero values anyway, and for invite/knock rooms we don't really want to leak such information (even if some of is included in the stripped state).
* Sliding Sync: Look for `bump _stamp` in the room timeline (#17684)Erik Johnston2024-09-101-53/+95
| | | | | | | This allows us to skip checking the database a lot of the time. --------- Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
* Sliding Sync: Retrieve fewer events from DB in sync (#17688)Erik Johnston2024-09-101-18/+2
| | | | When using timeline limit of 1 we end up fetching 2 events from the DB purely to tell if the response was "limited" or not. Lets not do that.
* Sliding Sync: Get `bump_stamp` from new sliding sync tables because it's ↵Eric Eastwood2024-09-091-18/+56
| | | | | | | | faster (#17658) Get `bump_stamp` from [new sliding sync tables](https://github.com/element-hq/synapse/pull/17512) which should be faster (performance) than flipping through the latest events in the room.
* Revert "Look for bump stamp in the room timeline"Erik Johnston2024-09-091-24/+12
| | | | This reverts commit a3c49565fff95cb332ef5f00b6faaf4803b34153.
* Look for bump stamp in the room timelineErik Johnston2024-09-091-12/+24
| | | | This allows us to skip checking the database a lot of the time.
* Sliding Sync: Speed up incremental sync by avoiding extra work (#17665)Eric Eastwood2024-09-091-37/+112
| | | | | Speed up incremental sync by avoiding extra work. We first look at the state delta changes and only fetch and calculate further derived things if they have changed.
* Fix bump stamp for non-joined rooms (#17674)Erik Johnston2024-09-061-19/+21
| | | | We should only look for bump stamps in joined rooms, otherwise we should just use the membership stream ordering.
* Speed up sliding sync by avoiding copies (#17670)Erik Johnston2024-09-061-7/+15
| | | | | | | | | We ended up spending ~10% CPU creating a new dictionary and `_RoomMembershipForUser`, so let's avoid creating new dicts and copying by returning `newly_joined`, `newly_left` and `is_dm` as sets directly. --------- Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
* Revert "Fix bump stamp for non-joined rooms"Erik Johnston2024-09-051-21/+19
| | | | This reverts commit f73c844403de00630fd773075cefe6f502b54e69.
* Fix bump stamp for non-joined roomsErik Johnston2024-09-051-19/+21
| | | | | We should only look for bump stamps in joined rooms, otherwise we should just use the membership stream ordering.
* Sliding Sync: Prevent duplicate tags being added to traces (#17655)Eric Eastwood2024-09-051-16/+17
| | | | | | | | | Prevent duplicate tags being added to traces. Noticed because we see these warnings in Jaeger: <img width="462" alt="Screenshot 2024-09-03 at 2 34 05 PM" src="https://github.com/user-attachments/assets/6fac12ed-0074-435b-9451-eccde7e7012a">
* Format files with Ruff (#17643)Quentin Gliech2024-09-021-3/+1
| | | | | | I thought ruff check would also format, but it doesn't. This runs ruff format in CI and dev scripts. The first commit is just a run of `ruff format .` in the root directory.
* Sliding sync: Store the per-connection state in the database. (#17599)Erik Johnston2024-08-291-7/+2
| | | | | | | Based on #17600 --------- Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
* Sliding Sync: Make `PerConnectionState` immutable (#17600)Erik Johnston2024-08-291-9/+7
| | | | | | | | | | | | | | | | | This is so that we can cache it. We also move the sliding sync types to `synapse/types/handlers/sliding_sync.py`. This is mainly in-prep for #17599 to avoid circular imports. The only change in behaviour is that `RoomSyncConfig.combine_sync_config(..)` now returns a new room sync config rather than mutating in-place. Reviewable commit-by-commit. --------- Co-authored-by: Eric Eastwood <eric.eastwood@beta.gouv.fr>
* Sliding Sync: Pre-populate room data for quick filtering/sorting (#17512)Eric Eastwood2024-08-291-14/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pre-populate room data for quick filtering/sorting in the Sliding Sync API Spawning from https://github.com/element-hq/synapse/pull/17450#discussion_r1697335578 This PR is acting as the Synapse version `N+1` step in the gradual migration being tracked by https://github.com/element-hq/synapse/issues/17623 Adding two new database tables: - `sliding_sync_joined_rooms`: A table for storing room meta data that the local server is still participating in. The info here can be shared across all `Membership.JOIN`. Keyed on `(room_id)` and updated when the relevant room current state changes or a new event is sent in the room. - `sliding_sync_membership_snapshots`: A table for storing a snapshot of room meta data at the time of the local user's membership. Keyed on `(room_id, user_id)` and only updated when a user's membership in a room changes. Also adds background updates to populate these tables with all of the existing data. We want to have the guarantee that if a row exists in the sliding sync tables, we are able to rely on it (accurate data). And if a row doesn't exist, we use a fallback to get the same info until the background updates fill in the rows or a new event comes in triggering it to be fully inserted. This means we need a couple extra things in place until we bump `SCHEMA_COMPAT_VERSION` and run the foreground update in the `N+2` part of the gradual migration. For context on why we can't rely on the tables without these things see [1]. 1. On start-up, block until we clear out any rows for the rooms that have had events since the max-`stream_ordering` of the `sliding_sync_joined_rooms` table (compare to max-`stream_ordering` of the `events` table). For `sliding_sync_membership_snapshots`, we can compare to the max-`stream_ordering` of `local_current_membership` - This accounts for when someone downgrades their Synapse version and then upgrades it again. This will ensure that we don't have any stale/out-of-date data in the `sliding_sync_joined_rooms`/`sliding_sync_membership_snapshots` tables since any new events sent in rooms would have also needed to be written to the sliding sync tables. For example a new event needs to bump `event_stream_ordering` in `sliding_sync_joined_rooms` table or some state in the room changing (like the room name). Or another example of someone's membership changing in a room affecting `sliding_sync_membership_snapshots`. 1. Add another background update that will catch-up with any rows that were just deleted from the sliding sync tables (based on the activity in the `events`/`local_current_membership`). The rooms that need recalculating are added to the `sliding_sync_joined_rooms_to_recalculate` table. 1. Making sure rows are fully inserted. Instead of partially inserting, we need to check if the row already exists and fully insert all data if not. All of this extra functionality can be removed once the `SCHEMA_COMPAT_VERSION` is bumped with support for the new sliding sync tables so people can no longer downgrade (the `N+2` part of the gradual migration). <details> <summary><sup>[1]</sup></summary> For `sliding_sync_joined_rooms`, since we partially insert rows as state comes in, we can't rely on the existence of the row for a given `room_id`. We can't even rely on looking at whether the background update has finished. There could still be partial rows from when someone reverted their Synapse version after the background update finished, had some state changes (or new rooms), then upgraded again and more state changes happen leaving a partial row. For `sliding_sync_membership_snapshots`, we insert items as a whole except for the `forgotten` column ~~so we can rely on rows existing and just need to always use a fallback for the `forgotten` data. We can't use the `forgotten` column in the table for the same reasons above about `sliding_sync_joined_rooms`.~~ We could have an out-of-date membership from when someone reverted their Synapse version. (same problems as outlined for `sliding_sync_joined_rooms` above) Discussed in an [internal meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7) </details> ### TODO - [x] Update `stream_ordering`/`bump_stamp` - [x] Handle remote invites - [x] Handle state resets - [x] Consider adding `sender` so we can filter `LEAVE` memberships and distinguish from kicks. - [x] We should add it to be able to tell leaves from kicks - [x] Consider adding `tombstone` state to help address https://github.com/element-hq/synapse/issues/17540 - [x] We should add it `tombstone_successor_room_id` - [x] Consider adding `forgotten` status to avoid extra lookup/table-join on `room_memberships` - [x] We should add it - [x] Background update to fill in values for all joined rooms and non-join membership - [x] Clean-up tables when room is deleted - [ ] Make sure tables are useful to our use case - First explored in https://github.com/element-hq/synapse/compare/erikj/ss_use_new_tables - Also explored in https://github.com/element-hq/synapse/commit/76b5a576eb363496315dfd39510cad7d02b0fc73 - [x] Plan for how can we use this with a fallback - See plan discussed above in main area of the issue description - Discussed in an [internal meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7) - [x] Plan for how we can rely on this new table without a fallback - Synapse version `N+1`: (this PR) Bump `SCHEMA_VERSION` to `87`. Add new tables and background update to backfill all rows. Since this is a new table, we don't have to add any `NOT VALID` constraints and validate them when the background update completes. Read from new tables with a fallback in cases where the rows aren't filled in yet. - Synapse version `N+2`: Bump `SCHEMA_VERSION` to `88` and bump `SCHEMA_COMPAT_VERSION` to `87` because we don't want people to downgrade and miss writes while they are on an older version. Add a foreground update to finish off the backfill so we can read from new tables without the fallback. Application code can now rely on the new tables being populated. - Discussed in an [internal meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.hh7shg4cxdhj) ### Dev notes ``` SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase SYNAPSE_POSTGRES=1 SYNAPSE_POSTGRES_USER=postgres SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase ``` ``` SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.handlers.test_sliding_sync.FilterRoomsTestCase ``` Reference: - [Development docs on background updates and worked examples of gradual migrations ](https://github.com/element-hq/synapse/blob/1dfa59b238cee0dc62163588cc9481896c288979/docs/development/database_schema.md#background-updates) - A real example of a gradual migration: https://github.com/matrix-org/synapse/pull/15649#discussion_r1213779514 - Adding `rooms.creator` field that needed a background update to backfill data, https://github.com/matrix-org/synapse/pull/10697 - Adding `rooms.room_version` that needed a background update to backfill data, https://github.com/matrix-org/synapse/pull/6729 - Adding `room_stats_state.room_type` that needed a background update to backfill data, https://github.com/matrix-org/synapse/pull/13031 - Tables from MSC2716: `insertion_events`, `insertion_event_edges`, `insertion_event_extremities`, `batch_events` - `current_state_events` updated in `synapse/storage/databases/main/events.py` --- ``` persist_event (adds to queue) _persist_event_batch _persist_events_and_state_updates (assigns `stream_ordering` to events) _persist_events_txn _store_event_txn _update_metadata_tables_txn _store_room_members_txn _update_current_state_txn ``` --- > Concatenated Indexes [...] (also known as multi-column, composite or combined index) > > [...] key consists of multiple columns. > > We can take advantage of the fact that the first index column is always usable for searching > > *-- https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys* --- Dealing with `portdb` (`synapse/_scripts/synapse_port_db.py`), https://github.com/element-hq/synapse/pull/17512#discussion_r1725998219 --- <details> <summary>SQL queries:</summary> Both of these are equivalent and work in SQLite and Postgres Options 1: ```sql WITH data_table (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) AS ( VALUES ( ?, ?, ?, (SELECT membership FROM room_memberships WHERE event_id = ?), (SELECT stream_ordering FROM events WHERE event_id = ?), {", ".join("?" for _ in insert_values)} ) ) INSERT INTO sliding_sync_non_join_memberships (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) SELECT * FROM data_table WHERE membership != ? ON CONFLICT (room_id, user_id) DO UPDATE SET membership_event_id = EXCLUDED.membership_event_id, membership = EXCLUDED.membership, event_stream_ordering = EXCLUDED.event_stream_ordering, {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)} ``` Option 2: ```sql INSERT INTO sliding_sync_non_join_memberships (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) SELECT column1 as room_id, column2 as user_id, column3 as membership_event_id, column4 as membership, column5 as event_stream_ordering, {", ".join("column" + str(i) for i in range(6, 6 + len(insert_keys)))} FROM ( VALUES ( ?, ?, ?, (SELECT membership FROM room_memberships WHERE event_id = ?), (SELECT stream_ordering FROM events WHERE event_id = ?), {", ".join("?" for _ in insert_values)} ) ) as v WHERE membership != ? ON CONFLICT (room_id, user_id) DO UPDATE SET membership_event_id = EXCLUDED.membership_event_id, membership = EXCLUDED.membership, event_stream_ordering = EXCLUDED.event_stream_ordering, {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)} ``` If we don't need the `membership` condition, we could use: ```sql INSERT INTO sliding_sync_non_join_memberships (room_id, membership_event_id, user_id, membership, event_stream_ordering, {", ".join(insert_keys)}) VALUES ( ?, ?, ?, (SELECT membership FROM room_memberships WHERE event_id = ?), (SELECT stream_ordering FROM events WHERE event_id = ?), {", ".join("?" for _ in insert_values)} ) ON CONFLICT (room_id, user_id) DO UPDATE SET membership_event_id = EXCLUDED.membership_event_id, membership = EXCLUDED.membership, event_stream_ordering = EXCLUDED.event_stream_ordering, {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)} ``` </details> ### Pull Request Checklist <!-- Please read https://element-hq.github.io/synapse/latest/development/contributing_guide.html before submitting your pull request --> * [x] Pull request is based on the develop branch * [x] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [x] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters)) --------- Co-authored-by: Erik Johnston <erik@matrix.org>
* Sliding sync: factor out room list logic (#17622)Erik Johnston2024-08-281-1265/+22
| | | | | | | | | Move calculating of the room lists out of the core handler. This should make it easier to switch things around to start using the tables in #17512. This is just moving code between files and methods. Reviewable commit-by-commit
* Sliding sync: Split up handler into its own module (#17595)Erik Johnston2024-08-201-0/+2326
That file was getting long. The changes are non functional, and simply split things up into: - the main class - the connection store - the extensions - the types