summary refs log tree commit diff
path: root/synapse (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Only try to backfill event if we haven't tried before recently (#13635)Eric Eastwood2022-09-232-44/+148
| | | | | | | | | | Only try to backfill event if we haven't tried before recently (exponential backoff). No need to keep trying the same backfill point that fails over and over. Fix https://github.com/matrix-org/synapse/issues/13622 Fix https://github.com/matrix-org/synapse/issues/8451 Follow-up to https://github.com/matrix-org/synapse/pull/13589 Part of https://github.com/matrix-org/synapse/issues/13356
* Faster room joins: Avoid blocking `/keys/changes` (#13888)Sean Quah2022-09-232-3/+11
| | | | | | | | | Part of the work for #12993. Once #12993 is fully resolved, we expect `/keys/changes` to behave sensibly when joined to a room with partial state. Signed-off-by: Sean Quah <seanq@matrix.org>
* Fix access token leak to logs from proxyagent (#13855)Eric Eastwood2022-09-231-1/+6
| | | | | | | | | | | | | | | This can happen specifically with an application service `/transactions/10722?access_token=leaked` request Fix https://github.com/matrix-org/synapse/issues/13010 --- Saw an example leak in https://github.com/matrix-org/synapse/issues/13423#issuecomment-1205348482 ``` 2022-08-04 14:47:57,925 - synapse.http.client - 401 - DEBUG - as-sender-signal-1 - Sending request PUT http://localhost:29328/transactions/10722?access_token=<redacted> 2022-08-04 14:47:57,926 - synapse.http.proxyagent - 223 - DEBUG - as-sender-signal-1 - Requesting b'http://localhost:29328/transactions/10722?access_token=leaked' via <HostnameEndpoint localhost:29328> ```
* Accept & store thread IDs for receipts (implement MSC3771). (#13782)Patrick Cloke2022-09-2310-27/+110
| | | | Updates the `/receipts` endpoint and receipt EDU handler to parse a `thread_id` from the body and insert it in the database.
* Send device list updates out to servers in partially joined rooms (#13874)Sean Quah2022-09-233-2/+65
| | | | | | | | | | | Use the provided list of servers in the room from the `/send_join` response, since we will not know which users are in the room. This isn't sufficient to ensure that all remote servers receive the right device list updates, since the `/send_join` response may be inaccurate or we may calculate the membership state of new users in the room incorrectly. Signed-off-by: Sean Quah <seanq@matrix.org>
* Faster Remote Room Joins: tell remote homeservers that we are unable to ↵reivilibre2022-09-239-41/+56
| | | | authorise them if they query a room which has partial state on our server. (#13823)
* Properly paginate forward in the /relations API. (#13840)Patrick Cloke2022-09-222-13/+31
| | | | | This fixes a bug where the `/relations` API with `dir=f` would skip the first item of each page (except the first page), causing incomplete data to be returned to the client.
* Last batch of Pydantic for synapse/rest/client/account.py (#13832)David Robertson2022-09-211-6/+13
| | | | | | | * Validation for `/add_threepid/msisdn/submit_token` * Don't validate deprecated endpoint * Changelog
* Add version flag for MSC3881 (#13860)Brendan Abolivier2022-09-211-0/+2
|
* Track device IDs for pushers (#13831)Brendan Abolivier2022-09-215-5/+103
| | | Second half of the MSC3881 implementation
* Implementation of MSC3882 login token request (#13722)Hugh Nimmo-Smith2022-09-214-0/+105
|
* Support enabling/disabling pushers (from MSC3881) (#13799)Brendan Abolivier2022-09-2110-54/+154
| | | Partial implementation of MSC3881
* Add cache invalidation across workers to module API (#13667)Mathieu Velten2022-09-214-19/+71
| | | Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
* Correct documentation for map_user_attributes of OpenID Mapping Providers ↵Peter Scheu2022-09-211-0/+3
| | | | | (#13836) Co-authored-by: David Robertson <davidr@element.io>
* Remove the `complete_sso_login` method from the Module API which was ↵Quentin Gliech2022-09-202-58/+1
| | | | | deprecated in Synapse 1.13.0. (#13843) Signed-off-by: Quentin Gliech <quenting@element.io>
* Generate separate snapshots for logical databases (#13792)David Robertson2022-09-202-5/+14
| | | | | | | * Generate separate snapshots for sqlite, postgres and common * Cleanup postgres dbs in the TRAP * Say which logical DB we're applying updates to * Run background updates on the state DB * Add new option for accepting a SCHEMA_NUMBER
* Port the push rule classes to Rust. (#13768)Erik Johnston2022-09-205-598/+25
|
* Don't include redundant prev_state in new events (#13791)Denis2022-09-202-4/+0
|
* Add support to purge rows from MSC2716 and other tables when purging a room ↵Eric Eastwood2022-09-163-0/+29
| | | | | | | | | | | (#13825) `event_failed_pull_attempts` added in https://github.com/matrix-org/synapse/pull/13589 MSC2716 related tables added in: - https://github.com/matrix-org/synapse/pull/10245/files#diff-3d42dfb44d02f7de3aada105e0bdc1cc9dd7f953cbf0f36c5d0f50827bf0320aR1 - Renamed in https://github.com/matrix-org/synapse/pull/10838/files#diff-2730bfbe9e688b55e46f9371aefe67dac2bd2b2b7d9d6b92774eea1fcfae156dR1 - https://github.com/matrix-org/synapse/pull/10498/files#diff-c52bbfbb5921a3f6f023b24343668479d966fac164f13b7c39d2197ce3afa7a5R1
* Remove error spam when users query the keys of departed remote users (#13826)Sean Quah2022-09-161-9/+12
| | | | The error message introduced in #13749 has turned out to be very spammy. Remove it for now.
* Add an admin API endpoint to find a user based on its external ID in an auth ↵Quentin Gliech2022-09-162-0/+29
| | | | provider. (#13810)
* Avoid putting rejected events in room state (#13723)Sean Quah2022-09-161-0/+15
| | | Signed-off-by: Sean Quah <seanq@matrix.org>
* Be able to correlate timeouts in reverse-proxy layer in front of Synapse ↵Eric Eastwood2022-09-152-4/+23
| | | | | | | | | | | | | | | | | | (pull request ID from header) (#13801) Fix https://github.com/matrix-org/synapse/issues/13685 New config: ```diff listeners: - port: 8008 tls: false type: http x_forwarded: true + request_id_header: "cf-ray" bind_addresses: ['::1', '127.0.0.1', '0.0.0.0'] ```
* Record any exception when processing a pulled event (#13814)Eric Eastwood2022-09-151-0/+10
| | | | | Part of https://github.com/matrix-org/synapse/issues/13700 and https://github.com/matrix-org/synapse/issues/13356 Follow-up to https://github.com/matrix-org/synapse/pull/13589
* Support providing an index predicate for upserts. (#13822)Patrick Cloke2022-09-152-7/+24
| | | | This is useful to upsert against a table which has a unique partial index while avoiding conflicts.
* A third batch of Pydantic validation for rest/client/account.py (#13736)David Robertson2022-09-152-42/+51
|
* Add a `MXCUri` class to make working with mxc uri's easier. (#13162)Andrew Morgan2022-09-152-4/+8
|
* Keep track when we try and fail to process a pulled event (#13589)Eric Eastwood2022-09-145-9/+106
| | | | | | | | | | | | | | We can follow-up this PR with: 1. Only try to backfill from an event if we haven't tried recently -> https://github.com/matrix-org/synapse/issues/13622 1. When we decide to backfill that event again, process it in the background so it doesn't block and make `/messages` slow when we know it will probably fail again -> https://github.com/matrix-org/synapse/issues/13623 1. Generally track failures everywhere we try and fail to pull an event over federation -> https://github.com/matrix-org/synapse/issues/13700 Fix https://github.com/matrix-org/synapse/issues/13621 Part of https://github.com/matrix-org/synapse/issues/13356 Mentioned in [internal doc](https://docs.google.com/document/d/1lvUoVfYUiy6UaHB6Rb4HicjaJAU40-APue9Q4vzuW3c/edit#bookmark=id.qv7cj51sv9i5)
* Update event push action and receipt tables to support threads. (#13753)Patrick Cloke2022-09-149-20/+310
| | | | | | | | | | | | | | | Adds a `thread_id` column to the `event_push_actions`, `event_push_actions_staging`, and `event_push_summary` tables. This will notifications to be segmented by the thread in a future pull request. The `thread_id` column stores the root event ID or the special value `"main"`. The `thread_id` column for `event_push_actions` and `event_push_summary` is backfilled with `"main"` for all existing rows. New entries into `event_push_actions` and `event_push_actions_staging` will get the proper thread ID. `receipts_linearized` and `receipts_graph` also gain a `thread_id` column, which is similar, except `NULL` is a special value meaning the receipt is "unthreaded". See MSC3771 and MSC3773 for where this data will be useful.
* Use partial indices on SQLIte. (#13802)Patrick Cloke2022-09-143-5/+58
| | | | | | | Partial indices have been supported since SQLite 3.8, but Synapse now requires >= 3.27, so we can enable support for them. This requires rebuilding previous indices which were partial on PostgreSQL, but not on SQLite.
* Deduplicate `is_server_notices_room`. (#13780)reivilibre2022-09-143-18/+19
|
* Fix a memory leak when running the unit tests. (#13798)reivilibre2022-09-142-6/+7
|
* Remove unused method in `synapse.api.auth.Auth`. (#13795)Quentin Gliech2022-09-141-9/+0
| | | | | Clean-up from b19060a29b4f73897847db2aba5d03ec819086e0 (#13094) and 73af10f419346a5f2d70131ac1ed8e69942edca0 (#13093) which removed all callers.
* Remove incorrect migration file from `state` logical DB (#13788)David Robertson2022-09-141-37/+0
| | | | | | | | | | | | | * Remove incorrect migration file from `state` logical DB The table `ex_outlier_stream` is part of the `main` logical DB; it should not have been created in the `state` logical DB. We remove this migration now as a tidy-up. Note: we cannot `DROP TABLE IF EXISTS ex_outlier_stream` in a new migration, because some (most) instances of Synapse host both of these logical DBs on the same DB cluster. * Changelog
* Fix bug in device list caching when remote users leave rooms (#13749)Sean Quah2022-09-143-14/+43
| | | | | | | | | | | | When a remote user leaves the last room shared with the homeserver, we have to mark their device list as unsubscribed, otherwise we would hold on to a stale device list in our cache. Crucially, the device list would remain cached even after the remote user rejoined the room, which could lead to E2EE failures until the next change to the remote user's device list. Fixes #13651. Signed-off-by: Sean Quah <seanq@matrix.org>
* Fix a long-standing spec compliance bug where Synapse would accept a ↵reivilibre2022-09-141-2/+1
| | | | | | | | | | | trailing slash on the end of `/get_missing_events` federation requests. (#13789) * Don't accept a trailing slash on the end of /get_missing_events * Newsfile Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org> Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
* Make sequence `cache_invalidation_stream_seq` begin at `2` (#13766)Mathieu Velten2022-09-132-0/+24
| | | | Signed-off-by: Mathieu Velten <mathieuv@matrix.org> Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Updates to the schema dump script (#13770)David Robertson2022-09-131-0/+4
|
* Add receipts event stream ordering (#13703)Nick Mills-Barrett2022-09-133-1/+94
|
* Remove check current state membership up to date (#13745)Nick Mills-Barrett2022-09-122-155/+99
| | | | | | | * Remove checks for membership column in current_state_events * Add schema script to force through the `current_state_events_membership` background job Contributed by Nick @ Beeper (@fizzadar).
* Check if Rust lib needs rebuilding. (#13759)Erik Johnston2022-09-122-0/+89
| | | This protects against the common mistake of failing to remember to rebuild Rust code after making changes.
* Concurrently collect room unread counts for push badges (#13765)Nick Mills-Barrett2022-09-091-3/+10
| | | | | | | Most of the time this function is heavily cached, but when that isn't the case fetching the counts room by room slows down push delivery on users with many (thousands) of rooms. Signed off by Nick @ Beeper.
* Tag trace with instance name (#13761)Eric Eastwood2022-09-092-2/+11
| | | | | | | | We tag the Synapse instance name so that it's an easy jumping off point into the logs. Can also be used to filter for an instance that is under load. As suggested by @clokep and @reivilibre in, - https://github.com/matrix-org/synapse/pull/13729#discussion_r964719258 - https://github.com/matrix-org/synapse/pull/13729#discussion_r964733578
* Strip number suffix from instance name to consolidate services that traces ↵Eric Eastwood2022-09-091-1/+12
| | | | | | | | | | | | are spread over (#13729) The problem with many services is that it makes it hard to find which service has the trace you want, see https://github.com/jaegertracing/jaeger-ui/issues/985 Previously, we split traces out into services based on their instance name like `matrix.org client_reader-1`, etc but there are many worker instances of the same `client_reader` so there is a lot to click through. With this PR, all of the traces are just collected under the worker type like `client_reader`, `event_persister` 😇 Note: A Synapse worker instance name is an opaque string with the number convention only being our own thing for the `matrix.org` deployment. But seems pretty sensible to group things this way.
* Use an upsert for `receipts_graph`. (#13752)Patrick Cloke2022-09-091-8/+4
| | | | | | Instead of a delete, then insert. This was previously done for `receipts_linearized` in 2dc430d36ef793b38d6d79ec8db4ea60588df2ee (#7607).
* Require SQLite >= 3.27.0 (#13760)David Robertson2022-09-097-207/+105
|
* Re-type hint some collections in `/sync` code as read-only (#13754)Sean Quah2022-09-081-10/+10
| | | | Signed-off-by: Sean Quah <seanq@matrix.org>
* Add timestamp to user's consent (#13741)Dirk Klimpel2022-09-083-1/+22
| | | Co-authored-by: reivilibre <olivier@librepush.net>
* Update docstrings to explain the impact of partial state (#13750)Sean Quah2022-09-081-1/+16
| | | | | | | Update the docstrings for `get_users_in_room` and `get_current_hosts_in_room` to explain the impact of partial state. Signed-off-by: Sean Quah <seanq@matrix.org>
* Avoid raising errors due to malformed IDs in `get_current_hosts_in_room` ↵Sean Quah2022-09-081-1/+4
| | | | | | | | | | (#13748) Handle malformed user IDs with no colons in `get_current_hosts_in_room`. It's not currently possible for a malformed user ID to join a room, so this error would never be hit. Signed-off-by: Sean Quah <seanq@matrix.org>
* Fix error in `is_mine_id` when encountering a malformed ID (#13746)Sean Quah2022-09-081-1/+11
| | | | | | | | | Previously, `is_mine_id` would raise an exception when passed an ID with no colons. Return `False` instead. Fixes #13040. Signed-off-by: Sean Quah <seanq@matrix.org>
* Fix cache metrics not being updated when not using the legacy exposition ↵reivilibre2022-09-083-21/+80
| | | | module. (#13717)
* Fix Prometheus recording rules to not use legacy metric names. (#13718)reivilibre2022-09-083-5/+10
|
* Fix a bug where Synapse fails to start if a signing key file contains an ↵reivilibre2022-09-081-1/+12
| | | | empty line. (#13738)
* Instrument `get_metadata_for_events` for tracing (#13730)Eric Eastwood2022-09-071-0/+2
| | | | When backfilling, `_get_state_ids_after_missing_prev_event` calls [`get_metadata_for_events`](https://github.com/matrix-org/synapse/blob/26bc26586b4b95d63ce7e453e9312469843f796e/synapse/handlers/federation_event.py#L1133). For `#matrix:matrix.org`, it's called with 77k `state_events` which means 77 calls to the database and takes 28 seconds.
* A second batch of Pydantic models for rest/client/account.py (#13687)David Robertson2022-09-073-34/+63
|
* Cancel the processing of key query requests when they time out. (#13680)reivilibre2022-09-0715-19/+71
|
* Rename the `EventFormatVersions` enum values so that they line up with room ↵reivilibre2022-09-079-37/+42
| | | | version numbers. (#13706)
* Add Admin API to Fetch Messages Within a Particular Window (#13672)Connor Davis2022-09-073-13/+132
| | | This adds two new admin APIs that allow us to fetch messages from a room within a particular time.
* Remove the unspecced room_id field in the /hierarchy response. (#13506)reivilibre2022-09-061-1/+0
| | | | | | | | | | | This is a re-do of 57d334a13d983406ea452dfa203bbe4837509c4e (#13365), which was backed out in 12abd724974a2311d5311272d26d2f8aa11734a9 (#13501). The `room_id` field represented the parent space for each room and was made redundant by changes in the API shape where the `children_state` is now nested underneath each `room`. The room ID of each child is in the `state_key` field and is still available.
* Actually fix typechecking with latest types-jsonschema (#13724)David Robertson2022-09-061-4/+4
|
* Update Grafana dashboard to not use legacy metric names. (#13714)reivilibre2022-09-062-4/+4
|
* Remove configuration options for direct TCP replication. (#13647)Patrick Cloke2022-09-064-54/+39
| | | Removes the ability to configure legacy direct TCP replication. Workers now require Redis to run.
* Fix typechecking with latest `types-jsonschema` (#13712)David Robertson2022-09-051-4/+4
|
* Share some metrics between the Prometheus exporter and the phone home stats ↵Brendan Abolivier2022-09-054-3/+100
| | | | (#13671)
* Add a schema delta to drop unstable private read receipts. (#13692)Patrick Cloke2022-09-011-0/+19
| | | | Otherwise they'll be leaked due to the filtering code only respecting the stable identifiers for private read receipts.
* Disable calculating unread counts unless the config flag is enabled. (#13694)Patrick Cloke2022-09-012-1/+9
| | | | | | | | This avoids doing work that will never be used (since the resulting unread counts will never be sent in a /sync response). The negative of doing this is that unread counts will be incorrect when the feature is initially enabled.
* Cache `is_partial_state_room` (#13693)Erik Johnston2022-09-011-4/+7
| | | Fixes #13613.
* Add some logging to help track down #13444 (#13679)Erik Johnston2022-09-011-0/+13
|
* Return keys for unwhitelisted servers from `/_matrix/key/v2/query` (#13683)Richard van der Hoff2022-09-011-20/+21
|
* Remove support for unstable private read receipts (#13653)Šimon Brandner2022-09-019-39/+7
| | | Signed-off-by: Šimon Brandner <simon.bra.ag@gmail.com>
* Drop support for calling `/_matrix/client/v3/rooms/{roomId}/invite` without ↵Jacek Kuśnierz2022-08-315-134/+55
| | | | | | | an `id_access_token` (#13241) Fixes #13206 Signed-off-by: Jacek Kusnierz jacek.kusnierz@tum.de
* Remove cached wrap on `_get_joined_users_from_context` method (#13569)Nick Mills-Barrett2022-08-312-85/+39
| | | | | | | The method doesn't actually do any data fetching and the method that does, `_get_joined_profile_from_event_id`, has its own cache. Signed off by Nick @ Beeper (@Fizzadar).
* Generalise the `@cancellable` annotation so it can be used on functions ↵reivilibre2022-08-315-71/+68
| | | | other than just servlet methods. (#13662)
* Fix admin List Room API return type on sqlite (#13509)David Robertson2022-08-311-2/+4
|
* Give the correct next event when the message timestamps are the same - ↵Eric Eastwood2022-08-301-2/+10
| | | | | | | | | MSC3030 (#13658) Discovered while working on https://github.com/matrix-org/synapse/pull/13589 and I had all the messages at the same timestamp in the tests. Part of https://github.com/matrix-org/matrix-spec-proposals/pull/3030 Complement tests: https://github.com/matrix-org/complement/pull/457
* Drop unused column `application_services_state.last_txn` (#13627)Shay2022-08-303-0/+58
|
* Merge branch 'release-v1.66' into developDavid Robertson2022-08-302-32/+127
|\
| * Fix rate limit metrics registering twice and misreporting (#13649)Eric Eastwood2022-08-302-32/+127
| | | | | | | | | | | | | | | | | | | | | | * Fix rate limit metrics registering twice and misreporting Fix https://github.com/matrix-org/synapse/issues/13641 * Fix lints * Add changelog * Document `metrics_name=None`.
* | Fix bug where we wedge media plugins if clients disconnect early (#13660)Erik Johnston2022-08-301-19/+21
| | | | | | | | | | | | | | | | We incorrectly didn't use the returned `Responder` if the client had disconnected, which meant that the resource used by the Responder wasn't correctly released. In particular, this exhausted the thread pools so that *all* requests timed out.
* | Do not wait for background updates to complete do expire URL cache. (#13657)Patrick Cloke2022-08-301-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Media downloaded as part of a URL preview is normally deleted after two days. However, while a background database migration is running, the process is stopped. A long-running database migration can therefore cause the media store to fill up with old preview files. This logic was added in #2697 to make sure that we didn't try to run the expiry without an index on `local_media_repository.created_ts`; the original logic that needs that index was added in #2478 (in `get_url_cache_media_before`, as amended by 93247a424a5068b088567fa98b6990e47608b7cb), and is still present. Given that the background update was added before Synapse v1.0.0, just drop this check and assume the index exists.
* | Speed up inserting `event_push_actions_staging`. (#13634)Patrick Cloke2022-08-301-20/+8
| | | | | | By using `execute_values` instead of `execute_batch`.
* | Fix that user cannot `/forget` rooms after the last member has left (#13546)Dirk Klimpel2022-08-301-2/+5
| |
* | Optimize how we calculate `likely_domains` during backfill (#13575)Eric Eastwood2022-08-304-70/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optimize how we calculate `likely_domains` during backfill because I've seen this take 17s in production just to `get_current_state` which is used to `get_domains_from_state` (see case [*2. Loading tons of events* in the `/messages` investigation issue](https://github.com/matrix-org/synapse/issues/13356)). There are 3 ways we currently calculate hosts that are in the room: 1. `get_current_state` -> `get_domains_from_state` - Used in `backfill` to calculate `likely_domains` and `/timestamp_to_event` because it was cargo-culted from `backfill` - This one is being eliminated in favor of `get_current_hosts_in_room` in this PR 🕳 1. `get_current_hosts_in_room` - Used for other federation things like sending read receipts and typing indicators 1. `get_hosts_in_room_at_events` - Used when pushing out events over federation to other servers in the `_process_event_queue_loop` Fix https://github.com/matrix-org/synapse/issues/13626 Part of https://github.com/matrix-org/synapse/issues/13356 Mentioned in [internal doc](https://docs.google.com/document/d/1lvUoVfYUiy6UaHB6Rb4HicjaJAU40-APue9Q4vzuW3c/edit#bookmark=id.2tvwz3yhcafh) ### Query performance #### Before The query from `get_current_state` sucks just because we have to get all 80k events. And we see almost the exact same performance locally trying to get all of these events (16s vs 17s): ``` synapse=# SELECT type, state_key, event_id FROM current_state_events WHERE room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; Time: 16035.612 ms (00:16.036) synapse=# SELECT type, state_key, event_id FROM current_state_events WHERE room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; Time: 4243.237 ms (00:04.243) ``` But what about `get_current_hosts_in_room`: When there is 8M rows in the `current_state_events` table, the previous query in `get_current_hosts_in_room` took 13s from complete freshness (when the events were first added). But takes 930ms after a Postgres restart or 390ms if running back to back to back. ```sh $ psql synapse synapse=# \timing on synapse=# SELECT COUNT(DISTINCT substring(state_key FROM '@[^:]*:(.*)$')) FROM current_state_events WHERE type = 'm.room.member' AND membership = 'join' AND room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; count ------- 4130 (1 row) Time: 13181.598 ms (00:13.182) synapse=# SELECT COUNT(*) from current_state_events where room_id = '!OGEhHVWSdvArJzumhm:matrix.org'; count ------- 80814 synapse=# SELECT COUNT(*) from current_state_events; count --------- 8162847 synapse=# SELECT pg_size_pretty( pg_total_relation_size('current_state_events') ); pg_size_pretty ---------------- 4702 MB ``` #### After I'm not sure how long it takes from complete freshness as I only really get that opportunity once (maybe restarting computer but that's cumbersome) and it's not really relevant to normal operating times. Maybe you get closer to the fresh times the more access variability there is so that Postgres caches aren't as exact. Update: The longest I've seen this run for is 6.4s and 4.5s after a computer restart. After a Postgres restart, it takes 330ms and running back to back takes 260ms. ```sh $ psql synapse synapse=# \timing on Timing is on. synapse=# SELECT substring(c.state_key FROM '@[^:]*:(.*)$') as host FROM current_state_events c /* Get the depth of the event from the events table */ INNER JOIN events AS e USING (event_id) WHERE c.type = 'm.room.member' AND c.membership = 'join' AND c.room_id = '!OGEhHVWSdvArJzumhm:matrix.org' GROUP BY host ORDER BY min(e.depth) ASC; Time: 333.800 ms ``` #### Going further To improve things further we could add a `limit` parameter to `get_current_hosts_in_room`. Realistically, we don't need 4k domains to choose from because there is no way we're going to query that many before we a) probably get an answer or b) we give up. Another thing we can do is optimize the query to use a index skip scan: - https://wiki.postgresql.org/wiki/Loose_indexscan - Index Skip Scan, https://commitfest.postgresql.org/37/1741/ - https://www.timescale.com/blog/how-we-made-distinct-queries-up-to-8000x-faster-on-postgresql/
* | Generate missing configuration files at startup (#13615)Richard van der Hoff2022-08-261-11/+48
| | | | | | | | | | | | | | | | If things like the signing key file are missing, let's just try to generate them on startup. Again, this is useful for k8s-like deployments where we just want to generate keys on the first run.
* | Move the execution of the retention purge_jobs to the main worker (#13632)Brad Murray2022-08-261-4/+2
| | | | | | | | | | Fixes #9927 Signed-off-by: Brad Murray brad@beeper.com
* | Support `registration_shared_secret` in a file (#13614)Richard van der Hoff2022-08-252-5/+73
| | | | | | | | A new `registration_shared_secret_path` option. This is kinda handy for k8s deployments and things.
* | register_new_matrix_user: read server url from config (#13616)Richard van der Hoff2022-08-251-6/+51
| | | | | | | | Fixes https://github.com/matrix-org/synapse/issues/3672: `https://localhost:8448` is virtually never right.
* | Comment about a better future where we can get the state diff between two ↵Eric Eastwood2022-08-241-0/+8
| | | | | | | | | | | | | | | | | | | | events (#13586) Split off from https://github.com/matrix-org/synapse/pull/13561 Part of https://github.com/matrix-org/synapse/issues/13356 Mentioned in [internal doc](https://docs.google.com/document/d/1lvUoVfYUiy6UaHB6Rb4HicjaJAU40-APue9Q4vzuW3c/edit#bookmark=id.2tvwz3yhcafh)
* | Rename `event_map` to `unpersisted_events` (#13603)David Robertson2022-08-241-32/+37
| |
* | Update `get_users_in_room` mis-use to get hosts with dedicated ↵Eric Eastwood2022-08-244-12/+18
| | | | | | | | | | `get_current_hosts_in_room` (#13605) See https://github.com/matrix-org/synapse/pull/13575#discussion_r953023755
* | Directly lookup local membership instead of getting all members in a room ↵Eric Eastwood2022-08-246-11/+53
| | | | | | | | | | first (`get_users_in_room` mis-use) (#13608) See https://github.com/matrix-org/synapse/pull/13575#discussion_r953023755
* | When loading current ids, sort by `stream_id` to avoid incorrect overwrite ↵Eric Eastwood2022-08-241-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and avoid errors caused by sorting alphabetical instance name which can be `null` (#13585) When loading current ids, sort by stream ID so that we don't want to overwrite the `current_position` of an instance to a lower stream ID than we're actually at ([discussion](https://github.com/matrix-org/synapse/pull/13585#discussion_r951795379)). Previously, it sorted alphabetically by instance name which can be `null` and throw errors but more importantly, accomplishes nothing. Fixes the following startup error which is why I started looking into this area: ``` $ poetry run synapse_homeserver --config-path homeserver.yaml **************************************************************** Error during initialisation: '<' not supported between instances of 'NoneType' and 'str' There may be more information in the logs. **************************************************************** ``` Somehow my database ended up looking like the following, notice the `instance_name` is `null` in the db, and we can't sort `NoneType` things. Another question is why do we see the `instance_name` as `null` sometimes instead of `master` in monolith mode? ``` $ psql synapse synapse=# SELECT * FROM stream_positions; stream_name | instance_name | stream_id -----------------+---------------+----------- account_data | master | 1242 events | master | 1787 to_device | master | 58 presence_stream | master | 485638 receipts | master | 341 backfill | master | -139106 (6 rows) synapse=# SELECT instance_name, stream_id FROM receipts_linearized; instance_name | stream_id ---------------+----------- | 211 | 3 | 4 | 212 | 213 | 224 | 228 | 164 | 313 | 253 | 38 | 321 | 324 | 189 | 192 | 193 | 194 | 195 | 197 | 198 | 275 | 79 | 339 | 340 | 82 | 341 | 84 | 85 | 91 | 119 ```
* | Use dedicated `get_local_users_in_room` to find local users when calculating ↵Eric Eastwood2022-08-241-6/+3
| | | | | | | | | | | | | | `join_authorised_via_users_server` of a `/make_join` request (#13606) Use dedicated `get_local_users_in_room` to find local users when calculating `join_authorised_via_users_server` ("the authorising user for joining a restricted room") of a `/make_join` request. Found while working on https://github.com/matrix-org/synapse/pull/13575#discussion_r953023755 but it's not related.
* | Add experimental configuration option to allow disabling legacy Prometheus ↵reivilibre2022-08-247-21/+113
| | | | | | | | | | metric names. (#13540) Co-authored-by: David Robertson <davidr@element.io>
* | Rewrite get push actions queries (#13597)Nick Mills-Barrett2022-08-241-160/+68
| |
* | Faster Room Joins: fix `/make_knock` blocking indefinitely when the room in ↵reivilibre2022-08-241-0/+11
| | | | | | | | | | question is a partial-stated room. (#13583) Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* | Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child ↵Eric Eastwood2022-08-234-3/+46
| | | | | | | | | | | | | | | | | | concurrent calls (#13588) Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child concurrent calls because I've see `_check_sigs_and_hash_and_fetch` take [10.41s to process 100 events](https://github.com/matrix-org/synapse/issues/13587) Fix https://github.com/matrix-org/synapse/issues/13587 Part of https://github.com/matrix-org/synapse/issues/13356
* | Speed up `@cachedList` (#13591)Erik Johnston2022-08-233-141/+297
| | | | | | | | | | | | | | | | | | This speeds things up by ~2x. The vast majority of the time is now spent in `LruCache` moving things around the linked lists. We do this via two things: 1. Don't create a deferred per-key during bulk set operations in `DeferredCache`. Instead, only create them if a subsequent caller asks for the key. 2. Add a bulk lookup API to `DeferredCache` rather than use a loop.
* | Fix regression caused by #13573 (#13600)Erik Johnston2022-08-231-4/+6
| | | | | | Broke in #13573.
* | Merge tag 'v1.66.0rc1' into developDavid Robertson2022-08-238-243/+71
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Synapse 1.66.0rc1 (2022-08-23) ============================== This release removes the ability for homeservers to delegate email ownership verification and password reset confirmation to identity servers. This removal was originally planned for Synapse 1.64, but was later deferred until now. See the [upgrade notes](https://matrix-org.github.io/synapse/v1.66/upgrade.html#upgrading-to-v1660) for more details. Features -------- - Improve validation of request bodies for the following client-server API endpoints: [`/account/password`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3accountpassword), [`/account/password/email/requestToken`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3accountpasswordemailrequesttoken), [`/account/deactivate`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3accountdeactivate) and [`/account/3pid/email/requestToken`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3account3pidemailrequesttoken). ([\#13188](https://github.com/matrix-org/synapse/issues/13188), [\#13563](https://github.com/matrix-org/synapse/issues/13563)) - Add forgotten status to [Room Details Admin API](https://matrix-org.github.io/synapse/latest/admin_api/rooms.html#room-details-api). ([\#13503](https://github.com/matrix-org/synapse/issues/13503)) - Add an experimental implementation for [MSC3852 (Expose user agents on `Device`)](https://github.com/matrix-org/matrix-spec-proposals/pull/3852). ([\#13549](https://github.com/matrix-org/synapse/issues/13549)) - Add `org.matrix.msc2716v4` experimental room version with updated content fields. Part of [MSC2716 (Importing history)](https://github.com/matrix-org/matrix-spec-proposals/pull/2716). ([\#13551](https://github.com/matrix-org/synapse/issues/13551)) - Add support for compression to federation responses. ([\#13537](https://github.com/matrix-org/synapse/issues/13537)) - Improve performance of sending messages in rooms with thousands of local users. ([\#13522](https://github.com/matrix-org/synapse/issues/13522), [\#13547](https://github.com/matrix-org/synapse/issues/13547)) Bugfixes -------- - Faster room joins: make `/joined_members` block whilst the room is partial stated. ([\#13514](https://github.com/matrix-org/synapse/issues/13514)) - Fix a bug introduced in Synapse 1.21.0 where the [`/event_reports` Admin API](https://matrix-org.github.io/synapse/develop/admin_api/event_reports.html) could return a total count which was larger than the number of results you can actually query for. ([\#13525](https://github.com/matrix-org/synapse/issues/13525)) - Fix a bug introduced in Synapse 1.52.0 where sending server notices fails if `max_avatar_size` or `allowed_avatar_mimetypes` is set and not `system_mxid_avatar_url`. ([\#13566](https://github.com/matrix-org/synapse/issues/13566)) - Fix a bug where the `opentracing.force_tracing_for_users` config option would not apply to [`/sendToDevice`](https://spec.matrix.org/v1.3/client-server-api/#put_matrixclientv3sendtodeviceeventtypetxnid) and [`/keys/upload`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3keysupload) requests. ([\#13574](https://github.com/matrix-org/synapse/issues/13574)) Improved Documentation ---------------------- - Add `openssl` example for generating registration HMAC digest. ([\#13472](https://github.com/matrix-org/synapse/issues/13472)) - Tidy up Synapse's README. ([\#13491](https://github.com/matrix-org/synapse/issues/13491)) - Document that event purging related to the `redaction_retention_period` config option is executed only every 5 minutes. ([\#13492](https://github.com/matrix-org/synapse/issues/13492)) - Add a warning to retention documentation regarding the possibility of database corruption. ([\#13497](https://github.com/matrix-org/synapse/issues/13497)) - Document that the `DOCKER_BUILDKIT=1` flag is needed to build the docker image. ([\#13515](https://github.com/matrix-org/synapse/issues/13515)) - Add missing links in `user_consent` section of configuration manual. ([\#13536](https://github.com/matrix-org/synapse/issues/13536)) - Fix the doc and some warnings that were referring to the nonexistent `custom_templates_directory` setting (instead of `custom_template_directory`). ([\#13538](https://github.com/matrix-org/synapse/issues/13538)) Deprecations and Removals ------------------------- - Remove the ability for homeservers to delegate email ownership verification and password reset confirmation to identity servers. See [upgrade notes](https://matrix-org.github.io/synapse/v1.66/upgrade.html#upgrading-to-v1660) for more details. Internal Changes ---------------- - Update the rejected state of events during de-partial-stating. ([\#13459](https://github.com/matrix-org/synapse/issues/13459)) - Avoid blocking lazy-loading `/sync`s during partial joins due to remote memberships. Pull remote memberships from auth events instead of the room state. ([\#13477](https://github.com/matrix-org/synapse/issues/13477)) - Refuse to start when faster joins is enabled on a deployment with workers, since worker configurations are not currently supported. ([\#13531](https://github.com/matrix-org/synapse/issues/13531)) - Allow use of both `@trace` and `@tag_args` stacked on the same function. ([\#13453](https://github.com/matrix-org/synapse/issues/13453)) - Instrument the federation/backfill part of `/messages` for understandable traces in Jaeger. ([\#13489](https://github.com/matrix-org/synapse/issues/13489)) - Instrument `FederationStateIdsServlet` (`/state_ids`) for understandable traces in Jaeger. ([\#13499](https://github.com/matrix-org/synapse/issues/13499), [\#13554](https://github.com/matrix-org/synapse/issues/13554)) - Track HTTP response times over 10 seconds from `/messages` (`synapse_room_message_list_rest_servlet_response_time_seconds`). ([\#13533](https://github.com/matrix-org/synapse/issues/13533)) - Add metrics to track how the rate limiter is affecting requests (sleep/reject). ([\#13534](https://github.com/matrix-org/synapse/issues/13534), [\#13541](https://github.com/matrix-org/synapse/issues/13541)) - Add metrics to time how long it takes us to do backfill processing (`synapse_federation_backfill_processing_before_time_seconds`, `synapse_federation_backfill_processing_after_time_seconds`). ([\#13535](https://github.com/matrix-org/synapse/issues/13535), [\#13584](https://github.com/matrix-org/synapse/issues/13584)) - Add metrics to track rate limiter queue timing (`synapse_rate_limit_queue_wait_time_seconds`). ([\#13544](https://github.com/matrix-org/synapse/issues/13544)) - Update metrics to track `/messages` response time by room size. ([\#13545](https://github.com/matrix-org/synapse/issues/13545)) - Refactor methods in `synapse.api.auth.Auth` to use `Requester` objects everywhere instead of user IDs. ([\#13024](https://github.com/matrix-org/synapse/issues/13024)) - Clean-up tests for notifications. ([\#13471](https://github.com/matrix-org/synapse/issues/13471)) - Add some miscellaneous comments to document sync, especially around `compute_state_delta`. ([\#13474](https://github.com/matrix-org/synapse/issues/13474)) - Use literals in place of `HTTPStatus` constants in tests. ([\#13479](https://github.com/matrix-org/synapse/issues/13479), [\#13488](https://github.com/matrix-org/synapse/issues/13488)) - Add comments about how event push actions are rotated. ([\#13485](https://github.com/matrix-org/synapse/issues/13485)) - Modify HTML template content to better support mobile devices' screen sizes. ([\#13493](https://github.com/matrix-org/synapse/issues/13493)) - Add a linter script which will reject non-strict types in Pydantic models. ([\#13502](https://github.com/matrix-org/synapse/issues/13502)) - Reduce the number of tests using legacy TCP replication. ([\#13543](https://github.com/matrix-org/synapse/issues/13543)) - Allow specifying additional request fields when using the `HomeServerTestCase.login` helper method. ([\#13549](https://github.com/matrix-org/synapse/issues/13549)) - Make `HomeServerTestCase` load any configured homeserver modules automatically. ([\#13558](https://github.com/matrix-org/synapse/issues/13558))
| * Drop support for delegating email validation, round 2 (#13596)David Robertson2022-08-238-243/+71
| |
* | Speed up fetching large numbers of push rules (#13592)Erik Johnston2022-08-233-9/+1
| |
* | Cache user IDs instead of profile objects (#13573)Nick Mills-Barrett2022-08-234-54/+56
|/ | | The profile objects are never used and increase cache size significantly.
* Fix that sending server notices fail if avatar is `None` (#13566)Dirk Klimpel2022-08-231-1/+1
| | | Indroduced in #11846.
* Fix Prometheus metrics being negative (mixed up start/end) (#13584)Eric Eastwood2022-08-233-2/+21
| | | | | | | Fix: - https://github.com/matrix-org/synapse/pull/13535#discussion_r949582508 - https://github.com/matrix-org/synapse/pull/13533#discussion_r949577244
* `synapse.api.auth.Auth` cleanup: make permission-related methods use ↵Quentin Gliech2022-08-2221-199/+185
| | | | | | | | | `Requester` instead of the `UserID` (#13024) Part of #13019 This changes all the permission-related methods to rely on the Requester instead of the UserID. This is a first step towards enabling scoped access tokens at some point, since I expect the Requester to have scope-related informations in it. It also changes methods which figure out the user/device/appservice out of the access token to return a Requester instead of something else. This avoids having store-related objects in the methods signatures.
* Remove redundant opentracing spans for `/sendToDevice` and `/keys/upload` ↵Andrew Morgan2022-08-222-4/+2
| | | | (#13574)
* MSC2716v4 room version - remove namespace from MSC2716 event content fields ↵Eric Eastwood2022-08-194-24/+24
| | | | | | | | (#13551) Complement PR: https://github.com/matrix-org/complement/pull/450 As suggested in https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941444525
* Implement MSC3852: Expose `last_seen_user_agent` to users for their own ↵Andrew Morgan2022-08-193-1/+38
| | | | devices; also expose to Admin API (#13549)
* Fix validation problem that occurs when a user tries to deactivate their ↵reivilibre2022-08-191-3/+3
| | | | account or change their password. (#13563)
* Add metrics to track `/messages` response time by room size (#13545)Eric Eastwood2022-08-181-2/+53
| | | | | Follow-up to https://github.com/matrix-org/synapse/pull/13533 Part of https://github.com/matrix-org/synapse/issues/13356
* Fix incorrect juggling of logging contexts in `_PerHostRatelimiter` (#13554)Sean Quah2022-08-181-10/+7
| | | | | | Signed-off-by: Sean Quah <seanq@matrix.org> Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
* Track number of hosts affected by the rate limiter (#13541)Eric Eastwood2022-08-181-4/+39
| | | | | | | Track number of hosts affected by the rate limiter so we can differentiate one really noisy homeserver from a general ratelimit tuning problem across the federation. Follow-up to https://github.com/matrix-org/synapse/pull/13534 Part of https://github.com/matrix-org/synapse/issues/13356
* Add support for compression to federation responses (#13537)Ayush Anand2022-08-181-1/+4
| | | | | | Closes #13415. Signed-off-by: Ayush Anand <iamayushanand@gmail.com>
* Avoid blocking lazy-loading `/sync`s during partial joins (#13477)Sean Quah2022-08-182-34/+243
| | | | | | | | | | | | | | | | | Use a state filter or accept partial state in a few places where we request state, to avoid blocking. To make lazy-loading `/sync`s work, we need to provide the memberships of event senders, which are not guaranteed to be in the room state. Instead we dig through auth events for memberships to present to clients. The auth events of an event are guaranteed to contain a passable membership event, otherwise the event would have been rejected. Note that this only covers the common code paths encountered during testing. There has been no exhaustive checking of all sync code paths. Fixes #13146. Signed-off-by: Sean Quah <seanq@matrix.org>
* Add metrics to track how the rate limiter is affecting requests ↵Eric Eastwood2022-08-171-8/+29
| | | | | | | (sleep/reject) (#13534) Related to https://github.com/matrix-org/synapse/pull/13499 Part of https://github.com/matrix-org/synapse/issues/13356
* Fix a bug in the `/event_reports` Admin API which meant that the total count ↵reivilibre2022-08-171-0/+6
| | | | | could be larger than the number of results you can actually query for. (#13525) Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
* Fix breaking event sending due to bad push rule (#13547)Erik Johnston2022-08-171-1/+12
| | | | | | | Broke by #13522 It looks like we have some rules in the DB with a priority class less than 0 that don't override the base rules. Before these were just dropped, but #13522 made that a hard error.
* Fix a typo in docs and in some warnings (#13538)Antonin Loubiere2022-08-173-3/+3
|
* Add forgotten status to Room Details API (#13503)Dirk Klimpel2022-08-172-0/+25
|
* Add metrics to track rate limiter queue timing (#13544)Eric Eastwood2022-08-171-0/+30
|
* Time how long it takes us to do backfill processing (#13535)Eric Eastwood2022-08-172-16/+89
|
* Add specific metric to time long-running `/messages` requests (#13533)Eric Eastwood2022-08-171-0/+32
|
* Instrument the federation/backfill part of `/messages` (#13489)Eric Eastwood2022-08-1610-33/+219
| | | | | | | | | Instrument the federation/backfill part of `/messages` so it's easier to follow what's going on in Jaeger when viewing a trace. Split out from https://github.com/matrix-org/synapse/pull/13440 Follow-up from https://github.com/matrix-org/synapse/pull/13368 Part of https://github.com/matrix-org/synapse/issues/13356
* Refuse to start when `faster_joins` is enabled on a worker deployment (#13531)Sean Quah2022-08-161-0/+7
| | | | | | Synapse does not currently support faster room joins on deployments with workers. Signed-off-by: Sean Quah <seanq@matrix.org>
* Faster room joins: make `/joined_members` block whilst the room is partial ↵reivilibre2022-08-163-1/+21
| | | | stated. (#13514)
* Make push rules use proper structures. (#13522)Erik Johnston2022-08-166-317/+476
| | | | | | | | | | | | This improves load times for push rules: | Version | Time per user | Time for 1k users | | -------------------- | ------------- | ----------------- | | Before | 138 µs | 138ms | | Now (with custom) | 2.11 µs | 2.11ms | | Now (without custom) | 49.7 ns | 0.05 ms | This therefore has a large impact on send times for rooms with large numbers of local users in the room.
* Use Pydantic to systematically validate a first batch of endpoints in ↵David Robertson2022-08-154-85/+180
| | | | `synapse.rest.client.account`. (#13188)
* Instrument `FederationStateIdsServlet` - `/state_ids` (#13499)Eric Eastwood2022-08-154-2/+20
| | | Instrument FederationStateIdsServlet - `/state_ids` so it's easier to follow what's going on in Jaeger when viewing a trace.
* Revert "Update locked versions of mypy and mypy-zope (#13521)"David Robertson2022-08-154-19/+31
| | | | | | | | This reverts commit f383b9b3eceaa082d5ae690550fe41460b711779. Other PRs were seeing mypy failures that looked to be related to mypy-zope. Confusingly, we didn't see this on #13521. Revert this for now and investigate later.
* Clarifications for event push action processing. (#13485)Patrick Cloke2022-08-152-21/+34
| | | | | | | | * Clarifies comments. * Fixes an erroneous comment (about return type) added in #13455 (ec24813220f9d54108924dc04aecd24555277b99). * Clarifies the name of a variable. * Simplifies logic of pulling out the latest join for the requesting user.
* Update locked versions of mypy and mypy-zope (#13521)David Robertson2022-08-154-31/+19
|
* Add viewport directive to HTML templates to optimise for mobile (#13493)Germain2022-08-1130-47/+139
|
* Merge branch 'release-v1.65' into developOlivier Wilkinson (reivilibre)2022-08-111-0/+1
|\
| * Revert 'Remove the unspecced field in the response. (#13365)' to give more ↵reivilibre2022-08-111-0/+1
| | | | | | | | time for clients to update. (#13501)
* | Update the rejected state of events during resync (#13459)Richard van der Hoff2022-08-113-9/+65
| | | | | | | | | | Events can be un-rejected or newly-rejected during resync, so ensure we update the database and caches when that happens.
* | Add some miscellaneous comments around sync (#13474)Sean Quah2022-08-102-40/+80
| | | | | | | | | | | | | | | | Add some miscellaneous comments to document sync, especially around `compute_state_delta`. Signed-off-by: Sean Quah <seanq@matrix.org> Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
* | Allow use of both `@trace` and `@tag_args` stacked on the same function (#13453)Eric Eastwood2022-08-091-56/+102
|/ | | | | | | | | | | | | ```py @trace @tag_args async def get_oldest_event_ids_with_depth_in_room(...) ... ``` Before this PR, you would see a warning in the logs and the span was not exported: ``` 2022-08-03 19:11:59,383 - synapse.logging.opentracing - 835 - ERROR - GET-0 - @trace may not have wrapped EventFederationWorkerStore.get_oldest_event_ids_with_depth_in_room correctly! The function is not async but returned a coroutine. ```
* Correct a misnamed argument in state res v2 (#13467)David Robertson2022-08-081-6/+6
| | | | | | | | | | | In state res v2, we apply two passes of iterative auth checks. The first pass replays power events and events in their auth chains, but only those belonging to the full conflicted set. The source code as written suggests that we want only those belonging to the auth difference (which is a smaller set of events). At runtime we were doing the correct thing anyway, because the only callsite of `_reverse_topological_power_sort` passes in the `full_conflicted_set`. So this really is just a rename.
* Support stable identifiers for MSC2285: private read receipts. (#13273)Šimon Brandner2022-08-0510-42/+126
| | | | | This adds support for the stable identifiers of MSC2285 while continuing to support the unstable identifiers behind the configuration flag. These will be removed in a future version.
* Update module API "update room membership" method to allow for remote joins ↵Matt C2022-08-051-4/+4
| | | | | | (#13441) Co-authored-by: MattC <buffless-matt@users.noreply.github.com> Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
* Add comments about how event push actions are stored. (#13445)Erik Johnston2022-08-041-0/+61
|
* Fix `@tag_args` being off-by-one (ahead) (#13452)Eric Eastwood2022-08-041-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix @tag_args being off-by-one (ahead) Example: ``` argspec.args=[ 'self', 'room_id' ] args=( <synapse.storage.databases.main.DataStore object at 0x10d0b8d00>, '!HBehERstyQBxyJDLfR:my.synapse.server' ) ``` --- The previous logic was also flawed and we can end up in a situation like this: ``` argspec.args=['self', 'dest', 'room_id', 'limit', 'extremities'] args=(<synapse.federation.federation_client.FederationClient object at 0x7f1651c18160>, 'hs1', '!jAEHKIubyIfuLOdfpY:hs1') ``` From this source: ```py async def backfill( self, dest: str, room_id: str, limit: int, extremities: Collection[str] ) -> Optional[List[EventBase]]: ``` And this usage: ```py events = await self._federation_client.backfill( dest, room_id, limit=limit, extremities=extremities ) ``` which would previously cause this error: ``` synapse_main | 2022-08-04 06:13:12,051 - synapse.handlers.federation - 424 - ERROR - GET-5 - Failed to backfill from hs1 because tuple index out of range synapse_main | Traceback (most recent call last): synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/handlers/federation.py", line 392, in try_backfill synapse_main | await self._federation_event_handler.backfill( synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/logging/tracing.py", line 828, in _wrapper synapse_main | return await func(*args, **kwargs) synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/handlers/federation_event.py", line 593, in backfill synapse_main | events = await self._federation_client.backfill( synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/logging/tracing.py", line 828, in _wrapper synapse_main | return await func(*args, **kwargs) synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/logging/tracing.py", line 827, in _wrapper synapse_main | with wrapping_logic(func, *args, **kwargs): synapse_main | File "/usr/local/lib/python3.9/contextlib.py", line 119, in __enter__ synapse_main | return next(self.gen) synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/logging/tracing.py", line 922, in _wrapping_logic synapse_main | set_attribute("ARG_" + arg, str(args[i + 1])) # type: ignore[index] synapse_main | IndexError: tuple index out of range ```
* Improve comments (& avoid a duplicate query) in push actions processing. ↵Patrick Cloke2022-08-041-124/+158
| | | | | | | | | (#13455) * Adds docstrings and inline comments. * Formats SQL queries using triple quoted strings. * Minor formatting changes. * Avoid fetching `event_push_summary_stream_ordering` multiple times in the same transactions.
* Update type of `EventContext.rejected` (#13460)Richard van der Hoff2022-08-042-5/+4
|
* Faster Room Joins: prevent Synapse from answering federated join requests ↵reivilibre2022-08-042-0/+34
| | | | for a room which it has not fully joined yet. (#13416)
* Optimise async get event lookups (#13435)Nick Mills-Barrett2022-08-043-8/+86
| | | | | | Still maintains local in memory lookup optimisation, but does any external lookup as part of the deferred that prevents duplicate lookups for the same event at once. This makes the assumption that fetching from an external cache is a non-zero load operation.
* Add module API method to create a room (#13429)Matt C2022-08-041-0/+51
| | | | Co-authored-by: MattC <buffless-matt@users.noreply.github.com> Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
* Fix rooms not being properly excluded from incremental sync (#13408)Brendan Abolivier2022-08-041-10/+15
|
* Add some tracing spans to give insight into local joins (#13439)Shay2022-08-032-33/+39
|
* Instrument `/messages` for understandable traces in Jaeger (#13368)Eric Eastwood2022-08-0310-1/+31
| | | | | | In Jaeger: - Before: huge list of uncategorized database calls - After: nice and collapsible into units of work
* Return 404 or member list when getting joined_members after leaving (#13374)andrew do2022-08-031-2/+4
| | | | | | Signed-off-by: Andrew Doh <andrewddo@gmail.com> Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> Co-authored-by: Andrew Morgan <andrewm@element.io> Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
* Rename `RateLimitConfig` to `RatelimitSettings` (#13442)Dirk Klimpel2022-08-034-29/+29
|
* Add module API method to resolve a room alias to a room ID (#13428)Matt C2022-08-031-0/+24
| | | | Co-authored-by: MattC <buffless-matt@users.noreply.github.com> Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
* Fix error when out of servers to sync partial state with (#13432)Sean Quah2022-08-021-2/+3
| | | | | so that we raise the intended error instead. Signed-off-by: Sean Quah <seanq@matrix.org>
* Faster Room Joins: don't leave a stuck room partial state flag if the join ↵reivilibre2022-08-011-14/+18
| | | | fails. (#13403)
* Fix missing import in `federation_event` handler. (#13431)Patrick Cloke2022-08-011-0/+1
| | | | #13404 removed an import of `Optional` which was still needed due to #13413 added more usages.
* Refactor `_resolve_state_at_missing_prevs` to return an `EventContext` (#13404)Sean Quah2022-08-013-82/+56
| | | | | | | | Previously, `_resolve_state_at_missing_prevs` returned the resolved state before an event and a partial state flag. These were unwieldy to carry around would only ever be used to build an event context. Build the event context directly instead. Signed-off-by: Sean Quah <seanq@matrix.org>
* Faster joins: fix rejected events becoming un-rejected during resync (#13413)Richard van der Hoff2022-08-012-6/+31
| | | | | Make sure that we re-check the auth rules during state resync, otherwise rejected events get un-rejected.
* Merge tag 'v1.64.0rc2' into developRichard van der Hoff2022-07-298-71/+242
|\ | | | | | | | | | | | | Synapse 1.64.0rc2 (2022-07-29) ============================== This RC reintroduces support for `account_threepid_delegates.email`, which was removed in 1.64.0rc1. It remains deprecated and will be removed altogether in a future release. ([\#13406](https://github.com/matrix-org/synapse/issues/13406))
| * Revert "Drop support for delegating email validation (#13192)" (#13406)3nprob2022-07-298-71/+242
| | | | | | | | | | Reverts commit fa71bb18b527d1a3e2629b48640ea67fff2f8c59, and tweaks documentation. Signed-off-by: 3nprob <git@3n.anonaddy.com>
* | Use stable prefixes for MSC3827: filtering of `/publicRooms` by room type ↵Šimon Brandner2022-07-275-8/+5
| | | | | | | | | | | | (#13370) Signed-off-by: Šimon Brandner <simon.bra.ag@gmail.com>
* | Implement MSC3848: Introduce errcodes for specific event sending failures ↵Will Hunt2022-07-279-34/+140
| | | | | | | | | | (#13343) Implements MSC3848
* | Make minor clarifications to the error messages given when we fail to join a ↵reivilibre2022-07-272-2/+12
| | | | | | | | room via any server. (#13160)
* | Fix `get_pdu` asking every remote destination even after it finds an event ↵Eric Eastwood2022-07-271-3/+3
| | | | | | | | (#13346)
* | Copy room serials before handling in `get_new_events_as` (#13392)Nick Mills-Barrett2022-07-261-3/+10
| |
* | Remove the unspecced `room_id` field in the `/hierarchy` response. (#13365)Patrick Cloke2022-07-261-1/+0
| | | | | | | | | | | | | | | | | | The `room_id` field represented the parent space for each room and was made redundant by changes in the API shape where the `children_state` is now nested underneath each `room`. The room ID of each child is in the `state_key` field and is still available.
* | Fix infinite loop in partial-state resync (#13353)Richard van der Hoff2022-07-262-8/+26
| | | | | | | | | | Make sure that we only pull out events from the db once they have no prev-events with partial state.
* | Faster room joins: avoid blocking when pulling events with missing prevs ↵Sean Quah2022-07-264-32/+114
| | | | | | | | | | | | | | | | | | (#13355) Avoid blocking on full state in `_resolve_state_at_missing_prevs` and return a new flag indicating whether the resolved state is partial. Thread that flag around so that it makes it into the event context. Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
* | Remove unused argument for get_relations_for_event. (#13383)Patrick Cloke2022-07-262-9/+0
|/
* Disable autocorrect and autocaptialisation when entering username for SSO ↵Doug2022-07-261-1/+1
| | | | | registration. (#13350) When registering a new account via SSO on iOS, the text field becomes pretty annoying as it autocapitalises and autocorrects your input. This PR fixes that (although I have only tested the raw HTML file on the simulator, I'm not sure how to get the complete setup available for testing in the flow).
* Support Implicit TLS for sending emails (#13317)Jan Schär2022-07-252-11/+32
| | | | | | | | | | Previously, TLS could only be used with STARTTLS. Add a new option `force_tls`, where TLS is used from the start. Implicit TLS is recommended over STARTLS, see https://datatracker.ietf.org/doc/html/rfc8314 Fixes #8046. Signed-off-by: Jan Schär <jan@jschaer.ch>
* Additional fixes for opentracing type hints. (#13362)Patrick Cloke2022-07-251-2/+2
|
* Refactor presence so we can prune user in room caches (#13313)Erik Johnston2022-07-253-91/+108
| | | | | | | | See #10826 and #10786 for context as to why we had to disable pruning on those caches. Now that `get_users_who_share_room_with_user` is called frequently only for presence, we just need to make calls to it less frequent and then we can remove the various levels of caching that is going on.
* Backfill remote event fetched by MSC3030 so we can paginate from it later ↵Eric Eastwood2022-07-222-15/+93
| | | | | | | | | (#13205) Depends on https://github.com/matrix-org/synapse/pull/13320 Complement tests: https://github.com/matrix-org/complement/pull/406 We could use the same method to backfill for `/context` as well in the future, see https://github.com/matrix-org/synapse/issues/3848
* Skip soft fail checks for rooms with partial state (#13354)Sean Quah2022-07-221-0/+10
| | | | | | | | | | | | When a room has the partial state flag, we may not have an accurate `m.room.member` event for event senders in the room's current state, and so cannot perform soft fail checks correctly. Skip the soft fail check entirely in this case. As an alternative, we could block until we have full state, but that would prevent us from receiving incoming events over federation, which is undesirable. Signed-off-by: Sean Quah <seanq@matrix.org>
* Remove old empty/redundant slaved stores. (#13349)Nick Mills-Barrett2022-07-219-166/+36
|
* Make DictionaryCache have better expiry properties (#13292)Erik Johnston2022-07-214-34/+321
|
* Don't hold onto full state in state cache (#13324)Erik Johnston2022-07-211-15/+53
|
* Track DB txn times w/ two counters, not histogram (#13342)David Robertson2022-07-211-3/+5
|
* Add missing types to opentracing. (#13345)Patrick Cloke2022-07-2110-32/+60
| | | After this change `synapse.logging` is fully typed.
* Use cache store remove base slaved (#13329)Nick Mills-Barrett2022-07-2115-114/+38
| | | This comes from two identical definitions in each of the base stores, and means the base slaved store is now empty and can be removed.
* Update `get_pdu` to return the original, pristine `EventBase` (#13320)Eric Eastwood2022-07-203-49/+119
| | | | | | | | | | | | Update `get_pdu` to return the untouched, pristine `EventBase` as it was originally seen over federation (no metadata added). Previously, we returned the same `event` reference that we stored in the cache which downstream code modified in place and added metadata like setting it as an `outlier` and essentially poisoned our cache. Now we always return a copy of the `event` so the original can stay pristine in our cache and re-used for the next cache call. Split out from https://github.com/matrix-org/synapse/pull/13205 As discussed at: - https://github.com/matrix-org/synapse/pull/13205#discussion_r918365746 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918366125 Related to https://github.com/matrix-org/synapse/issues/12584. This PR doesn't fix that issue because it hits [`get_event` which exists from the local database before it tries to `get_pdu`](https://github.com/matrix-org/synapse/blob/7864f33e286dec22368dc0b11c06eebb1462a51e/synapse/federation/federation_client.py#L581-L594).
* Validate federation destinations and log an error if server name is invalid. ↵Shay2022-07-201-0/+9
| | | | (#13318)
* Merge remote-tracking branch 'origin/master' into developErik Johnston2022-07-201-0/+7
|\
| * Don't include appservice users when calculating push rules (#13332)Erik Johnston2022-07-201-0/+7
| | | | | | This can cause a lot of extra load on servers with lots of appservice users. Introduced in #13078
* | Fix spurious warning when fetching state after a missing prev event (#13258)Sean Quah2022-07-191-0/+3
| |
* | Add type annotations to `trace` decorator. (#13328)Patrick Cloke2022-07-1911-55/+101
| | | | | | | | Functions that are decorated with `trace` are now properly typed and the type hints for them are fixed.
* | Merge branch 'master' into developBrendan Abolivier2022-07-192-8/+8
|\|
| * Remove 'anonymised' from the phone home stats documentation (#13321)Andrew Morgan2022-07-192-8/+8
| |
* | Reduce memory usage of state group cache (#13323)Erik Johnston2022-07-191-1/+2
| |
* | Rate limit joins per-room (#13276)David Robertson2022-07-198-9/+106
| |
* | Safe async event cache (#13308)Nick Mills-Barrett2022-07-197-21/+101
| | | | | | | | | | | | | | | | Fix race conditions in the async cache invalidation logic, by separating the async & local invalidation calls and ensuring any async call i executed first. Signed off by Nick @ Beeper (@Fizzadar).
* | Increase batch size of `bulk_get_push_rules` and ↵Shay2022-07-182-1/+2
| | | | | | | | `_get_joined_profiles_from_event_ids`. (#13300)
* | Improve performance of query ` _get_subset_users_in_room_with_profiles` (#13299)Shay2022-07-181-1/+1
| |
* | Fix overcounting of pushers when they are replaced (#13296)Sean Quah2022-07-181-11/+16
| | | | | | | | Signed-off-by: Sean Quah <seanq@matrix.org>
* | Revert "Make all `process_replication_rows` methods async (#13304)" (#13312)Erik Johnston2022-07-1813-39/+25
| | | | | | This reverts commit 5d4028f217f178fcd384d5bfddd92225b4e78c51.
* | Don't pull out full state when sending dummy events (#13310)Erik Johnston2022-07-181-7/+1
| |
* | Use READ COMMITTED isolation level when purging rooms (#12942)Nick Mills-Barrett2022-07-181-2/+31
| | | | | | | | | | To close: #10294. Signed off by Nick @ Beeper.
* | Don't pull out the full state when creating an event (#13281)Erik Johnston2022-07-182-2/+9
| |
* | Make all `process_replication_rows` methods async (#13304)Nick Mills-Barrett2022-07-1713-25/+39
| | | | | | | | | | More prep work for asyncronous caching, also makes all process_replication_rows methods consistent (presence handler already is so). Signed off by Nick @ Beeper (@Fizzadar)
* | Provide more info why we don't have any thumbnails to serve (#13038)Eric Eastwood2022-07-152-9/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix https://github.com/matrix-org/synapse/issues/13016 ## New error code and status ### Before Previously, we returned a `404` for `/thumbnail` which isn't even in the spec. ```json { "errcode": "M_NOT_FOUND", "error": "Not found [b'hs1', b'tefQeZhmVxoiBfuFQUKRzJxc']" } ``` ### After What does the spec say? > 400: The request does not make sense to the server, or the server cannot thumbnail the content. For example, the client requested non-integer dimensions or asked for negatively-sized images. > > *-- https://spec.matrix.org/v1.1/client-server-api/#get_matrixmediav3thumbnailservernamemediaid* Now with this PR, we respond with a `400` when we don't have thumbnails to serve and we explain why we might not have any thumbnails. ```json { "errcode": "M_UNKNOWN", "error": "Cannot find any thumbnails for the requested media ([b'example.com', b'12345']). This might mean the media is not a supported_media_format=(image/jpeg, image/jpg, image/webp, image/gif, image/png) or that thumbnailing failed for some other reason. (Dynamic thumbnails are disabled on this server.)", } ``` > Cannot find any thumbnails for the requested media ([b'example.com', b'12345']). This might mean the media is not a supported_media_format=(image/jpeg, image/jpg, image/webp, image/gif, image/png) or that thumbnailing failed for some other reason. (Dynamic thumbnails are disabled on this server.) --- We still respond with a 404 in many other places. But we can iterate on those later and maybe keep some in some specific places after spec updates/clarification: https://github.com/matrix-org/matrix-spec/issues/1122 We can also iterate on the bugs where Synapse doesn't thumbnail when it should in other issues/PRs.
* | Don't pull out the full state when storing state (#13274)Erik Johnston2022-07-153-69/+125
| |
* | Use state before join to determine if we `_should_perform_remote_join` (#13270)David Robertson2022-07-153-24/+34
| | | | | | Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
* | Bg update to populate new `events` table columns (#13215)Richard van der Hoff2022-07-152-0/+134
| | | | | | | | | | These columns were added back in Synapse 1.52, and have been populated for new events since then. It's now (beyond) time to back-populate them for existing events.
* | Fix a bug which could lead to incorrect state (#13278)Erik Johnston2022-07-152-6/+16
| | | | | | | | | | There are two fixes here: 1. A long-standing bug where we incorrectly calculated `delta_ids`; and 2. A bug introduced in #13267 where we got current state incorrect.
* | Async get event cache prep (#13242)Nick Mills-Barrett2022-07-157-20/+79
| | | | | | | | | | Some experimental prep work to enable external event caching based on #9379 & #12955. Doesn't actually move the cache at all, just lays the groundwork for async implemented caches. Signed off by Nick @ Beeper (@Fizzadar)
* | Federation Sender & Appservice Pusher Stream Optimisations (#13251)Nick Mills-Barrett2022-07-155-79/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | * Replace `get_new_events_for_appservice` with `get_all_new_events_stream` The functions were near identical and this brings the AS worker closer to the way federation senders work which can allow for multiple workers to handle AS traffic. * Pull received TS alongside events when processing the stream This avoids an extra query -per event- when both federation sender and appservice pusher process events.
* | Rip out auth-event reconciliation code (#12943)Richard van der Hoff2022-07-142-221/+82
| | | | | | | | | | | | | | There is a corner in `_check_event_auth` (long known as "the weird corner") where, if we get an event with auth_events which don't match those we were expecting, we attempt to resolve the diffence between our state and the remote's with a state resolution. This isn't specced, and there's general agreement we shouldn't be doing it. However, it turns out that the faster-joins code was relying on it, so we need to introduce something similar (but rather simpler) for that.
* | Don't pull out state in `compute_event_context` for unconflicted state (#13267)Erik Johnston2022-07-145-81/+94
| |
* | Allow rate limiters to passively record actions they cannot limit (#13253)David Robertson2022-07-131-12/+82
| | | | | | Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
* | Notifier: accept callbacks to fire on room joins (#13254)David Robertson2022-07-131-0/+18
| |
* | Call the v2 identity service `/3pid/unbind` endpoint, rather than v1. (#13240)Jacek Kuśnierz2022-07-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | * Drop support for v1 unbind Signed-off-by: Jacek Kusnierz <jacek.kusnierz@tum.de> * Add changelog Signed-off-by: Jacek Kusnierz <jacek.kusnierz@tum.de> * Update changelog.d/13240.misc
* | Add support for room version 10 (#13220)Shay2022-07-132-0/+59
| |
* | Optimise room creation event lookups part 2 (#13224)Nick Mills-Barrett2022-07-132-15/+73
| |
* | Reduce duplicate code in receipts servlets. (#13198)Patrick Cloke2022-07-132-44/+32
| |
* | Add prometheus counters for content types other than events (#13175)Brad Murray2022-07-131-0/+14
| |
* | Drop unused tables from groups/communities. (#12967)Patrick Cloke2022-07-133-19/+35
| | | | | | | | These tables have been unused since Synapse v1.61.0, although schema version 72 was added in Synapse v1.62.0.
* | Fix "add user" admin api error when request contains a "msisdn" threepid ↵Thomas Weston2022-07-131-0/+1
| | | | | | | | | | | | (#13263) Co-authored-by: Thomas Weston <thomas.weston@clearspancloud.com> Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
* | Inline URL preview documentation. (#13261)Patrick Cloke2022-07-121-4/+58
| | | | | | Inline URL preview documentation near the implementation.
* | Drop unused table `event_reference_hashes` (#13218)Richard van der Hoff2022-07-121-0/+17
| | | | | | This is unused since Synapse 1.60.0 (#12679). It's time for it to go.
* | Drop support for calling `/_matrix/client/v3/account/3pid/bind` without an ↵Jacek Kuśnierz2022-07-122-26/+10
| | | | | | | | | | | | | | `id_access_token` (#13239) Fixes #13201 Signed-off-by: Jacek Kusnierz jacek.kusnierz@tum.de
* | Drop support for delegating email validation (#13192)Richard van der Hoff2022-07-128-234/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Drop support for delegating email validation Delegating email validation to an IS is insecure (since it allows the owner of the IS to do a password reset on your HS), and has long been deprecated. It will now cause a config error at startup. * Update unit test which checks for email verification Give it an `email` config instead of a threepid delegate * Remove unused method `requestEmailToken` * Simplify config handling for email verification Rather than an enum and a boolean, all we need here is a single bool, which says whether we are or are not doing email verification. * update docs * changelog * upgrade.md: fix typo * update version number this will be in 1.64, not 1.63 * update version number this one too
* | Log the stack when waiting for an entire room to be un-partial stated (#13257)Sean Quah2022-07-121-0/+1
| | | | | | | | The stack is already logged when waiting for an event to be un-partial stated. Log the stack for rooms as well, to aid in debugging.
* | Make the AS login method call `Auth.get_user_by_req` for checking the AS ↵Quentin Gliech2022-07-121-2/+8
| | | | | | | | | | | | | | | | token. (#13094) This gets rid of another usage of get_appservice_by_req, with all the benefits, including correctly tracking the appservice IP and setting the tracing attributes correctly. Signed-off-by: Quentin Gliech <quenting@element.io>
* | expose whether a room is a space in the Admin API (#13208)andrew do2022-07-121-2/+4
|/
* Don't pull out the full state when calculating push actions (#13078)Erik Johnston2022-07-114-341/+160
|
* Reduce event lookups during room creation by passing known event IDs (#13210)Nick Mills-Barrett2022-07-111-2/+16
| | | | | | | | Inspired by the room batch handler, this uses previous event inserts to pre-populate prev events during room creation, reducing the number of queries required to create a room. Signed off by Nick @ Beeper (@Fizzadar)
* Uniformize spam-checker API, part 5: expand other spam-checker callbacks to ↵David Teller2022-07-118-50/+176
| | | | | | return `Tuple[Codes, dict]` (#13044) Signed-off-by: David Teller <davidt@element.io> Co-authored-by: Brendan Abolivier <babolivier@matrix.org>
* Fix to-device messages not being sent to MSC3202-enabled appservices (#13235)Travis Ralston2022-07-111-2/+3
| | | | The field name was simply incorrect, leading to errors.
* Remove delay when rotating event push actions (#13211)Erik Johnston2022-07-111-3/+1
| | | | We want to be as up to date as possible, and sleeping doesn't help here and can mean we fall behind.
* Add a `filter_event_for_clients_with_state` function (#13222)Erik Johnston2022-07-112-138/+399
|
* Fix appservice EDUs failing to send if the EDU doesn't have a room ID (#13236)Travis Ralston2022-07-111-1/+3
| | | | | | | | | | | * Fix appservice EDUs failing to send if the EDU doesn't have a room ID As is in the case of presence. * changelog * linter * fix linter again
* Ensure portdb selects _all_ rows with negative rowids (#13226)David Robertson2022-07-111-1/+4
|
* Fix notification count after a highlighted message (#13223)Erik Johnston2022-07-081-3/+8
| | | | | Fixes #13196 Broke by #13005
* Fix exception when using MSC3030 to look for remote federated events before ↵Eric Eastwood2022-07-071-1/+5
| | | | | | | | | | | | | | | | room creation (#13197) Complement tests: https://github.com/matrix-org/complement/pull/405 This happens when you have some messages imported before the room is created. Then use MSC3030 to look backwards before the room creation from a remote federated server. The server won't find anything locally, but will ask over federation which will have the remote event. The previous logic would choke on not having the local event assigned. ``` Failed to fetch /timestamp_to_event from hs2 because of exception(UnboundLocalError) local variable 'local_event' referenced before assignment args=("local variable 'local_event' referenced before assignment",) ```
* Faster room joins: fix race in recalculation of current room state (#13151)Sean Quah2022-07-076-55/+211
| | | | | | | | | | | Bounce recalculation of current state to the correct event persister and move recalculation of current state into the event persistence queue, to avoid concurrent updates to a room's current state. Also give recalculation of a room's current state a real stream ordering. Signed-off-by: Sean Quah <seanq@matrix.org>
* Use a single query in `ProfileHandler.get_profile` (#13209)Nick Mills-Barrett2022-07-071-12/+7
|
* Check that `auto_vacuum` is disabled when porting a SQLite database to ↵reivilibre2022-07-071-0/+34
| | | | Postgres, as `VACUUM`s must not be performed between runs of the script. (#13195)
* Make `_get_state_map_for_room` not break when room state events don't ↵David Teller2022-07-071-8/+1
| | | | | contain an event id. (#13174) Method `_get_state_map_for_room` seems to break in presence of some ill-formed events in the database. Reimplementing this method to use `get_current_state`, which is more robust to such events.
* Fix bug where we failed to delete old push actions (#13194)Erik Johnston2022-07-061-2/+4
| | | This happened if we encountered a stream ordering in `event_push_actions` that had more rows than the batch size of the delete, as If we don't delete any rows in an iteration then the next time round we get the exact same stream ordering and get stuck.
* Handle race between persisting an event and un-partial stating a room (#13100)Sean Quah2022-07-059-74/+233
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever we want to persist an event, we first compute an event context, which includes the state at the event and a flag indicating whether the state is partial. After a lot of processing, we finally try to store the event in the database, which can fail for partial state events when the containing room has been un-partial stated in the meantime. We detect the race as a foreign key constraint failure in the data store layer and turn it into a special `PartialStateConflictError` exception, which makes its way up to the method in which we computed the event context. To make things difficult, the exception needs to cross a replication request: `/fed_send_events` for events coming over federation and `/send_event` for events from clients. We transport the `PartialStateConflictError` as a `409 Conflict` over replication and turn `409`s back into `PartialStateConflictError`s on the worker making the request. All client events go through `EventCreationHandler.handle_new_client_event`, which is called in *a lot* of places. Instead of trying to update all the code which creates client events, we turn the `PartialStateConflictError` into a `429 Too Many Requests` in `EventCreationHandler.handle_new_client_event` and hope that clients take it as a hint to retry their request. On the federation event side, there are 7 places which compute event contexts. 4 of them use outlier event contexts: `FederationEventHandler._auth_and_persist_outliers_inner`, `FederationHandler.do_knock`, `FederationHandler.on_invite_request` and `FederationHandler.do_remotely_reject_invite`. These events won't have the partial state flag, so we do not need to do anything for then. The remaining 3 paths which create events are `FederationEventHandler.process_remote_join`, `FederationEventHandler.on_send_membership_event` and `FederationEventHandler._process_received_pdu`. We can't experience the race in `process_remote_join`, unless we're handling an additional join into a partial state room, which currently blocks, so we make no attempt to handle it correctly. `on_send_membership_event` is only called by `FederationServer._on_send_membership_event`, so we catch the `PartialStateConflictError` there and retry just once. `_process_received_pdu` is called by `on_receive_pdu` for incoming events and `_process_pulled_event` for backfill. The latter should never try to persist partial state events, so we ignore it. We catch the `PartialStateConflictError` in `on_receive_pdu` and retry just once. Refering to the graph of code paths in https://github.com/matrix-org/synapse/issues/12988#issuecomment-1156857648 may make the above make more sense. Signed-off-by: Sean Quah <seanq@matrix.org>
* Type `tests.utils` (#13028)David Robertson2022-07-052-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Cast to postgres types when handling postgres db * Remove unused method * Easy annotations * Annotate create_room * Use `ParamSpec` to annotate looping_call * Annotate `default_config` * Track `now` as a float `time_ms` returns an int like the proper Synapse `Clock` * Introduce a `Timer` dataclass * Introduce a Looper type * Suppress checking of a mock * tests.utils is typed * Changelog * Whoops, import ParamSpec from typing_extensions * ditch the psycopg2 casts
* Use upserts for updating `event_push_summary` (#13153)Erik Johnston2022-07-051-40/+7
|
* Fix application service not being able to join remote federated room without ↵Eric Eastwood2022-07-051-9/+23
| | | | | | | a profile set (#13131) Fix https://github.com/matrix-org/synapse/issues/4778 Complement tests: https://github.com/matrix-org/complement/pull/399
* Merge tag 'v1.62.0rc3' into developAndrew Morgan2022-07-041-2/+7
|\ | | | | | | | | | | | | | | | | | | | | Synapse 1.62.0rc3 (2022-07-04) ============================== Bugfixes -------- - Update the version of the [ldap3 plugin](https://github.com/matrix-org/matrix-synapse-ldap3/) included in the `matrixdotorg/synapse` DockerHub images and the Debian packages hosted on `packages.matrix.org` to 0.2.1. This fixes [a bug](https://github.com/matrix-org/matrix-synapse-ldap3/pull/163) with usernames containing uppercase characters. ([\#13156](https://github.com/matrix-org/synapse/issues/13156)) - Fix a bug introduced in Synapse 1.62.0rc1 affecting unread counts for users on small servers. ([\#13168](https://github.com/matrix-org/synapse/issues/13168))
| * Fix stuck notification counts on small servers (#13168)Erik Johnston2022-07-041-2/+7
| |
* | Extra validation for rest/client/account_data (#13148)David Robertson2022-07-011-2/+17
| | | | | | | | | | | | | | * Extra validation for rest/client/account_data This is a fairly simple endpoint and we did pretty well here. * Changelog
* | `_process_received_pdu`: Improve exception handling (#13145)Richard van der Hoff2022-07-011-7/+6
| | | | | | | | `_check_event_auth` is expected to raise `AuthError`s, so no need to log it again.