summary refs log tree commit diff
path: root/synapse/storage/util (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Fix bug where a new writer advances their token too quickly (#16473)Erik Johnston2023-10-231-1/+67
| | | | | | | | | | | | | | | | | | | * Fix bug where a new writer advances their token too quickly When starting a new writer (for e.g. persisting events), the `MultiWriterIdGenerator` doesn't have a minimum token for it as there are no rows matching that new writer in the DB. This results in the the first stream ID it acquired being announced as persisted *before* it actually finishes persisting, if another writer gets and persists a subsequent stream ID. This is due to the logic of setting the minimum persisted position to the minimum known position of across all writers, and the new writer starts off not being considered. * Fix sending out POSITIONs when our token advances without update Broke in #14820 * For replication HTTP requests, only wait for minimal position
* Combine AbstractStreamIdTracker and AbstractStreamIdGenerator. (#15192)Patrick Cloke2023-03-031-12/+5
| | | | | AbstractStreamIdTracker (now) has only a single sub-class: AbstractStreamIdGenerator, combine them to simplify some code and remove any direct references to AbstractStreamIdTracker.
* Add a `get_next_txn` method to `StreamIdGenerator` to match ↵Andrew Morgan2023-03-022-2/+45
| | | | `MultiWriterIdGenerator` (#15191
* Always notify replication when a stream advances (#14877)Erik Johnston2023-01-201-2/+24
| | | This ensures that all other workers are told about stream updates in a timely manner, without having to remember to manually poke replication.
* Wait for streams to catch up when processing HTTP replication. (#14820)Erik Johnston2023-01-181-15/+19
| | | | This should hopefully mitigate a class of races where data gets out of sync due a HTTP replication request racing with the replication streams.
* Reintroduce #14376, with bugfix for monoliths (#14468)David Robertson2022-11-161-3/+10
| | | | | | | | | | | | | | | | | | | | | | * Add tests for StreamIdGenerator * Drive-by: annotate all defs * Revert "Revert "Remove slaved id tracker (#14376)" (#14463)" This reverts commit d63814fd736fed5d3d45ff3af5e6d3bfae50c439, which in turn reverted 36097e88c4da51fce6556a58c49bd675f4cf20ab. This restores the latter. * Fix StreamIdGenerator not handling unpersisted IDs Spotted by @erikjohnston. Closes #14456. * Changelog Co-authored-by: Nick Mills-Barrett <nick@fizzadar.com> Co-authored-by: Erik Johnston <erik@matrix.org>
* Revert "Remove slaved id tracker (#14376)" (#14463)Erik Johnston2022-11-161-10/+3
| | | This reverts commit 36097e88c4da51fce6556a58c49bd675f4cf20ab.
* Remove slaved id tracker (#14376)Nick Mills-Barrett2022-11-141-3/+10
| | | | | This matches the multi instance writer ID generator class which can both handle advancing the current token over replication and by calling the database.
* Cancel the processing of key query requests when they time out. (#13680)reivilibre2022-09-071-0/+3
|
* When loading current ids, sort by `stream_id` to avoid incorrect overwrite ↵Eric Eastwood2022-08-241-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and avoid errors caused by sorting alphabetical instance name which can be `null` (#13585) When loading current ids, sort by stream ID so that we don't want to overwrite the `current_position` of an instance to a lower stream ID than we're actually at ([discussion](https://github.com/matrix-org/synapse/pull/13585#discussion_r951795379)). Previously, it sorted alphabetically by instance name which can be `null` and throw errors but more importantly, accomplishes nothing. Fixes the following startup error which is why I started looking into this area: ``` $ poetry run synapse_homeserver --config-path homeserver.yaml **************************************************************** Error during initialisation: '<' not supported between instances of 'NoneType' and 'str' There may be more information in the logs. **************************************************************** ``` Somehow my database ended up looking like the following, notice the `instance_name` is `null` in the db, and we can't sort `NoneType` things. Another question is why do we see the `instance_name` as `null` sometimes instead of `master` in monolith mode? ``` $ psql synapse synapse=# SELECT * FROM stream_positions; stream_name | instance_name | stream_id -----------------+---------------+----------- account_data | master | 1242 events | master | 1787 to_device | master | 58 presence_stream | master | 485638 receipts | master | 341 backfill | master | -139106 (6 rows) synapse=# SELECT instance_name, stream_id FROM receipts_linearized; instance_name | stream_id ---------------+----------- | 211 | 3 | 4 | 212 | 213 | 224 | 228 | 164 | 313 | 253 | 38 | 321 | 324 | 189 | 192 | 193 | 194 | 195 | 197 | 198 | 275 | 79 | 339 | 340 | 82 | 341 | 84 | 85 | 91 | 119 ```
* Instrument the federation/backfill part of `/messages` (#13489)Eric Eastwood2022-08-161-0/+3
| | | | | | | | | Instrument the federation/backfill part of `/messages` so it's easier to follow what's going on in Jaeger when viewing a trace. Split out from https://github.com/matrix-org/synapse/pull/13440 Follow-up from https://github.com/matrix-org/synapse/pull/13368 Part of https://github.com/matrix-org/synapse/issues/13356
* Log the stack when waiting for an entire room to be un-partial stated (#13257)Sean Quah2022-07-121-0/+1
| | | | The stack is already logged when waiting for an event to be un-partial stated. Log the stack for rooms as well, to aid in debugging.
* Wait for lazy join to complete when getting current state (#12872)Erik Johnston2022-06-011-0/+60
|
* Await un-partial-stating after a partial-state join (#12399)Richard van der Hoff2022-04-211-0/+120
| | | | | | When we join a room via the faster-joins mechanism, we end up with "partial state" at some points on the event DAG. Many parts of the codebase need to wait for the full state to load. So, we implement a mechanism to keep track of which events have partial state, and wait for them to be fully-populated.
* Use auto_attribs/native type hints for attrs classes. (#11692)Patrick Cloke2022-01-131-4/+4
|
* Improve log messages for stream ids (#11536)Richard van der Hoff2021-12-081-3/+3
| | | | Somehow I'd managed to get my database in a pickle with stream ids. These changes were useful to debug.
* Add type hints to `synapse/storage/databases/main/events_worker.py` (#11411)Sean Quah2021-11-261-54/+62
| | | | Also refactor the stream ID trackers/generators a bit and try to document them better.
* Add type hints to some storage classes (#11307)Patrick Cloke2021-11-111-2/+22
|
* Fix race in `MultiWriterIdGenerator` (#11045)Erik Johnston2021-10-121-15/+67
| | | | | | | | | | The race allowed the current position to advance too far when stream IDs are still being persisted. This happened when it received a new stream ID from a remote write between a new stream ID being allocated and it being added to the set of unpersisted stream IDs. Fixes #9424.
* Annotate synapse.storage.util (#10892)David Robertson2021-10-082-55/+94
| | | | | Also mark `synapse.streams` as having has no untyped defs Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Speed up MultiWriterIdGenerator when lots of IDs are in flight. (#10755)Erik Johnston2021-09-031-2/+3
|
* Use inline type hints in `http/federation/`, `storage/` and `util/` (#10381)Jonathan de Jong2021-07-152-9/+9
|
* Fix bug when running presence off master (#10149)Erik Johnston2021-06-111-0/+15
| | | Hopefully fixes #10027.
* Remove redundant "coding: utf-8" lines (#9786)Jonathan de Jong2021-04-143-3/+0
| | | | | | | Part of #9744 Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
* Bugbear: Add Mutable Parameter fixes (#9682)Jonathan de Jong2021-04-081-2/+9
| | | | | | | Part of #9366 Adds in fixes for B006 and B008, both relating to mutable parameter lint errors. Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>
* Refactor to ensure we call check_consistency (#9470)Erik Johnston2021-02-241-2/+22
| | | The idea here is to stop people forgetting to call `check_consistency`. Folks can still just pass in `None` to the new args in `build_sequence_generator`, but hopefully they won't.
* Update black, and run auto formatting over the codebase (#9381)Eric Eastwood2021-02-162-10/+11
| | | | | | | - Update black version to the latest - Run black auto formatting over the codebase - Run autoformatting according to [`docs/code_style.md `](https://github.com/matrix-org/synapse/blob/80d6dc9783aa80886a133756028984dbf8920168/docs/code_style.md) - Update `code_style.md` docs around installing black to use the correct version
* Fix some typos.Patrick Cloke2021-02-121-3/+3
|
* Update type hints for Cursor to match PEP 249. (#9299)Jonathan de Jong2021-02-051-2/+6
|
* Speed up chain cover calculation (#9176)Erik Johnston2021-01-211-0/+16
|
* Increase perf of handling concurrent use of StreamIDGenerators. (#9190)Erik Johnston2021-01-211-8/+13
| | | | | We have seen a failure mode here where if there are many in flight unfinished IDs then marking an ID as finished takes a lot of CPU (as calling deque.remove iterates over the list)
* Add schema update to fix existing DBs affected by #9193 (#9195)Erik Johnston2021-01-211-1/+1
|
* Fix receipts or account data not being sent down sync (#9193)Erik Johnston2021-01-212-4/+58
| | | | | Introduced in #9104 This wasn't picked up by the tests as this is all fine the first time you run Synapse (after upgrading), but then when you restart the wrong value is pulled from `stream_positions`.
* Allow moving account data and receipts streams off master (#9104)Erik Johnston2021-01-181-36/+48
|
* Fix chain cover background update to work with split out event persisters ↵Erik Johnston2021-01-141-4/+6
| | | | (#9115)
* Convert internal pusher dicts to attrs classes. (#8940)Patrick Cloke2020-12-161-2/+2
| | | This improves type hinting and should use less memory.
* Remove racey assertion in MultiWriterIDGenerator (#8530)Erik Johnston2020-10-141-7/+0
| | | | | | | | We asserted that the IDs returned by postgres sequence was greater than any we had seen, however this is technically racey as we may update the current positions out of order. We now assert that the sequences are correct on startup, so the assertion is no longer really required, so we remove them.
* Make event persisters periodically announce position over replication. (#8499)Erik Johnston2020-10-122-0/+12
| | | | | Currently background proccesses stream the events stream use the "minimum persisted position" (i.e. `get_current_token()`) rather than the vector clock style tokens. This is broadly fine as it doesn't matter if the background processes lag a small amount. However, in extreme cases (i.e. SyTests) where we only write to one event persister the background processes will never make progress. This PR changes it so that the `MultiWriterIDGenerator` keeps the current position of a given instance as up to date as possible (i.e using the latest token it sees if its not in the process of persisting anything), and then periodically announces that over replication. This then allows the "minimum persisted position" to advance, albeit with a small lag.
* Reduce serialization errors in MultiWriterIdGen (#8456)Erik Johnston2020-10-071-1/+11
| | | | | | We call `_update_stream_positions_table_txn` a lot, which is an UPSERT that can conflict in `REPEATABLE READ` isolation level. Instead of doing a transaction consisting of a single query we may as well run it outside of a transaction.
* Add logging on startup/shutdown (#8448)Erik Johnston2020-10-022-9/+14
| | | | | This is so we can tell what is going on when things are taking a while to start up. The main change here is to ensure that transactions that are created during startup get correctly logged like normal transactions.
* Merge tag 'v1.21.0rc2' into developRichard van der Hoff2020-10-021-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Synapse 1.21.0rc2 (2020-10-02) ============================== Features -------- - Convert additional templates from inline HTML to Jinja2 templates. ([\#8444](https://github.com/matrix-org/synapse/issues/8444)) Bugfixes -------- - Fix a regression in v1.21.0rc1 which broke thumbnails of remote media. ([\#8438](https://github.com/matrix-org/synapse/issues/8438)) - Do not expose the experimental `uk.half-shot.msc2778.login.application_service` flow in the login API, which caused a compatibility problem with Element iOS. ([\#8440](https://github.com/matrix-org/synapse/issues/8440)) - Fix malformed log line in new federation "catch up" logic. ([\#8442](https://github.com/matrix-org/synapse/issues/8442)) - Fix DB query on startup for negative streams which caused long start up times. Introduced in [\#8374](https://github.com/matrix-org/synapse/issues/8374). ([\#8447](https://github.com/matrix-org/synapse/issues/8447))
| * Fix DB query on startup for negative streams. (#8447)Erik Johnston2020-10-021-1/+1
| | | | | | | | | | | | | | | | For negative streams we have to negate the internal stream ID before querying the DB. The effect of this bug was to query far too many rows, slowing start up time, but we would correctly filter the results afterwards so there was no ill effect.
* | Enable mypy checking for unreachable code and fix instances. (#8432)Patrick Cloke2020-10-011-1/+1
|/
* Don't table scan events on worker startup (#8419)Erik Johnston2020-09-291-1/+25
| | | | | | | | | | | | | | | | | | | | * Fix table scan of events on worker startup. This happened because we assumed "new" writers had an initial stream position of 0, so the replication code tried to fetch all events written by the instance between 0 and the current position. Instead, set the initial position of new writers to the current persisted up to position, on the assumption that new writers won't have written anything before that point. * Consider old writers coming back as "new". Otherwise we'd try and fetch entries between the old stale token and the current position, even though it won't have written any rows. Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com> Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
* Add checks for postgres sequence consistency (#8402)Erik Johnston2020-09-282-2/+93
|
* Fix schema delta for servers that have not backfilled (#8396)Erik Johnston2020-09-251-1/+5
| | | | | Fixes #8395.
* Fix MultiWriteIdGenerator's handling of restarts. (#8374)Erik Johnston2020-09-241-21/+127
| | | | | | | | | | | | | | | | | | | On startup `MultiWriteIdGenerator` fetches the maximum stream ID for each instance from the table and uses that as its initial "current position" for each writer. This is problematic as a) it involves either a scan of events table or an index (neither of which is ideal), and b) if rows are being persisted out of order elsewhere while the process restarts then using the maximum stream ID is not correct. This could theoretically lead to race conditions where e.g. events that are persisted out of order are not sent down sync streams. We fix this by creating a new table that tracks the current positions of each writer to the stream, and update it each time we finish persisting a new entry. This is a relatively small overhead when persisting events. However for the cache invalidation stream this is a much bigger relative overhead, so instead we note that for invalidation we don't actually care about reliability over restarts (as there's no caches to invalidate) and simply don't bother reading and writing to the new table in that particular case.
* Use `async with` for ID gens (#8383)Erik Johnston2020-09-231-54/+76
| | | This will allow us to hit the DB after we've finished using the generated stream ID.
* Add experimental support for sharding event persister. Again. (#8294)Erik Johnston2020-09-141-4/+6
| | | | | | This is *not* ready for production yet. Caveats: 1. We should write some tests... 2. The stream token that we use for events can get stalled at the minimum position of all writers. This means that new events may not be processed and e.g. sent down sync streams if a writer isn't writing or is slow.
* Fix `MultiWriterIdGenerator.current_position`. (#8257)Erik Johnston2020-09-081-6/+37
| | | | | It did not correctly handle IDs finishing being persisted out of order, resulting in the `current_position` lagging until new IDs are persisted.
* Add more logging to debug slow startup (#8264)Richard van der Hoff2020-09-071-0/+5
| | | | I'm hoping this will provide some pointers for debugging https://github.com/matrix-org/synapse/issues/7968.
* Stop sub-classing object (#8249)Patrick Cloke2020-09-041-2/+2
|
* Revert "Add experimental support for sharding event persister. (#8170)" (#8242)Brendan Abolivier2020-09-041-6/+4
| | | | | | | * Revert "Add experimental support for sharding event persister. (#8170)" This reverts commit 82c1ee1c22a87b9e6e3179947014b0f11c0a1ac3. * Changelog
* Add experimental support for sharding event persister. (#8170)Erik Johnston2020-09-021-4/+6
| | | | | | This is *not* ready for production yet. Caveats: 1. We should write some tests... 2. The stream token that we use for events can get stalled at the minimum position of all writers. This means that new events may not be processed and e.g. sent down sync streams if a writer isn't writing or is slow.
* Make MultiWriterIDGenerator work for streams that use negative stream IDs ↵Erik Johnston2020-09-011-11/+28
| | | | | (#8203) This is so that we can use it for the backfill events stream.
* Fix missing _add_persisted_position (#8179)Erik Johnston2020-08-271-0/+2
| | | This was forgotten in #8164.
* Add functions to `MultiWriterIdGen` used by events stream (#8164)Erik Johnston2020-08-252-3/+108
|
* Make StreamIdGen `get_next` and `get_next_mult` async (#8161)Erik Johnston2020-08-251-5/+5
| | | | This is mainly so that `StreamIdGenerator` and `MultiWriterIdGenerator` will have the same interface, allowing them to be used interchangeably.
* Remove `ChainedIdGenerator`. (#8123)Erik Johnston2020-08-191-67/+1
| | | | | It's just a thin wrapper around two ID gens to make `get_current_token` and `get_next` return tuples. This can easily be replaced by calling the appropriate methods on the underlying ID gens directly.
* Separate `get_current_token` into two. (#8113)Erik Johnston2020-08-191-9/+27
| | | | | | | | | | | | The function is used for two purposes: 1) for subscribers of streams to get a token they can use to get further updates with, and 2) for replication to track position of the writers of the stream. For streams with a single writer the two scenarios produce the same result, however the situation becomes complicated for streams with multiple writers. The current `MultiWriterIdGenerator` does not correctly handle the first case (which is not an issue as its only used for the `caches` stream which nothing subscribes to outside of replication).
* Rename database classes to make some sense (#8033)Erik Johnston2020-08-051-2/+2
|
* Use `PostgresSequenceGenerator` from `MultiWriterIdGenerator`Richard van der Hoff2020-07-161-4/+4
| | | | partly just to show it works, but alwo to remove a bit of code duplication.
* Add some helper classes for generating ID sequencesRichard van der Hoff2020-07-161-0/+98
|
* Move event stream handling out of slave store. (#7491)Erik Johnston2020-05-151-0/+11
| | | | | This allows us to have the logic on both master and workers, which is necessary to move event persistence off master. We also combine the instantiation of ID generators from DataStore and slave stores to the base worker stores. This allows us to select which process writes events independently of the master/worker splits.
* Add MultiWriterIdGenerator. (#7281)Erik Johnston2020-05-041-2/+167
| | | | | | This will be used to coordinate stream IDs across multiple writers. Functions as the equivalent of both `StreamIdGenerator` and `SlavedIdTracker`.
* Update black to 19.10b0 (#6304)Amber Brown2019-11-011-1/+1
| | | * update version of black and also fix the mypy config being overridden
* Remove unnecessary parentheses around return statements (#5931)Andrew Morgan2019-08-301-2/+2
| | | | | Python will return a tuple whether there are parentheses around the returned values or not. I'm just sick of my editor complaining about this all over the place :)
* Run black on the rest of the storage module (#4996)Amber Brown2019-04-031-5/+5
|
* run isortAmber Brown2018-07-091-1/+1
|
* Fix assertion to stop transaction queue getting wedgedRichard van der Hoff2017-03-151-0/+14
| | | | | | | | ... and update some docstrings to correctly reflect the types being used. get_new_device_msgs_for_remote can return a long under some circumstances, which was being stored in last_device_list_stream_id_by_dest, and was then upsetting things on the next loop.
* Add tests for redactionsMark Haines2016-04-071-1/+1
|
* Assert that the step != 0Mark Haines2016-04-011-0/+1
|
* use google style doc stringsMark Haines2016-04-011-11/+12
|
* Rename direction to step, apply checks consistentlyMark Haines2016-04-011-15/+15
|
* Use a stream id generator for backfilled idsMark Haines2016-04-011-20/+41
|
* Add replication stream for pushersMark Haines2016-03-151-1/+6
|
* Ensure integer is an integerErik Johnston2016-03-091-1/+1
|
* Add a stream for push rule updatesMark Haines2016-03-011-26/+58
|
* Load the current id in the IdGenerator constructorMark Haines2016-03-011-47/+22
| | | | | | | | | Rather than loading them lazily. This allows us to remove all the yield statements and spurious arguments for the get_next methods. It also allows us to replace all instances of get_next_txn with get_next since get_next no longer needs to access the db.
* Remove unused param from get_max_tokenErik Johnston2016-02-181-3/+1
|
* Initial cutErik Johnston2016-02-171-1/+3
|
* Add a Homeserver.setup method.Erik Johnston2016-01-261-27/+9
| | | | | | This is for setting up dependencies that require work on startup. This is useful for the DataStore that wants to read a bunch from the database before initiliazing.
* copyrightsMatthew Hodgson2016-01-072-2/+2
|
* Merge pull request #199 from matrix-org/erikj/receiptsErik Johnston2015-07-161-2/+5
|\ | | | | Implement read receipts.
| * Add basic storage functions for handling of receiptsErik Johnston2015-07-011-2/+5
| |
* | Add bulk insert events APIErik Johnston2015-06-251-0/+31
|/
* SYN-377: Make sure that the StreamIdGenerator.get_next.__exit__ is called ↵Mark Haines2015-05-121-4/+8
| | | | from the main thread after the transaction completes, not from database thread before the transaction completes.
* TypoErik Johnston2015-04-291-1/+1
|
* Also remove yield from within lock in the other generatorErik Johnston2015-04-291-8/+6
|
* Fix deadlock in id_generators. No idea why this was an actual deadlock.Erik Johnston2015-04-291-14/+16
|
* Make get_max_token into inlineCallbacks so that the lock works.Erik Johnston2015-04-271-3/+4
|
* Use try..finally in contextlib.contextmanagerErik Johnston2015-04-151-3/+5
|
* Correctly increment the _next_id initiallyErik Johnston2015-04-141-2/+4
|
* Stream ordering and out of order insertions.Erik Johnston2015-04-092-0/+140
Handle the fact that events can be persisted out of order, and so to get the "current max" stream token becomes non trivial - as we need to make sure that *all* stream tokens less than the current max have also successfully been persisted.