summary refs log tree commit diff
path: root/synapse/replication/tcp/handler.py (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Correctly mention previous copyright (#16820)Erik Johnston2024-01-231-0/+2
| | | | | During the migration the automated script to update the copyright headers accidentally got rid of some of the existing copyright lines. Reinstate them.
* Update license headersPatrick Cloke2023-11-211-11/+16
|
* Fix sending out of order `POSITION` over replication (#16639)Erik Johnston2023-11-161-19/+19
| | | | | If a worker reconnects to Redis we send out the current positions of all our streams. However, if we're also trying to send out a backlog of RDATA at the same time then we can end up sending a `POSITION` with the current token *before* we've sent all the RDATA before the current token. This doesn't cause actual bugs as the receiving servers see the POSITION, fetch the relevant rows from the DB, and then ignore the old RDATA as they come in. However, this is inefficient so it'd be better if we didn't send out-of-order positions
* More efficiently handle no-op POSITION (#16640)Erik Johnston2023-11-161-0/+34
| | | | We may receive `POSITION` commands where we already know that worker has advanced past that position, so there is no point in handling it.
* Reduce spurious replication catchup (#16555)Erik Johnston2023-10-271-5/+9
|
* Remove duplicate call to wake a remote destination when using federation ↵Jason Little2023-10-241-2/+0
| | | | sending worker (#16515)
* Some minor performance fixes for task schedular (#16313)Erik Johnston2023-09-141-4/+2
|
* Improve logging of replication (#16309)Erik Johnston2023-09-131-1/+1
|
* Track currently syncing users by device for presence (#16172)Patrick Cloke2023-08-291-5/+14
| | | | | | | Refactoring to use both the user ID & the device ID when tracking the currently syncing users in the presence handler. This is done both locally and over replication. Note that the device ID is discarded but will be used in a future change.
* Task scheduler: add replication notify for new task to launch ASAP (#16184)Mathieu Velten2023-08-281-0/+18
|
* Run pyupgrade for python 3.7 & 3.8. (#16110)Patrick Cloke2023-08-151-1/+1
|
* Add ability to wait for locks and add locks to purge history / room deletion ↵Erik Johnston2023-07-311-0/+22
| | | | | (#15791) c.f. #13476
* Add Unix socket support for Redis connections (#15644)Jason Little2023-05-261-1/+9
| | | | Adds a new configuration setting to connect to Redis via a Unix socket instead of over TCP. Disabled by default.
* Add redis SSL configuration options (#15312)Roel ter Maat2023-05-111-7/+22
| | | | | | | | | | | | | | | | | * Add SSL options to redis config * fix lint issues * Add documentation and changelog file * add missing . at the end of the changelog * Move client context factory to new file * Rename ssl to tls and fix typo * fix lint issues * Added when redis attributes were added
* Remove no-op send_command for Redis replication. (#15274)Patrick Cloke2023-03-161-25/+1
| | | | | With Redis commands do not need to be re-issued by the main process (they fan-out to all processes at once) and thus it is no longer necessary to worry about them reflecting recursively forever.
* Merge account data streams (#14826)Erik Johnston2023-01-131-2/+1
|
* Remove configuration options for direct TCP replication. (#13647)Patrick Cloke2022-09-061-37/+21
| | | Removes the ability to configure legacy direct TCP replication. Workers now require Redis to run.
* Lay some foundation work to allow workers to only subscribe to some kinds of ↵reivilibre2022-05-191-2/+32
| | | | messages, reducing replication traffic. (#12672)
* Reduce log spam when running multiple event persisters (#12610)Erik Johnston2022-05-051-2/+7
|
* Move `update_client_ip` background job from the main process to the ↵reivilibre2022-04-011-13/+35
| | | | background worker. (#12251)
* Improve code documentation for the typing stream over replication. (#12211)reivilibre2022-03-111-1/+1
|
* Rename get_tcp_replication to get_replication_command_handler. (#12192)Patrick Cloke2022-03-101-3/+1
| | | | | | Since the object it returns is a ReplicationCommandHandler. This is clean-up from adding support to Redis where the command handler was added as an additional layer of abstraction from the TCP protocol.
* Remove `HomeServer.get_datastore()` (#12031)Richard van der Hoff2022-02-231-1/+1
| | | | | | | The presence of this method was confusing, and mostly present for backwards compatibility. Let's get rid of it. Part of #11733
* Add missing type hints to synapse.replication. (#11938)Patrick Cloke2022-02-081-18/+18
|
* Remove unnecessary ignores due to Twisted upgrade. (#11939)Patrick Cloke2022-02-081-2/+2
| | | | Twisted 22.1.0 fixed some internal type hints, allowing Synapse to remove ignore calls for parameters to connectTCP.
* Enable passing typing stream writers as a list. (#11237)Nick Barrett2021-11-031-1/+1
| | | | This makes the typing stream writer config match the other stream writers that only currently support a single worker.
* Add type hints for most `HomeServer` parameters (#11095)Sean Quah2021-10-221-1/+5
|
* Require direct references to configuration variables. (#10985)Patrick Cloke2021-10-061-2/+5
| | | | | | This removes the magic allowing accessing configurable variables directly from the config object. It is now required that a specific configuration class is used (e.g. `config.foo` must be replaced with `config.server.foo`).
* Pass str to twisted's IReactorTCP (#10895)David Robertson2021-09-301-2/+6
| | | | | | | This follows a correction made in twisted/twisted#1664 and should fix our Twisted Trial CI job. Until that change is in a twisted release, we'll have to ignore the type of the `host` argument. I've raised #10899 to remind us to review the issue in a few months' time.
* Use direct references for configuration variables (part 5). (#10897)Patrick Cloke2021-09-241-2/+2
|
* Use direct references for some configuration variables (#10798)Patrick Cloke2021-09-131-2/+2
| | | | Instead of proxying through the magic getter of the RootConfig object. This should be more performant (and is more explicit).
* Use inline type hints in various other places (in `synapse/`) (#10380)Jonathan de Jong2021-07-151-8/+8
|
* update black to 21.6b0 (#10197)Marcus2021-06-171-1/+1
| | | | | Reformat all files with the new version. Signed-off-by: Marcus Hoffmann <bubu@bubu1.eu>
* Split presence out of master (#9820)Erik Johnston2021-04-231-2/+16
|
* Remove redundant "coding: utf-8" lines (#9786)Jonathan de Jong2021-04-141-1/+0
| | | | | | | Part of #9744 Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
* Fix remaining mypy issues due to Twisted upgrade. (#9608)Patrick Cloke2021-03-151-2/+2
|
* Fix additional type hints from Twisted 21.2.0. (#9591)Patrick Cloke2021-03-121-20/+24
|
* Fix deleting pushers when using sharded pushers. (#9465)Erik Johnston2021-02-221-23/+0
|
* Update black, and run auto formatting over the codebase (#9381)Eric Eastwood2021-02-161-16/+14
| | | | | | | - Update black version to the latest - Run black auto formatting over the codebase - Run autoformatting according to [`docs/code_style.md `](https://github.com/matrix-org/synapse/blob/80d6dc9783aa80886a133756028984dbf8920168/docs/code_style.md) - Update `code_style.md` docs around installing black to use the correct version
* Precompute joined hosts and store in Redis (#9198)Erik Johnston2021-01-261-14/+1
|
* Periodically send pings to detect dead Redis connections (#9218)Erik Johnston2021-01-261-2/+6
| | | | | | | | This is done by creating a custom `RedisFactory` subclass that periodically pings all connections in its pool. We also ensure that the `replyTimeout` param is non-null, so that we timeout waiting for the reply to those pings (and thus triggering a reconnect).
* Allow moving account data and receipts streams off master (#9104)Erik Johnston2021-01-181-0/+19
|
* Allow running sendToDevice on workers (#9044)Erik Johnston2021-01-071-0/+9
|
* Make event persisters periodically announce position over replication. (#8499)Erik Johnston2020-10-121-10/+14
| | | | | Currently background proccesses stream the events stream use the "minimum persisted position" (i.e. `get_current_token()`) rather than the vector clock style tokens. This is broadly fine as it doesn't matter if the background processes lag a small amount. However, in extreme cases (i.e. SyTests) where we only write to one event persister the background processes will never make progress. This PR changes it so that the `MultiWriterIDGenerator` keeps the current position of a given instance as up to date as possible (i.e using the latest token it sees if its not in the process of persisting anything), and then periodically announces that over replication. This then allows the "minimum persisted position" to advance, albeit with a small lag.
* Add unit test for event persister sharding (#8433)Erik Johnston2020-10-021-3/+3
|
* Add experimental support for sharding event persister. Again. (#8294)Erik Johnston2020-09-141-1/+1
| | | | | | This is *not* ready for production yet. Caveats: 1. We should write some tests... 2. The stream token that we use for events can get stalled at the minimum position of all writers. This means that new events may not be processed and e.g. sent down sync streams if a writer isn't writing or is slow.
* Revert "Add experimental support for sharding event persister. (#8170)" (#8242)Brendan Abolivier2020-09-041-1/+1
| | | | | | | * Revert "Add experimental support for sharding event persister. (#8170)" This reverts commit 82c1ee1c22a87b9e6e3179947014b0f11c0a1ac3. * Changelog
* Add experimental support for sharding event persister. (#8170)Erik Johnston2020-09-021-1/+1
| | | | | | This is *not* ready for production yet. Caveats: 1. We should write some tests... 2. The stream token that we use for events can get stalled at the minimum position of all writers. This means that new events may not be processed and e.g. sent down sync streams if a writer isn't writing or is slow.
* Handle replication commands synchronously where possible (#7876)Richard van der Hoff2020-07-271-49/+66
| | | Most of the stuff we do for replication commands can be done synchronously. There's no point spinning up background processes if we're not going to need them.
* Remove an unused prometheus metric (#7878)Richard van der Hoff2020-07-221-3/+1
|
* Optimise queueing of inbound replication commands (#7861)Richard van der Hoff2020-07-161-116/+215
| | | | | | | | | | | When we get behind on replication, we tend to stack up background processes behind a linearizer. Bg processes are heavy (particularly with respect to prometheus metrics) and linearizers aren't terribly efficient once the queue gets long either. A better approach is to maintain a queue of requests to be processed, and nominate a single process to work its way through the queue. Fixes: #7444
* Allow moving typing off master (#7869)Erik Johnston2020-07-161-0/+9
|
* Add ability to shard the federation sender (#7798)Erik Johnston2020-07-101-2/+2
|
* isort 5 compatibility (#7786)Will Hunt2020-07-051-2/+2
| | | The CI appears to use the latest version of isort, which is a problem when isort gets a major version bump. Rather than try to pin the version, I've done the necessary to make isort5 happy with synapse.
* Discard RDATA from already seen positions. (#7648)Patrick Cloke2020-06-151-4/+26
|
* Ensure ReplicationStreamer is always started when replication enabled. (#7579)Erik Johnston2020-05-271-0/+3
| | | Fixes #7566.
* Add option to move event persistence off master (#7517)Erik Johnston2020-05-221-0/+10
|
* Have all instances correctly respond to REPLICATE command. (#7475)Erik Johnston2020-05-131-10/+45
| | | | | Before all streams were only written to from master, so only master needed to respond to `REPLICATE` commands. Before all instances wrote to the cache invalidation stream, but didn't respond to `REPLICATE`. This was a bug, which could lead to missed rows from cache invalidation stream if an instance is restarted, however all the caches would be empty in that case so it wasn't a problem.
* Fix Redis reconnection logic (#7482)Erik Johnston2020-05-131-1/+8
| | | Proactively send out `POSITION` commands (as if we had just received a `REPLICATE`) when we connect to Redis. This is important as other instances won't notice we've connected to issue a `REPLICATE` command (unlike for direct TCP connections). This is only currently an issue if master process reconnects without restarting (if it restarts then it won't have written anything and so other instances probably won't have missed anything).
* Merge branch 'release-v1.13.0' into developAndrew Morgan2020-05-111-3/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * release-v1.13.0: Don't UPGRADE database rows RST indenting Put rollback instructions in upgrade notes Fix changelog typo Oh yeah, RST Absolute URL it is then Fix upgrade notes link Provide summary of upgrade issues in changelog. Fix ) Move next version notes from changelog to upgrade notes Changelog fixes 1.13.0rc1 Documentation on setting up redis (#7446) Rework UI Auth session validation for registration (#7455) Fix errors from malformed log line (#7454) Drop support for redis.dbid (#7450)
| * Drop support for redis.dbid (#7450)Richard van der Hoff2020-05-071-3/+1
| | | | | | Since we only use pubsub, the dbid is irrelevant.
* | Support any process writing to cache invalidation stream. (#7436)Erik Johnston2020-05-071-35/+7
|/
* Merge branch 'release-v1.13.0' into rav/fix_dropped_messagesRichard van der Hoff2020-05-051-1/+1
|\
| * Move logs about discarded RDATA to debug (#7421)Brendan Abolivier2020-05-051-1/+1
| |
* | Merge branch 'release-v1.13.0' into rav/fix_dropped_messagesRichard van der Hoff2020-05-051-13/+16
|\|
| * Thread through instance name to replication client. (#7369)Erik Johnston2020-05-011-5/+15
| | | | | | For in memory streams when fetching updates on workers we need to query the source of the stream, which currently is hard coded to be master. This PR threads through the source instance we received via `POSITION` through to the update function in each stream, which can then be passed to the replication client for in memory streams.
| * Use `stream.current_token()` and remove `stream_positions()` (#7172)Erik Johnston2020-05-011-9/+1
| | | | | | | | We move the processing of typing and federation replication traffic into their handlers so that `Stream.current_token()` points to a valid token. This allows us to remove `get_streams_to_replicate()` and `stream_positions()`.
* | Wait for a POSITION on the right connection before accepting RDATARichard van der Hoff2020-05-051-18/+37
| | | | | | | | ... otherwise we can believe we're up to date when we're not.
* | Wait to subscribe before sending REPLICATERichard van der Hoff2020-05-051-1/+2
|/
* Add instance name to RDATA/POSITION commands (#7364)Erik Johnston2020-04-291-3/+14
| | | | | This is primarily for allowing us to send those commands from workers, but for now simply allows us to ignore echoed RDATA/POSITION commands that we sent (we get echoes of sent commands when using redis). Currently we log a WARNING on the master process every time we receive an echoed RDATA.
* Don't relay REMOTE_SERVER_UP cmds to same conn. (#7352)Erik Johnston2020-04-291-14/+49
| | | | | | | | | | | | | | For direct TCP connections we need the master to relay REMOTE_SERVER_UP commands to the other connections so that all instances get notified about it. The old implementation just relayed to all connections, assuming that sending back to the original sender of the command was safe. This is not true for redis, where commands sent get echoed back to the sender, which was causing master to effectively infinite loop sending and then re-receiving REMOTE_SERVER_UP commands that it sent. The fix is to ensure that we only relay to *other* connections and not to the connection we received the notification from. Fixes #7334.
* Fix limit logic for EventsStream (#7358)Richard van der Hoff2020-04-291-1/+3
| | | | | | | | | | | | | | | | | | | * Factor out functions for injecting events into database I want to add some more flexibility to the tools for injecting events into the database, and I don't want to clutter up HomeserverTestCase with them, so let's factor them out to a new file. * Rework TestReplicationDataHandler This wasn't very easy to work with: the mock wrapping was largely superfluous, and it's useful to be able to inspect the received rows, and clear out the received list. * Fix AssertionErrors being thrown by EventsStream Part of the problem was that there was an off-by-one error in the assertion, but also the limit logic was too simple. Fix it all up and add some tests.
* Stop the master relaying USER_SYNC for other workers (#7318)Richard van der Hoff2020-04-221-10/+5
| | | | | | | Long story short: if we're handling presence on the current worker, we shouldn't be sending USER_SYNC commands over replication. In an attempt to figure out what is going on here, I ended up refactoring some bits of the presencehandler code, so the first 4 commits here are non-functional refactors to move this code slightly closer to sanity. (There's still plenty to do here :/). Suggest reviewing individual commits. Fixes (I hope) #7257.
* Add ability to run replication protocol over redis. (#7040)Erik Johnston2020-04-221-7/+43
| | | This is configured via the `redis` config options.
* On catchup, process each row with its own stream id (#7286)Richard van der Hoff2020-04-201-5/+68
| | | | | | Other parts of the code (such as the StreamChangeCache) assume that there will not be multiple changes with the same stream id. This code was introduced in #7024, and I hope this fixes #7206.
* Remove vestigal references to SYNC replication commandRichard van der Hoff2020-04-071-4/+0
| | | | We've ripped pretty much all of this out: let's remove the remains.
* Fix race in replication (#7226)Erik Johnston2020-04-071-28/+45
| | | | Fixes a race between handling `POSITION` and `RDATA` commands. We do this by simply linearizing handling of them.
* Move server command handling out of TCP protocol (#7187)Erik Johnston2020-04-071-18/+159
| | | This completes the merging of server and client command processing.
* Move client command handling out of TCP protocol (#7185)Erik Johnston2020-04-061-0/+252
The aim here is to move the command handling out of the TCP protocol classes and to also merge the client and server command handling (so that we can reuse them for redis protocol). This PR simply moves the client paths to the new `ReplicationCommandHandler`, a future PR will move the server paths too.