summary refs log tree commit diff
path: root/synapse/federation/transaction_queue.py (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Rename and move the classesRichard van der Hoff2019-03-131-801/+0
|
* Factor per-destination stuff out of TransactionQueueRichard van der Hoff2019-03-131-132/+182
| | | | This is easier than having to have a million fields keyed on destination.
* Move client receipt processing to federation sender worker.Richard van der Hoff2019-03-131-0/+35
| | | | | This is mostly a prerequisite for #4730, but also fits with the general theme of "move everything off the master that we possibly can".
* Avoid rebuilding Edu objects in worker mode (#4770)Richard van der Hoff2019-03-041-7/+24
| | | | | In worker mode, on the federation sender, when we receive an edu for sending over the replication socket, it is parsed into an Edu object. There is no point extracting the contents of it so that we can then immediately build another Edu.
* Add metrics for number of outgoing EDUs, by type (#4695)Richard van der Hoff2019-02-201-4/+18
|
* Use snder and not event ID domain to check if oursErik Johnston2019-01-291-1/+1
| | | | | | The transaction queue only sends out events that we generate. This was done by checking domain of event ID, but that can no longer be used. Instead, we may as well use the sender field.
* Don't log stack traces for HTTP error responsesErik Johnston2019-01-081-1/+6
|
* Refactor request sending to have better excpetions (#4358)Erik Johnston2019-01-081-5/+14
| | | | | | | | | | | | | | * Correctly retry and back off if we get a HTTPerror response * Refactor request sending to have better excpetions MatrixFederationHttpClient blindly reraised exceptions to the caller without differentiating "expected" failures (e.g. connection timeouts etc) versus more severe problems (e.g. programming errors). This commit adds a RequestSendFailed exception that is raised when "expected" failures happen, allowing the TransactionQueue to log them as warnings while allowing us to log other exceptions as actual exceptions.
* Add helpers for getting prev and auth events (#4139)Erik Johnston2018-11-061-3/+1
| | | | | | | * Add helpers for getting prev and auth events This is in preparation for allowing the event format to change between room versions.
* Various cleanups in the federation client code (#4031)Richard van der Hoff2018-10-161-15/+12
| | | | | | | | | | | | | | - Improve logging: log things in the right order, include destination and txids in all log lines, don't log successful responses twice - Fix the docstring on TransportLayerClient.send_transaction - Don't use treq.request, which is overcomplicated for our purposes: just use a twisted.web.client.Agent. - simplify the logic for setting up the bodyProducer - fix bytes/str confusions
* Fix complete fail to do the right thingRichard van der Hoff2018-09-281-1/+2
|
* remove spurious federation checks on localhostRichard van der Hoff2018-09-261-31/+6
| | | | | | | | | | | There's really no point in checking for destinations called "localhost" because there is nothing stopping people creating other DNS entries which point to 127.0.0.1. The right fix for this is https://github.com/matrix-org/synapse/issues/3953. Blocking localhost, on the other hand, means that you get a surprise when trying to connect a test server on localhost to an existing server (with a 'normal' server_name).
* Limit the number of PDUs/EDUs per fedreation transactionErik Johnston2018-09-061-0/+12
|
* Integrate presence from hotfixes (#3694)Amber Brown2018-08-181-0/+4
|
* more metrics for the federation and appservice sendersRichard van der Hoff2018-08-071-1/+9
|
* Remove pdu_failures from transactionsTravis Ralston2018-07-301-27/+5
| | | The field is never read from, and all the opportunities given to populate it are not utilized. It should be very safe to remove this.
* Run things as background processesRichard van der Hoff2018-07-181-9/+6
| | | | | | | | This fixes #3518, and ensures that we get useful logs and metrics for lots of things that happen in the background. (There are certainly more things that happen in the background; these are just the common ones I've found running a single-process synapse locally).
* Resource tracking for background processesRichard van der Hoff2018-07-181-5/+7
| | | | | | | | | | | | | | | | This introduces a mechanism for tracking resource usage by background processes, along with an example of how it will be used. This will help address #3518, but more importantly will give us better insights into things which are happening but not being shown up by the request metrics. We *could* do this with Measure blocks, but: - I think having them pulled out as a completely separate metric class will make it easier to distinguish top-level processes from those which are nested. - I want to be able to report on in-flight background processes, and I don't think we want to do this for *all* Measure blocks.
* run isortAmber Brown2018-07-091-16/+14
|
* Populate synapse_federation_client_sent_pdu_destinations:count again (#3386)Amber Brown2018-06-211-3/+7
|
* Remove run_on_reactor (#3395)Amber Brown2018-06-141-4/+0
|
* Consistently use six's iteritems and wrap lazy keys/values in list() if ↵Amber Brown2018-05-311-2/+4
| | | | they're not meant to be lazy (#3307)
* fixesAmber Brown2018-05-231-4/+4
|
* cleanup pep8 errorsAmber Brown2018-05-221-5/+17
|
* fixesAmber Brown2018-05-221-3/+3
|
* replacing portionsAmber Brown2018-05-211-28/+19
|
* Improve exception handling for background processesRichard van der Hoff2018-04-271-0/+2
| | | | | | | | | | | | | | | | | | There were a bunch of places where we fire off a process to happen in the background, but don't have any exception handling on it - instead relying on the unhandled error being logged when the relevent deferred gets garbage-collected. This is unsatisfactory for a number of reasons: - logging on garbage collection is best-effort and may happen some time after the error, if at all - it can be hard to figure out where the error actually happened. - it is logged as a scary CRITICAL error which (a) I always forget to grep for and (b) it's not really CRITICAL if a background process we don't care about fails. So this is an attempt to add exception handling to everything we fire off into the background.
* Set all metrics at the same timeErik Johnston2018-04-121-6/+6
|
* Track last processed event received_tsErik Johnston2018-04-111-0/+11
|
* Track where event stream processing have gotten up toErik Johnston2018-04-111-0/+4
|
* Use run_in_background insteadErik Johnston2018-04-101-1/+1
|
* Preserve log contexts correctlyErik Johnston2018-04-101-1/+4
|
* Log event ID on exceptionErik Johnston2018-04-101-1/+4
|
* Handle all events in a room correctlyErik Johnston2018-04-091-1/+2
|
* Send federation events concurrentlyErik Johnston2018-04-091-4/+18
|
* Handle exceptions in get_hosts_for_room when sending events over federationErik Johnston2018-04-091-11/+16
|
* Add federation_domain_whitelist option (#2820)Matthew Hodgson2018-01-221-1/+3
| | | | | | Add federation_domain_whitelist gives a way to restrict which domains your HS is allowed to federate with. useful mainly for gracefully preventing a private but internet-connected HS from trying to federate to the wider public Matrix network
* Metrics for events processed in appservice and fed senderRichard van der Hoff2018-01-151-0/+4
| | | | More metrics I wished I'd had
* Clear logcontext before starting fed txn queue runnerRichard van der Hoff2017-11-281-2/+8
| | | | | These processes take a long time compared to the request, so there is lots of "Entering|Restoring dead context" in the logs. Let's try to shut it up a bit.
* Fix up logcontext handling in (federation) TransactionQueueRichard van der Hoff2017-10-061-16/+32
| | | | | Avoid using preserve_context_over_function, which has problems with respect to logcontexts.
* Remove spurious log linesErik Johnston2017-06-071-1/+0
|
* Faster cache for get_joined_hostsErik Johnston2017-05-251-0/+2
|
* Make presence use cached users/hosts in roomErik Johnston2017-05-161-1/+1
|
* Add cache for get_current_hosts_in_roomErik Johnston2017-05-021-5/+1
|
* Merge pull request #2115 from matrix-org/erikj/dedupe_federation_replErik Johnston2017-04-121-10/+76
|\ | | | | Reduce federation replication traffic
| * CommentErik Johnston2017-04-121-2/+1
| |
| * Reuse get_interested_partiesErik Johnston2017-04-121-3/+3
| |
| * CommentErik Johnston2017-04-111-0/+2
| |
| * Move get_interested_remotes back to presence handlerErik Johnston2017-04-111-35/+6
| |
| * CommentsErik Johnston2017-04-111-1/+14
| |
| * Reduce federation presence replication trafficErik Johnston2017-04-101-9/+90
| | | | | | | | | | | | | | | | This is mainly done by moving the calculation of where to send presence updates from the presence handler to the transaction queue, so we only need to send the presence event (and not the destinations) across the replication connection. Before we were duplicating by sending the full state across once per destination.
* | Add a counter metric for successfully-sent transactionsPaul "LeoNerd" Evans2017-04-111-0/+3
|/
* Bail early if remote wouldn't be retried (#2064)Erik Johnston2017-03-291-2/+8
| | | | | | | | | | * Bail early if remote wouldn't be retried * Don't always return true * Just use get_retry_limiter * Spelling
* Batch sending of device list pokesErik Johnston2017-03-241-0/+1
|
* push federation retry limiter down to matrixfederationclientRichard van der Hoff2017-03-231-121/+95
| | | | | rather than having to instrument everywhere we make a federation call, make the MatrixFederationHttpClient manage the retry limiter.
* Fix assertion to stop transaction queue getting wedgedRichard van der Hoff2017-03-151-0/+5
| | | | | | | | ... and update some docstrings to correctly reflect the types being used. get_new_device_msgs_for_remote can return a long under some circumstances, which was being stored in last_device_list_stream_id_by_dest, and was then upsetting things on the next loop.
* Fix a race in transaction queueRichard van der Hoff2017-02-201-9/+21
| | | | | | It was theoretically possible for a PDU to get queued and not sent for ages. On closer inspection I think there were bigger problems elsewhere, but we might as well fix this since it's easy.
* Correctly raise exceptions for ratelimitng. Ratelimit on 401Erik Johnston2017-02-011-1/+1
|
* Better handle 404 response for federation /send/Erik Johnston2017-01-311-0/+1
|
* Fix up sending of m.device_list_update edusErik Johnston2017-01-251-60/+61
|
* Add basic implementation of local device list changesErik Johnston2017-01-251-3/+21
|
* Lower the not retrying host log line to debugErik Johnston2017-01-171-1/+1
|
* Only send events that originate on this server.Mark Haines2017-01-051-0/+12
| | | | | | Or events that are sent via the federation "send_join" API. This should match the behaviour from before v0.18.5 and #1635 landed.
* Get the destinations from the state from before the eventMark Haines2017-01-041-8/+9
| | | | Rather than the state after then event.
* Send ALL membership events to the server that was affected.Mark Haines2017-01-041-3/+5
| | | | | | Send all membership changes to the server that was affected. This ensures that if the last member of a room on a server was kicked or banned they get told about it.
* Correctly handle 500's and 429 on federationErik Johnston2016-11-241-0/+7
|
* CommentsErik Johnston2016-11-211-0/+3
|
* Remove explicit calls to send_pduErik Johnston2016-11-211-4/+9
|
* Fix testsErik Johnston2016-11-211-0/+3
|
* Store federation stream positions in the databaseErik Johnston2016-11-211-4/+17
|
* Handle sending events and device messages over federationErik Johnston2016-11-171-0/+32
|
* Use new federation_sender DIErik Johnston2016-11-161-0/+10
|
* Add transaction queue and transport layer to DIErik Johnston2016-11-161-2/+2
|
* Move logic into transaction_queueErik Johnston2016-11-161-3/+16
|
* Rename transaction queue functions to send_*Erik Johnston2016-11-161-5/+5
|
* Fix incorrect attribute nameErik Johnston2016-09-091-1/+1
|
* CommentErik Johnston2016-09-091-0/+1
|
* Add edu.type as part of key. Remove debug loggingErik Johnston2016-09-091-2/+3
|
* Clobber EDUs in send queueErik Johnston2016-09-091-3/+45
|
* Drop replication log levelsErik Johnston2016-09-091-1/+0
|
* Check if destination is ready for retry earlierErik Johnston2016-09-091-15/+16
|
* Fix tightloop on sending transactionErik Johnston2016-09-091-122/+134
|
* Correctly guard against multiple concurrent transactionsErik Johnston2016-09-091-38/+41
|
* Update last_device_stream_id_by_dest if there is nothing to sendErik Johnston2016-09-091-0/+1
|
* Add a new method to enqueue the device messages rather than sending a dummy EDUMark Haines2016-09-071-0/+11
|
* Move the check for federated device_messages.Mark Haines2016-09-071-11/+15
| | | | | Move the check into _attempt_new_transaction. Only delete messages if there were messages to delete.
* Add stream change caches for device messagesMark Haines2016-09-071-1/+4
|
* Send device messages over federationMark Haines2016-09-061-7/+36
|
* PEP8Erik Johnston2016-08-101-1/+3
|
* Clean up TransactionQueueErik Johnston2016-08-101-215/+160
|
* Measure federation send transaction resourcesErik Johnston2016-08-101-5/+7
|
* Run transaction queue on reactorErik Johnston2016-05-091-0/+3
| | | | | This ensures that any CPU work that happens doesn't block message sending.
* Fix up logcontextsErik Johnston2016-02-081-3/+0
|
* copyrightsMatthew Hodgson2016-01-071-1/+1
|
* Don't rearrange transaction_queueErik Johnston2015-11-031-12/+11
|
* Fix broken cache for getting retry times. This meant we retried remote ↵Erik Johnston2015-11-031-23/+24
| | | | destinations way more frequently than we should
* Add txn_id to some log linesErik Johnston2015-05-221-6/+11
|
* Log less lines at INFO level, but include more helpful informationErik Johnston2015-05-221-6/+10
|
* Don't log enqueue_Erik Johnston2015-05-011-1/+0
|
* Appease pep8Paul "LeoNerd" Evans2015-03-121-3/+6
|
* Neater metrics from TransactionQueuePaul "LeoNerd" Evans2015-03-121-9/+11
|
* Use _ instead of . as a metric namespacing separator, for PrometheusPaul "LeoNerd" Evans2015-03-121-2/+2
|
* Rename Metrics' "keys" to "labels"Paul "LeoNerd" Evans2015-03-121-2/+2
|
* Put vector gauges on transaction queue pending PDU and EDU dictsPaul "LeoNerd" Evans2015-03-121-2/+14
|
* Fix bug in logging.Erik Johnston2015-03-101-5/+5
|
* Fix bug in logging.Erik Johnston2015-03-101-1/+1
|
* Must update pending_transactions map before yield'ingErik Johnston2015-02-261-2/+2
|
* Implement and use new batched get missing pduErik Johnston2015-02-231-1/+1
|
* Merge branch 'develop' of github.com:matrix-org/synapse into release-v0.7.1Erik Johnston2015-02-181-3/+27
|\
| * Restrict the destinations that synapse can talk toMark Haines2015-02-181-3/+27
| |
* | Add errback to all deferreds in transaction_queueErik Johnston2015-02-181-14/+23
| |
* | Discard destination 'localhost'Erik Johnston2015-02-181-2/+2
| |
* | Don't send failure to selfErik Johnston2015-02-181-0/+3
|/
* Fix pyflakesErik Johnston2015-02-181-1/+0
|
* Merge branch 'keyclient_retry_scheme' of github.com:matrix-org/synapse into ↵Erik Johnston2015-02-181-92/+63
|\ | | | | | | develop
| * Try to only back off if we think we failed to connect to the remoteErik Johnston2015-02-171-33/+33
| |
| * Add per server retry limiting.Erik Johnston2015-02-171-95/+66
| | | | | | | | | | Factor out the pre destination retry logic from TransactionQueue so it can be reused in both get_pdu and crypto.keyring
* | Format the response of transaction request in a nicer wayErik Johnston2015-02-171-2/+20
|/
* Use consumeErrors=True on all DeferredLists.Erik Johnston2015-02-171-1/+1
| | | | | | This is so that the DeferredLists actually consume the error instead of propogating down the non-existent errback chain. This should reduce the number of unhandled errors we are seeing.
* Log all the exits from _attempt_new_transactionErik Johnston2015-02-101-2/+7
|
* Apply sanity to the transport client interface. Convert 'make_join' and ↵Erik Johnston2015-02-041-5/+18
| | | | 'send_join' to accept iterables of destinations
* Split up replication_layer module into client, server and transaction queueErik Johnston2015-01-261-3/+6
|
* Split out TransactionQueue from replication layerErik Johnston2015-01-221-0/+314