summary refs log tree commit diff
path: root/synapse/util/retryutils.py (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Fix a bug where servers could be marked as up when they were failing (#16506)Patrick Cloke2023-10-171-13/+17
| | | | After this change a server will only be reported as back online if they were previously having requests fail.
* Don't wake up destination transaction queue if they're not due for retry. ↵Erik Johnston2023-09-041-0/+25
| | | | (#16223)
* Don't reset retry timers on "valid" error codes (#16221)Erik Johnston2023-09-041-2/+16
|
* Allow config of the backoff algorithm for the federation client. (#15754)Mathieu Velten2023-08-031-13/+16
| | | | | | | | | | | Adds three new configuration variables: * destination_min_retry_interval is identical to before (10mn). * destination_retry_multiplier is now 2 instead of 5, the maximum value will be reached slower. * destination_max_retry_interval is one day instead of (essentially) infinity. Capping this will cause destinations to continue to be retried sometimes instead of being lost forever. The previous value was 2 ^ 62 milliseconds.
* Refactor MSC3030 `/timestamp_to_event` to move away from our snowflake pull ↵Eric Eastwood2022-10-261-1/+1
| | | | | | | | | from `destination` pattern (#14096) 1. `federation_client.timestamp_to_event(...)` now handles all `destination` looping and uses our generic `_try_destination_list(...)` helper. 2. Consistently handling `NotRetryingDestination` and `FederationDeniedError` across `get_pdu` , backfill, and the generic `_try_destination_list` which is used for many places we use this pattern. 3. `get_pdu(...)` now returns `PulledPduInfo` so we know which `destination` we ended up pulling the PDU from
* Fix `RetryDestinationLimiter` re-starting finished log contexts (#12803)Sean Quah2022-05-191-2/+2
| | | | Signed-off-by: Sean Quah <seanq@matrix.org>
* Immediately retry any requests that have backed off when a server comes back ↵Erik Johnston2022-05-101-1/+23
| | | | | online. (#12500) Otherwise it can take up to a minute for any in-flight `/send` requests to be retried.
* Bump `black` and `click` versions (#12320)David Robertson2022-03-291-1/+1
|
* Add types to synapse.util. (#10601)reivilibre2021-09-101-27/+42
|
* Don't hammer the database for destination retry timings every ~5mins (#10036)Erik Johnston2021-05-211-5/+3
|
* Remove redundant "coding: utf-8" lines (#9786)Jonathan de Jong2021-04-141-1/+0
| | | | | | | Part of #9744 Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
* Tell Black to format code for Python 3.5 (#8664)Dan Callahan2020-10-271-1/+1
| | | | | | | | This allows trailing commas in multi-line arg lists. Minor, but we might as well keep our formatting current with regard to our minimum supported Python version. Signed-off-by: Dan Callahan <danc@element.io>
* Simplify super() calls to Python 3 syntax. (#8344)Patrick Cloke2020-09-181-1/+1
| | | | | | | This converts calls like super(Foo, self) -> super(). Generated with: sed -i "" -Ee 's/super\([^\(]+\)/super()/g' **/*.py
* Stop sub-classing object (#8249)Patrick Cloke2020-09-041-1/+1
|
* Convert some util functions to async (#8035)Patrick Cloke2020-08-061-10/+6
|
* Fix some spelling mistakes / typos. (#7811)Patrick Cloke2020-07-091-2/+2
|
* Fix errors storing large retry intervals.Erik Johnston2019-10-021-1/+1
| | | | | | | | | We have set the max retry interval to a value larger than a postgres or sqlite int can hold, which caused exceptions when updating the destinations table. To fix postgres we need to change the column to a bigint, and for sqlite we lower the max interval to 2**62 (which is still incredibly long).
* Add 'failure_ts' column to 'destinations' table (#6016)Richard van der Hoff2019-09-171-1/+15
| | | | Track the time that a server started failing at, for general analysis purposes.
* Remove the cap on federation retry interval. (#6026)Richard van der Hoff2019-09-121-2/+2
| | | | | | Essentially the intention here is to end up blacklisting servers which never respond to federation requests. Fixes https://github.com/matrix-org/synapse/issues/5113.
* Fix bug in calculating the federation retry backoff period (#6025)Richard van der Hoff2019-09-121-2/+3
| | | | This was intended to introduce an element of jitter; instead it gave you a 30/60 chance of resetting to zero.
* Clean up some code in the retry logic (#6017)Richard van der Hoff2019-09-111-16/+13
| | | | * remove some unused code * make things which were constants into constants for efficiency and clarity
* Replace returnValue with return (#5736)Amber Brown2019-07-231-9/+7
|
* Move logging utilities out of the side drawer of util/ and into logging/ (#5606)Amber Brown2019-07-041-2/+2
|
* Call RetryLimiter correctly (#5340)Richard van der Hoff2019-06-041-1/+6
| | | Fixes a regression introduced in #5335.
* Avoid rapidly backing-off a server if we ignore the retry intervalRichard van der Hoff2019-06-031-23/+37
|
* Improve the logging when handling a federation transaction (#3904)Richard van der Hoff2018-09-191-1/+1
| | | | | | | | | | Let's try to rationalise the logging that happens when we are processing an incoming transaction, to make it easier to figure out what is going wrong when they take ages. In particular: - make everything start with a [room_id event_id] prefix - make sure we log a warning when catching exceptions rather than just turning them into other, more cryptic, exceptions.
* run isortAmber Brown2018-07-091-5/+4
|
* Use run_in_background in preference to preserve_fnRichard van der Hoff2018-04-271-2/+2
| | | | | | While I was going through uses of preserve_fn for other PRs, I converted places which only use the wrapped function once to use run_in_background, to avoid creating the function object.
* Add federation_domain_whitelist option (#2820)Matthew Hodgson2018-01-221-0/+12
| | | | | | Add federation_domain_whitelist gives a way to restrict which domains your HS is allowed to federate with. useful mainly for gracefully preventing a private but internet-connected HS from trying to federate to the wider public Matrix network
* replace 'except:' with 'except Exception:'Richard van der Hoff2017-10-231-1/+1
| | | | what could possibly go wrong
* Merge pull request #2050 from matrix-org/rav/federation_backoffRichard van der Hoff2017-03-231-4/+25
|\ | | | | push federation retry limiter down to matrixfederationclient
| * Ignore backoff history for invites, aliases, and roomdirsRichard van der Hoff2017-03-231-2/+11
| | | | | | | | | | Add a param to the federation client which lets us ignore historical backoff data for federation queries, and set it for a handful of operations.
| * push federation retry limiter down to matrixfederationclientRichard van der Hoff2017-03-231-2/+14
| | | | | | | | | | rather than having to instrument everywhere we make a federation call, make the MatrixFederationHttpClient manage the retry limiter.
* | Fix a couple of logcontext leaksRichard van der Hoff2017-03-231-2/+3
|/ | | | | Use preserve_fn to correctly manage the logcontexts around things we don't want to yield on.
* Correctly raise exceptions for ratelimitng. Ratelimit on 401Erik Johnston2017-02-011-3/+5
|
* Remove explicit < 400 check as apparently this is confusingErik Johnston2017-01-311-3/+1
|
* CommentErik Johnston2017-01-311-0/+2
|
* CommentErik Johnston2017-01-311-0/+4
|
* Better handle 404 response for federation /send/Erik Johnston2017-01-311-2/+13
|
* Use correct varErik Johnston2016-11-241-1/+1
|
* Correctly handle 500's and 429 on federationErik Johnston2016-11-241-1/+1
|
* Invalidate retry cache in both directionsErik Johnston2016-11-221-9/+12
|
* Fix retry utils to check if the exception is a subclass of CMEMark Haines2016-07-281-1/+1
|
* copyrightsMatthew Hodgson2016-01-071-1/+1
|
* Retry dead servers a lot less oftenErik Johnston2015-11-021-2/+5
|
* Make work in both Maria and SQLite. Fix testsErik Johnston2015-04-011-1/+1
|
* Remove unused importErik Johnston2015-02-181-2/+0
|
* Remove spurious comma. Remove temp run_on_reactorErik Johnston2015-02-181-2/+1
|
* Temporarily add a run_on_reactor() callErik Johnston2015-02-181-0/+3
|
* s/self._clock/self.clock/Erik Johnston2015-02-181-1/+1
|
* More docsErik Johnston2015-02-181-1/+5
|
* Docs.Erik Johnston2015-02-181-1/+33
|
* Try to only back off if we think we failed to connect to the remoteErik Johnston2015-02-171-2/+8
|
* Only update destination_retry_timings if we have succeeded when retryingErik Johnston2015-02-171-0/+3
|
* Remove spurious selfErik Johnston2015-02-171-1/+1
|
* Add per server retry limiting.Erik Johnston2015-02-171-0/+108
Factor out the pre destination retry logic from TransactionQueue so it can be reused in both get_pdu and crypto.keyring