| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|\ \ \
| | | |
| | | | |
Split TransactionQueue up
|
| | | | |
|
| | | |
| | | |
| | | |
| | | | |
This is easier than having to have a million fields keyed on destination.
|
|\| | |
| | | |
| | | | |
Move client receipt processing to federation sender worker.
|
| | |/
| |/|
| | |
| | |
| | | |
This is mostly a prerequisite for #4730, but also fits with the general theme
of "move everything off the master that we possibly can".
|
|/ /
| |
| |
| |
| |
| | |
endpoints (#4793)"
This reverts commit 290552fd836f4ae2dc1d893a7f72f7fff85365d3.
|
|/
|
|
|
| |
endpoints (#4793)
Server side of a solution towards #3622.
|
|
|
|
|
| |
A dollar sign is already appended to the end of each PATH, so there's
no need to add one in the PATH declaration as well.
|
|
|
|
|
| |
In worker mode, on the federation sender, when we receive an edu for sending
over the replication socket, it is parsed into an Edu object. There is no point
extracting the contents of it so that we can then immediately build another Edu.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* make 'event_id' a required parameter in federated state requests
As per the spec: https://matrix.org/docs/spec/server_server/r0.1.1.html#id40
Signed-off-by: Joseph Weston <joseph@weston.cloud>
* add changelog entry for bugfix
Signed-off-by: Joseph Weston <joseph@weston.cloud>
* Update server.py
|
| |
|
|\
| |
| |
| | |
anoa/public_rooms_federate_develop
|
| | |
|
| |\
| | |
| | | |
Config option to prevent showing non-fed rooms in fed /publicRooms
|
| | |\
| | | |
| | | |
| | | | |
anoa/public_rooms_federate
|
| | | | |
|
| |\ \ \
| | | | |
| | | | | |
Log tracebacks correctly
|
| | | |/
| | |/| |
|
| |/ / |
|
| | | |
|
| |\ \
| | | |
| | | | |
New listener resource for the federation API "openid/userinfo" endpoint
|
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Jason Robinson <jasonr@matrix.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This allows the OpenID userinfo endpoint to be active even if the
federation resource is not active. The OpenID userinfo endpoint
is called by integration managers to verify user actions using the
client API OpenID access token. Without this verification, the
integration manager cannot know that the access token is valid.
The OpenID userinfo endpoint will be loaded in the case that either
"federation" or "openid" resource is defined. The new "openid"
resource is defaulted to active in default configuration.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
* Reject large transactions on federation
* Add changelog
* lint
* Simplify large transaction handling
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | |
| | | |
| | | |
| | | |
| | | | |
In future version events won't have an event ID, so we won't be able to
do this check.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We only process events sent to us from a server if the event ID matches
the server, to help guard against federation storms. We replace this
with a check against the event origin.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The transaction queue only sends out events that we generate. This was
done by checking domain of event ID, but that can no longer be used.
Instead, we may as well use the sender field.
|
| |\ \ \
| | | | |
| | | | | |
Refactor event building into EventBuilder
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This is so that everything is done in one place, making it easier to
change the event format based on room version
|
| |\ \ \ \
| | | | | |
| | | | | | |
Fixup calls to `comput_event_signature`
|
| | |/ / /
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
We currently pass FrozenEvent instead of `dict` to
`compute_event_signature`, which works by accident due to `dict(event)`
producing the correct result.
This fixes PR #4493 commit 855a151
|
| |/ / /
| | | |
| | | |
| | | |
| | | | |
If the room version is either 1 or 2 then a server should retry failed
`/v2/invite` requests with the v1 API
|
| | | | |
|
| |\ \ \ |
|
| | |\ \ \
| | | | | |
| | | | | | |
Add room_version param to get_pdu
|
| | | | | | |
|
| | | |/ /
| | | | |
| | | | |
| | | | |
| | | | | |
When we add new event format we'll need to know the event format or room
version when parsing events.
|
| | | | | |
|
| | | | | |
|
| | |/ /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently they're stored as non-outliers even though the server isn't in
the room, which can be problematic in places where the code assumes it
has the state for all non outlier events.
In particular, there is an edge case where persisting the leave event
triggers a state resolution, which requires looking up the room version
from state. Since the server doesn't have the state, this causes an
exception to be thrown.
|
| | | | |
|
| |/ /
| | |
| | |
| | |
| | | |
We also implement `make_membership_event` converting the returned
room version to an event format version.
|
| | | |
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* Correctly retry and back off if we get a HTTPerror response
* Refactor request sending to have better excpetions
MatrixFederationHttpClient blindly reraised exceptions to the caller
without differentiating "expected" failures (e.g. connection timeouts
etc) versus more severe problems (e.g. programming errors).
This commit adds a RequestSendFailed exception that is raised when
"expected" failures happen, allowing the TransactionQueue to log them as
warnings while allowing us to log other exceptions as actual exceptions.
|
| | |
| | |
| | | |
Co-Authored-By: erikjohnston <erikj@jki.re>
|
| | |
| | |
| | | |
Co-Authored-By: erikjohnston <erikj@jki.re>
|
| | |
| | |
| | |
| | |
| | |
| | | |
When we receive events over federation we will need to know the room
version to be able to correctly handle them, e.g. once we start changing
event formats. Currently, we attempt to handle events in unknown rooms.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* Add helpers for getting prev and auth events
This is in preparation for allowing the event format to change between
room versions.
|
| | |
| | |
| | |
| | |
| | | |
This is in preparation to refactor FrozenEvent to support different
event formats for different room versions
|
| |/ |
|
|/ |
|
|\
| |
| |
| | |
erikj/alias_disallow_list
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Broadly three things here:
* disable W504 which seems a bit whacko
* remove a bunch of `as e` expressions from exception handlers that don't use
them
* use `r""` for strings which include backslashes
Also, we don't use pep8 any more, so we can get rid of the duplicate config
there.
|
|/ |
|
|
|
|
|
|
|
|
|
| |
It's quite important that get_missing_events returns the *latest* events in the
room; however we were pulling event ids out of the database until we got *at
least* 10, and then taking the *earliest* of the results.
We also shouldn't really be relying on depth, and should be checking the
room_id.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Improve logging: log things in the right order, include destination and txids
in all log lines, don't log successful responses twice
- Fix the docstring on TransportLayerClient.send_transaction
- Don't use treq.request, which is overcomplicated for our purposes: just use a
twisted.web.client.Agent.
- simplify the logic for setting up the bodyProducer
- fix bytes/str confusions
|
|\
| |
| | |
remove spurious federation checks on localhost
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There's really no point in checking for destinations called "localhost" because
there is nothing stopping people creating other DNS entries which point to
127.0.0.1. The right fix for this is
https://github.com/matrix-org/synapse/issues/3953.
Blocking localhost, on the other hand, means that you get a surprise when
trying to connect a test server on localhost to an existing server (with a
'normal' server_name).
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
transactions (#3959)
when processing incoming transactions, it can be hard to see what's going on,
because we process a bunch of stuff in parallel, and because we may end up
recursively working our way through a chain of three or four events.
This commit creates a way to use logcontexts to add the relevant event ids to
the log lines.
|
|/
|
|
| |
trivial fixes for docstring
|
|\
| |
| | |
Comments and interface cleanup for on_receive_pdu
|
| |
| |
| |
| |
| |
| |
| |
| | |
Add some informative comments about what's going on here.
Also, `sent_to_us_directly` and `get_missing` were doing the same thing (apart
from in `_handle_queued_pdus`, which looks like a bug), so let's get rid of
`get_missing` and use `sent_to_us_directly` consistently.
|
|/
|
|
|
|
|
|
| |
ExpiringCache required that `start()` be called before it would actually
start expiring entries. A number of places didn't do that.
This PR removes `start` from ExpiringCache, and automatically starts
backround reaping process on creation instead.
|
|
|
|
|
|
|
|
|
|
| |
If we receive an event that doesn't pass their content hash check (e.g.
due to already being redacted) then we hit a bug which causes an
exception to be raised, which then promplty stops the event (and
request) from being processed.
This effects all sorts of federation APIs, including joining rooms with
a redacted state event.
|
| |
|
|\
| |
| | |
add some logging for the keyring queue
|
| | |
|
|/ |
|
|\ |
|
| |
| |
| |
| |
| | |
Use the actual origin for push transactions, rather than whatever the remote
server claimed.
|
| |
| |
| |
| |
| |
| | |
We should check that both the sender's server, and the server which created the
event_id (which may be different from whatever the remote server has told us
the origin is), have signed the event.
|
| | |
|
| |
| |
| |
| |
| | |
itervalues(d) calls d.itervalues() [PY2] and d.values() [PY3]
but SortedDict only implements d.values()
|
|\ \
| | |
| | | |
limt -> limit
|
| | | |
|
| | | |
|
| |/
|/|
| |
| |
| | |
Not being able to resolve or connect to remote servers is an expected
error, so we shouldn't log at ERROR with stacktraces.
|
| | |
|
|\ \
| | |
| | |
| | | |
erikj/split_federation
|
| | | |
|
| | | |
|
|\| |
| | |
| | |
| | | |
erikj/split_federation
|
| |\ \
| | | |
| | | | |
more metrics for the federation and appservice senders
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Reject make_join requests from servers which do not support the room version.
Also include the room version in the response.
|
| |/ /
| | |
| | |
| | | |
... to save me reverse-engineering this stuff again.
|
|/ / |
|
|\ \
| | |
| | | |
Clean up handling of errors from outbound requests
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This commit replaces SynapseError.from_http_response_exception with
HttpResponseException.to_synapse_error.
The new method actually returns a ProxiedRequestError, which allows us to pass
through additional metadata from the API call.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We really shouldn't be sending all CodeMessageExceptions back over the C-S API;
it will include things like 401s which we shouldn't proxy.
That means that we need to explicitly turn a few HttpResponseExceptions into
SynapseErrors in the federation layer.
The effect of the latter is that the matrix errcode will get passed through
correctly to calling clients, which might help with some of the random
M_UNKNOWN errors when trying to join rooms.
|
|\| |
| | |
| | |
| | |
| | | |
matrix-org/rav/refactor_federation_client_exception_handling
Factor out exception handling in federation_client
|
| | |
| | |
| | |
| | |
| | | |
Factor out the error handling from make_membership_event, send_join, and
send_leave, so that it can be shared.
|
|\ \ \
| |/ /
|/| | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When we get a federation request which refers to an event id, make sure that
said event is in the room the caller claims it is in.
(patch supplied by @turt2live)
|
| |/
|/| |
|
| |
| |
| | |
The field is never read from, and all the opportunities given to populate it are not utilized. It should be very safe to remove this.
|
| |
| |
| | |
It's still not used, however the parameter is an event ID not a transaction ID.
|
| |
| |
| |
| | |
when we get an exception handling a federation PDU, log the whole stacktrace.
|
| |
| |
| |
| |
| |
| |
| |
| | |
This fixes #3518, and ensures that we get useful logs and metrics for lots of
things that happen in the background.
(There are certainly more things that happen in the background; these are just
the common ones I've found running a single-process synapse locally).
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces a mechanism for tracking resource usage by background
processes, along with an example of how it will be used.
This will help address #3518, but more importantly will give us better insights
into things which are happening but not being shown up by the request metrics.
We *could* do this with Measure blocks, but:
- I think having them pulled out as a completely separate metric class will
make it easier to distinguish top-level processes from those which are
nested.
- I want to be able to report on in-flight background processes, and I don't
think we want to do this for *all* Measure blocks.
|
|
|
|
|
|
| |
the method "assert_params_in_request" does handle dicts and not
requests. A request body has to be parsed to json before this method
can be used
|
| |
|
|
|
|
|
| |
... as described at
https://docs.google.com/document/d/1EttUVzjc2DWe2ciw4XPtNpUpIl9lWXGEsy2ewDS7rtw.
|
|
|
|
|
|
|
|
| |
We need to do a bit more validation when we get a server name, but don't want
to be re-doing it all over the shop, so factor out a separate
parse_and_validate_server_name, and do the extra validation.
Also, use it to verify the server name in the config file.
|
|
|
|
|
| |
Make sure that server_names used in auth headers are sane, and reject them with
a sensible error code, before they disappear off into the depths of the system.
|
|\
| |
| | |
Check the state of prev_events a bit more thoroughly when coming over federation
|
| | |
|
|/ |
|
|\
| |
| | |
Simplify get_persisted_pdu
|
| |
| |
| |
| |
| | |
it doesn't make much sense to use get_persisted_pdu on the receive path: just
get the event straight from the store.
|
| | |
|
|/ |
|
| |
|
| |
|
|
|
|
|
| |
Fixes a startup crash due to commit df9f72d9e5fe264b86005208e0f096156eb03e4b
"replacing portions".
|
|
|
|
| |
they're not meant to be lazy (#3307)
|
| |
|
| |
|
| |
|
| |
|
| |
|
|\
| |
| | |
transaction_id, destination defined twice
|
| | |
|
|\| |
|
| |
| |
| |
| |
| |
| |
| | |
* When creating a new event, cap its depth to 2^63 - 1
* When receiving events, reject any without a sensible depth
As per https://docs.google.com/document/d/1I3fi2S-XnpO45qrpCsowZv8P8dHcNZ4fsBsbOW7KABI
|
|\ \ |
|
| |\ \ |
|
| | |/
| |/|
| | |
| | |
| | |
| | | |
While I was going through uses of preserve_fn for other PRs, I converted places
which only use the wrapped function once to use run_in_background, to avoid
creating the function object.
|
| |/
|/|
| |
| |
| |
| | |
plus a bonus next()
Signed-off-by: Adrian Tschira <nota@notafile.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were a bunch of places where we fire off a process to happen in the
background, but don't have any exception handling on it - instead relying on
the unhandled error being logged when the relevent deferred gets
garbage-collected.
This is unsatisfactory for a number of reasons:
- logging on garbage collection is best-effort and may happen some time after
the error, if at all
- it can be hard to figure out where the error actually happened.
- it is logged as a scary CRITICAL error which (a) I always forget to grep for
and (b) it's not really CRITICAL if a background process we don't care about
fails.
So this is an attempt to add exception handling to everything we fire off into
the background.
|
|\
| |
| | |
Reject events which have lots of prev_events
|
| | |
|
|\ \
| | |
| | | |
Use six.itervalues in some places
|
| |/
| |
| |
| |
| |
| | |
There's more where that came from
Signed-off-by: Adrian Tschira <nota@notafile.com>
|
|\ \
| | |
| | | |
Refactor ResponseCache usage
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Adds a `.wrap` method to ResponseCache which wraps up the boilerplate of a
(get, set) pair, and then use it throughout the codebase.
This will be largely non-functional, but does include the following functional
changes:
* federation_server.on_context_state_request: drops use of _server_linearizer
which looked redundant and could cause incorrect cache misses by yielding
between the get and the set.
* RoomListHandler.get_remote_public_room_list(): fixes logcontext leaks
* the wrap function includes some logging. I'm hoping this won't be too noisy
on production.
|
| |/
|/|
| |
| |
| |
| | |
It turns out that most of the time we were calling have_events, we were only
using half of the result. Replace have_events with have_seen_events and
get_rejection_reasons, so that we can see what's going on a bit more clearly.
|
| |
| |
| |
| | |
we were checking the wrong server_name on inbound requests
|
| | |
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
| |
This reverts commit 9fbe70a7dc3afabfdac176ba1f4be32dd44602aa.
It turns out that sortedcontainers.SortedDict is not an exact match for
blist.sorteddict; in particular, `popitem()` removes things from the opposite
end of the dict.
This is trivial to fix, but I want to add some unit tests, and potentially some
more thought about it, before we do so.
|
|\
| |
| | |
Add metrics for ResponseCache
|
| | |
|
| | |
|
| | |
|
| | |
|
|\ \
| | |
| | | |
Synapse on PyPy
|
| |/
| |
| |
| |
| |
| |
| |
| | |
This commit drop-in replaces blist with SortedContainers. They are
written in pure python so work with pypy, but perform as good as
native implementations, at least in a couple benchmarks:
http://www.grantjenks.com/docs/sortedcontainers/performance.html
|
|\ \
| | |
| | | |
Send federation events concurrently
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| |/ |
|
| | |
|
|/ |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The API is now under
/groups/$group_id/setting/m.join_policy
and expects a JSON blob of the shape
```json
{
"m.join_policy": {
"type": "invite"
}
}
```
where "invite" could alternatively be "open".
|
| |
|
|
|
|
|
| |
Adds API to set the 'joinable' flag, and corresponding flag in the
table.
|
|\
| |
| | |
Remove ReplicationLayer and user Client/Server directly
|
| | |
|
|\|
| |
| | |
Don't build handlers on workers unnecessarily
|
| | |
|
| | |
|
|\|
| |
| | |
Move property setting from ReplicationLayer to base classes
|
| | |
|
|/ |
|
| |
|
|
|
|
|
|
| |
Add federation_domain_whitelist
gives a way to restrict which domains your HS is allowed to federate with.
useful mainly for gracefully preventing a private but internet-connected HS from trying to federate to the wider public Matrix network
|
|
|
|
| |
More metrics I wished I'd had
|
|
|
|
| |
Return a 400 rather than a 500 when somebody messes up their send_join
|
|
|
|
|
| |
turns out we have two copies of this, and neither needs to be an instance
method
|
| |
|
| |
|
|
|
|
|
| |
These processes take a long time compared to the request, so there is lots of
"Entering|Restoring dead context" in the logs. Let's try to shut it up a bit.
|
|
|
|
|
| |
Both of these functions ae known to leak logcontexts. Replace the remaining
calls to them and kill them off.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
also includes renamings to make things more consistent.
|
| |
|
|\
| |
| |
| | |
erikj/group_fed_update_profile
|
| |
| |
| |
| | |
what could possibly go wrong
|
|/ |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
FederationServer doesn't have a send_failure (and nor does its subclass,
ReplicationLayer), so this was failing.
I'm not really sure what the idea behind send_failure is, given (a) we don't do
anything at the other end with it except log it, and (b) we also send back the
failure via the transaction response. I suspect there's a whole lot of dead
code around it, but for now I'm just removing the broken bit.
|
| |
|
| |
|
|\
| |
| | |
log pdu_failures from incoming transactions
|
| |
| |
| |
| |
| |
| |
| | |
... even if we have no EDUs.
This appears to have been introduced in
476899295f5fd6cff64799bcbc84cd4bf9005e33.
|
| | |
|
|\ \
| | |
| | | |
Initial Group Implementation
|
| |\ \ |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | |
| | | |
| | | |
| | | |
| | | | |
With luck, this will give a real-time improvement when there are many rooms and
the server ends up calling out to fetch missing events.
|
| |_|/
|/| |
| | |
| | |
| | | |
We don't want to process the same transaction multiple times concurrently, so
use a linearizer.
|
| | |
| | |
| | |
| | |
| | | |
Move as much as possible to after the have_responded check, and reduce the
number of times we iterate over the pdu list.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The response-building code expects there to be an entry in the `results` list
for each entry in the pdu_list, so the early `continue` was messing this
up. That doesn't really matter, because all that the federation client does is
log any errors, but it's pretty poor form.
|
| |/
|/|
| |
| |
| | |
Avoid using preserve_context_over_function, which has problems with respect to
logcontexts.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| | |
preserve_fn is a no-op unless the wrapped function returns a
Deferred. verify_json_objects_for_server returns a list, so this is doing
nothing.
|
|/
|
| |
Demonstration of how you might add some hooks to filter out spammy events.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
might help us figure out if https://github.com/vector-im/riot-web/issues/3868
has happened.
|
|\
| |
| | |
Always mark remotes as up if we receive a signed request from them
|
| | |
|
| | |
|
|/ |
|
|
|
|
|
| |
When we're rejecting invites, ignore the backoff data, so that we have a better
chance of not getting the room out of sync.
|
|
|
|
|
| |
The documentation on get_json has been wrong ever since the very first commit
to synapse...
|