diff options
Diffstat (limited to 'docs/server-server')
-rw-r--r-- | docs/server-server/protocol-format.rst | 59 | ||||
-rw-r--r-- | docs/server-server/signing.rst | 151 | ||||
-rw-r--r-- | docs/server-server/specification.rst | 231 | ||||
-rw-r--r-- | docs/server-server/versioning.rst | 11 |
4 files changed, 0 insertions, 452 deletions
diff --git a/docs/server-server/protocol-format.rst b/docs/server-server/protocol-format.rst deleted file mode 100644 index 2838253ab7..0000000000 --- a/docs/server-server/protocol-format.rst +++ /dev/null @@ -1,59 +0,0 @@ - -Transaction -=========== - -Required keys: - -============ =================== =============================================== - Key Type Description -============ =================== =============================================== -origin String DNS name of homeserver making this transaction. -ts Integer Timestamp in milliseconds on originating - homeserver when this transaction started. -previous_ids List of Strings List of transactions that were sent immediately - prior to this transaction. -pdus List of Objects List of updates contained in this transaction. -============ =================== =============================================== - - -PDU -=== - -Required keys: - -============ ================== ================================================ - Key Type Description -============ ================== ================================================ -context String Event context identifier -origin String DNS name of homeserver that created this PDU. -pdu_id String Unique identifier for PDU within the context for - the originating homeserver. -ts Integer Timestamp in milliseconds on originating - homeserver when this PDU was created. -pdu_type String PDU event type. -prev_pdus List of Pairs The originating homeserver and PDU ids of the - of Strings most recent PDUs the homeserver was aware of for - this context when it made this PDU. -depth Integer The maximum depth of the previous PDUs plus one. -============ ================== ================================================ - -Keys for state updates: - -================== ============ ================================================ - Key Type Description -================== ============ ================================================ -is_state Boolean True if this PDU is updating state. -state_key String Optional key identifying the updated state within - the context. -power_level Integer The asserted power level of the user performing - the update. -min_update Integer The required power level needed to replace this - update. -prev_state_id String The homeserver of the update this replaces -prev_state_origin String The PDU id of the update this replaces. -user String The user updating the state. -================== ============ ================================================ - - - - diff --git a/docs/server-server/signing.rst b/docs/server-server/signing.rst deleted file mode 100644 index dae10f121b..0000000000 --- a/docs/server-server/signing.rst +++ /dev/null @@ -1,151 +0,0 @@ -Signing JSON -============ - -JSON is signed by encoding the JSON object without ``signatures`` or ``meta`` -keys using a canonical encoding. The JSON bytes are then signed using the -signature algorithm and the signature encoded using base64 with the padding -stripped. The resulting base64 signature is added to an object under the -*signing key identifier* which is added to the ``signatures`` object under the -name of the server signing it which is added back to the original JSON object -along with the ``meta`` object. - -The *signing key identifier* is the concatenation of the *signing algorithm* -and a *key version*. The *signing algorithm* identifies the algorithm used to -sign the JSON. The currently support value for *signing algorithm* is -``ed25519`` as implemented by NACL (http://nacl.cr.yp.to/). The *key version* -is used to distinguish between different signing keys used by the same entity. - -The ``meta`` object and the ``signatures`` object are not covered by the -signature. Therefore intermediate servers can add metadata such as time stamps -and additional signatures. - - -:: - - { - "name": "example.org", - "signing_keys": { - "ed25519:1": "XSl0kuyvrXNj6A+7/tkrB9sxSbRi08Of5uRhxOqZtEQ" - }, - "meta": { - "retrieved_ts_ms": 922834800000 - }, - "signatures": { - "example.org": { - "ed25519:1": "s76RUgajp8w172am0zQb/iPTHsRnb4SkrzGoeCOSFfcBY2V/1c8QfrmdXHpvnc2jK5BD1WiJIxiMW95fMjK7Bw" - } - } - } - -:: - - def sign_json(json_object, signing_key, signing_name): - signatures = json_object.pop("signatures", {}) - meta = json_object.pop("meta", None) - - signed = signing_key.sign(encode_canonical_json(json_object)) - signature_base64 = encode_base64(signed.signature) - - key_id = "%s:%s" % (signing_key.alg, signing_key.version) - signatures.setdefault(sigature_name, {})[key_id] = signature_base64 - - json_object["signatures"] = signatures - if meta is not None: - json_object["meta"] = meta - - return json_object - -Checking for a Signature ------------------------- - -To check if an entity has signed a JSON object a server does the following - -1. Checks if the ``signatures`` object contains an entry with the name of the - entity. If the entry is missing then the check fails. -2. Removes any *signing key identifiers* from the entry with algorithms it - doesn't understand. If there are no *signing key identifiers* left then the - check fails. -3. Looks up *verification keys* for the remaining *signing key identifiers* - either from a local cache or by consulting a trusted key server. If it - cannot find a *verification key* then the check fails. -4. Decodes the base64 encoded signature bytes. If base64 decoding fails then - the check fails. -5. Checks the signature bytes using the *verification key*. If this fails then - the check fails. Otherwise the check succeeds. - -Canonical JSON --------------- - -The canonical JSON encoding for a value is the shortest UTF-8 JSON encoding -with dictionary keys lexicographically sorted by unicode codepoint. Numbers in -the JSON value must be integers in the range [-(2**53)+1, (2**53)-1]. - -:: - - import json - - def canonical_json(value): - return json.dumps( - value, - ensure_ascii=False, - separators=(',',':'), - sort_keys=True, - ).encode("UTF-8") - -Grammar -+++++++ - -Adapted from the grammar in http://tools.ietf.org/html/rfc7159 removing -insignificant whitespace, fractions, exponents and redundant character escapes - -:: - - value = false / null / true / object / array / number / string - false = %x66.61.6c.73.65 - null = %x6e.75.6c.6c - true = %x74.72.75.65 - object = %x7B [ member *( %x2C member ) ] %7D - member = string %x3A value - array = %x5B [ value *( %x2C value ) ] %5B - number = [ %x2D ] int - int = %x30 / ( %x31-39 *digit ) - digit = %x30-39 - string = %x22 *char %x22 - char = unescaped / %x5C escaped - unescaped = %x20-21 / %x23-5B / %x5D-10FFFF - escaped = %x22 ; " quotation mark U+0022 - / %x5C ; \ reverse solidus U+005C - / %x62 ; b backspace U+0008 - / %x66 ; f form feed U+000C - / %x6E ; n line feed U+000A - / %x72 ; r carriage return U+000D - / %x74 ; t tab U+0009 - / %x75.30.30.30 (%x30-37 / %x62 / %x65-66) ; u000X - / %x75.30.30.31 (%x30-39 / %x61-66) ; u001X - -Signing Events -============== - -Signing events is a more complicated process since servers can choose to redact -non-essential event contents. Before signing the event it is encoded as -Canonical JSON and hashed using SHA-256. The resulting hash is then stored -in the event JSON in a ``hash`` object under a ``sha256`` key. Then all -non-essential keys are stripped from the event object, and the resulting object -which included the ``hash`` key is signed using the JSON signing algorithm. - -Servers can then transmit the entire event or the event with the non-essential -keys removed. Receiving servers can then check the entire event if it is -present by computing the SHA-256 of the event excluding the ``hash`` object, or -by using the ``hash`` object included in the event if keys have been redacted. - -New hash functions can be introduced by adding additional keys to the ``hash`` -object. Since the ``hash`` object cannot be redacted a server shouldn't allow -too many hashes to be listed, otherwise a server might embed illict data within -the ``hash`` object. For similar reasons a server shouldn't allow hash values -that are too long. - -[[TODO(markjh): We might want to specify a maximum number of keys for the -``hash`` and we might want to specify the maximum output size of a hash]] - -[[TODO(markjh) We might want to allow the server to omit the output of well -known hash functions like SHA-256 when none of the keys have been redacted]] diff --git a/docs/server-server/specification.rst b/docs/server-server/specification.rst deleted file mode 100644 index 17cffafdd4..0000000000 --- a/docs/server-server/specification.rst +++ /dev/null @@ -1,231 +0,0 @@ -=========================== -Matrix Server-to-Server API -=========================== - -A description of the protocol used to communicate between Matrix home servers; -also known as Federation. - - -Overview -======== - -The server-server API is a mechanism by which two home servers can exchange -Matrix event messages, both as a real-time push of current events, and as a -historic fetching mechanism to synchronise past history for clients to view. It -uses HTTP connections between each pair of servers involved as the underlying -transport. Messages are exchanged between servers in real-time by active pushing -from each server's HTTP client into the server of the other. Queries to fetch -historic data for the purpose of back-filling scrollback buffers and the like -can also be performed. - - - { Matrix clients } { Matrix clients } - ^ | ^ | - | events | | events | - | V | V - +------------------+ +------------------+ - | |---------( HTTP )---------->| | - | Home Server | | Home Server | - | |<--------( HTTP )-----------| | - +------------------+ +------------------+ - -There are three main kinds of communication that occur between home servers: - - * Queries - These are single request/response interactions between a given pair of - servers, initiated by one side sending an HTTP request to obtain some - information, and responded by the other. They are not persisted and contain - no long-term significant history. They simply request a snapshot state at the - instant the query is made. - - * EDUs - Ephemeral Data Units - These are notifications of events that are pushed from one home server to - another. They are not persisted and contain no long-term significant history, - nor does the receiving home server have to reply to them. - - * PDUs - Persisted Data Units - These are notifications of events that are broadcast from one home server to - any others that are interested in the same "context" (namely, a Room ID). - They are persisted to long-term storage and form the record of history for - that context. - -Where Queries are presented directly across the HTTP connection as GET requests -to specific URLs, EDUs and PDUs are further wrapped in an envelope called a -Transaction, which is transferred from the origin to the destination home server -using a PUT request. - - -Transactions and EDUs/PDUs -========================== - -The transfer of EDUs and PDUs between home servers is performed by an exchange -of Transaction messages, which are encoded as JSON objects with a dict as the -top-level element, passed over an HTTP PUT request. A Transaction is meaningful -only to the pair of home servers that exchanged it; they are not globally- -meaningful. - -Each transaction has an opaque ID and timestamp (UNIX epoch time in -milliseconds) generated by its origin server, an origin and destination server -name, a list of "previous IDs", and a list of PDUs - the actual message payload -that the Transaction carries. - - {"transaction_id":"916d630ea616342b42e98a3be0b74113", - "ts":1404835423000, - "origin":"red", - "destination":"blue", - "prev_ids":["e1da392e61898be4d2009b9fecce5325"], - "pdus":[...], - "edus":[...]} - -The "previous IDs" field will contain a list of previous transaction IDs that -the origin server has sent to this destination. Its purpose is to act as a -sequence checking mechanism - the destination server can check whether it has -successfully received that Transaction, or ask for a retransmission if not. - -The "pdus" field of a transaction is a list, containing zero or more PDUs.[*] -Each PDU is itself a dict containing a number of keys, the exact details of -which will vary depending on the type of PDU. Similarly, the "edus" field is -another list containing the EDUs. This key may be entirely absent if there are -no EDUs to transfer. - -(* Normally the PDU list will be non-empty, but the server should cope with -receiving an "empty" transaction, as this is useful for informing peers of other -transaction IDs they should be aware of. This effectively acts as a push -mechanism to encourage peers to continue to replicate content.) - -All PDUs have an ID, a context, a declaration of their type, a list of other PDU -IDs that have been seen recently on that context (regardless of which origin -sent them), and a nested content field containing the actual event content. - -[[TODO(paul): Update this structure so that 'pdu_id' is a two-element -[origin,ref] pair like the prev_pdus are]] - - {"pdu_id":"a4ecee13e2accdadf56c1025af232176", - "context":"#example.green", - "origin":"green", - "ts":1404838188000, - "pdu_type":"m.text", - "prev_pdus":[["blue","99d16afbc857975916f1d73e49e52b65"]], - "content":... - "is_state":false} - -In contrast to the transaction layer, it is important to note that the prev_pdus -field of a PDU refers to PDUs that any origin server has sent, rather than -previous IDs that this origin has sent. This list may refer to other PDUs sent -by the same origin as the current one, or other origins. - -Because of the distributed nature of participants in a Matrix conversation, it -is impossible to establish a globally-consistent total ordering on the events. -However, by annotating each outbound PDU at its origin with IDs of other PDUs it -has received, a partial ordering can be constructed allowing causallity -relationships to be preserved. A client can then display these messages to the -end-user in some order consistent with their content and ensure that no message -that is semantically in reply of an earlier one is ever displayed before it. - -PDUs fall into two main categories: those that deliver Events, and those that -synchronise State. For PDUs that relate to State synchronisation, additional -keys exist to support this: - - {..., - "is_state":true, - "state_key":TODO - "power_level":TODO - "prev_state_id":TODO - "prev_state_origin":TODO} - -[[TODO(paul): At this point we should probably have a long description of how -State management works, with descriptions of clobbering rules, power levels, etc -etc... But some of that detail is rather up-in-the-air, on the whiteboard, and -so on. This part needs refining. And writing in its own document as the details -relate to the server/system as a whole, not specifically to server-server -federation.]] - -EDUs, by comparison to PDUs, do not have an ID, a context, or a list of -"previous" IDs. The only mandatory fields for these are the type, origin and -destination home server names, and the actual nested content. - - {"edu_type":"m.presence", - "origin":"blue", - "destination":"orange", - "content":...} - - -Protocol URLs -============= - -All these URLs are namespaced within a prefix of - - /_matrix/federation/v1/... - -For active pushing of messages representing live activity "as it happens": - - PUT .../send/:transaction_id/ - Body: JSON encoding of a single Transaction - - Response: [[TODO(paul): I don't actually understand what - ReplicationLayer.on_transaction() is doing here, so I'm not sure what the - response ought to be]] - - The transaction_id path argument will override any ID given in the JSON body. - The destination name will be set to that of the receiving server itself. Each - embedded PDU in the transaction body will be processed. - - -To fetch a particular PDU: - - GET .../pdu/:origin/:pdu_id/ - - Response: JSON encoding of a single Transaction containing one PDU - - Retrieves a given PDU from the server. The response will contain a single new - Transaction, inside which will be the requested PDU. - - -To fetch all the state of a given context: - - GET .../state/:context/ - - Response: JSON encoding of a single Transaction containing multiple PDUs - - Retrieves a snapshot of the entire current state of the given context. The - response will contain a single Transaction, inside which will be a list of - PDUs that encode the state. - - -To backfill events on a given context: - - GET .../backfill/:context/ - Query args: v, limit - - Response: JSON encoding of a single Transaction containing multiple PDUs - - Retrieves a sliding-window history of previous PDUs that occurred on the - given context. Starting from the PDU ID(s) given in the "v" argument, the - PDUs that preceeded it are retrieved, up to a total number given by the - "limit" argument. These are then returned in a new Transaction containing all - off the PDUs. - - -To stream events all the events: - - GET .../pull/ - Query args: origin, v - - Response: JSON encoding of a single Transaction consisting of multiple PDUs - - Retrieves all of the transactions later than any version given by the "v" - arguments. [[TODO(paul): I'm not sure what the "origin" argument does because - I think at some point in the code it's got swapped around.]] - - -To make a query: - - GET .../query/:query_type - Query args: as specified by the individual query types - - Response: JSON encoding of a response object - - Performs a single query request on the receiving home server. The Query Type - part of the path specifies the kind of query being made, and its query - arguments have a meaning specific to that kind of query. The response is a - JSON-encoded object whose meaning also depends on the kind of query. diff --git a/docs/server-server/versioning.rst b/docs/server-server/versioning.rst deleted file mode 100644 index ffda60633f..0000000000 --- a/docs/server-server/versioning.rst +++ /dev/null @@ -1,11 +0,0 @@ -Versioning is, like, hard for backfilling backwards because of the number of Home Servers involved. - -The way we solve this is by doing versioning as an acyclic directed graph of PDUs. For backfilling purposes, this is done on a per context basis. -When we send a PDU we include all PDUs that have been received for that context that hasn't been subsequently listed in a later PDU. The trivial case is a simple list of PDUs, e.g. A <- B <- C. However, if two servers send out a PDU at the same to, both B and C would point at A - a later PDU would then list both B and C. - -Problems with opaque version strings: - - How do you do clustering without mandating that a cluster can only have one transaction in flight to a given remote home server at a time. - If you have multiple transactions sent at once, then you might drop one transaction, receive another with a version that is later than the dropped transaction and which point ARGH WE LOST A TRANSACTION. - - How do you do backfilling? A version string defines a point in a stream w.r.t. a single home server, not a point in the context. - -We only need to store the ends of the directed graph, we DO NOT need to do the whole one table of nodes and one of edges. |