From c5cec1cc77029c21f0117c318c522ab320de3923 Mon Sep 17 00:00:00 2001 From: Mark Haines Date: Fri, 17 Oct 2014 16:50:04 +0100 Subject: Rename 'meta' to 'unsigned' --- docs/server-server/signing.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) (limited to 'docs/server-server') diff --git a/docs/server-server/signing.rst b/docs/server-server/signing.rst index dae10f121b..60c701ca91 100644 --- a/docs/server-server/signing.rst +++ b/docs/server-server/signing.rst @@ -1,13 +1,13 @@ Signing JSON ============ -JSON is signed by encoding the JSON object without ``signatures`` or ``meta`` +JSON is signed by encoding the JSON object without ``signatures`` or ``unsigned`` keys using a canonical encoding. The JSON bytes are then signed using the signature algorithm and the signature encoded using base64 with the padding stripped. The resulting base64 signature is added to an object under the *signing key identifier* which is added to the ``signatures`` object under the name of the server signing it which is added back to the original JSON object -along with the ``meta`` object. +along with the ``unsigned`` object. The *signing key identifier* is the concatenation of the *signing algorithm* and a *key version*. The *signing algorithm* identifies the algorithm used to @@ -15,8 +15,8 @@ sign the JSON. The currently support value for *signing algorithm* is ``ed25519`` as implemented by NACL (http://nacl.cr.yp.to/). The *key version* is used to distinguish between different signing keys used by the same entity. -The ``meta`` object and the ``signatures`` object are not covered by the -signature. Therefore intermediate servers can add metadata such as time stamps +The ``unsigned`` object and the ``signatures`` object are not covered by the +signature. Therefore intermediate servers can add unsigneddata such as time stamps and additional signatures. @@ -27,7 +27,7 @@ and additional signatures. "signing_keys": { "ed25519:1": "XSl0kuyvrXNj6A+7/tkrB9sxSbRi08Of5uRhxOqZtEQ" }, - "meta": { + "unsigned": { "retrieved_ts_ms": 922834800000 }, "signatures": { @@ -41,7 +41,7 @@ and additional signatures. def sign_json(json_object, signing_key, signing_name): signatures = json_object.pop("signatures", {}) - meta = json_object.pop("meta", None) + unsigned = json_object.pop("unsigned", None) signed = signing_key.sign(encode_canonical_json(json_object)) signature_base64 = encode_base64(signed.signature) @@ -50,8 +50,8 @@ and additional signatures. signatures.setdefault(sigature_name, {})[key_id] = signature_base64 json_object["signatures"] = signatures - if meta is not None: - json_object["meta"] = meta + if unsigned is not None: + json_object["unsigned"] = unsigned return json_object -- cgit 1.5.1 From bebca337c4c19b653d69536f9915ca185bade5c0 Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Tue, 11 Nov 2014 20:43:36 +0200 Subject: this has been merged into matrix-doc/specification/30_server_server_api.rst --- docs/server-server/specification.rst | 231 ----------------------------------- 1 file changed, 231 deletions(-) delete mode 100644 docs/server-server/specification.rst (limited to 'docs/server-server') diff --git a/docs/server-server/specification.rst b/docs/server-server/specification.rst deleted file mode 100644 index 17cffafdd4..0000000000 --- a/docs/server-server/specification.rst +++ /dev/null @@ -1,231 +0,0 @@ -=========================== -Matrix Server-to-Server API -=========================== - -A description of the protocol used to communicate between Matrix home servers; -also known as Federation. - - -Overview -======== - -The server-server API is a mechanism by which two home servers can exchange -Matrix event messages, both as a real-time push of current events, and as a -historic fetching mechanism to synchronise past history for clients to view. It -uses HTTP connections between each pair of servers involved as the underlying -transport. Messages are exchanged between servers in real-time by active pushing -from each server's HTTP client into the server of the other. Queries to fetch -historic data for the purpose of back-filling scrollback buffers and the like -can also be performed. - - - { Matrix clients } { Matrix clients } - ^ | ^ | - | events | | events | - | V | V - +------------------+ +------------------+ - | |---------( HTTP )---------->| | - | Home Server | | Home Server | - | |<--------( HTTP )-----------| | - +------------------+ +------------------+ - -There are three main kinds of communication that occur between home servers: - - * Queries - These are single request/response interactions between a given pair of - servers, initiated by one side sending an HTTP request to obtain some - information, and responded by the other. They are not persisted and contain - no long-term significant history. They simply request a snapshot state at the - instant the query is made. - - * EDUs - Ephemeral Data Units - These are notifications of events that are pushed from one home server to - another. They are not persisted and contain no long-term significant history, - nor does the receiving home server have to reply to them. - - * PDUs - Persisted Data Units - These are notifications of events that are broadcast from one home server to - any others that are interested in the same "context" (namely, a Room ID). - They are persisted to long-term storage and form the record of history for - that context. - -Where Queries are presented directly across the HTTP connection as GET requests -to specific URLs, EDUs and PDUs are further wrapped in an envelope called a -Transaction, which is transferred from the origin to the destination home server -using a PUT request. - - -Transactions and EDUs/PDUs -========================== - -The transfer of EDUs and PDUs between home servers is performed by an exchange -of Transaction messages, which are encoded as JSON objects with a dict as the -top-level element, passed over an HTTP PUT request. A Transaction is meaningful -only to the pair of home servers that exchanged it; they are not globally- -meaningful. - -Each transaction has an opaque ID and timestamp (UNIX epoch time in -milliseconds) generated by its origin server, an origin and destination server -name, a list of "previous IDs", and a list of PDUs - the actual message payload -that the Transaction carries. - - {"transaction_id":"916d630ea616342b42e98a3be0b74113", - "ts":1404835423000, - "origin":"red", - "destination":"blue", - "prev_ids":["e1da392e61898be4d2009b9fecce5325"], - "pdus":[...], - "edus":[...]} - -The "previous IDs" field will contain a list of previous transaction IDs that -the origin server has sent to this destination. Its purpose is to act as a -sequence checking mechanism - the destination server can check whether it has -successfully received that Transaction, or ask for a retransmission if not. - -The "pdus" field of a transaction is a list, containing zero or more PDUs.[*] -Each PDU is itself a dict containing a number of keys, the exact details of -which will vary depending on the type of PDU. Similarly, the "edus" field is -another list containing the EDUs. This key may be entirely absent if there are -no EDUs to transfer. - -(* Normally the PDU list will be non-empty, but the server should cope with -receiving an "empty" transaction, as this is useful for informing peers of other -transaction IDs they should be aware of. This effectively acts as a push -mechanism to encourage peers to continue to replicate content.) - -All PDUs have an ID, a context, a declaration of their type, a list of other PDU -IDs that have been seen recently on that context (regardless of which origin -sent them), and a nested content field containing the actual event content. - -[[TODO(paul): Update this structure so that 'pdu_id' is a two-element -[origin,ref] pair like the prev_pdus are]] - - {"pdu_id":"a4ecee13e2accdadf56c1025af232176", - "context":"#example.green", - "origin":"green", - "ts":1404838188000, - "pdu_type":"m.text", - "prev_pdus":[["blue","99d16afbc857975916f1d73e49e52b65"]], - "content":... - "is_state":false} - -In contrast to the transaction layer, it is important to note that the prev_pdus -field of a PDU refers to PDUs that any origin server has sent, rather than -previous IDs that this origin has sent. This list may refer to other PDUs sent -by the same origin as the current one, or other origins. - -Because of the distributed nature of participants in a Matrix conversation, it -is impossible to establish a globally-consistent total ordering on the events. -However, by annotating each outbound PDU at its origin with IDs of other PDUs it -has received, a partial ordering can be constructed allowing causallity -relationships to be preserved. A client can then display these messages to the -end-user in some order consistent with their content and ensure that no message -that is semantically in reply of an earlier one is ever displayed before it. - -PDUs fall into two main categories: those that deliver Events, and those that -synchronise State. For PDUs that relate to State synchronisation, additional -keys exist to support this: - - {..., - "is_state":true, - "state_key":TODO - "power_level":TODO - "prev_state_id":TODO - "prev_state_origin":TODO} - -[[TODO(paul): At this point we should probably have a long description of how -State management works, with descriptions of clobbering rules, power levels, etc -etc... But some of that detail is rather up-in-the-air, on the whiteboard, and -so on. This part needs refining. And writing in its own document as the details -relate to the server/system as a whole, not specifically to server-server -federation.]] - -EDUs, by comparison to PDUs, do not have an ID, a context, or a list of -"previous" IDs. The only mandatory fields for these are the type, origin and -destination home server names, and the actual nested content. - - {"edu_type":"m.presence", - "origin":"blue", - "destination":"orange", - "content":...} - - -Protocol URLs -============= - -All these URLs are namespaced within a prefix of - - /_matrix/federation/v1/... - -For active pushing of messages representing live activity "as it happens": - - PUT .../send/:transaction_id/ - Body: JSON encoding of a single Transaction - - Response: [[TODO(paul): I don't actually understand what - ReplicationLayer.on_transaction() is doing here, so I'm not sure what the - response ought to be]] - - The transaction_id path argument will override any ID given in the JSON body. - The destination name will be set to that of the receiving server itself. Each - embedded PDU in the transaction body will be processed. - - -To fetch a particular PDU: - - GET .../pdu/:origin/:pdu_id/ - - Response: JSON encoding of a single Transaction containing one PDU - - Retrieves a given PDU from the server. The response will contain a single new - Transaction, inside which will be the requested PDU. - - -To fetch all the state of a given context: - - GET .../state/:context/ - - Response: JSON encoding of a single Transaction containing multiple PDUs - - Retrieves a snapshot of the entire current state of the given context. The - response will contain a single Transaction, inside which will be a list of - PDUs that encode the state. - - -To backfill events on a given context: - - GET .../backfill/:context/ - Query args: v, limit - - Response: JSON encoding of a single Transaction containing multiple PDUs - - Retrieves a sliding-window history of previous PDUs that occurred on the - given context. Starting from the PDU ID(s) given in the "v" argument, the - PDUs that preceeded it are retrieved, up to a total number given by the - "limit" argument. These are then returned in a new Transaction containing all - off the PDUs. - - -To stream events all the events: - - GET .../pull/ - Query args: origin, v - - Response: JSON encoding of a single Transaction consisting of multiple PDUs - - Retrieves all of the transactions later than any version given by the "v" - arguments. [[TODO(paul): I'm not sure what the "origin" argument does because - I think at some point in the code it's got swapped around.]] - - -To make a query: - - GET .../query/:query_type - Query args: as specified by the individual query types - - Response: JSON encoding of a response object - - Performs a single query request on the receiving home server. The Query Type - part of the path specifies the kind of query being made, and its query - arguments have a meaning specific to that kind of query. The response is a - JSON-encoded object whose meaning also depends on the kind of query. -- cgit 1.5.1 From 216d5f6b521b08ac29e1a8039968f2b6ffe2a5ed Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Tue, 11 Nov 2014 20:44:28 +0200 Subject: this is obsolete and lives in matrix-doc in specification/30_server_server_api.rst now --- docs/server-server/protocol-format.rst | 59 ---------------------------------- 1 file changed, 59 deletions(-) delete mode 100644 docs/server-server/protocol-format.rst (limited to 'docs/server-server') diff --git a/docs/server-server/protocol-format.rst b/docs/server-server/protocol-format.rst deleted file mode 100644 index 2838253ab7..0000000000 --- a/docs/server-server/protocol-format.rst +++ /dev/null @@ -1,59 +0,0 @@ - -Transaction -=========== - -Required keys: - -============ =================== =============================================== - Key Type Description -============ =================== =============================================== -origin String DNS name of homeserver making this transaction. -ts Integer Timestamp in milliseconds on originating - homeserver when this transaction started. -previous_ids List of Strings List of transactions that were sent immediately - prior to this transaction. -pdus List of Objects List of updates contained in this transaction. -============ =================== =============================================== - - -PDU -=== - -Required keys: - -============ ================== ================================================ - Key Type Description -============ ================== ================================================ -context String Event context identifier -origin String DNS name of homeserver that created this PDU. -pdu_id String Unique identifier for PDU within the context for - the originating homeserver. -ts Integer Timestamp in milliseconds on originating - homeserver when this PDU was created. -pdu_type String PDU event type. -prev_pdus List of Pairs The originating homeserver and PDU ids of the - of Strings most recent PDUs the homeserver was aware of for - this context when it made this PDU. -depth Integer The maximum depth of the previous PDUs plus one. -============ ================== ================================================ - -Keys for state updates: - -================== ============ ================================================ - Key Type Description -================== ============ ================================================ -is_state Boolean True if this PDU is updating state. -state_key String Optional key identifying the updated state within - the context. -power_level Integer The asserted power level of the user performing - the update. -min_update Integer The required power level needed to replace this - update. -prev_state_id String The homeserver of the update this replaces -prev_state_origin String The PDU id of the update this replaces. -user String The user updating the state. -================== ============ ================================================ - - - - -- cgit 1.5.1 From b6c48a694b8f9d0ebef29024163a0763d40f1b30 Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Tue, 11 Nov 2014 20:45:11 +0200 Subject: haven't i already moved you to matrix-doc twice? :/ --- docs/server-server/signing.rst | 151 ----------------------------------------- 1 file changed, 151 deletions(-) delete mode 100644 docs/server-server/signing.rst (limited to 'docs/server-server') diff --git a/docs/server-server/signing.rst b/docs/server-server/signing.rst deleted file mode 100644 index 60c701ca91..0000000000 --- a/docs/server-server/signing.rst +++ /dev/null @@ -1,151 +0,0 @@ -Signing JSON -============ - -JSON is signed by encoding the JSON object without ``signatures`` or ``unsigned`` -keys using a canonical encoding. The JSON bytes are then signed using the -signature algorithm and the signature encoded using base64 with the padding -stripped. The resulting base64 signature is added to an object under the -*signing key identifier* which is added to the ``signatures`` object under the -name of the server signing it which is added back to the original JSON object -along with the ``unsigned`` object. - -The *signing key identifier* is the concatenation of the *signing algorithm* -and a *key version*. The *signing algorithm* identifies the algorithm used to -sign the JSON. The currently support value for *signing algorithm* is -``ed25519`` as implemented by NACL (http://nacl.cr.yp.to/). The *key version* -is used to distinguish between different signing keys used by the same entity. - -The ``unsigned`` object and the ``signatures`` object are not covered by the -signature. Therefore intermediate servers can add unsigneddata such as time stamps -and additional signatures. - - -:: - - { - "name": "example.org", - "signing_keys": { - "ed25519:1": "XSl0kuyvrXNj6A+7/tkrB9sxSbRi08Of5uRhxOqZtEQ" - }, - "unsigned": { - "retrieved_ts_ms": 922834800000 - }, - "signatures": { - "example.org": { - "ed25519:1": "s76RUgajp8w172am0zQb/iPTHsRnb4SkrzGoeCOSFfcBY2V/1c8QfrmdXHpvnc2jK5BD1WiJIxiMW95fMjK7Bw" - } - } - } - -:: - - def sign_json(json_object, signing_key, signing_name): - signatures = json_object.pop("signatures", {}) - unsigned = json_object.pop("unsigned", None) - - signed = signing_key.sign(encode_canonical_json(json_object)) - signature_base64 = encode_base64(signed.signature) - - key_id = "%s:%s" % (signing_key.alg, signing_key.version) - signatures.setdefault(sigature_name, {})[key_id] = signature_base64 - - json_object["signatures"] = signatures - if unsigned is not None: - json_object["unsigned"] = unsigned - - return json_object - -Checking for a Signature ------------------------- - -To check if an entity has signed a JSON object a server does the following - -1. Checks if the ``signatures`` object contains an entry with the name of the - entity. If the entry is missing then the check fails. -2. Removes any *signing key identifiers* from the entry with algorithms it - doesn't understand. If there are no *signing key identifiers* left then the - check fails. -3. Looks up *verification keys* for the remaining *signing key identifiers* - either from a local cache or by consulting a trusted key server. If it - cannot find a *verification key* then the check fails. -4. Decodes the base64 encoded signature bytes. If base64 decoding fails then - the check fails. -5. Checks the signature bytes using the *verification key*. If this fails then - the check fails. Otherwise the check succeeds. - -Canonical JSON --------------- - -The canonical JSON encoding for a value is the shortest UTF-8 JSON encoding -with dictionary keys lexicographically sorted by unicode codepoint. Numbers in -the JSON value must be integers in the range [-(2**53)+1, (2**53)-1]. - -:: - - import json - - def canonical_json(value): - return json.dumps( - value, - ensure_ascii=False, - separators=(',',':'), - sort_keys=True, - ).encode("UTF-8") - -Grammar -+++++++ - -Adapted from the grammar in http://tools.ietf.org/html/rfc7159 removing -insignificant whitespace, fractions, exponents and redundant character escapes - -:: - - value = false / null / true / object / array / number / string - false = %x66.61.6c.73.65 - null = %x6e.75.6c.6c - true = %x74.72.75.65 - object = %x7B [ member *( %x2C member ) ] %7D - member = string %x3A value - array = %x5B [ value *( %x2C value ) ] %5B - number = [ %x2D ] int - int = %x30 / ( %x31-39 *digit ) - digit = %x30-39 - string = %x22 *char %x22 - char = unescaped / %x5C escaped - unescaped = %x20-21 / %x23-5B / %x5D-10FFFF - escaped = %x22 ; " quotation mark U+0022 - / %x5C ; \ reverse solidus U+005C - / %x62 ; b backspace U+0008 - / %x66 ; f form feed U+000C - / %x6E ; n line feed U+000A - / %x72 ; r carriage return U+000D - / %x74 ; t tab U+0009 - / %x75.30.30.30 (%x30-37 / %x62 / %x65-66) ; u000X - / %x75.30.30.31 (%x30-39 / %x61-66) ; u001X - -Signing Events -============== - -Signing events is a more complicated process since servers can choose to redact -non-essential event contents. Before signing the event it is encoded as -Canonical JSON and hashed using SHA-256. The resulting hash is then stored -in the event JSON in a ``hash`` object under a ``sha256`` key. Then all -non-essential keys are stripped from the event object, and the resulting object -which included the ``hash`` key is signed using the JSON signing algorithm. - -Servers can then transmit the entire event or the event with the non-essential -keys removed. Receiving servers can then check the entire event if it is -present by computing the SHA-256 of the event excluding the ``hash`` object, or -by using the ``hash`` object included in the event if keys have been redacted. - -New hash functions can be introduced by adding additional keys to the ``hash`` -object. Since the ``hash`` object cannot be redacted a server shouldn't allow -too many hashes to be listed, otherwise a server might embed illict data within -the ``hash`` object. For similar reasons a server shouldn't allow hash values -that are too long. - -[[TODO(markjh): We might want to specify a maximum number of keys for the -``hash`` and we might want to specify the maximum output size of a hash]] - -[[TODO(markjh) We might want to allow the server to omit the output of well -known hash functions like SHA-256 when none of the keys have been redacted]] -- cgit 1.5.1 From 7e1779d48c5ef9a90b60a409286f9830c76eb8ae Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Tue, 11 Nov 2014 20:49:03 +0200 Subject: this is ancient and has been moved to matrix-doc/drafts/federated_versioning_design_notes.rst --- docs/server-server/versioning.rst | 11 ----------- 1 file changed, 11 deletions(-) delete mode 100644 docs/server-server/versioning.rst (limited to 'docs/server-server') diff --git a/docs/server-server/versioning.rst b/docs/server-server/versioning.rst deleted file mode 100644 index ffda60633f..0000000000 --- a/docs/server-server/versioning.rst +++ /dev/null @@ -1,11 +0,0 @@ -Versioning is, like, hard for backfilling backwards because of the number of Home Servers involved. - -The way we solve this is by doing versioning as an acyclic directed graph of PDUs. For backfilling purposes, this is done on a per context basis. -When we send a PDU we include all PDUs that have been received for that context that hasn't been subsequently listed in a later PDU. The trivial case is a simple list of PDUs, e.g. A <- B <- C. However, if two servers send out a PDU at the same to, both B and C would point at A - a later PDU would then list both B and C. - -Problems with opaque version strings: - - How do you do clustering without mandating that a cluster can only have one transaction in flight to a given remote home server at a time. - If you have multiple transactions sent at once, then you might drop one transaction, receive another with a version that is later than the dropped transaction and which point ARGH WE LOST A TRANSACTION. - - How do you do backfilling? A version string defines a point in a stream w.r.t. a single home server, not a point in the context. - -We only need to store the ends of the directed graph, we DO NOT need to do the whole one table of nodes and one of edges. -- cgit 1.5.1