diff options
Diffstat (limited to 'docs/specification.rst')
-rw-r--r-- | docs/specification.rst | 492 |
1 files changed, 330 insertions, 162 deletions
diff --git a/docs/specification.rst b/docs/specification.rst index fa085bac27..d4a01a3fc2 100644 --- a/docs/specification.rst +++ b/docs/specification.rst @@ -5,16 +5,20 @@ TODO(Introduction) : Matthew - Similar to intro paragraph from README. - Explaining the overall mission, what this spec describes... - "What is Matrix?" + - Draw parallels with email? Architecture ============ -- Basic structure: What are clients/home servers and what are their - responsibilities? What are events. +Clients transmit data to other clients through home servers (HSes). Clients do not communicate with each +other directly. :: - { Matrix clients } { Matrix clients } + How data flows between clients + ============================== + + { Matrix client A } { Matrix client B } ^ | ^ | | events | | events | | V | V @@ -22,22 +26,205 @@ Architecture | |---------( HTTP )---------->| | | Home Server | | Home Server | | |<--------( HTTP )-----------| | - +------------------+ +------------------+ + +------------------+ Federation +------------------+ + +A "Client" is an end-user, typically a human using a web application or mobile app. Clients use the +"Client-to-Server" (C-S) API to communicate with their home server. A single Client is usually +responsible for a single user account. A user account is represented by their "User ID". This ID is +namespaced to the home server which allocated the account and looks like:: + + @localpart:domain + +The ``localpart`` of a user ID may be a user name, or an opaque ID identifying this user. + + +A "Home Server" is a server which provides C-S APIs and has the ability to federate with other HSes. +It is typically responsible for multiple clients. "Federation" is the term used to describe the +sharing of data between two or more home servers. + +Data in Matrix is encapsulated in an "Event". An event is an action within the system. Typically each +action (e.g. sending a message) correlates with exactly one event. Each event has a ``type`` which is +used to differentiate different kinds of data. ``type`` values SHOULD be namespaced according to standard +Java package naming conventions, e.g. ``com.example.myapp.event``. Events are usually sent in the context +of a "Room". + +Room structure +-------------- + +A room is a conceptual place where users can send and receive events. Rooms +can be created, joined and left. Events are sent to a room, and all +participants in that room will receive the event. Rooms are uniquely +identified via a "Room ID", which look like:: + + !opaque_id:domain + +There is exactly one room ID for each room. Whilst the room ID does contain a +domain, it is simply for namespacing room IDs. The room does NOT reside on the +domain specified. Room IDs are not meant to be human readable. + +The following diagram shows an ``m.room.message`` event being sent in the room +``!qporfwt:matrix.org``:: + + { @alice:matrix.org } { @bob:domain.com } + | ^ + | | + Room ID: !qporfwt:matrix.org Room ID: !qporfwt:matrix.org + Event type: m.room.message Event type: m.room.message + Content: { JSON object } Content: { JSON object } + | | + V | + +------------------+ +------------------+ + | Home Server | | Home Server | + | matrix.org |<-------Federation------->| domain.com | + +------------------+ +------------------+ + | ................................. | + |______| Partially Shared State |_______| + | Room ID: !qporfwt:matrix.org | + | Servers: matrix.org, domain.com | + | Members: | + | - @alice:matrix.org | + | - @bob:domain.com | + |.................................| + +Federation maintains shared state between multiple home servers, such that when an event is +sent to a room, the home server knows where to forward the event on to, and how to process +the event. Home servers do not need to have completely shared state in order to participate +in a room. State is scoped to a single room, and federation ensures that all home servers +have the information they need, even if that means the home server has to request more +information from another home server before processing the event. + +Room Aliases +------------ + +Each room can also have multiple "Room Aliases", which looks like:: + + #room_alias:domain + +A room alias "points" to a room ID. The room ID the alias is pointing to can be obtained +by visiting the domain specified. Room aliases are designed to be human readable strings +which can be used to publicise rooms. Note that the mapping from a room alias to a +room ID is not fixed, and may change over time to point to a different room ID. For this +reason, Clients SHOULD resolve the room alias to a room ID once and then use that ID on +subsequent requests. + +:: + + GET + #matrix:domain.com !aaabaa:matrix.org + | ^ + | | + _______V____________________|____ + | domain.com | + | Mappings: | + | #matrix >> !aaabaa:matrix.org | + | #golf >> !wfeiofh:sport.com | + | #bike >> !4rguxf:matrix.org | + |________________________________| + -- How do identity servers fit in? 3PIDs? Users? Aliases -- Pattern of the APIs (HTTP/JSON, REST + txns) -- Standard error response format. -- C-S Event stream +Identity +-------- +- Identity in relation to 3PIDs. Discovery of users based on 3PIDs. +- Identity servers; trusted clique of servers which replicate content. +- They govern the mapping of 3PIDs to user IDs and the creation of said mappings. +- Not strictly required in order to communicate. -Rooms -===== -A room is a conceptual place where users can send and receive messages. Rooms -can be created, joined and left. Messages are sent to a room, and all -participants in that room will receive the message. Rooms are uniquely -identified via a room ID. There is exactly one room ID for each room. +API Standards +------------- +All communication in Matrix is performed over HTTP[S] using a Content-Type of ``application/json``. +Any errors which occur on the Matrix API level MUST return a "standard error response". This is a +JSON object which looks like:: + + { + "errcode": "<error code>", + "error": "<error message>" + } + +The ``error`` string will be a human-readable error message, usually a sentence +explaining what went wrong. The ``errcode`` string will be a unique string which can be +used to handle an error message e.g. ``M_FORBIDDEN``. These error codes should have their +namespace first in ALL CAPS, followed by a single _. For example, if there was a custom +namespace ``com.mydomain.here``, and a ``FORBIDDEN`` code, the error code should look +like ``COM.MYDOMAIN.HERE_FORBIDDEN``. There may be additional keys depending on +the error, but the keys ``error`` and ``errcode`` MUST always be present. + +Some standard error codes are below: + +:``M_FORBIDDEN``: + Forbidden access, e.g. joining a room without permission, failed login. + +:``M_UNKNOWN_TOKEN``: + The access token specified was not recognised. + +:``M_BAD_JSON``: + Request contained valid JSON, but it was malformed in some way, e.g. missing + required keys, invalid values for keys. + +:``M_NOT_JSON``: + Request did not contain valid JSON. + +:``M_NOT_FOUND``: + No resource was found for this request. + +Some requests have unique error codes: + +:``M_USER_IN_USE``: + Encountered when trying to register a user ID which has been taken. + +:``M_ROOM_IN_USE``: + Encountered when trying to create a room which has been taken. + +:``M_BAD_PAGINATION``: + Encountered when specifying bad pagination query parameters. + +:``M_LOGIN_EMAIL_URL_NOT_YET``: + Encountered when polling for an email link which has not been clicked yet. + +The C-S API typically uses ``HTTP POST`` to submit requests. This means these requests +are not idempotent. The C-S API also allows ``HTTP PUT`` to make requests idempotent. +In order to use a ``PUT``, paths should be suffixed with ``/{txnId}``. ``{txnId}`` is a +client-generated transaction ID which identifies the request. Crucially, it **only** +serves to identify new requests from retransmits. After the request has finished, the +``{txnId}`` value should be changed (how is not specified, it could be a monotonically +increasing integer, etc). It is preferable to use ``HTTP PUT`` to make sure requests to +send messages do not get sent more than once should clients need to retransmit requests. + +Valid requests look like:: + + POST /some/path/here + { + "key": "This is a post." + } + + PUT /some/path/here/11 + { + "key": "This is a put with a txnId of 11." + } + +In contrast, these are invalid requests:: + + POST /some/path/here/11 + { + "key": "This is a post, but it has a txnId." + } + + PUT /some/path/here + { + "key": "This is a put but it is missing a txnId." + } + +Receiving live updates on a client +---------------------------------- +- C-S longpoll event stream +- Concept of start/end tokens. +- Mention /initialSync to get token. + -- Aliases +Rooms +===== +- How are they created? PDU anchor point: "root of the tree". +- Adding / removing aliases. - Invite/join dance - State and non-state data (+extensibility) @@ -46,10 +233,8 @@ TODO : Room permissions / config / power levels. Messages ======== -This specification outlines several standard message types, all of which are -prefixed with "m.". - -- Namespacing? +This specification outlines several standard event types, all of which are +prefixed with ``m.`` State messages -------------- @@ -102,15 +287,15 @@ below: - ``body`` : "string" - The alt text of the image, or some kind of content description for accessibility e.g. "image attachment". -ImageInfo: - Information about an image:: + ImageInfo: + Information about an image:: - { - "size" : integer (size of image in bytes), - "w" : integer (width of image in pixels), - "h" : integer (height of image in pixels), - "mimetype" : "string (e.g. image/jpeg)", - } + { + "size" : integer (size of image in bytes), + "w" : integer (width of image in pixels), + "h" : integer (height of image in pixels), + "mimetype" : "string (e.g. image/jpeg)", + } ``m.audio`` Required keys: @@ -121,15 +306,14 @@ ImageInfo: - ``body`` : "string" - A description of the audio e.g. "Bee Gees - Stayin' Alive", or some kind of content description for accessibility e.g. "audio attachment". + AudioInfo: + Information about a piece of audio:: -AudioInfo: - Information about a piece of audio:: - - { - "mimetype" : "string (e.g. audio/aac)", - "size" : integer (size of audio in bytes), - "duration" : integer (duration of audio in milliseconds), - } + { + "mimetype" : "string (e.g. audio/aac)", + "size" : integer (size of audio in bytes), + "duration" : integer (duration of audio in milliseconds), + } ``m.video`` Required keys: @@ -140,18 +324,18 @@ AudioInfo: - ``body`` : "string" - A description of the video e.g. "Gangnam style", or some kind of content description for accessibility e.g. "video attachment". -VideoInfo: - Information about a video:: + VideoInfo: + Information about a video:: - { - "mimetype" : "string (e.g. video/mp4)", - "size" : integer (size of video in bytes), - "duration" : integer (duration of video in milliseconds), - "w" : integer (width of video in pixels), - "h" : integer (height of video in pixels), - "thumbnail_url" : "string (URL to image)", - "thumbanil_info" : JSON object (ImageInfo) - } + { + "mimetype" : "string (e.g. video/mp4)", + "size" : integer (size of video in bytes), + "duration" : integer (duration of video in milliseconds), + "w" : integer (width of video in pixels), + "h" : integer (height of video in pixels), + "thumbnail_url" : "string (URL to image)", + "thumbanil_info" : JSON object (ImageInfo) + } ``m.location`` Required keys: @@ -174,88 +358,59 @@ The following keys can be attached to any ``m.room.message``: Presence ======== -Each user has the concept of Presence information. This encodes a sense of the -"availability" of that user, suitable for display on other user's clients. - -The basic piece of presence information is an enumeration of a small set of -state; such as "free to chat", "online", "busy", or "offline". The default state -unless the user changes it is "online". Lower states suggest some amount of -decreased availability from normal, which might have some client-side effect -like muting notification sounds and suggests to other users not to bother them -unless it is urgent. Equally, the "free to chat" state exists to let the user -announce their general willingness to receive messages moreso than default. - -Home servers should also allow a user to set their state as "hidden" - a state -which behaves as offline, but allows the user to see the client state anyway and -generally interact with client features such as reading message history or -accessing contacts in the address book. - -This basic state field applies to the user as a whole, regardless of how many +Each user has the concept of presence information. This encodes the +"availability" of that user, suitable for display on other user's clients. This +is transmitted as an ``m.presence`` event and is one of the few events which +are sent *outside the context of a room*. The basic piece of presence information +is represented by the ``state`` key, which is an enum of one of the following: + + - ``online`` : The default state when the user is connected to an event stream. + - ``unavailable`` : The user is not reachable at this time. + - ``offline`` : The user is not connected to an event stream. + - ``free_for_chat`` : The user is generally willing to receive messages + moreso than default. + - ``hidden`` : TODO. Behaves as offline, but allows the user to see the client + state anyway and generally interact with client features. + +This basic ``state`` field applies to the user as a whole, regardless of how many client devices they have connected. The home server should synchronise this status choice among multiple devices to ensure the user gets a consistent experience. Idle Time --------- -As well as the basic state field, the presence information can also show a sense +As well as the basic ``state`` field, the presence information can also show a sense of an "idle timer". This should be maintained individually by the user's -clients, and the homeserver can take the highest reported time as that to -report. Likely this should be presented in fairly coarse granularity; possibly -being limited to letting the home server automatically switch from a "free to -chat" or "online" mode into "idle". - -When a user is offline, the Home Server can still report when the user was last -seen online, again perhaps in a somewhat coarse manner. +clients, and the home server can take the highest reported time as that to +report. When a user is offline, the home server can still report when the user was last +seen online. -Device Type ------------ -Client devices that may limit the user experience somewhat (such as "mobile" -devices with limited ability to type on a real keyboard or read large amounts of -text) should report this to the home server, as this is also useful information -to report as "presence" if the user cannot be expected to provide a good typed -response to messages. - -- m.presence and enums (when should they be used) +Transmission +------------ +- Transmitted as an EDU. +- Presence lists determine who to send to. Presence List ------------- Each user's home server stores a "presence list" for that user. This stores a -list of other user IDs the user has chosen to add to it (remembering any ACL -Pointer if appropriate). - -To be added to a contact list, the user being added must grant permission. Once -granted, both user's HS(es) store this information, as it allows the user who -has added the contact some more abilities; see below. Since such subscriptions +list of other user IDs the user has chosen to add to it. To be added to this +list, the user being added must receive permission from the list owner. Once +granted, both user's HS(es) store this information. Since such subscriptions are likely to be bidirectional, HSes may wish to automatically accept requests when a reverse subscription already exists. -As a convenience, presence lists should support the ability to collect users -into groups, which could allow things like inviting the entire group to a new -("ad-hoc") chat room, or easy interaction with the profile information ACL -implementation of the HS. - Presence and Permissions ------------------------ For a viewing user to be allowed to see the presence information of a target -user, either +user, either: - * The target user has allowed the viewing user to add them to their presence + - The target user has allowed the viewing user to add them to their presence list, or - - * The two users share at least one room in common + - The two users share at least one room in common In the latter case, this allows for clients to display some minimal sense of presence information in a user list for a room. -Home servers can also use the user's choice of presence state as a signal for -how to handle new private one-to-one chat message requests. For example, it -might decide: - - - "free to chat": accept anything - - "online": accept from anyone in my address book list - - "busy": accept from anyone in this "important people" group in my address - book list - Typing notifications ==================== @@ -274,18 +429,14 @@ human-friendly string. Profiles grant users the ability to see human-readable names for other users that are in some way meaningful to them. Additionally, profiles can publish additional information, such as the user's age or location. -It is also conceivable that since we are attempting to provide a -worldwide-applicable messaging system, that users may wish to present different -subsets of information in their profile to different other people, from a -privacy and permissions perspective. - A Profile consists of a display name, an avatar picture, and a set of other metadata fields that the user may wish to publish (email address, phone numbers, website URLs, etc...). This specification puts no requirements on the -display name other than it being a valid Unicode string. +display name other than it being a valid unicode string. - Metadata extensibility - Bundled with which events? e.g. m.room.member +- Generate own events? What type? Registration and login ====================== @@ -312,8 +463,8 @@ The login process breaks down into the following: step 2. As each home server may have different ways of logging in, the client needs to know how -they should login. All distinct login stages MUST have a corresponding ``'type'``. -A ``'type'`` is a namespaced string which details the mechanism for logging in. +they should login. All distinct login stages MUST have a corresponding ``type``. +A ``type`` is a namespaced string which details the mechanism for logging in. A client may be able to login via multiple valid login flows, and should choose a single flow when logging in. A flow is a series of login stages. The home server MUST respond @@ -359,17 +510,17 @@ subsequent requests until the login is completed:: } This specification defines the following login types: - - m.login.password - - m.login.oauth2 - - m.login.email.code - - m.login.email.url + - ``m.login.password`` + - ``m.login.oauth2`` + - ``m.login.email.code`` + - ``m.login.email.url`` Password-based -------------- -Type: - "m.login.password" -Description: +:Type: + m.login.password +:Description: Login is supported via a username and password. To respond to this type, reply with:: @@ -385,9 +536,9 @@ process, or a standard error response. OAuth2-based ------------ -Type: - "m.login.oauth2" -Description: +:Type: + m.login.oauth2 +:Description: Login is supported via OAuth2 URLs. This login consists of multiple requests. To respond to this type, reply with:: @@ -438,9 +589,9 @@ visits the REDIRECT_URI with the auth code= query parameter which returns:: Email-based (code) ------------------ -Type: - "m.login.email.code" -Description: +:Type: + m.login.email.code +:Description: Login is supported by typing in a code which is sent in an email. This login consists of multiple requests. @@ -473,9 +624,9 @@ the login process, or a standard error response. Email-based (url) ----------------- -Type: - "m.login.email.url" -Description: +:Type: + m.login.email.url +:Description: Login is supported by clicking on a URL in an email. This login consists of multiple requests. @@ -515,7 +666,7 @@ N-Factor Authentication ----------------------- Multiple login stages can be combined to create N-factor authentication during login. -This can be achieved by responding with the ``'next'`` login type on completion of a +This can be achieved by responding with the ``next`` login type on completion of a previous login stage:: { @@ -523,7 +674,7 @@ previous login stage:: } If a home server implements N-factor authentication, it MUST respond with all -``'stages'`` when initially queried for their login requirements:: +``stages`` when initially queried for their login requirements:: { "type": "<1st login type>", @@ -592,59 +743,62 @@ can also be performed. There are three main kinds of communication that occur between home servers: - * Queries +:Queries: These are single request/response interactions between a given pair of - servers, initiated by one side sending an HTTP request to obtain some + servers, initiated by one side sending an HTTP GET request to obtain some information, and responded by the other. They are not persisted and contain no long-term significant history. They simply request a snapshot state at the instant the query is made. - * EDUs - Ephemeral Data Units +:Ephemeral Data Units (EDUs): These are notifications of events that are pushed from one home server to another. They are not persisted and contain no long-term significant history, nor does the receiving home server have to reply to them. - * PDUs - Persisted Data Units +:Persisted Data Units (PDUs): These are notifications of events that are broadcast from one home server to any others that are interested in the same "context" (namely, a Room ID). They are persisted to long-term storage and form the record of history for that context. -Where Queries are presented directly across the HTTP connection as GET requests -to specific URLs, EDUs and PDUs are further wrapped in an envelope called a -Transaction, which is transferred from the origin to the destination home server -using a PUT request. +EDUs and PDUs are further wrapped in an envelope called a Transaction, which is +transferred from the origin to the destination home server using an HTTP PUT request. -Transactions and EDUs/PDUs --------------------------- +Transactions +------------ The transfer of EDUs and PDUs between home servers is performed by an exchange -of Transaction messages, which are encoded as JSON objects with a dict as the -top-level element, passed over an HTTP PUT request. A Transaction is meaningful -only to the pair of home servers that exchanged it; they are not globally- -meaningful. +of Transaction messages, which are encoded as JSON objects, passed over an +HTTP PUT request. A Transaction is meaningful only to the pair of home servers that +exchanged it; they are not globally-meaningful. + +Each transaction has: + - An opaque transaction ID. + - A timestamp (UNIX epoch time in milliseconds) generated by its origin server. + - An origin and destination server name. + - A list of "previous IDs". + - A list of PDUs and EDUs - the actual message payload that the Transaction carries. -Each transaction has an opaque ID and timestamp (UNIX epoch time in -milliseconds) generated by its origin server, an origin and destination server -name, a list of "previous IDs", and a list of PDUs - the actual message payload -that the Transaction carries. +:: - {"transaction_id":"916d630ea616342b42e98a3be0b74113", + { + "transaction_id":"916d630ea616342b42e98a3be0b74113", "ts":1404835423000, "origin":"red", "destination":"blue", "prev_ids":["e1da392e61898be4d2009b9fecce5325"], "pdus":[...], - "edus":[...]} + "edus":[...] + } -The "previous IDs" field will contain a list of previous transaction IDs that -the origin server has sent to this destination. Its purpose is to act as a +The ``prev_ids`` field contains a list of previous transaction IDs that +the ``origin`` server has sent to this ``destination``. Its purpose is to act as a sequence checking mechanism - the destination server can check whether it has successfully received that Transaction, or ask for a retransmission if not. -The "pdus" field of a transaction is a list, containing zero or more PDUs.[*] -Each PDU is itself a dict containing a number of keys, the exact details of -which will vary depending on the type of PDU. Similarly, the "edus" field is +The ``pdus`` field of a transaction is a list, containing zero or more PDUs.[*] +Each PDU is itself a JSON object containing a number of keys, the exact details of +which will vary depending on the type of PDU. Similarly, the ``edus`` field is another list containing the EDUs. This key may be entirely absent if there are no EDUs to transfer. @@ -653,25 +807,35 @@ receiving an "empty" transaction, as this is useful for informing peers of other transaction IDs they should be aware of. This effectively acts as a push mechanism to encourage peers to continue to replicate content.) -All PDUs have an ID, a context, a declaration of their type, a list of other PDU -IDs that have been seen recently on that context (regardless of which origin -sent them), and a nested content field containing the actual event content. +PDUs and EDUs +------------- + +All PDUs have: + - An ID + - A context + - A declaration of their type + - A list of other PDU IDs that have been seen recently on that context (regardless of which origin + sent them) [[TODO(paul): Update this structure so that 'pdu_id' is a two-element [origin,ref] pair like the prev_pdus are]] - {"pdu_id":"a4ecee13e2accdadf56c1025af232176", +:: + + { + "pdu_id":"a4ecee13e2accdadf56c1025af232176", "context":"#example.green", "origin":"green", "ts":1404838188000, "pdu_type":"m.text", "prev_pdus":[["blue","99d16afbc857975916f1d73e49e52b65"]], "content":... - "is_state":false} + "is_state":false + } -In contrast to the transaction layer, it is important to note that the prev_pdus +In contrast to Transactions, it is important to note that the ``prev_pdus`` field of a PDU refers to PDUs that any origin server has sent, rather than -previous IDs that this origin has sent. This list may refer to other PDUs sent +previous IDs that this ``origin`` has sent. This list may refer to other PDUs sent by the same origin as the current one, or other origins. Because of the distributed nature of participants in a Matrix conversation, it @@ -686,6 +850,8 @@ PDUs fall into two main categories: those that deliver Events, and those that synchronise State. For PDUs that relate to State synchronisation, additional keys exist to support this: +:: + {..., "is_state":true, "state_key":TODO @@ -704,6 +870,8 @@ EDUs, by comparison to PDUs, do not have an ID, a context, or a list of "previous" IDs. The only mandatory fields for these are the type, origin and destination home server names, and the actual nested content. +:: + {"edu_type":"m.presence", "origin":"blue", "destination":"orange", |