diff --git a/docs/specification.rst b/docs/specification.rst
index fa085bac27..c1559c886c 100644
--- a/docs/specification.rst
+++ b/docs/specification.rst
@@ -5,16 +5,20 @@ TODO(Introduction) : Matthew
- Similar to intro paragraph from README.
- Explaining the overall mission, what this spec describes...
- "What is Matrix?"
+ - Draw parallels with email?
Architecture
============
-- Basic structure: What are clients/home servers and what are their
- responsibilities? What are events.
+Clients transmit data to other clients through home servers (HSes). Clients do not communicate with each
+other directly.
::
- { Matrix clients } { Matrix clients }
+ How data flows between clients
+ ==============================
+
+ { Matrix client A } { Matrix client B }
^ | ^ |
| events | | events |
| V | V
@@ -22,22 +26,128 @@ Architecture
| |---------( HTTP )---------->| |
| Home Server | | Home Server |
| |<--------( HTTP )-----------| |
- +------------------+ +------------------+
+ +------------------+ Federation +------------------+
+
+A "Client" is an end-user, typically a human using a web application or mobile app. Clients use the
+"Client-to-Server" (C-S) API to communicate with their home server. A single Client is usually
+responsible for a single user account. A user account is represented by their "User ID". This ID is
+namespaced to the home server which allocated the account and looks like::
+
+ @localpart:domain
+
+The ``localpart`` of a user ID may be a user name, or an opaque ID identifying this user.
+
+
+A "Home Server" is a server which provides C-S APIs and has the ability to federate with other HSes.
+It is typically responsible for multiple clients. "Federation" is the term used to describe the
+sharing of data between two or more home servers.
+
+Data in Matrix is encapsulated in an "Event". An event is an action within the system. Typically each
+action (e.g. sending a message) correlates with exactly one event. Each event has a ``type`` which is
+used to differentiate different kinds of data. ``type`` values SHOULD be namespaced according to standard
+Java package naming conventions, e.g. ``com.example.myapp.event``. Events are usually sent in the context
+of a "Room".
+
+Room structure
+--------------
+
+A room is a conceptual place where users can send and receive events. Rooms
+can be created, joined and left. Events are sent to a room, and all
+participants in that room will receive the event. Rooms are uniquely
+identified via a "Room ID", which look like::
+
+ !opaque_id:domain
+
+There is exactly one room ID for each room. Whilst the room ID does contain a
+domain, it is simply for namespacing room IDs. The room does NOT reside on the
+domain specified. Room IDs are not meant to be human readable.
+
+The following diagram shows an ``m.room.message`` event being sent in the room
+``!qporfwt:matrix.org``::
+
+ { @alice:matrix.org } { @bob:domain.com }
+ | ^
+ | |
+ Room ID: !qporfwt:matrix.org Room ID: !qporfwt:matrix.org
+ Event type: m.room.message Event type: m.room.message
+ Content: { JSON object } Content: { JSON object }
+ | |
+ V |
+ +------------------+ +------------------+
+ | Home Server | | Home Server |
+ | matrix.org |<-------Federation------->| domain.com |
+ +------------------+ +------------------+
+ | ................................. |
+ |______| Partially Shared State |_______|
+ | Room ID: !qporfwt:matrix.org |
+ | Servers: matrix.org, domain.com |
+ | Members: |
+ | - @alice:matrix.org |
+ | - @bob:domain.com |
+ |.................................|
+
+Federation maintains shared state between multiple home servers, such that when an event is
+sent to a room, the home server knows where to forward the event on to, and how to process
+the event. Home servers do not need to have completely shared state in order to participate
+in a room. State is scoped to a single room, and federation ensures that all home servers
+have the information they need, even if that means the home server has to request more
+information from another home server before processing the event.
+
+Room Aliases
+------------
+
+Each room can also have multiple "Room Aliases", which looks like::
+
+ #room_alias:domain
+
+A room alias "points" to a room ID. The room ID the alias is pointing to can be obtained
+by visiting the domain specified. Room aliases are designed to be human readable strings
+which can be used to publicise rooms. Note that the mapping from a room alias to a
+room ID is not fixed, and may change over time to point to a different room ID. For this
+reason, Clients SHOULD resolve the room alias to a room ID once and then use that ID on
+subsequent requests.
+
+::
+
+ GET
+ #matrix:domain.com !aaabaa:matrix.org
+ | ^
+ | |
+ _______V____________________|____
+ | domain.com |
+ | Mappings: |
+ | #matrix >> !aaabaa:matrix.org |
+ | #golf >> !wfeiofh:sport.com |
+ | #bike >> !4rguxf:matrix.org |
+ |________________________________|
+
-- How do identity servers fit in? 3PIDs? Users? Aliases
-- Pattern of the APIs (HTTP/JSON, REST + txns)
-- Standard error response format.
-- C-S Event stream
+Identity
+--------
+- Identity in relation to 3PIDs. Discovery of users based on 3PIDs.
+- Identity servers; trusted clique of servers which replicate content.
+- They govern the mapping of 3PIDs to user IDs and the creation of said mappings.
+- Not strictly required in order to communicate.
-Rooms
-=====
-A room is a conceptual place where users can send and receive messages. Rooms
-can be created, joined and left. Messages are sent to a room, and all
-participants in that room will receive the message. Rooms are uniquely
-identified via a room ID. There is exactly one room ID for each room.
+API Standards
+-------------
+- All HTTP[S]
+- Uses JSON as HTTP bodies
+- Standard error response format { errcode: M_WHATEVER, error: "some message" }
+- C-S API provides POST for operations, or PUT with txn IDs. Explain txn IDs.
+
+Receiving live updates on a client
+----------------------------------
+- C-S longpoll event stream
+- Concept of start/end tokens.
+- Mention /initialSync to get token.
-- Aliases
+
+Rooms
+=====
+- How are they created? PDU anchor point: "root of the tree".
+- Adding / removing aliases.
- Invite/join dance
- State and non-state data (+extensibility)
@@ -46,10 +156,8 @@ TODO : Room permissions / config / power levels.
Messages
========
-This specification outlines several standard message types, all of which are
-prefixed with "m.".
-
-- Namespacing?
+This specification outlines several standard event types, all of which are
+prefixed with ``m.``
State messages
--------------
@@ -102,15 +210,15 @@ below:
- ``body`` : "string" - The alt text of the image, or some kind of content
description for accessibility e.g. "image attachment".
-ImageInfo:
- Information about an image::
+ ImageInfo:
+ Information about an image::
- {
- "size" : integer (size of image in bytes),
- "w" : integer (width of image in pixels),
- "h" : integer (height of image in pixels),
- "mimetype" : "string (e.g. image/jpeg)",
- }
+ {
+ "size" : integer (size of image in bytes),
+ "w" : integer (width of image in pixels),
+ "h" : integer (height of image in pixels),
+ "mimetype" : "string (e.g. image/jpeg)",
+ }
``m.audio``
Required keys:
@@ -121,15 +229,14 @@ ImageInfo:
- ``body`` : "string" - A description of the audio e.g. "Bee Gees -
Stayin' Alive", or some kind of content description for accessibility e.g.
"audio attachment".
+ AudioInfo:
+ Information about a piece of audio::
-AudioInfo:
- Information about a piece of audio::
-
- {
- "mimetype" : "string (e.g. audio/aac)",
- "size" : integer (size of audio in bytes),
- "duration" : integer (duration of audio in milliseconds),
- }
+ {
+ "mimetype" : "string (e.g. audio/aac)",
+ "size" : integer (size of audio in bytes),
+ "duration" : integer (duration of audio in milliseconds),
+ }
``m.video``
Required keys:
@@ -140,18 +247,18 @@ AudioInfo:
- ``body`` : "string" - A description of the video e.g. "Gangnam style",
or some kind of content description for accessibility e.g. "video attachment".
-VideoInfo:
- Information about a video::
+ VideoInfo:
+ Information about a video::
- {
- "mimetype" : "string (e.g. video/mp4)",
- "size" : integer (size of video in bytes),
- "duration" : integer (duration of video in milliseconds),
- "w" : integer (width of video in pixels),
- "h" : integer (height of video in pixels),
- "thumbnail_url" : "string (URL to image)",
- "thumbanil_info" : JSON object (ImageInfo)
- }
+ {
+ "mimetype" : "string (e.g. video/mp4)",
+ "size" : integer (size of video in bytes),
+ "duration" : integer (duration of video in milliseconds),
+ "w" : integer (width of video in pixels),
+ "h" : integer (height of video in pixels),
+ "thumbnail_url" : "string (URL to image)",
+ "thumbanil_info" : JSON object (ImageInfo)
+ }
``m.location``
Required keys:
@@ -174,88 +281,59 @@ The following keys can be attached to any ``m.room.message``:
Presence
========
-Each user has the concept of Presence information. This encodes a sense of the
-"availability" of that user, suitable for display on other user's clients.
-
-The basic piece of presence information is an enumeration of a small set of
-state; such as "free to chat", "online", "busy", or "offline". The default state
-unless the user changes it is "online". Lower states suggest some amount of
-decreased availability from normal, which might have some client-side effect
-like muting notification sounds and suggests to other users not to bother them
-unless it is urgent. Equally, the "free to chat" state exists to let the user
-announce their general willingness to receive messages moreso than default.
-
-Home servers should also allow a user to set their state as "hidden" - a state
-which behaves as offline, but allows the user to see the client state anyway and
-generally interact with client features such as reading message history or
-accessing contacts in the address book.
-
-This basic state field applies to the user as a whole, regardless of how many
+Each user has the concept of presence information. This encodes the
+"availability" of that user, suitable for display on other user's clients. This
+is transmitted as an ``m.presence`` event and is one of the few events which
+are sent *outside the context of a room*. The basic piece of presence information
+is represented by the ``state`` key, which is an enum of one of the following:
+
+ - ``online`` : The default state when the user is connected to an event stream.
+ - ``unavailable`` : The user is not reachable at this time.
+ - ``offline`` : The user is not connected to an event stream.
+ - ``free_for_chat`` : The user is generally willing to receive messages
+ moreso than default.
+ - ``hidden`` : TODO. Behaves as offline, but allows the user to see the client
+ state anyway and generally interact with client features.
+
+This basic ``state`` field applies to the user as a whole, regardless of how many
client devices they have connected. The home server should synchronise this
status choice among multiple devices to ensure the user gets a consistent
experience.
Idle Time
---------
-As well as the basic state field, the presence information can also show a sense
+As well as the basic ``state`` field, the presence information can also show a sense
of an "idle timer". This should be maintained individually by the user's
-clients, and the homeserver can take the highest reported time as that to
-report. Likely this should be presented in fairly coarse granularity; possibly
-being limited to letting the home server automatically switch from a "free to
-chat" or "online" mode into "idle".
+clients, and the home server can take the highest reported time as that to
+report. When a user is offline, the home server can still report when the user was last
+seen online.
-When a user is offline, the Home Server can still report when the user was last
-seen online, again perhaps in a somewhat coarse manner.
-
-Device Type
------------
-Client devices that may limit the user experience somewhat (such as "mobile"
-devices with limited ability to type on a real keyboard or read large amounts of
-text) should report this to the home server, as this is also useful information
-to report as "presence" if the user cannot be expected to provide a good typed
-response to messages.
-
-- m.presence and enums (when should they be used)
+Transmission
+------------
+- Transmitted as an EDU.
+- Presence lists determine who to send to.
Presence List
-------------
Each user's home server stores a "presence list" for that user. This stores a
-list of other user IDs the user has chosen to add to it (remembering any ACL
-Pointer if appropriate).
-
-To be added to a contact list, the user being added must grant permission. Once
-granted, both user's HS(es) store this information, as it allows the user who
-has added the contact some more abilities; see below. Since such subscriptions
+list of other user IDs the user has chosen to add to it. To be added to this
+list, the user being added must receive permission from the list owner. Once
+granted, both user's HS(es) store this information. Since such subscriptions
are likely to be bidirectional, HSes may wish to automatically accept requests
when a reverse subscription already exists.
-As a convenience, presence lists should support the ability to collect users
-into groups, which could allow things like inviting the entire group to a new
-("ad-hoc") chat room, or easy interaction with the profile information ACL
-implementation of the HS.
-
Presence and Permissions
------------------------
For a viewing user to be allowed to see the presence information of a target
-user, either
+user, either:
- * The target user has allowed the viewing user to add them to their presence
+ - The target user has allowed the viewing user to add them to their presence
list, or
-
- * The two users share at least one room in common
+ - The two users share at least one room in common
In the latter case, this allows for clients to display some minimal sense of
presence information in a user list for a room.
-Home servers can also use the user's choice of presence state as a signal for
-how to handle new private one-to-one chat message requests. For example, it
-might decide:
-
- - "free to chat": accept anything
- - "online": accept from anyone in my address book list
- - "busy": accept from anyone in this "important people" group in my address
- book list
-
Typing notifications
====================
@@ -274,18 +352,14 @@ human-friendly string. Profiles grant users the ability to see human-readable
names for other users that are in some way meaningful to them. Additionally,
profiles can publish additional information, such as the user's age or location.
-It is also conceivable that since we are attempting to provide a
-worldwide-applicable messaging system, that users may wish to present different
-subsets of information in their profile to different other people, from a
-privacy and permissions perspective.
-
A Profile consists of a display name, an avatar picture, and a set of other
metadata fields that the user may wish to publish (email address, phone
numbers, website URLs, etc...). This specification puts no requirements on the
-display name other than it being a valid Unicode string.
+display name other than it being a valid unicode string.
- Metadata extensibility
- Bundled with which events? e.g. m.room.member
+- Generate own events? What type?
Registration and login
======================
@@ -312,8 +386,8 @@ The login process breaks down into the following:
step 2.
As each home server may have different ways of logging in, the client needs to know how
-they should login. All distinct login stages MUST have a corresponding ``'type'``.
-A ``'type'`` is a namespaced string which details the mechanism for logging in.
+they should login. All distinct login stages MUST have a corresponding ``type``.
+A ``type`` is a namespaced string which details the mechanism for logging in.
A client may be able to login via multiple valid login flows, and should choose a single
flow when logging in. A flow is a series of login stages. The home server MUST respond
@@ -359,17 +433,17 @@ subsequent requests until the login is completed::
}
This specification defines the following login types:
- - m.login.password
- - m.login.oauth2
- - m.login.email.code
- - m.login.email.url
+ - ``m.login.password``
+ - ``m.login.oauth2``
+ - ``m.login.email.code``
+ - ``m.login.email.url``
Password-based
--------------
-Type:
- "m.login.password"
-Description:
+:Type:
+ m.login.password
+:Description:
Login is supported via a username and password.
To respond to this type, reply with::
@@ -385,9 +459,9 @@ process, or a standard error response.
OAuth2-based
------------
-Type:
- "m.login.oauth2"
-Description:
+:Type:
+ m.login.oauth2
+:Description:
Login is supported via OAuth2 URLs. This login consists of multiple requests.
To respond to this type, reply with::
@@ -438,9 +512,9 @@ visits the REDIRECT_URI with the auth code= query parameter which returns::
Email-based (code)
------------------
-Type:
- "m.login.email.code"
-Description:
+:Type:
+ m.login.email.code
+:Description:
Login is supported by typing in a code which is sent in an email. This login
consists of multiple requests.
@@ -473,9 +547,9 @@ the login process, or a standard error response.
Email-based (url)
-----------------
-Type:
- "m.login.email.url"
-Description:
+:Type:
+ m.login.email.url
+:Description:
Login is supported by clicking on a URL in an email. This login consists of
multiple requests.
@@ -515,7 +589,7 @@ N-Factor Authentication
-----------------------
Multiple login stages can be combined to create N-factor authentication during login.
-This can be achieved by responding with the ``'next'`` login type on completion of a
+This can be achieved by responding with the ``next`` login type on completion of a
previous login stage::
{
@@ -523,7 +597,7 @@ previous login stage::
}
If a home server implements N-factor authentication, it MUST respond with all
-``'stages'`` when initially queried for their login requirements::
+``stages`` when initially queried for their login requirements::
{
"type": "<1st login type>",
@@ -592,59 +666,62 @@ can also be performed.
There are three main kinds of communication that occur between home servers:
- * Queries
+:Queries:
These are single request/response interactions between a given pair of
- servers, initiated by one side sending an HTTP request to obtain some
+ servers, initiated by one side sending an HTTP GET request to obtain some
information, and responded by the other. They are not persisted and contain
no long-term significant history. They simply request a snapshot state at the
instant the query is made.
- * EDUs - Ephemeral Data Units
+:Ephemeral Data Units (EDUs):
These are notifications of events that are pushed from one home server to
another. They are not persisted and contain no long-term significant history,
nor does the receiving home server have to reply to them.
- * PDUs - Persisted Data Units
+:Persisted Data Units (PDUs):
These are notifications of events that are broadcast from one home server to
any others that are interested in the same "context" (namely, a Room ID).
They are persisted to long-term storage and form the record of history for
that context.
-Where Queries are presented directly across the HTTP connection as GET requests
-to specific URLs, EDUs and PDUs are further wrapped in an envelope called a
-Transaction, which is transferred from the origin to the destination home server
-using a PUT request.
+EDUs and PDUs are further wrapped in an envelope called a Transaction, which is
+transferred from the origin to the destination home server using an HTTP PUT request.
-Transactions and EDUs/PDUs
---------------------------
+Transactions
+------------
The transfer of EDUs and PDUs between home servers is performed by an exchange
-of Transaction messages, which are encoded as JSON objects with a dict as the
-top-level element, passed over an HTTP PUT request. A Transaction is meaningful
-only to the pair of home servers that exchanged it; they are not globally-
-meaningful.
+of Transaction messages, which are encoded as JSON objects, passed over an
+HTTP PUT request. A Transaction is meaningful only to the pair of home servers that
+exchanged it; they are not globally-meaningful.
-Each transaction has an opaque ID and timestamp (UNIX epoch time in
-milliseconds) generated by its origin server, an origin and destination server
-name, a list of "previous IDs", and a list of PDUs - the actual message payload
-that the Transaction carries.
+Each transaction has:
+ - An opaque transaction ID.
+ - A timestamp (UNIX epoch time in milliseconds) generated by its origin server.
+ - An origin and destination server name.
+ - A list of "previous IDs".
+ - A list of PDUs and EDUs - the actual message payload that the Transaction carries.
- {"transaction_id":"916d630ea616342b42e98a3be0b74113",
+::
+
+ {
+ "transaction_id":"916d630ea616342b42e98a3be0b74113",
"ts":1404835423000,
"origin":"red",
"destination":"blue",
"prev_ids":["e1da392e61898be4d2009b9fecce5325"],
"pdus":[...],
- "edus":[...]}
+ "edus":[...]
+ }
-The "previous IDs" field will contain a list of previous transaction IDs that
-the origin server has sent to this destination. Its purpose is to act as a
+The ``prev_ids`` field contains a list of previous transaction IDs that
+the ``origin`` server has sent to this ``destination``. Its purpose is to act as a
sequence checking mechanism - the destination server can check whether it has
successfully received that Transaction, or ask for a retransmission if not.
-The "pdus" field of a transaction is a list, containing zero or more PDUs.[*]
-Each PDU is itself a dict containing a number of keys, the exact details of
-which will vary depending on the type of PDU. Similarly, the "edus" field is
+The ``pdus`` field of a transaction is a list, containing zero or more PDUs.[*]
+Each PDU is itself a JSON object containing a number of keys, the exact details of
+which will vary depending on the type of PDU. Similarly, the ``edus`` field is
another list containing the EDUs. This key may be entirely absent if there are
no EDUs to transfer.
@@ -653,25 +730,35 @@ receiving an "empty" transaction, as this is useful for informing peers of other
transaction IDs they should be aware of. This effectively acts as a push
mechanism to encourage peers to continue to replicate content.)
-All PDUs have an ID, a context, a declaration of their type, a list of other PDU
-IDs that have been seen recently on that context (regardless of which origin
-sent them), and a nested content field containing the actual event content.
+PDUs and EDUs
+-------------
+
+All PDUs have:
+ - An ID
+ - A context
+ - A declaration of their type
+ - A list of other PDU IDs that have been seen recently on that context (regardless of which origin
+ sent them)
[[TODO(paul): Update this structure so that 'pdu_id' is a two-element
[origin,ref] pair like the prev_pdus are]]
- {"pdu_id":"a4ecee13e2accdadf56c1025af232176",
+::
+
+ {
+ "pdu_id":"a4ecee13e2accdadf56c1025af232176",
"context":"#example.green",
"origin":"green",
"ts":1404838188000,
"pdu_type":"m.text",
"prev_pdus":[["blue","99d16afbc857975916f1d73e49e52b65"]],
"content":...
- "is_state":false}
+ "is_state":false
+ }
-In contrast to the transaction layer, it is important to note that the prev_pdus
+In contrast to Transactions, it is important to note that the ``prev_pdus``
field of a PDU refers to PDUs that any origin server has sent, rather than
-previous IDs that this origin has sent. This list may refer to other PDUs sent
+previous IDs that this ``origin`` has sent. This list may refer to other PDUs sent
by the same origin as the current one, or other origins.
Because of the distributed nature of participants in a Matrix conversation, it
@@ -686,6 +773,8 @@ PDUs fall into two main categories: those that deliver Events, and those that
synchronise State. For PDUs that relate to State synchronisation, additional
keys exist to support this:
+::
+
{...,
"is_state":true,
"state_key":TODO
@@ -704,6 +793,8 @@ EDUs, by comparison to PDUs, do not have an ID, a context, or a list of
"previous" IDs. The only mandatory fields for these are the type, origin and
destination home server names, and the actual nested content.
+::
+
{"edu_type":"m.presence",
"origin":"blue",
"destination":"orange",
|