From 660129deb1bb60dac04ff59d1da8bbe53882149e Mon Sep 17 00:00:00 2001 From: Kegan Dougal Date: Thu, 28 Aug 2014 09:45:05 +0100 Subject: Shuffle files around in /docs --- docs/code_style.rst | 18 ----- docs/documentation_style.rst | 43 ------------ docs/implementation-notes/code_style.rst | 18 +++++ docs/implementation-notes/documentation_style.rst | 43 ++++++++++++ docs/implementation-notes/python_architecture.rst | 53 ++++++++++++++ docs/model/protocol_examples.rst | 64 +++++++++++++++++ docs/model/terminology.rst | 86 +++++++++++++++++++++++ docs/protocol_examples.rst | 64 ----------------- docs/python_architecture.rst | 53 -------------- docs/server-server/versioning.rst | 11 +++ docs/terminology.rst | 86 ----------------------- docs/versioning.rst | 11 --- 12 files changed, 275 insertions(+), 275 deletions(-) delete mode 100644 docs/code_style.rst delete mode 100644 docs/documentation_style.rst create mode 100644 docs/implementation-notes/code_style.rst create mode 100644 docs/implementation-notes/documentation_style.rst create mode 100644 docs/implementation-notes/python_architecture.rst create mode 100644 docs/model/protocol_examples.rst create mode 100644 docs/model/terminology.rst delete mode 100644 docs/protocol_examples.rst delete mode 100644 docs/python_architecture.rst create mode 100644 docs/server-server/versioning.rst delete mode 100644 docs/terminology.rst delete mode 100644 docs/versioning.rst (limited to 'docs') diff --git a/docs/code_style.rst b/docs/code_style.rst deleted file mode 100644 index d7e2d5e69e..0000000000 --- a/docs/code_style.rst +++ /dev/null @@ -1,18 +0,0 @@ -Basically, PEP8 - -- Max line width: 80 chars. -- Use camel case for class and type names -- Use underscores for functions and variables. -- Use double quotes. -- Use parentheses instead of '\' for line continuation where ever possible (which is pretty much everywhere) -- There should be max a single new line between: - - statements - - functions in a class -- There should be two new lines between: - - definitions in a module (e.g., between different classes) -- There should be spaces where spaces should be and not where there shouldn't be: - - a single space after a comma - - a single space before and after for '=' when used as assignment - - no spaces before and after for '=' for default values and keyword arguments. - -Comments should follow the google code style. This is so that we can generate documentation with sphinx (http://sphinxcontrib-napoleon.readthedocs.org/en/latest/) diff --git a/docs/documentation_style.rst b/docs/documentation_style.rst deleted file mode 100644 index c365d09dff..0000000000 --- a/docs/documentation_style.rst +++ /dev/null @@ -1,43 +0,0 @@ -=================== -Documentation Style -=================== - -A brief single sentence to describe what this file contains; in this case a -description of the style to write documentation in. - - -Sections -======== - -Each section should be separated from the others by two blank lines. Headings -should be underlined using a row of equals signs (===). Paragraphs should be -separated by a single blank line, and wrap to no further than 80 columns. - -[[TODO(username): if you want to leave some unanswered questions, notes for -further consideration, or other kinds of comment, use a TODO section. Make sure -to notate it with your name so we know who to ask about it!]] - -Subsections ------------ - -If required, subsections can use a row of dashes to underline their header. A -single blank line between subsections of a single section. - - -Bullet Lists -============ - - * Bullet lists can use asterisks with a single space either side. - - * Another blank line between list elements. - - -Definition Lists -================ - -Terms: - Start in the first column, ending with a colon - -Definitions: - Take a two space indent, following immediately from the term without a blank - line before it, but having a blank line afterwards. diff --git a/docs/implementation-notes/code_style.rst b/docs/implementation-notes/code_style.rst new file mode 100644 index 0000000000..d7e2d5e69e --- /dev/null +++ b/docs/implementation-notes/code_style.rst @@ -0,0 +1,18 @@ +Basically, PEP8 + +- Max line width: 80 chars. +- Use camel case for class and type names +- Use underscores for functions and variables. +- Use double quotes. +- Use parentheses instead of '\' for line continuation where ever possible (which is pretty much everywhere) +- There should be max a single new line between: + - statements + - functions in a class +- There should be two new lines between: + - definitions in a module (e.g., between different classes) +- There should be spaces where spaces should be and not where there shouldn't be: + - a single space after a comma + - a single space before and after for '=' when used as assignment + - no spaces before and after for '=' for default values and keyword arguments. + +Comments should follow the google code style. This is so that we can generate documentation with sphinx (http://sphinxcontrib-napoleon.readthedocs.org/en/latest/) diff --git a/docs/implementation-notes/documentation_style.rst b/docs/implementation-notes/documentation_style.rst new file mode 100644 index 0000000000..c365d09dff --- /dev/null +++ b/docs/implementation-notes/documentation_style.rst @@ -0,0 +1,43 @@ +=================== +Documentation Style +=================== + +A brief single sentence to describe what this file contains; in this case a +description of the style to write documentation in. + + +Sections +======== + +Each section should be separated from the others by two blank lines. Headings +should be underlined using a row of equals signs (===). Paragraphs should be +separated by a single blank line, and wrap to no further than 80 columns. + +[[TODO(username): if you want to leave some unanswered questions, notes for +further consideration, or other kinds of comment, use a TODO section. Make sure +to notate it with your name so we know who to ask about it!]] + +Subsections +----------- + +If required, subsections can use a row of dashes to underline their header. A +single blank line between subsections of a single section. + + +Bullet Lists +============ + + * Bullet lists can use asterisks with a single space either side. + + * Another blank line between list elements. + + +Definition Lists +================ + +Terms: + Start in the first column, ending with a colon + +Definitions: + Take a two space indent, following immediately from the term without a blank + line before it, but having a blank line afterwards. diff --git a/docs/implementation-notes/python_architecture.rst b/docs/implementation-notes/python_architecture.rst new file mode 100644 index 0000000000..8beaa615d0 --- /dev/null +++ b/docs/implementation-notes/python_architecture.rst @@ -0,0 +1,53 @@ += Server to Server = + +== Server to Server Stack == + +To use the server to server stack, home servers should only need to interact with the Messaging layer. + +The server to server side of things is designed into 4 distinct layers: + + 1. Messaging Layer + 2. Pdu Layer + 3. Transaction Layer + 4. Transport Layer + +Where the bottom (the transport layer) is what talks to the internet via HTTP, and the top (the messaging layer) talks to the rest of the Home Server with a domain specific API. + +1. Messaging Layer + This is what the rest of the Home Server hits to send messages, join rooms, etc. It also allows you to register callbacks for when it get's notified by lower levels that e.g. a new message has been received. + + It is responsible for serializing requests to send to the data layer, and to parse requests received from the data layer. + + +2. PDU Layer + This layer handles: + * duplicate pdu_id's - i.e., it makes sure we ignore them. + * responding to requests for a given pdu_id + * responding to requests for all metadata for a given context (i.e. room) + * handling incoming backfill requests + + So it has to parse incoming messages to discover which are metadata and which aren't, and has to correctly clobber existing metadata where appropriate. + + For incoming PDUs, it has to check the PDUs it references to see if we have missed any. If we have go and ask someone (another home server) for it. + + +3. Transaction Layer + This layer makes incoming requests idempotent. I.e., it stores which transaction id's we have seen and what our response were. If we have already seen a message with the given transaction id, we do not notify higher levels but simply respond with the previous response. + +transaction_id is from "GET /send//" + + It's also responsible for batching PDUs into single transaction for sending to remote destinations, so that we only ever have one transaction in flight to a given destination at any one time. + + This is also responsible for answering requests for things after a given set of transactions, i.e., ask for everything after 'ver' X. + + +4. Transport Layer + This is responsible for starting a HTTP server and hitting the correct callbacks on the Transaction layer, as well as sending both data and requests for data. + + +== Persistence == + +We persist things in a single sqlite3 database. All database queries get run on a separate, dedicated thread. This that we only ever have one query running at a time, making it a lot easier to do things in a safe manner. + +The queries are located in the synapse.persistence.transactions module, and the table information in the synapse.persistence.tables module. + diff --git a/docs/model/protocol_examples.rst b/docs/model/protocol_examples.rst new file mode 100644 index 0000000000..61a599b432 --- /dev/null +++ b/docs/model/protocol_examples.rst @@ -0,0 +1,64 @@ +PUT /send/abc/ HTTP/1.1 +Host: ... +Content-Length: ... +Content-Type: application/json + +{ + "origin": "localhost:5000", + "pdus": [ + { + "content": {}, + "context": "tng", + "depth": 12, + "is_state": false, + "origin": "localhost:5000", + "pdu_id": 1404381396854, + "pdu_type": "feedback", + "prev_pdus": [ + [ + "1404381395883", + "localhost:6000" + ] + ], + "ts": 1404381427581 + } + ], + "prev_ids": [ + "1404381396852" + ], + "ts": 1404381427823 +} + +HTTP/1.1 200 OK +... + +====================================== + +GET /pull/-1/ HTTP/1.1 +Host: ... +Content-Length: 0 + +HTTP/1.1 200 OK +Content-Length: ... +Content-Type: application/json + +{ + origin: ..., + prev_ids: ..., + data: [ + { + data_id: ..., + prev_pdus: [...], + depth: ..., + ts: ..., + context: ..., + origin: ..., + content: { + ... + } + }, + ..., + ] +} + + diff --git a/docs/model/terminology.rst b/docs/model/terminology.rst new file mode 100644 index 0000000000..cc6e6760ac --- /dev/null +++ b/docs/model/terminology.rst @@ -0,0 +1,86 @@ +=========== +Terminology +=========== + +A list of definitions of specific terminology used among these documents. +These terms were originally taken from the server-server documentation, and may +not currently match the exact meanings used in other places; though as a +medium-term goal we should encourage the unification of this terminology. + + +Terms +===== + +Backfilling: + The process of synchronising historic state from one home server to another, + to backfill the event storage so that scrollback can be presented to the + client(s). (Formerly, and confusingly, called 'pagination') + +Context: + A single human-level entity of interest (currently, a chat room) + +EDU (Ephemeral Data Unit): + A message that relates directly to a given pair of home servers that are + exchanging it. EDUs are short-lived messages that related only to one single + pair of servers; they are not persisted for a long time and are not forwarded + on to other servers. Because of this, they have no internal ID nor previous + EDUs reference chain. + +Event: + A record of activity that records a single thing that happened on to a context + (currently, a chat room). These are the "chat messages" that Synapse makes + available. + [[NOTE(paul): The current server-server implementation calls these simply + "messages" but the term is too ambiguous here; I've called them Events]] + +PDU (Persistent Data Unit): + A message that relates to a single context, irrespective of the server that + is communicating it. PDUs either encode a single Event, or a single State + change. A PDU is referred to by its PDU ID; the pair of its origin server + and local reference from that server. + +PDU ID: + The pair of PDU Origin and PDU Reference, that together globally uniquely + refers to a specific PDU. + +PDU Origin: + The name of the origin server that generated a given PDU. This may not be the + server from which it has been received, due to the way they are copied around + from server to server. The origin always records the original server that + created it. + +PDU Reference: + A local ID used to refer to a specific PDU from a given origin server. These + references are opaque at the protocol level, but may optionally have some + structured meaning within a given origin server or implementation. + +Presence: + The concept of whether a user is currently online, how available they declare + they are, and so on. See also: doc/model/presence + +Profile: + A set of metadata about a user, such as a display name, provided for the + benefit of other users. See also: doc/model/profiles + +Room ID: + An opaque string (of as-yet undecided format) that identifies a particular + room and used in PDUs referring to it. + +Room Alias: + A human-readable string of the form #name:some.domain that users can use as a + pointer to identify a room; a Directory Server will map this to its Room ID + +State: + A set of metadata maintained about a Context, which is replicated among the + servers in addition to the history of Events. + +User ID: + A string of the form @localpart:domain.name that identifies a user for + wire-protocol purposes. The localpart is meaningless outside of a particular + home server. This takes a human-readable form that end-users can use directly + if they so wish, avoiding the 3PIDs. + +Transaction: + A message which relates to the communication between a given pair of servers. + A transaction contains possibly-empty lists of PDUs and EDUs. + diff --git a/docs/protocol_examples.rst b/docs/protocol_examples.rst deleted file mode 100644 index 61a599b432..0000000000 --- a/docs/protocol_examples.rst +++ /dev/null @@ -1,64 +0,0 @@ -PUT /send/abc/ HTTP/1.1 -Host: ... -Content-Length: ... -Content-Type: application/json - -{ - "origin": "localhost:5000", - "pdus": [ - { - "content": {}, - "context": "tng", - "depth": 12, - "is_state": false, - "origin": "localhost:5000", - "pdu_id": 1404381396854, - "pdu_type": "feedback", - "prev_pdus": [ - [ - "1404381395883", - "localhost:6000" - ] - ], - "ts": 1404381427581 - } - ], - "prev_ids": [ - "1404381396852" - ], - "ts": 1404381427823 -} - -HTTP/1.1 200 OK -... - -====================================== - -GET /pull/-1/ HTTP/1.1 -Host: ... -Content-Length: 0 - -HTTP/1.1 200 OK -Content-Length: ... -Content-Type: application/json - -{ - origin: ..., - prev_ids: ..., - data: [ - { - data_id: ..., - prev_pdus: [...], - depth: ..., - ts: ..., - context: ..., - origin: ..., - content: { - ... - } - }, - ..., - ] -} - - diff --git a/docs/python_architecture.rst b/docs/python_architecture.rst deleted file mode 100644 index 8beaa615d0..0000000000 --- a/docs/python_architecture.rst +++ /dev/null @@ -1,53 +0,0 @@ -= Server to Server = - -== Server to Server Stack == - -To use the server to server stack, home servers should only need to interact with the Messaging layer. - -The server to server side of things is designed into 4 distinct layers: - - 1. Messaging Layer - 2. Pdu Layer - 3. Transaction Layer - 4. Transport Layer - -Where the bottom (the transport layer) is what talks to the internet via HTTP, and the top (the messaging layer) talks to the rest of the Home Server with a domain specific API. - -1. Messaging Layer - This is what the rest of the Home Server hits to send messages, join rooms, etc. It also allows you to register callbacks for when it get's notified by lower levels that e.g. a new message has been received. - - It is responsible for serializing requests to send to the data layer, and to parse requests received from the data layer. - - -2. PDU Layer - This layer handles: - * duplicate pdu_id's - i.e., it makes sure we ignore them. - * responding to requests for a given pdu_id - * responding to requests for all metadata for a given context (i.e. room) - * handling incoming backfill requests - - So it has to parse incoming messages to discover which are metadata and which aren't, and has to correctly clobber existing metadata where appropriate. - - For incoming PDUs, it has to check the PDUs it references to see if we have missed any. If we have go and ask someone (another home server) for it. - - -3. Transaction Layer - This layer makes incoming requests idempotent. I.e., it stores which transaction id's we have seen and what our response were. If we have already seen a message with the given transaction id, we do not notify higher levels but simply respond with the previous response. - -transaction_id is from "GET /send//" - - It's also responsible for batching PDUs into single transaction for sending to remote destinations, so that we only ever have one transaction in flight to a given destination at any one time. - - This is also responsible for answering requests for things after a given set of transactions, i.e., ask for everything after 'ver' X. - - -4. Transport Layer - This is responsible for starting a HTTP server and hitting the correct callbacks on the Transaction layer, as well as sending both data and requests for data. - - -== Persistence == - -We persist things in a single sqlite3 database. All database queries get run on a separate, dedicated thread. This that we only ever have one query running at a time, making it a lot easier to do things in a safe manner. - -The queries are located in the synapse.persistence.transactions module, and the table information in the synapse.persistence.tables module. - diff --git a/docs/server-server/versioning.rst b/docs/server-server/versioning.rst new file mode 100644 index 0000000000..ffda60633f --- /dev/null +++ b/docs/server-server/versioning.rst @@ -0,0 +1,11 @@ +Versioning is, like, hard for backfilling backwards because of the number of Home Servers involved. + +The way we solve this is by doing versioning as an acyclic directed graph of PDUs. For backfilling purposes, this is done on a per context basis. +When we send a PDU we include all PDUs that have been received for that context that hasn't been subsequently listed in a later PDU. The trivial case is a simple list of PDUs, e.g. A <- B <- C. However, if two servers send out a PDU at the same to, both B and C would point at A - a later PDU would then list both B and C. + +Problems with opaque version strings: + - How do you do clustering without mandating that a cluster can only have one transaction in flight to a given remote home server at a time. + If you have multiple transactions sent at once, then you might drop one transaction, receive another with a version that is later than the dropped transaction and which point ARGH WE LOST A TRANSACTION. + - How do you do backfilling? A version string defines a point in a stream w.r.t. a single home server, not a point in the context. + +We only need to store the ends of the directed graph, we DO NOT need to do the whole one table of nodes and one of edges. diff --git a/docs/terminology.rst b/docs/terminology.rst deleted file mode 100644 index cc6e6760ac..0000000000 --- a/docs/terminology.rst +++ /dev/null @@ -1,86 +0,0 @@ -=========== -Terminology -=========== - -A list of definitions of specific terminology used among these documents. -These terms were originally taken from the server-server documentation, and may -not currently match the exact meanings used in other places; though as a -medium-term goal we should encourage the unification of this terminology. - - -Terms -===== - -Backfilling: - The process of synchronising historic state from one home server to another, - to backfill the event storage so that scrollback can be presented to the - client(s). (Formerly, and confusingly, called 'pagination') - -Context: - A single human-level entity of interest (currently, a chat room) - -EDU (Ephemeral Data Unit): - A message that relates directly to a given pair of home servers that are - exchanging it. EDUs are short-lived messages that related only to one single - pair of servers; they are not persisted for a long time and are not forwarded - on to other servers. Because of this, they have no internal ID nor previous - EDUs reference chain. - -Event: - A record of activity that records a single thing that happened on to a context - (currently, a chat room). These are the "chat messages" that Synapse makes - available. - [[NOTE(paul): The current server-server implementation calls these simply - "messages" but the term is too ambiguous here; I've called them Events]] - -PDU (Persistent Data Unit): - A message that relates to a single context, irrespective of the server that - is communicating it. PDUs either encode a single Event, or a single State - change. A PDU is referred to by its PDU ID; the pair of its origin server - and local reference from that server. - -PDU ID: - The pair of PDU Origin and PDU Reference, that together globally uniquely - refers to a specific PDU. - -PDU Origin: - The name of the origin server that generated a given PDU. This may not be the - server from which it has been received, due to the way they are copied around - from server to server. The origin always records the original server that - created it. - -PDU Reference: - A local ID used to refer to a specific PDU from a given origin server. These - references are opaque at the protocol level, but may optionally have some - structured meaning within a given origin server or implementation. - -Presence: - The concept of whether a user is currently online, how available they declare - they are, and so on. See also: doc/model/presence - -Profile: - A set of metadata about a user, such as a display name, provided for the - benefit of other users. See also: doc/model/profiles - -Room ID: - An opaque string (of as-yet undecided format) that identifies a particular - room and used in PDUs referring to it. - -Room Alias: - A human-readable string of the form #name:some.domain that users can use as a - pointer to identify a room; a Directory Server will map this to its Room ID - -State: - A set of metadata maintained about a Context, which is replicated among the - servers in addition to the history of Events. - -User ID: - A string of the form @localpart:domain.name that identifies a user for - wire-protocol purposes. The localpart is meaningless outside of a particular - home server. This takes a human-readable form that end-users can use directly - if they so wish, avoiding the 3PIDs. - -Transaction: - A message which relates to the communication between a given pair of servers. - A transaction contains possibly-empty lists of PDUs and EDUs. - diff --git a/docs/versioning.rst b/docs/versioning.rst deleted file mode 100644 index ffda60633f..0000000000 --- a/docs/versioning.rst +++ /dev/null @@ -1,11 +0,0 @@ -Versioning is, like, hard for backfilling backwards because of the number of Home Servers involved. - -The way we solve this is by doing versioning as an acyclic directed graph of PDUs. For backfilling purposes, this is done on a per context basis. -When we send a PDU we include all PDUs that have been received for that context that hasn't been subsequently listed in a later PDU. The trivial case is a simple list of PDUs, e.g. A <- B <- C. However, if two servers send out a PDU at the same to, both B and C would point at A - a later PDU would then list both B and C. - -Problems with opaque version strings: - - How do you do clustering without mandating that a cluster can only have one transaction in flight to a given remote home server at a time. - If you have multiple transactions sent at once, then you might drop one transaction, receive another with a version that is later than the dropped transaction and which point ARGH WE LOST A TRANSACTION. - - How do you do backfilling? A version string defines a point in a stream w.r.t. a single home server, not a point in the context. - -We only need to store the ends of the directed graph, we DO NOT need to do the whole one table of nodes and one of edges. -- cgit 1.5.1