summary refs log tree commit diff
diff options
context:
space:
mode:
authorErik Johnston <erik@matrix.org>2018-05-17 17:07:08 +0100
committerErik Johnston <erik@matrix.org>2018-05-17 17:08:36 +0100
commit0504d809fd42d065d7771aa4a64b51780c290704 (patch)
tree31082af9f610c794b5475201d5a467bbe5042a84
parentDocument case of unconnected chunks (diff)
downloadsynapse-0504d809fd42d065d7771aa4a64b51780c290704.tar.xz
More comments
-rw-r--r--CHANGES.rst3
-rw-r--r--synapse/storage/chunk_ordered_table.py18
-rw-r--r--synapse/util/katriel_bodlaender.py4
3 files changed, 18 insertions, 7 deletions
diff --git a/CHANGES.rst b/CHANGES.rst
index 6df4bd9e38..ea532c6cca 100644
--- a/CHANGES.rst
+++ b/CHANGES.rst
@@ -2,7 +2,8 @@ Changes in <unreleased>
 =======================
 
 This release adds an index to the events table. This means that on first
-startup there will be an inceased amount of IO until the index is created.
+startup there will be an inceased amount of IO until the index is created, and
+an increase in disk usage.
 
 
 Changes in synapse v0.29.0 (2018-05-16)
diff --git a/synapse/storage/chunk_ordered_table.py b/synapse/storage/chunk_ordered_table.py
index 06257d66d8..d9e331a75d 100644
--- a/synapse/storage/chunk_ordered_table.py
+++ b/synapse/storage/chunk_ordered_table.py
@@ -33,10 +33,20 @@ class ChunkDBOrderedListStore(OrderedListStore):
     """Used as the list store for room chunks, efficiently maintaining them in
     topological order on updates.
 
-    A room chunk is a connected portion of the room events DAG. As such it
-    inherits a DAG, i.e. if an event in one chunk references an event in a
-    second chunk, then we say that the first chunk references the second, and
-    thus forming a DAG.
+    A room chunk is a connected portion of the room events DAG. As such the set
+    of chunks in a room inherits a DAG, i.e. if an event in one chunk references
+    an event in a second chunk, then we say that the first chunk references the
+    second, and thus forming a DAG.
+
+    Chunks are constructed so that they have the aditional property that for all
+    events in the chunk, either all of their prev_events are in that chunk or
+    none of them are. This ensures that no event that is subsequently received
+    needs to be inserted into the middle of a chunk, since it cannot both
+    reference an event in the chunk and be referenced by an event in the chunk
+    (assuming no cycles).
+
+    Multiple chunks can therefore happen when the server misses some events,
+    e.g. due to the server being offline for a time.
 
     The server may only have a subset of all events in a room, in which case
     its possible for the server to have chunks that are unconnected from each
diff --git a/synapse/util/katriel_bodlaender.py b/synapse/util/katriel_bodlaender.py
index 11ba612dce..b0eab2b4b0 100644
--- a/synapse/util/katriel_bodlaender.py
+++ b/synapse/util/katriel_bodlaender.py
@@ -24,7 +24,7 @@ This ordering is therefore opposite to what one might expect when considering
 the room DAG, as newer messages would be added to the start rather than the
 end.
 
-***We therefore invert the direction of edges when using the algorithm***
+***The ChunkDBOrderedListStore therefore inverts the direction of edges***
 
 See:
     A tight analysis of the Katriel–Bodlaender algorithm for online topological
@@ -79,7 +79,7 @@ class OrderedListStore(object):
         self._insert_before(node_id, None)
 
     def add_edge(self, source, target):
-        """Adds a new edge is added to the graph and updates the ordering.
+        """Adds a new edge to the graph and updates the ordering.
 
         See module level docs.