summary refs log tree commit diff
diff options
context:
space:
mode:
authorpacien <pacien@users.noreply.github.com>2023-07-03 16:39:38 +0200
committerGitHub <noreply@github.com>2023-07-03 16:39:38 +0200
commit07d7cbfe69c239a7ffe5668c1166799370eef0d6 (patch)
tree912961198d7ac4c2923dabaeea660ae0c834e4f6
parentFix the `devenv up` configuration which was ignoring the config overrides. (#... (diff)
downloadsynapse-07d7cbfe69c239a7ffe5668c1166799370eef0d6.tar.xz
devices: use combined ANY clause for faster cleanup (#15861)
Old device entries for the same user were being removed in individual
SQL commands, making the batch take way longer than necessary.

This combines the commands into a single one with a IN/ANY clause.

Example of log entry before the change, regularly observed with
"log_min_duration_statement = 10000" in PostgreSQL's config:

    LOG:  duration: 42538.282 ms  statement:
    DELETE FROM device_lists_stream
    WHERE user_id = '@someone' AND device_id = 'someid1'
    AND stream_id < 123456789
    ;
    DELETE FROM device_lists_stream
    WHERE user_id = '@someone' AND device_id = 'someid2'
    AND stream_id < 123456789
    ;
    [repeated for each device ID of that user, potentially a lot...]

With the patch applied on my instance for the past couple of days, I
no longer notice overly long statements of that particular kind.

Signed-off-by: pacien <pacien.trangirard@pacien.net>
-rw-r--r--changelog.d/15861.misc1
-rw-r--r--synapse/storage/databases/main/devices.py14
2 files changed, 10 insertions, 5 deletions
diff --git a/changelog.d/15861.misc b/changelog.d/15861.misc
new file mode 100644
index 0000000000..6f320eab81
--- /dev/null
+++ b/changelog.d/15861.misc
@@ -0,0 +1 @@
+Optimised cleanup of old entries in device_lists_stream.
diff --git a/synapse/storage/databases/main/devices.py b/synapse/storage/databases/main/devices.py
index f677d048aa..d9df437e51 100644
--- a/synapse/storage/databases/main/devices.py
+++ b/synapse/storage/databases/main/devices.py
@@ -1950,12 +1950,16 @@ class DeviceStore(DeviceWorkerStore, DeviceBackgroundUpdateStore):
 
         # Delete older entries in the table, as we really only care about
         # when the latest change happened.
-        txn.execute_batch(
-            """
+        cleanup_obsolete_stmt = """
             DELETE FROM device_lists_stream
-            WHERE user_id = ? AND device_id = ? AND stream_id < ?
-            """,
-            [(user_id, device_id, min_stream_id) for device_id in device_ids],
+            WHERE user_id = ? AND stream_id < ? AND %s
+        """
+        device_ids_clause, device_ids_args = make_in_list_sql_clause(
+            txn.database_engine, "device_id", device_ids
+        )
+        txn.execute(
+            cleanup_obsolete_stmt % (device_ids_clause,),
+            [user_id, min_stream_id] + device_ids_args,
         )
 
         self.db_pool.simple_insert_many_txn(