summary refs log tree commit diff
path: root/docs/structured_logging.md
diff options
context:
space:
mode:
authorPatrick Cloke <clokep@users.noreply.github.com>2020-10-29 07:27:37 -0400
committerGitHub <noreply@github.com>2020-10-29 07:27:37 -0400
commit00b24aa545091395f9a92d531836f6bf7b4460e0 (patch)
tree10d7333f2d1d9aaa0a6888c9ce3afb7d6feebf58 /docs/structured_logging.md
parentDon't require hiredis to run unit tests (#8680) (diff)
downloadsynapse-00b24aa545091395f9a92d531836f6bf7b4460e0.tar.xz
Support generating structured logs in addition to standard logs. (#8607)
This modifies the configuration of structured logging to be usable from
the standard Python logging configuration.

This also separates the formatting of logs from the transport allowing
JSON logs to files or standard logs to sockets.
Diffstat (limited to 'docs/structured_logging.md')
-rw-r--r--docs/structured_logging.md164
1 files changed, 121 insertions, 43 deletions
diff --git a/docs/structured_logging.md b/docs/structured_logging.md
index decec9b8fa..b1281667e0 100644
--- a/docs/structured_logging.md
+++ b/docs/structured_logging.md
@@ -1,83 +1,161 @@
 # Structured Logging
 
-A structured logging system can be useful when your logs are destined for a machine to parse and process. By maintaining its machine-readable characteristics, it enables more efficient searching and aggregations when consumed by software such as the "ELK stack".
+A structured logging system can be useful when your logs are destined for a
+machine to parse and process. By maintaining its machine-readable characteristics,
+it enables more efficient searching and aggregations when consumed by software
+such as the "ELK stack".
 
-Synapse's structured logging system is configured via the file that Synapse's `log_config` config option points to. The file must be YAML and contain `structured: true`. It must contain a list of "drains" (places where logs go to).
+Synapse's structured logging system is configured via the file that Synapse's
+`log_config` config option points to. The file should include a formatter which
+uses the `synapse.logging.TerseJsonFormatter` class included with Synapse and a
+handler which uses the above formatter.
+
+There is also a `synapse.logging.JsonFormatter` option which does not include
+a timestamp in the resulting JSON. This is useful if the log ingester adds its
+own timestamp.
 
 A structured logging configuration looks similar to the following:
 
 ```yaml
-structured: true
+version: 1
+
+formatters:
+    structured:
+        class: synapse.logging.TerseJsonFormatter
+
+handlers:
+    file:
+        class: logging.handlers.TimedRotatingFileHandler
+        formatter: structured
+        filename: /path/to/my/logs/homeserver.log
+        when: midnight
+        backupCount: 3  # Does not include the current log file.
+        encoding: utf8
 
 loggers:
     synapse:
         level: INFO
+        handlers: [remote]
     synapse.storage.SQL:
         level: WARNING
-
-drains:
-    console:
-        type: console
-        location: stdout
-    file:
-        type: file_json
-        location: homeserver.log
 ```
 
-The above logging config will set Synapse as 'INFO' logging level by default, with the SQL layer at 'WARNING', and will have two logging drains (to the console and to a file, stored as JSON).
-
-## Drain Types
+The above logging config will set Synapse as 'INFO' logging level by default,
+with the SQL layer at 'WARNING', and will log to a file, stored as JSON.
 
-Drain types can be specified by the `type` key.
+It is also possible to figure Synapse to log to a remote endpoint by using the
+`synapse.logging.RemoteHandler` class included with Synapse. It takes the
+following arguments:
 
-### `console`
+- `host`: Hostname or IP address of the log aggregator.
+- `port`: Numerical port to contact on the host.
+- `maximum_buffer`: (Optional, defaults to 1000) The maximum buffer size to allow.
 
-Outputs human-readable logs to the console.
+A remote structured logging configuration looks similar to the following:
 
-Arguments:
+```yaml
+version: 1
 
-- `location`: Either `stdout` or `stderr`.
+formatters:
+    structured:
+        class: synapse.logging.TerseJsonFormatter
 
-### `console_json`
+handlers:
+    remote:
+        class: synapse.logging.RemoteHandler
+        formatter: structured
+        host: 10.1.2.3
+        port: 9999
 
-Outputs machine-readable JSON logs to the console.
+loggers:
+    synapse:
+        level: INFO
+        handlers: [remote]
+    synapse.storage.SQL:
+        level: WARNING
+```
 
-Arguments:
+The above logging config will set Synapse as 'INFO' logging level by default,
+with the SQL layer at 'WARNING', and will log JSON formatted messages to a
+remote endpoint at 10.1.2.3:9999.
 
-- `location`: Either `stdout` or `stderr`.
+## Upgrading from legacy structured logging configuration
 
-### `console_json_terse`
+Versions of Synapse prior to v1.23.0 included a custom structured logging
+configuration which is deprecated. It used a `structured: true` flag and
+configured `drains` instead of ``handlers`` and `formatters`.
 
-Outputs machine-readable JSON logs to the console, separated by newlines. This
-format is not designed to be read and re-formatted into human-readable text, but
-is optimal for a logging aggregation system.
+Synapse currently automatically converts the old configuration to the new
+configuration, but this will be removed in a future version of Synapse. The
+following reference can be used to update your configuration. Based on the drain
+`type`, we can pick a new handler:
 
-Arguments:
+1. For a type of `console`, `console_json`, or `console_json_terse`: a handler
+   with a class of `logging.StreamHandler` and a `stream` of `ext://sys.stdout`
+   or `ext://sys.stderr` should be used.
+2. For a type of `file` or `file_json`: a handler of `logging.FileHandler` with
+   a location of the file path should be used.
+3. For a type of `network_json_terse`: a handler of `synapse.logging.RemoteHandler`
+   with the host and port should be used.
 
-- `location`: Either `stdout` or `stderr`.
+Then based on the drain `type` we can pick a new formatter:
 
-### `file`
+1. For a type of `console` or `file` no formatter is necessary.
+2. For a type of `console_json` or `file_json`: a formatter of
+   `synapse.logging.JsonFormatter` should be used.
+3. For a type of `console_json_terse` or `network_json_terse`: a formatter of
+   `synapse.logging.TerseJsonFormatter` should be used.
 
-Outputs human-readable logs to a file.
+For each new handler and formatter they should be added to the logging configuration
+and then assigned to either a logger or the root logger.
 
-Arguments:
+An example legacy configuration:
 
-- `location`: An absolute path to the file to log to.
+```yaml
+structured: true
 
-### `file_json`
+loggers:
+    synapse:
+        level: INFO
+    synapse.storage.SQL:
+        level: WARNING
 
-Outputs machine-readable logs to a file.
+drains:
+    console:
+        type: console
+        location: stdout
+    file:
+        type: file_json
+        location: homeserver.log
+```
 
-Arguments:
+Would be converted into a new configuration:
 
-- `location`: An absolute path to the file to log to.
+```yaml
+version: 1
 
-### `network_json_terse`
+formatters:
+    json:
+        class: synapse.logging.JsonFormatter
 
-Delivers machine-readable JSON logs to a log aggregator over TCP. This is
-compatible with LogStash's TCP input with the codec set to `json_lines`.
+handlers:
+    console:
+        class: logging.StreamHandler
+        location: ext://sys.stdout
+    file:
+        class: logging.FileHandler
+        formatter: json
+        filename: homeserver.log
 
-Arguments:
+loggers:
+    synapse:
+        level: INFO
+        handlers: [console, file]
+    synapse.storage.SQL:
+        level: WARNING
+```
 
-- `host`: Hostname or IP address of the log aggregator.
-- `port`: Numerical port to contact on the host.
\ No newline at end of file
+The new logging configuration is a bit more verbose, but significantly more
+flexible. It allows for configuration that were not previously possible, such as
+sending plain logs over the network, or using different handlers for different
+modules.