1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
|
# Cancellation
Sometimes, requests take a long time to service and clients disconnect
before Synapse produces a response. To avoid wasting resources, Synapse
can cancel request processing for select endpoints marked with the
`@cancellable` decorator.
Synapse makes use of Twisted's `Deferred.cancel()` feature to make
cancellation work. The `@cancellable` decorator does nothing by itself
and merely acts as a flag, signalling to developers and other code alike
that a method can be cancelled.
## Enabling cancellation for an endpoint
1. Check that the endpoint method, and any `async` functions in its call
tree handle cancellation correctly. See
[Handling cancellation correctly](#handling-cancellation-correctly)
for a list of things to look out for.
2. Add the `@cancellable` decorator to the `on_GET/POST/PUT/DELETE`
method. It's not recommended to make non-`GET` methods cancellable,
since cancellation midway through some database updates is less
likely to be handled correctly.
## Mechanics
There are two stages to cancellation: downward propagation of a
`cancel()` call, followed by upwards propagation of a `CancelledError`
out of a blocked `await`.
Both Twisted and asyncio have a cancellation mechanism.
| | Method | Exception | Exception inherits from |
|---------------|---------------------|-----------------------------------------|-------------------------|
| Twisted | `Deferred.cancel()` | `twisted.internet.defer.CancelledError` | `Exception` (!) |
| asyncio | `Task.cancel()` | `asyncio.CancelledError` | `BaseException` |
### Deferred.cancel()
When Synapse starts handling a request, it runs the async method
responsible for handling it using `defer.ensureDeferred`, which returns
a `Deferred`. For example:
```python
def do_something() -> Deferred[None]:
...
@cancellable
async def on_GET() -> Tuple[int, JsonDict]:
d = make_deferred_yieldable(do_something())
await d
return 200, {}
request = defer.ensureDeferred(on_GET())
```
When a client disconnects early, Synapse checks for the presence of the
`@cancellable` decorator on `on_GET`. Since `on_GET` is cancellable,
`Deferred.cancel()` is called on the `Deferred` from
`defer.ensureDeferred`, ie. `request`. Twisted knows which `Deferred`
`request` is waiting on and passes the `cancel()` call on to `d`.
The `Deferred` being waited on, `d`, may have its own handling for
`cancel()` and pass the call on to other `Deferred`s.
Eventually, a `Deferred` handles the `cancel()` call by resolving itself
with a `CancelledError`.
### CancelledError
The `CancelledError` gets raised out of the `await` and bubbles up, as
per normal Python exception handling.
## Handling cancellation correctly
In general, when writing code that might be subject to cancellation, two
things must be considered:
* The effect of `CancelledError`s raised out of `await`s.
* The effect of `Deferred`s being `cancel()`ed.
Examples of code that handles cancellation incorrectly include:
* `try-except` blocks which swallow `CancelledError`s.
* Code that shares the same `Deferred`, which may be cancelled, between
multiple requests.
* Code that starts some processing that's exempt from cancellation, but
uses a logging context from cancellable code. The logging context
will be finished upon cancellation, while the uncancelled processing
is still using it.
Some common patterns are listed below in more detail.
### `async` function calls
Most functions in Synapse are relatively straightforward from a
cancellation standpoint: they don't do anything with `Deferred`s and
purely call and `await` other `async` functions.
An `async` function handles cancellation correctly if its own code
handles cancellation correctly and all the async function it calls
handle cancellation correctly. For example:
```python
async def do_two_things() -> None:
check_something()
await do_something()
await do_something_else()
```
`do_two_things` handles cancellation correctly if `do_something` and
`do_something_else` handle cancellation correctly.
That is, when checking whether a function handles cancellation
correctly, its implementation and all its `async` function calls need to
be checked, recursively.
As `check_something` is not `async`, it does not need to be checked.
### CancelledErrors
Because Twisted's `CancelledError`s are `Exception`s, it's easy to
accidentally catch and suppress them. Care must be taken to ensure that
`CancelledError`s are allowed to propagate upwards.
<table width="100%">
<tr>
<td width="50%" valign="top">
**Bad**:
```python
try:
await do_something()
except Exception:
# `CancelledError` gets swallowed here.
logger.info(...)
```
</td>
<td width="50%" valign="top">
**Good**:
```python
try:
await do_something()
except CancelledError:
raise
except Exception:
logger.info(...)
```
</td>
</tr>
<tr>
<td width="50%" valign="top">
**OK**:
```python
try:
check_something()
# A `CancelledError` won't ever be raised here.
except Exception:
logger.info(...)
```
</td>
<td width="50%" valign="top">
**Good**:
```python
try:
await do_something()
except ValueError:
logger.info(...)
```
</td>
</tr>
</table>
#### defer.gatherResults
`defer.gatherResults` produces a `Deferred` which:
* broadcasts `cancel()` calls to every `Deferred` being waited on.
* wraps the first exception it sees in a `FirstError`.
Together, this means that `CancelledError`s will be wrapped in
a `FirstError` unless unwrapped. Such `FirstError`s are liable to be
swallowed, so they must be unwrapped.
<table width="100%">
<tr>
<td width="50%" valign="top">
**Bad**:
```python
async def do_something() -> None:
await make_deferred_yieldable(
defer.gatherResults([...], consumeErrors=True)
)
try:
await do_something()
except CancelledError:
raise
except Exception:
# `FirstError(CancelledError)` gets swallowed here.
logger.info(...)
```
</td>
<td width="50%" valign="top">
**Good**:
```python
async def do_something() -> None:
await make_deferred_yieldable(
defer.gatherResults([...], consumeErrors=True)
).addErrback(unwrapFirstError)
try:
await do_something()
except CancelledError:
raise
except Exception:
logger.info(...)
```
</td>
</tr>
</table>
### Creation of `Deferred`s
If a function creates a `Deferred`, the effect of cancelling it must be considered. `Deferred`s that get shared are likely to have unintended behaviour when cancelled.
<table width="100%">
<tr>
<td width="50%" valign="top">
**Bad**:
```python
cache: Dict[str, Deferred[None]] = {}
def wait_for_room(room_id: str) -> Deferred[None]:
deferred = cache.get(room_id)
if deferred is None:
deferred = Deferred()
cache[room_id] = deferred
# `deferred` can have multiple waiters.
# All of them will observe a `CancelledError`
# if any one of them is cancelled.
return make_deferred_yieldable(deferred)
# Request 1
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org")
# Request 2
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org")
```
</td>
<td width="50%" valign="top">
**Good**:
```python
cache: Dict[str, Deferred[None]] = {}
def wait_for_room(room_id: str) -> Deferred[None]:
deferred = cache.get(room_id)
if deferred is None:
deferred = Deferred()
cache[room_id] = deferred
# `deferred` will never be cancelled now.
# A `CancelledError` will still come out of
# the `await`.
# `delay_cancellation` may also be used.
return make_deferred_yieldable(stop_cancellation(deferred))
# Request 1
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org")
# Request 2
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org")
```
</td>
</tr>
<tr>
<td width="50%" valign="top">
</td>
<td width="50%" valign="top">
**Good**:
```python
cache: Dict[str, List[Deferred[None]]] = {}
def wait_for_room(room_id: str) -> Deferred[None]:
if room_id not in cache:
cache[room_id] = []
# Each request gets its own `Deferred` to wait on.
deferred = Deferred()
cache[room_id]].append(deferred)
return make_deferred_yieldable(deferred)
# Request 1
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org")
# Request 2
await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org")
```
</td>
</table>
### Uncancelled processing
Some `async` functions may kick off some `async` processing which is
intentionally protected from cancellation, by `stop_cancellation` or
other means. If the `async` processing inherits the logcontext of the
request which initiated it, care must be taken to ensure that the
logcontext is not finished before the `async` processing completes.
<table width="100%">
<tr>
<td width="50%" valign="top">
**Bad**:
```python
cache: Optional[ObservableDeferred[None]] = None
async def do_something_else(
to_resolve: Deferred[None]
) -> None:
await ...
logger.info("done!")
to_resolve.callback(None)
async def do_something() -> None:
if not cache:
to_resolve = Deferred()
cache = ObservableDeferred(to_resolve)
# `do_something_else` will never be cancelled and
# can outlive the `request-1` logging context.
run_in_background(do_something_else, to_resolve)
await make_deferred_yieldable(cache.observe())
with LoggingContext("request-1"):
await do_something()
```
</td>
<td width="50%" valign="top">
**Good**:
```python
cache: Optional[ObservableDeferred[None]] = None
async def do_something_else(
to_resolve: Deferred[None]
) -> None:
await ...
logger.info("done!")
to_resolve.callback(None)
async def do_something() -> None:
if not cache:
to_resolve = Deferred()
cache = ObservableDeferred(to_resolve)
run_in_background(do_something_else, to_resolve)
# We'll wait until `do_something_else` is
# done before raising a `CancelledError`.
await make_deferred_yieldable(
delay_cancellation(cache.observe())
)
else:
await make_deferred_yieldable(cache.observe())
with LoggingContext("request-1"):
await do_something()
```
</td>
</tr>
<tr>
<td width="50%">
**OK**:
```python
cache: Optional[ObservableDeferred[None]] = None
async def do_something_else(
to_resolve: Deferred[None]
) -> None:
await ...
logger.info("done!")
to_resolve.callback(None)
async def do_something() -> None:
if not cache:
to_resolve = Deferred()
cache = ObservableDeferred(to_resolve)
# `do_something_else` will get its own independent
# logging context. `request-1` will not count any
# metrics from `do_something_else`.
run_as_background_process(
"do_something_else",
do_something_else,
to_resolve,
)
await make_deferred_yieldable(cache.observe())
with LoggingContext("request-1"):
await do_something()
```
</td>
<td width="50%">
</td>
</tr>
</table>
|