This provides a monitor for in-progress queries. The monitor keeps track of each registered query's deadline, client socket (channel), and possible postgresql connection, and whenever the deadline is reaached or the client disconnects (the query is abandoned), the monitor will attempt to kill the query -- currently by invoking a pg_terminate() on the query's registered postgres pid.
The main focus is client disconnections since there didn't appear to be any easy way to detect/handle them otherwise, and because without the pg_terminate, the server might continue executing an expensive query for a long time after the client is gone (say via browser page refresh).
It's worth noting that including this monitor, we have three different query timeout mechanisms. The other two are the time-limited-seq and the jdbc/update-local-timeouts operations in query-eng. We have all three because the time-limited seq only works when rows are moving (i.e. not when blocked waiting on pg or blocked pushing json to the client), the pg timeouts have an unusual granularity, i.e. they're a per-pg-wire-batch timeout, not a timeout for an entire statement like a top-level select, and the pg_terminate()s used here in the monitor are more expensive than either of those (killing an entire pg worker process).
The core of the monitor is a traditional (here NIO based) socket select loop, which should be able to handle even a large number of queries reasonably efficiently, and without requiring some number of threads proportional to the in-progress query count.
The current implementation is intended to respect the Selector
concurrency requirements, aided in part by limiting most work to the
single monitor thread, though forget does compete with the monitor
loop (coordinating via the :terminated promise.
No operations should block forever; they should all eventually (in some cases customizably) time out, and the current implementation is intended, overall, to try to let pdb keep running, even if the monitor (thread) dies somehow. The precipitating errors should still be reported to the log.
Every monitored query will have a SelectionKey associated with it, The key is cancelled during forget, but won't be removed from the selector's cancelled set until the next call to select. During that time, another query on the same socket/connection could try to re-register the cancelled key. This will throw an exception, which we suppress and retry until the select loop finally removes the cancelled key, and we can re-register the socket.
Every monitored query may also have a postgres pid associated with it, and whenever it does, that pid should be terminated (in coordination with the :terminated promise) once the query has been abandoned or has timed out.
The terminated promise coordinates between the monitor and attempts to remove (forget) a query. The arrangement is intended to make sure that the attempt to forget doesn't return until any competing termination attempt has finished, or at least had a chance to finish (otherwise the termination could kill a pg worker that's no longer associated with the original query, i.e. it's handling a new query that jetty has picked up on that channel).
The client socket monitoring depends on access to the jetty query response which (at least at the moment) provides indirect access to the java socket channel which can be read to determine whether the client is still connected.
The current implementation is completely incompatible with http "pipelining", but it looks like that is no longer a realistic concern: https://daniel.haxx.se/blog/2019/04/06/curl-says-bye-bye-to-pipelining/
If that turns out to be an incorrect assumption, then we'll have to reevaluate the implementation and/or feasibility of the monitoring. That's because so far, the only way we've found to detect a client disconnection is to attempt to read a byte. At the moment, that's acceptable because the client shouldn't be sending any data during the response (which of course wouldn't be true with pipelining, where it could be sending additional requests).
This provides a monitor for in-progress queries. The monitor keeps track of each registered query's deadline, client socket (channel), and possible postgresql connection, and whenever the deadline is reaached or the client disconnects (the query is abandoned), the monitor will attempt to kill the query -- currently by invoking a pg_terminate() on the query's registered postgres pid. The main focus is client disconnections since there didn't appear to be any easy way to detect/handle them otherwise, and because without the pg_terminate, the server might continue executing an expensive query for a long time after the client is gone (say via browser page refresh). It's worth noting that including this monitor, we have three different query timeout mechanisms. The other two are the time-limited-seq and the jdbc/update-local-timeouts operations in query-eng. We have all three because the time-limited seq only works when rows are moving (i.e. not when blocked waiting on pg or blocked pushing json to the client), the pg timeouts have an unusual granularity, i.e. they're a per-pg-wire-batch timeout, not a timeout for an entire statement like a top-level select, and the pg_terminate()s used here in the monitor are more expensive than either of those (killing an entire pg worker process). The core of the monitor is a traditional (here NIO based) socket select loop, which should be able to handle even a large number of queries reasonably efficiently, and without requiring some number of threads proportional to the in-progress query count. The current implementation is intended to respect the Selector concurrency requirements, aided in part by limiting most work to the single monitor thread, though `forget` does compete with the monitor loop (coordinating via the `:terminated` promise. No operations should block forever; they should all eventually (in some cases customizably) time out, and the current implementation is intended, overall, to try to let pdb keep running, even if the monitor (thread) dies somehow. The precipitating errors should still be reported to the log. Every monitored query will have a SelectionKey associated with it, The key is cancelled during forget, but won't be removed from the selector's cancelled set until the next call to select. During that time, another query on the same socket/connection could try to re-register the cancelled key. This will throw an exception, which we suppress and retry until the select loop finally removes the cancelled key, and we can re-register the socket. Every monitored query may also have a postgres pid associated with it, and whenever it does, that pid should be terminated (in coordination with the :terminated promise) once the query has been abandoned or has timed out. The terminated promise coordinates between the monitor and attempts to remove (forget) a query. The arrangement is intended to make sure that the attempt to forget doesn't return until any competing termination attempt has finished, or at least had a chance to finish (otherwise the termination could kill a pg worker that's no longer associated with the original query, i.e. it's handling a new query that jetty has picked up on that channel). The client socket monitoring depends on access to the jetty query response which (at least at the moment) provides indirect access to the java socket channel which can be read to determine whether the client is still connected. The current implementation is completely incompatible with http "pipelining", but it looks like that is no longer a realistic concern: https://daniel.haxx.se/blog/2019/04/06/curl-says-bye-bye-to-pipelining/ If that turns out to be an incorrect assumption, then we'll have to reevaluate the implementation and/or feasibility of the monitoring. That's because so far, the only way we've found to detect a client disconnection is to attempt to read a byte. At the moment, that's acceptable because the client shouldn't be sending any data during the response (which of course wouldn't be true with pipelining, where it could be sending additional requests).
(forget {:keys [queries selector thread] :as _monitor} select-key)Causes the monitor to forget the query specified by the select-key. Returns true if that succeeds, or ::timeout if the process does not complete within two seconds. Repeated calls for the same query will not crash. After a call for a given key, the monitor will have forgotten about it, but the final disposition of that query is undefined, i.e. it might or might not have been killed successfully. Calling this for a query that has already been forgotten is not an error (simplifies some error handling).
Causes the monitor to forget the query specified by the select-key. Returns true if that succeeds, or ::timeout if the process does not complete within two seconds. Repeated calls for the same query will not crash. After a call for a given key, the monitor will have forgotten about it, but the final disposition of that query is undefined, i.e. it might or might not have been killed successfully. Calling this for a query that has already been forgotten is not an error (simplifies some error handling).
(map->PuppetDBQueryMonitor m40241)Factory function for class PuppetDBQueryMonitor, taking a map of keywords to field values, but not much slower than ->x like the clojure.core version. (performance is fixed in Clojure 1.7, so this should eventually be removed.)
Factory function for class PuppetDBQueryMonitor, taking a map of keywords to field values, but not much slower than ->x like the clojure.core version. (performance is fixed in Clojure 1.7, so this should eventually be removed.)
(monitor &
{:keys [terminate-query]
:or {terminate-query
puppetlabs.puppetdb.query.monitor/terminate-query}})(stop monitor)(stop {:keys [exit selector thread] :as _monitor} timeout-ms)Attempts to stop the monitor, waiting up to timeout-ms if provided, or forever. Returns false if the monitor thread is still running, true otherwise. May be called more than once. When a true value is returned, all monitor activities should be finished.
Attempts to stop the monitor, waiting up to timeout-ms if provided, or forever. Returns false if the monitor thread is still running, true otherwise. May be called more than once. When a true value is returned, all monitor activities should be finished.
(stop-query-at-deadline-or-disconnect {:keys [selector queries thread]
:as _monitor}
id
channel
deadline-ns
db)(strict-map->PuppetDBQueryMonitor m40242 & [drop-extra-keys?__3495__auto__])Factory function for class PuppetDBQueryMonitor, taking a map of keywords to field values. All keys are required, and no extra keys are allowed. Even faster than map->
Factory function for class PuppetDBQueryMonitor, taking a map of keywords to field values. All keys are required, and no extra keys are allowed. Even faster than map->
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |