puppetlabs.puppetdb.query.monitor

Liking cljdoc? Tell your friends :D

Clojure only.

compare-deadline-keys
forget
forget-pg-pid
map->PuppetDBQueryMonitor
monitor
ns-per-ms
register-pg-pid
start
state-summary
stop
stop-query-at-deadline-or-disconnect
strict-map->PuppetDBQueryMonitor

This provides a monitor for in-progress queries. The monitor keeps track of each registered query's deadline, client socket (channel), and possible postgresql connection, and whenever the deadline is reaached or the client disconnects (the query is abandoned), the monitor will attempt to kill the query -- currently by invoking a pg_terminate() on the query's registered postgres pid.

The main focus is client disconnections since there didn't appear to be any easy way to detect/handle them otherwise, and because without the pg_terminate, the server might continue executing an expensive query for a long time after the client is gone (say via browser page refresh).

It's worth noting that including this monitor, we have three different query timeout mechanisms. The other two are the time-limited-seq and the jdbc/update-local-timeouts operations in query-eng. We have all three because the time-limited seq only works when rows are moving (i.e. not when blocked waiting on pg or blocked pushing json to the client), the pg timeouts have an unusual granularity, i.e. they're a per-pg-wire-batch timeout, not a timeout for an entire statement like a top-level select, and the pg_terminate()s used here in the monitor are more expensive than either of those (killing an entire pg worker process).

The core of the monitor is a traditional (here NIO based) socket select loop, which should be able to handle even a large number of queries reasonably efficiently, and without requiring some number of threads proportional to the in-progress query count.

The current implementation is intended to respect the Selector concurrency requirements, aided in part by limiting most work to the single monitor thread, though forget does compete with the monitor loop (coordinating via the :terminated promise.

No operations should block forever; they should all eventually (in some cases customizably) time out, and the current implementation is intended, overall, to try to let pdb keep running, even if the monitor (thread) dies somehow. The precipitating errors should still be reported to the log.

Every monitored query will have a SelectionKey associated with it, The key is cancelled during forget, but won't be removed from the selector's cancelled set until the next call to select. During that time, another query on the same socket/connection could try to re-register the cancelled key. This will throw an exception, which we suppress and retry until the select loop finally removes the cancelled key, and we can re-register the socket.

Every monitored query may also have a postgres pid associated with it, and whenever it does, that pid should be terminated (in coordination with the :terminated promise) once the query has been abandoned or has timed out.

The terminated promise coordinates between the monitor and attempts to remove (forget) a query. The arrangement is intended to make sure that the attempt to forget doesn't return until any competing termination attempt has finished, or at least had a chance to finish (otherwise the termination could kill a pg worker that's no longer associated with the original query, i.e. it's handling a new query that jetty has picked up on that channel).

The client socket monitoring depends on access to the jetty query response which (at least at the moment) provides indirect access to the java socket channel which can be read to determine whether the client is still connected.

The current implementation is completely incompatible with http "pipelining", but it looks like that is no longer a realistic concern: https://daniel.haxx.se/blog/2019/04/06/curl-says-bye-bye-to-pipelining/

If that turns out to be an incorrect assumption, then we'll have to reevaluate the implementation and/or feasibility of the monitoring. That's because so far, the only way we've found to detect a client disconnection is to attempt to read a byte. At the moment, that's acceptable because the client shouldn't be sending any data during the response (which of course wouldn't be true with pipelining, where it could be sending additional requests).

This provides a monitor for in-progress queries.  The monitor keeps
track of each registered query's deadline, client socket (channel),
and possible postgresql connection, and whenever the deadline is
reaached or the client disconnects (the query is abandoned), the
monitor will attempt to kill the query -- currently by invoking a
pg_terminate() on the query's registered postgres pid.

The main focus is client disconnections since there didn't appear to
be any easy way to detect/handle them otherwise, and because without
the pg_terminate, the server might continue executing an expensive
query for a long time after the client is gone (say via browser page
refresh).

It's worth noting that including this monitor, we have three
different query timeout mechanisms.  The other two are the
time-limited-seq and the jdbc/update-local-timeouts operations in
query-eng.  We have all three because the time-limited seq only
works when rows are moving (i.e. not when blocked waiting on pg or
blocked pushing json to the client), the pg timeouts have an unusual
granularity, i.e. they're a per-pg-wire-batch timeout, not a timeout
for an entire statement like a top-level select, and the
pg_terminate()s used here in the monitor are more expensive than
either of those (killing an entire pg worker process).

The core of the monitor is a traditional (here NIO based) socket
select loop, which should be able to handle even a large number of
queries reasonably efficiently, and without requiring some number of
threads proportional to the in-progress query count.

The current implementation is intended to respect the Selector
concurrency requirements, aided in part by limiting most work to the
single monitor thread, though `forget` does compete with the monitor
loop (coordinating via the `:terminated` promise.

No operations should block forever; they should all eventually (in
some cases customizably) time out, and the current implementation is
intended, overall, to try to let pdb keep running, even if the
monitor (thread) dies somehow.  The precipitating errors should
still be reported to the log.

Every monitored query will have a SelectionKey associated with it,
The key is cancelled during forget, but won't be removed from the
selector's cancelled set until the next call to select.  During that
time, another query on the same socket/connection could try to
re-register the cancelled key.  This will throw an exception, which
we suppress and retry until the select loop finally removes the
cancelled key, and we can re-register the socket.

Every monitored query may also have a postgres pid associated with
it, and whenever it does, that pid should be terminated (in
coordination with the :terminated promise) once the query has been
abandoned or has timed out.

The terminated promise coordinates between the monitor and attempts
to remove (forget) a query.  The arrangement is intended to make
sure that the attempt to forget doesn't return until any competing
termination attempt has finished, or at least had a chance to
finish (otherwise the termination could kill a pg worker that's no
longer associated with the original query, i.e. it's handling a new
query that jetty has picked up on that channel).

The client socket monitoring depends on access to the jetty query
response which (at least at the moment) provides indirect access to
the java socket channel which can be read to determine whether the
client is still connected.

The current implementation is completely incompatible with http
"pipelining", but it looks like that is no longer a realistic
concern:
https://daniel.haxx.se/blog/2019/04/06/curl-says-bye-bye-to-pipelining/

If that turns out to be an incorrect assumption, then we'll have to
reevaluate the implementation and/or feasibility of the monitoring.
That's because so far, the only way we've found to detect a client
disconnection is to attempt to read a byte.  At the moment, that's
acceptable because the client shouldn't be sending any data during
the response (which of course wouldn't be true with pipelining,
where it could be sending additional requests).

raw docstring

compare-deadline-keys^clj

(compare-deadline-keys [deadline-1 skey-1] [deadline-2 skey-2])

source

forget^clj

(forget {:keys [queries selector thread] :as _monitor} select-key)

Causes the monitor to forget the query specified by the select-key. Returns true if that succeeds, or ::timeout if the process does not complete within two seconds. Repeated calls for the same query will not crash. After a call for a given key, the monitor will have forgotten about it, but the final disposition of that query is undefined, i.e. it might or might not have been killed successfully. Calling this for a query that has already been forgotten is not an error (simplifies some error handling).

Causes the monitor to forget the query specified by the select-key.
Returns true if that succeeds, or ::timeout if the process does not
complete within two seconds.  Repeated calls for the same query will
not crash.  After a call for a given key, the monitor will have
forgotten about it, but the final disposition of that query is
undefined, i.e. it might or might not have been killed successfully.
Calling this for a query that has already been forgotten is
not an error (simplifies some error handling).

source raw docstring

forget-pg-pid^clj

(forget-pg-pid {:keys [queries thread] :as _monitor} select-key)

source

map->PuppetDBQueryMonitor^clj

(map->PuppetDBQueryMonitor m40241)

Factory function for class PuppetDBQueryMonitor, taking a map of keywords to field values, but not much slower than ->x like the clojure.core version. (performance is fixed in Clojure 1.7, so this should eventually be removed.)

Factory function for class PuppetDBQueryMonitor, taking a map of keywords to field values, but not much
slower than ->x like the clojure.core version.
(performance is fixed in Clojure 1.7, so this should eventually be removed.)

source raw docstring

monitor^clj

(monitor &
         {:keys [terminate-query]
          :or {terminate-query
                 puppetlabs.puppetdb.query.monitor/terminate-query}})

source

ns-per-ms^clj

source

register-pg-pid^clj

(register-pg-pid {:keys [queries thread] :as _monitor} select-key pid)

source

start^clj

(start {:keys [thread] :as monitor})

source

state-summary^clj

(state-summary {:keys [deadlines selector-keys]})

source

stop^clj

(stop monitor)

(stop {:keys [exit selector thread] :as _monitor} timeout-ms)

Attempts to stop the monitor, waiting up to timeout-ms if provided, or forever. Returns false if the monitor thread is still running, true otherwise. May be called more than once. When a true value is returned, all monitor activities should be finished.

Attempts to stop the monitor, waiting up to timeout-ms if provided,
or forever.  Returns false if the monitor thread is still running,
true otherwise.  May be called more than once.  When a true value is
returned, all monitor activities should be finished.

source raw docstring

stop-query-at-deadline-or-disconnect^clj

(stop-query-at-deadline-or-disconnect {:keys [selector queries thread]
                                       :as _monitor}
                                      id
                                      channel
                                      deadline-ns
                                      db)

source

strict-map->PuppetDBQueryMonitor^clj

(strict-map->PuppetDBQueryMonitor m40242 & [drop-extra-keys?__3495__auto__])

Factory function for class PuppetDBQueryMonitor, taking a map of keywords to field values. All keys are required, and no extra keys are allowed. Even faster than map->

Factory function for class PuppetDBQueryMonitor, taking a map of keywords to field values.  All keys are required, and no extra keys are allowed.  Even faster than map->

source raw docstring

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close