Liking cljdoc? Tell your friends :D

puppetlabs.puppetdb.cli.benchmark

Benchmark suite

This command-line utility will simulate catalog submission for a population. It requires that a separate, running instance of PuppetDB for it to submit catalogs to.

We attempt to approximate a number of hosts submitting catalogs at the specified runinterval with the specified rate-of-churn in catalog content.

Running parallel Benchmarks

If are running up against the upper limit at which Benchmark can submit simulated requests, you can run multiple instances of benchmark and make use of the --offset flag to shift the cert numbers.

Example (probably run on completely separate hosts):

benchmark --offset 0 --numhosts 100000
benchmark --offset 100000 --numhosts 100000
benchmark --offset 200000 --numhosts 100000
...

Preserving host-map data

By default, each time Benchmark is run, it initializes the host-map catalog, factset and report data randomly from the given set of base --catalogs --factsets and --reports files. When re-running benchmark, this causes excessive load on puppetdb due to the completely changed catalogs/factsets that must be processed.

To avoid this, set --simulation-dir to preserve all of the host map data between runs as nippy/frozen files. Benchmark will then load and initialize a preserved host matching a particular host-# from these files at startup. Missing hosts (if --numhosts exceeds preserved, for example) will be initialized randomly as by default.

Mutating Catalogs and Factsets

The benchmark tool automatically refreshs timestamps and transaction ids when submitting catalogs, factsets and reports, but the content does not change.

To simulate system drift, code changes and fact changes, use '--rand-catalog=PERCENT_CHANCE:CHANGE_COUNT' and '--rand-facts=PERCENT_CHANCE:PERCENT_CHANGE'.

The former indicates the chance any given catalog will perform CHANGE_COUNT resource mutations (additions, modifications or deletions). The later is the chance any given factset will mutate PERCENT_CHANGE of its fact values. These may be set multiple times, provided that PERCENT_CHANCE does not sum to more than 100%.

By default edges are not included in catalogs. If --include-edges is true, then add-resource and del-resource will involve edges as well.

  • adding a resource adds a single 'contains' edge with the source being one of the catalog's original (non-added) resources.
  • deleting a resource removes one of the added resources (if there are any) and it's related leaf edge.

By ensuring we only ever delete leaves from the graph, we maintain the graph integrity, which is important since PuppetDB validates the edges on injestion.

This provides only limited exercise of edge mutation, which seemed like a reasonable trade-off given that edge submission is deprecated. Running with --include-edges also impacts the nature of catalog mutation, since original resources will never be removed from the catalog.

See add-resource, mod-resource and del-resource for details of resource and edge changes.

TODO: Fact addition/removal TODO: Mutating reports

Viewing Metrics

There are benchmark metrics which can be viewed via JMX.

WARNING: DO NOT DO THIS WITH A PRODUCTION OR INTERNET-ACCESSIBLE INSTANCE! This gives remote access to the JVM internals, including potentially secrets. If you absolutely must (you don't), read about using certs with JMX to do it securely. You are better off using the metrics API or Grafana metrics exporter.

Add the following properties to your Benchmark Java process on startup:

-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.port=5555
-Djava.rmi.server.hostname=127.0.0.1
-Dcom.sun.management.jmxremote.rmi.port=5556

Then with a tool like VisualVM, you can add a JMX Connection, and (with the MBeans plugin) view puppetlabs.puppetdb.benchmark metrics.

Benchmark suite

This command-line utility will simulate catalog submission for a
population. It requires that a separate, running instance of
PuppetDB for it to submit catalogs to.

We attempt to approximate a number of hosts submitting catalogs at
the specified runinterval with the specified rate-of-churn in
catalog content.

### Running parallel Benchmarks

If are running up against the upper limit at which Benchmark can
submit simulated requests, you can run multiple instances of benchmark and
make use of the --offset flag to shift the cert numbers.

Example (probably run on completely separate hosts):

```
benchmark --offset 0 --numhosts 100000
benchmark --offset 100000 --numhosts 100000
benchmark --offset 200000 --numhosts 100000
...
```

### Preserving host-map data

By default, each time Benchmark is run, it initializes the host-map catalog,
factset and report data randomly from the given set of base --catalogs
--factsets and --reports files. When re-running benchmark, this causes
excessive load on puppetdb due to the completely changed catalogs/factsets
that must be processed.

To avoid this, set --simulation-dir to preserve all of the host map data
between runs as nippy/frozen files. Benchmark will then load and initialize a
preserved host matching a particular host-# from these files at startup.
Missing hosts (if --numhosts exceeds preserved, for example) will be
initialized randomly as by default.

### Mutating Catalogs and Factsets

The benchmark tool automatically refreshs timestamps and transaction ids
when submitting catalogs, factsets and reports, but the content does not
change.

To simulate system drift, code changes and fact changes, use
'--rand-catalog=PERCENT_CHANCE:CHANGE_COUNT' and
'--rand-facts=PERCENT_CHANCE:PERCENT_CHANGE'.

The former indicates the chance any given catalog will perform CHANGE_COUNT
resource mutations (additions, modifications or deletions). The later is the
chance any given factset will mutate PERCENT_CHANGE of its fact values. These
may be set multiple times, provided that PERCENT_CHANCE does not sum to more
than 100%.

By default edges are not included in catalogs. If --include-edges is true,
then add-resource and del-resource will involve edges as well.

* adding a resource adds a single 'contains' edge with the source
  being one of the catalog's original (non-added) resources.
* deleting a resource removes one of the added resources (if there are any)
  and it's related leaf edge.

By ensuring we only ever delete leaves from the graph, we maintain the graph
integrity, which is important since PuppetDB validates the edges on injestion.

This provides only limited exercise of edge mutation, which seemed like a
reasonable trade-off given that edge submission is deprecated. Running with
--include-edges also impacts the nature of catalog mutation, since original
resources will never be removed from the catalog.

See add-resource, mod-resource and del-resource for details of resource and
edge changes.

TODO: Fact addition/removal
TODO: Mutating reports

### Viewing Metrics

There are benchmark metrics which can be viewed via JMX.

WARNING: DO NOT DO THIS WITH A PRODUCTION OR INTERNET-ACCESSIBLE INSTANCE! 
This gives remote access to the JVM internals, including potentially secrets.
If you absolutely must (you don't), read about using certs with JMX to do it
securely. You are better off using the metrics API or Grafana metrics
exporter.

Add the following properties to your Benchmark Java process on startup:

```
-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.port=5555
-Djava.rmi.server.hostname=127.0.0.1
-Dcom.sun.management.jmxremote.rmi.port=5556
````

Then with a tool like VisualVM, you can add a JMX Connection, and (with the
MBeans plugin) view puppetlabs.puppetdb.benchmark metrics.
raw docstring

-mainclj

(-main & args)
source

add-catalog-varying-fieldsclj

(add-catalog-varying-fields catalog)

This function adds the fields that change when there is a different catalog. code_id and catalog_uuid should be different whenever the catalog is different

This function adds the fields that change when there is a different
catalog. code_id and catalog_uuid should be different whenever the
catalog is different
sourceraw docstring

add-random-resourceclj

(add-random-resource work-cat)
source

add-resourceclj

(add-resource {:keys [original-keys include-edges] :as work-cat}
              resource-to-clone)

Adds a new resource. The new resource is built to be the same type and of a similar weight as the given resource. This helps keep the catalog relatively stable in overall weight when resources are dropped by del-resource.

If include-edges is true, a single leaf edge is created with a source from the given set of original-keys. This array is passed in and does not contain any of the 'clone-*' resources created by add-resource. This prevents nested relationships from forming between added resources, and in turn allows del-resource in the include-edges case to simply drop a cloned resource and its edge without breaking the graph validated by PuppetDB on injestion.

Adds a new resource. The new resource is built to be the same type and of a
similar weight as the given resource. This helps keep the catalog relatively
stable in overall weight when resources are dropped by del-resource.

If include-edges is true, a single leaf edge is created with a source
from the given set of original-keys. This array is passed in and does not
contain any of the 'clone-*' resources created by add-resource. This prevents
nested relationships from forming between added resources, and in turn allows
del-resource in the include-edges case to simply drop a cloned resource and
its edge without breaking the graph validated by PuppetDB on injestion.
sourceraw docstring

benchmark-shutdown-timeoutclj

source

change-resourcesclj

(change-resources operation {:keys [resource-hash] :as work-cat})

Dispatches resource change based on operation.

Makes two judgements:

  1. If the selected resource has large blob parameters it routes an :add or :del operation to :mod so as to preserve the overall lumpiness of the catalog. Without this, over time, deletes could drop the blob resources, evening out catalogs unintentionally.

  2. If there is only one resource a :del becomes a :mod.

NOTE: regarding uniformity, we probably need to revist this, since overtime, depending on the number of original resources, it grows more likely the catalog will reach 1 resource and thereafter all resources will be clones of that single resource.

Dispatches resource change based on operation.

Makes two judgements:

1) If the selected resource has large blob parameters it routes an :add or :del
operation to :mod so as to preserve the overall lumpiness of the catalog.
Without this, over time, deletes could drop the blob resources, evening out
catalogs unintentionally.

2) If there is only one resource a :del becomes a :mod.

NOTE: regarding uniformity, we probably need to revist this, since overtime,
depending on the number of original resources, it grows more likely the
catalog will reach 1 resource and thereafter all resources will be clones of
that single resource.
sourceraw docstring

cliclj

(cli args)

Runs the benchmark command as directed by the command line args and returns an appropriate exit status.

Runs the benchmark command as directed by the command line args and
returns an appropriate exit status.
sourceraw docstring

clone-resourceclj

(clone-resource resource)

Build a new resource loosely based off the characteristics of the given resource.

Keeps type, tags, and approximate parameter size (in bytes).

Build a new resource loosely based off the characteristics of the given resource.

Keeps type, tags, and approximate parameter size (in bytes).
sourceraw docstring

create-storage-dirclj

(create-storage-dir simulation-dir)

Returns a Path to the directory where simulation host-maps are stored.

If simulation-dir is set, then the path will be the absolute-path to simulation-dir. Otherwise a temporary directory will be created in tmpdir.

The directory is created as a side effect of calling this method if it does not already exist. Parent directories are not created.

Returns a Path to the directory where simulation host-maps are stored.

If simulation-dir is set, then the path will be the absolute-path to
simulation-dir. Otherwise a temporary directory will be created in tmpdir.

The directory is created as a side effect of calling this method if it does
not already exist. Parent directories are not created.
sourceraw docstring

default-data-pathsclj

source

del-random-resourceclj

(del-random-resource work-cat)
source

del-resourceclj

(del-resource {:keys [resource-hash include-edges] :as work-cat} rkey)

Return the resource hash with chosen resource removed.

But if we have edges, instead choose a resource from the list of cloned resources (from add-resource actions). This is so we can just drop the single leaf edge associated with the cloned resource (we're careful in add-resource to only form a contain relation with original uncloned resources).

If no cloned resources are available to choose from, do nothing, so as not to break the graph.

Return the resource hash with chosen resource removed.

But if we have edges, instead choose a resource from the list of cloned
resources (from add-resource actions). This is so we can just drop the single
leaf edge associated with the cloned resource (we're careful in add-resource
to only form a contain relation with original uncloned resources).

If no cloned resources are available to choose from, do nothing, so as not to
break the graph.
sourceraw docstring

directorclj

(director base-url
          ssl-opts
          scheduler
          {:keys [max-command-delay-ms] :as cmd-opts}
          event-ch
          seq-end)
source

jitterclj

(jitter stamp n)

jitter a timestamp (rand-int n) seconds in the forward direction

jitter a timestamp (rand-int n) seconds in the forward direction
sourceraw docstring

load-data-from-optionsclj

(load-data-from-options {:keys [archive] :as options})
source

load-sample-dataclj

(load-sample-data dir from-classpath?)

Load all .json files contained in dir.

Load all .json files contained in `dir`.
sourceraw docstring

metricsclj

source

mod-random-resourceclj

(mod-random-resource work-cat)
source

mod-resourceclj

(mod-resource {:keys [resource-hash] :as work-cat} rkey)

Updates resource by touching parameters.

Updates resource by touching parameters.
sourceraw docstring

modify-titleclj

(modify-title title prefix)

Regenerate a title of the same size as the one given matching cli.generate/pseudonym format. The original ordinal is kept to help with debugging, and to avoid the cost of scanning for a next value.

NOTE: the minimum title-size is 20, but there is still a chance of duplicates in long running benchmarks.

Regenerate a title of the same size as the one given matching
cli.generate/pseudonym format. The original ordinal is kept to help with
debugging, and to avoid the cost of scanning for a next value.

NOTE: the minimum title-size is 20, but there is still a chance of duplicates
in long running benchmarks.
sourceraw docstring

mutate-resource-fnsclj

Functions that randomly change a catalog's resources.

Functions that randomly change a catalog's resources.
sourceraw docstring

no-open-optionsclj

source

pdb-connection-infoclj

(pdb-connection-info {:keys [config] :as options})
source

populate-hostsclj

(populate-hosts n
                offset
                pdb-host
                include-edges?
                catalogs
                reports
                facts
                storage-dir)

Returns a lazy sequence of host info maps, reading the data from storage-dir when a suitable file exists, and deriving the data from catalogs, reports, and facts otherwise.

Returns a lazy sequence of host info maps, reading the data from
storage-dir when a suitable file exists, and deriving the data from
catalogs, reports, and facts otherwise.
sourceraw docstring

process-tar-entryclj

(process-tar-entry tar-reader)
source

producersclj

source

progressing-timestampclj

(progressing-timestamp num-hosts num-msgs run-interval-minutes end-commands-in)

Return a function that will return a timestamp that progresses forward in time.

Return a function that will return a timestamp that progresses forward in time.
sourceraw docstring

prune-host-infoclj

(prune-host-info info factsets catalogs reports)

Adjusts the info to match the current run, i.e. if the current run didn't specify --catalogs, then prune it. We might have extra data when using a simulation dir from a previous run with different arguments.

Adjusts the info to match the current run, i.e. if the current run
didn't specify --catalogs, then prune it.  We might have extra data
when using a simulation dir from a previous run with different
arguments.
sourceraw docstring

rand-catalog-mutationclj

(rand-catalog-mutation catalog randomize-count include-edges)

Updates id fields that change with a catalog change, and makes randomize-count additions, modifications and/or removals of resources (and edges if include-edges is true).

Updates id fields that change with a catalog change, and makes randomize-count
additions, modifications and/or removals of resources (and edges if
include-edges is true).
sourceraw docstring

random-cmd-delayclj

source

random-producerclj

(random-producer)
source

randomize-map-leafclj

(randomize-map-leaf leaf)

Randomizes a fact leaf.

Randomizes a fact leaf.
sourceraw docstring

randomize-map-leavesclj

(randomize-map-leaves rand-perc value)

Runs through a map and randomizes a random percentage of leaves.

Runs through a map and randomizes a random percentage of leaves.
sourceraw docstring

rebuild-parametersclj

(rebuild-parameters parameters)

Return resource parameters with changed keys and values of the same number and size.

In order to avoid key collisions, keys are rebuilt with at least five characters.

If changing keys still results in a collision, log an error and return the original parameters.

Return resource parameters with changed keys and values of the same number and
size.

In order to avoid key collisions, keys are rebuilt with at least five characters.

If changing keys still results in a collision, log an error and return the
original parameters.
sourceraw docstring

register-resource-countsclj

(register-resource-counts numhosts)

Setup a metric to track catalog resource counts based on numhosts.

Setup a metric to track catalog resource counts based on numhosts.
sourceraw docstring

register-shutdown-hook!clj

(register-shutdown-hook! f)
source

resource-has-blob?clj

(resource-has-blob? resource)

True if the given resource has a BLOB parameter value. Sample catalogs created by the PuppetDB Generate command may have 'content_blob_*' parameters with large values.

True if the given resource has a BLOB parameter value. Sample catalogs
created by the PuppetDB Generate command may have 'content_blob_*' parameters
with large values.
sourceraw docstring

send-catalogclj

(send-catalog url certname version catalog opts)
source

send-commandsclj

(send-commands options)

Feeds commands to PDB as requested by args. Returns a map of :join, a function to wait for the benchmark process to terminate (only happens when you pass nummsgs), and :stop, function to request termination of the benchmark process and wait for it to stop cleanly. These functions return true if shutdown happened cleanly, or false if there was a timeout.

Feeds commands to PDB as requested by args. Returns a map of :join, a
function to wait for the benchmark process to terminate (only happens when you
pass nummsgs), and :stop, function to request termination of the benchmark
process and wait for it to stop cleanly. These functions return true if
shutdown happened cleanly, or false if there was a timeout.
sourceraw docstring

send-commands-wrapperclj

(send-commands-wrapper args)
source

send-factsclj

(send-facts url certname version catalog opts)
source

send-queriesclj

(send-queries options)
source

send-reportclj

(send-report url certname version catalog opts)
source

start-rate-monitorclj

(start-rate-monitor rate-monitor-ch run-interval commands-per-puppet-run _state)

Start a task which monitors the rate of messages on rate-monitor-ch and prints it to the console every 5 seconds. Uses run-interval to compute the number of nodes that would produce that load.

Start a task which monitors the rate of messages on rate-monitor-ch and
prints it to the console every 5 seconds. Uses run-interval to compute the
number of nodes that would produce that load.
sourceraw docstring

start-simulation-loopclj

(start-simulation-loop numhosts
                       run-interval
                       num-msgs
                       end-commands-in
                       rand-catalogs
                       rand-facts
                       simulation-threads
                       sim-ch
                       host-info-ch
                       read-ch
                       &
                       {:keys [facts catalogs reports include-edges?
                               storage-dir]})

Run a background process which takes host-state maps from read-ch, updates them with update-host, and puts them on write-ch. If num-msgs is not given, uses numhosts and run-interval to run the simulation at a reasonable rate. Close read-ch to terminate the background process.

Run a background process which takes host-state maps from read-ch, updates
them with update-host, and puts them on write-ch. If num-msgs is not given,
uses numhosts and run-interval to run the simulation at a reasonable rate.
Close read-ch to terminate the background process.
sourceraw docstring

touch-parameter-valuecljmultimethod

Return a new parameter of about the same size and type.

TODO: handle arrays and maps.

Return a new parameter of about the same size and type.

TODO: handle arrays and maps.
sourceraw docstring

touch-parametersclj

(touch-parameters parameters)

Return resource parameters with one value changed. Size is the same.

Return resource parameters with one value changed. Size is the same.
sourceraw docstring

try-load-fileclj

(try-load-file file)

Attempt to read and parse the JSON in file. If this failed, an error is logged, and nil is returned.

Attempt to read and parse the JSON in `file`. If this failed, an error is
logged, and nil is returned.
sourceraw docstring

update-catalogclj

(update-catalog catalog include-edges rand-catalogs uuid stamp)

Updates catalog timestamps and transaction UUIDS that vary with every catalog run. Depending on settings in the rand-catalogs array, may make additional random changes to catalog resources.

Updates catalog timestamps and transaction UUIDS that vary with every
catalog run. Depending on settings in the rand-catalogs array, may make
additional random changes to catalog resources.
sourceraw docstring

update-factsetclj

(update-factset factset rand-facts stamp)

Updates the producer_timestamp to be current, and randomly updates the leaves of the factset based on a percentage provided in rand-percentage.

Updates the producer_timestamp to be current, and randomly updates the leaves
of the factset based on a percentage provided in `rand-percentage`.
sourceraw docstring

update-hostclj

(update-host {:keys [_host catalog report factset] :as state}
             include-edges
             rand-catalogs
             rand-facts
             get-timestamp)

Perform a simulation step on host-map. Always update timestamps and uuids; randomly mutate other data depending on rand-catalogs and rand-facts

Perform a simulation step on host-map. Always update timestamps and uuids;
randomly mutate other data depending on rand-catalogs and rand-facts 
sourceraw docstring

update-reportclj

(update-report report uuid stamp)

configuration_version, start_time and end_time should always change on subsequent report submittions, this changes those fields to avoid computing the same hash again (causing constraint errors in the DB)

configuration_version, start_time and end_time should always change
on subsequent report submittions, this changes those fields to avoid
computing the same hash again (causing constraint errors in the DB)
sourceraw docstring

update-report-resourcesclj

(update-report-resources resources stamp)
source

validate-optionsclj

(validate-options options)
source

warn-missing-dataclj

(warn-missing-data catalogs reports facts)
source

write-host-infoclj

(write-host-info info path)
source

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close