puppetlabs.puppetdb.cli.benchmark

Benchmark suite

This command-line utility will simulate catalog submission for a population. It requires that a separate, running instance of PuppetDB for it to submit catalogs to.

We attempt to approximate a number of hosts submitting catalogs at the specified runinterval with the specified rate-of-churn in catalog content.

Running parallel Benchmarks

If are running up against the upper limit at which Benchmark can submit simulated requests, you can run multiple instances of benchmark and make use of the --offset flag to shift the cert numbers.

Example (probably run on completely separate hosts):

benchmark --offset 0 --numhosts 100000
benchmark --offset 100000 --numhosts 100000
benchmark --offset 200000 --numhosts 100000
...

By default, each time Benchmark is run, it initializes the host-map catalog, factset and report data randomly from the given set of base --catalogs --factsets and --reports files. When re-running benchmark, this causes excessive load on puppetdb due to the completely changed catalogs/factsets that must be processed.

To avoid this, set --simulation-dir to preserve all of the host map data between runs as nippy/frozen files. Benchmark will then load and initialize a preserved host matching a particular host-# from these files at startup. Missing hosts (if --numhosts exceeds preserved, for example) will be initialized randomly as by default.

Mutating Catalogs and Factsets

The benchmark tool automatically refreshs timestamps and transaction ids when submitting catalogs, factsets and reports, but the content does not change.

To simulate system drift, code changes and fact changes, use '--rand-catalog=PERCENT_CHANCE:CHANGE_COUNT' and '--rand-facts=PERCENT_CHANCE:PERCENT_CHANGE'.

The former indicates the chance any given catalog will perform CHANGE_COUNT resource mutations (additions, modifications or deletions). The later is the chance any given factset will mutate PERCENT_CHANGE of its fact values. These may be set multiple times, provided that PERCENT_CHANCE does not sum to more than 100%.

By default edges are not included in catalogs. If --include-edges is true, then add-resource and del-resource will involve edges as well.

adding a resource adds a single 'contains' edge with the source being one of the catalog's original (non-added) resources.
deleting a resource removes one of the added resources (if there are any) and it's related leaf edge.

By ensuring we only ever delete leaves from the graph, we maintain the graph integrity, which is important since PuppetDB validates the edges on injestion.

This provides only limited exercise of edge mutation, which seemed like a reasonable trade-off given that edge submission is deprecated. Running with --include-edges also impacts the nature of catalog mutation, since original resources will never be removed from the catalog.

See add-resource, mod-resource and del-resource for details of resource and edge changes.

TODO: Fact addition/removal TODO: Mutating reports

Viewing Metrics

There are benchmark metrics which can be viewed via JMX.

WARNING: DO NOT DO THIS WITH A PRODUCTION OR INTERNET-ACCESSIBLE INSTANCE! This gives remote access to the JVM internals, including potentially secrets. If you absolutely must (you don't), read about using certs with JMX to do it securely. You are better off using the metrics API or Grafana metrics exporter.

Add the following properties to your Benchmark Java process on startup:

-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.port=5555
-Djava.rmi.server.hostname=127.0.0.1
-Dcom.sun.management.jmxremote.rmi.port=5556

Then with a tool like VisualVM, you can add a JMX Connection, and (with the MBeans plugin) view puppetlabs.puppetdb.benchmark metrics.

Benchmark suite

This command-line utility will simulate catalog submission for a
population. It requires that a separate, running instance of
PuppetDB for it to submit catalogs to.

We attempt to approximate a number of hosts submitting catalogs at
the specified runinterval with the specified rate-of-churn in
catalog content.

### Running parallel Benchmarks

If are running up against the upper limit at which Benchmark can
submit simulated requests, you can run multiple instances of benchmark and
make use of the --offset flag to shift the cert numbers.

Example (probably run on completely separate hosts):

```
benchmark --offset 0 --numhosts 100000
benchmark --offset 100000 --numhosts 100000
benchmark --offset 200000 --numhosts 100000
...
```

### Preserving host-map data

By default, each time Benchmark is run, it initializes the host-map catalog,
factset and report data randomly from the given set of base --catalogs
--factsets and --reports files. When re-running benchmark, this causes
excessive load on puppetdb due to the completely changed catalogs/factsets
that must be processed.

To avoid this, set --simulation-dir to preserve all of the host map data
between runs as nippy/frozen files. Benchmark will then load and initialize a
preserved host matching a particular host-# from these files at startup.
Missing hosts (if --numhosts exceeds preserved, for example) will be
initialized randomly as by default.

### Mutating Catalogs and Factsets

The benchmark tool automatically refreshs timestamps and transaction ids
when submitting catalogs, factsets and reports, but the content does not
change.

To simulate system drift, code changes and fact changes, use
'--rand-catalog=PERCENT_CHANCE:CHANGE_COUNT' and
'--rand-facts=PERCENT_CHANCE:PERCENT_CHANGE'.

The former indicates the chance any given catalog will perform CHANGE_COUNT
resource mutations (additions, modifications or deletions). The later is the
chance any given factset will mutate PERCENT_CHANGE of its fact values. These
may be set multiple times, provided that PERCENT_CHANCE does not sum to more
than 100%.

By default edges are not included in catalogs. If --include-edges is true,
then add-resource and del-resource will involve edges as well.

* adding a resource adds a single 'contains' edge with the source
  being one of the catalog's original (non-added) resources.
* deleting a resource removes one of the added resources (if there are any)
  and it's related leaf edge.

By ensuring we only ever delete leaves from the graph, we maintain the graph
integrity, which is important since PuppetDB validates the edges on injestion.

This provides only limited exercise of edge mutation, which seemed like a
reasonable trade-off given that edge submission is deprecated. Running with
--include-edges also impacts the nature of catalog mutation, since original
resources will never be removed from the catalog.

See add-resource, mod-resource and del-resource for details of resource and
edge changes.

TODO: Fact addition/removal
TODO: Mutating reports

### Viewing Metrics

There are benchmark metrics which can be viewed via JMX.

WARNING: DO NOT DO THIS WITH A PRODUCTION OR INTERNET-ACCESSIBLE INSTANCE! 
This gives remote access to the JVM internals, including potentially secrets.
If you absolutely must (you don't), read about using certs with JMX to do it
securely. You are better off using the metrics API or Grafana metrics
exporter.

Add the following properties to your Benchmark Java process on startup:

```
-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.port=5555
-Djava.rmi.server.hostname=127.0.0.1
-Dcom.sun.management.jmxremote.rmi.port=5556
````

Then with a tool like VisualVM, you can add a JMX Connection, and (with the
MBeans plugin) view puppetlabs.puppetdb.benchmark metrics.

raw docstring

puppetlabs.puppetdb.cli.benchmark.query

puppetlabs.puppetdb.cli.fact-storage-benchmark

puppetlabs.puppetdb.cli.generate

Data Generation utility

This command-line tool can generate a base sampling of catalog, fact and report files suitable for consumption by the PuppetDB benchmark utility.

Note that it is only necessary to generate a small set of initial sample data since benchmark will permute per node differences. So even if you want to benchmark 1000 nodes, you don't need to generate initial catalog/fact/report json for 1000 nodes.

If you want a representative sample with big differences between catalogs, you will need to run the tool multiple times. For example, if you want a set of 5 large catalogs and 10 small ones, you will need to run the tool twice with the desired parameters to create the two different sets.

Flag Notes

Catalogs

Resource Counts

The num-resources flag is total and includes num-classes. So if you set --num-resources to 100 and --num-classes to 30, you will get a catalog with a hundred resources, thirty of which are classes.

Edges

A containment edge is always generated between the main stage and each class. And non-class resources get a containment edge to a random class. So there will always be a base set of containment edges equal to the resource count. The --additional-edge-percent governs how many non-containment edges are added on top of that to simulate some further catalog structure. There is no guarantee of relationship depth (as far as, for example Stage(main) -> Class(foo) -> Class(bar) -> Resource(biff)), but it does ensure some edges between classes, as well as between class and non-class resources.

Large Resource Parameter Blobs

The --blob-count and --blob-size parameters control inclusion of large text blobs in catalog resources. By default one ~100kb blob is added per catalog.

Set --blob-count to 0 to exclude blobs altogether.

Facts

Baseline Facts

Each fact set begins with a set of baseline facts from: baseline-agent-node.json.

These provide some consistency for a common set of baseline fact paths present on any puppet node. The generator then mutates half of the values to provide variety.

Fact Counts

The --num-facts parameter controls the number of facts to generate per host.

There are 376 leaf facts in the baseline file. Setting num-facts less than this will remove baseline facts to approach the requested number of facts. (Empty maps and arrays are not removed from the factset, so it will never pair down to zero.) Setting num-facts to a larger number will add facts of random depth based on --max-fact-depth until the requested count is reached.

Total Facts Size

The --total-fact-size parameter controls the total weight of the fact values map in kB. Weight is added after count is reached. So if the weight of the adjusted baseline facts already exceeds the total-fact-size, nothing more is done. No attempt is made to pair facts back down the requested size, as this would likely require removing facts.

Max Fact Depth

The --max-fact-depth parameter is the maximum nested depth a fact added to the baseline facts may reach. For example a max depth of 5, would mean that an added fact would at most be a nest of four maps:

{foo: {bar: {baz: {biff: boz}}}}

Since depth is picked randomly for each additional fact, this does not guarantee facts of a given depth. Nor does it directly affect the average depth of facts in the generated factset, although the larger the max-fact-depth and num-facts, the more likely that the average depth will drift higher.

Package Inventory

The --num-packages parameter sets the number of packages to generate for the factset's package_inventory array. Set to 0 to exclude.

Reports

Reports per Catalog

The --num-reports flag governs the number of reports to generate per generated catalog. Since one catalog is generated per host, this means you will end up with num-hosts * num-reports reports.

Variation in Reports

A report details change, or lack there of, during enforcement of the puppet catalog on the host. Since the benchmark tool currently chooses randomly from the given report files, a simple mechanism for determining the likelihood of receiving a report of a particular size (with lots of changes, few changes or no changes) is to produce multiple reports of each type per host to generate a weighted average. (If there are 10 reports, 2 are large and 8 are small, then it's 80% likely any given report submission submitted by benchmark will be of the small variety...)

The knobs to control this with the generate tool are:

--num-reports, to determine the base number of reports to generate per catalog
--high-change-reports-percent, percentage of that base to generate as reports with a high number of change events, as determined by:
--high-change-resource-percent, percentage of resources in a high change report that will experience events (changes)
--low-change-reports-percent, percentage of the base reports to generate as reports with a low number of change events as determined by:
--low-change-resource-percent, percentage of resources in a low change report that will experience events (changes)

The left over percentage of reports will be no change reports (generally the most common) indicating the report run was steady-state with no changes.

By default, with a num-reports of 20, a high change percent of 5% and a low change percent of 20%, you will get 1 high change, 4 low change and 15 unchanged reports per host.

Unchanged Resources

In Puppet 8, by default, the agent no longer includes unchanged resources in the report, reducing its size.

The generate tool also does this by default, but you can set --no-exclude-unchanged-resources to instead include unchanged resources in every report (for default Puppet 7 behavior, for example).

Logs

In addition to a few boilerplate log lines, random logs are generated for each change event in the report. However other factors, such as pluginsync, puppet runs with debug lines and additional logging in modules can increase log output (quite dramatically in the case of debug output from the agent).

To simulate this, you can set --num-additional-logs to include in a report. And you can set --percent-add-report-logs to indicate what percentage of reports have this additional number of logs included.

Random Distribution

The default generation produces relatively uniform structures.

for catalogs it generates equal resource and edge counts and similar byte counts.
for factsets it generates equal fact counts and similar byte counts.

Example:

jpartlow@jpartlow-dev-2204:~/work/src/puppetdb$ lein run generate --verbose --output-dir generate-test ... :catalogs: 5

| :certname | :resource-count | :resource-weight | :min-resource | :mean-resource | :max-resource | :edge-count | :edge-weight | :catalog-weight | |---------------+-----------------+------------------+---------------+----------------+---------------+-------------+--------------+-----------------| | host-sarasu-0 | 101 | 137117 | 90 | 1357 | 110246 | 150 | 16831 | 154248 | | host-lukoxo-1 | 101 | 132639 | 98 | 1313 | 104921 | 150 | 16565 | 149504 | | host-dykivy-2 | 101 | 120898 | 109 | 1197 | 94013 | 150 | 16909 | 138107 | | host-talyla-3 | 101 | 110328 | 128 | 1092 | 82999 | 150 | 16833 | 127461 | | host-foropy-4 | 101 | 136271 | 106 | 1349 | 109811 | 150 | 16980 | 153551 |

:facts: 5

| :certname | :fact-count | :avg-depth | :max-depth | :fact-weight | :total-weight | |---------------+-------------+------------+------------+--------------+---------------| | host-sarasu-0 | 400 | 2.77 | 7 | 10000 | 10118 | | host-lukoxo-1 | 400 | 2.8 | 7 | 10000 | 10118 | | host-dykivy-2 | 400 | 2.7625 | 7 | 10000 | 10118 | | host-talyla-3 | 400 | 2.7825 | 7 | 10000 | 10118 | | host-foropy-4 | 400 | 2.7925 | 7 | 10000 | 10118 | ...

This mode is best used when generating several different sample sets with distinct weights and counts to provide (when combined) an overall sample set for benchmark that includes some fixed number of fairly well described catalog, fact and report examples.

By setting --random-distribution to true, you can instead generate a more random sample set, where the exact parameter values used per host will be picked from a normal curve based on the set value as mean.

for catalogs, this will effect the class, resource, edge and total blob counts

Blobs will be distributed randomly through the set, so if you set --blob-count to 2 over --hosts 10, on averge there will be two per catalog, but some may have none, others four, etc...

for facts, this will effect the fact and package counts, the total weight and the max fact depth.

This has no effect on generated reports at the moment.

Example:

jpartlow@jpartlow-dev-2204:~/work/src/puppetdb$ lein run generate --verbose --random-distribution :catalogs: 5

| :certname | :resource-count | :resource-weight | :min-resource | :mean-resource | :max-resource | :edge-count | :edge-weight | :catalog-weight | |---------------+-----------------+------------------+---------------+----------------+---------------+-------------+--------------+-----------------| | host-cevani-0 | 122 | 33831 | 93 | 277 | 441 | 193 | 22044 | 56175 | | host-firilo-1 | 91 | 115091 | 119 | 1264 | 91478 | 130 | 14466 | 129857 | | host-gujudi-2 | 129 | 36080 | 133 | 279 | 465 | 180 | 20230 | 56610 | | host-xegyxy-3 | 106 | 120603 | 136 | 1137 | 92278 | 153 | 17482 | 138385 | | host-jaqomi-4 | 107 | 211735 | 87 | 1978 | 98354 | 159 | 17792 | 229827 |

:facts: 5

| :certname | :fact-count | :avg-depth | :max-depth | :fact-weight | :total-weight | |---------------+-------------+------------+------------+--------------+---------------| | host-cevani-0 | 533 | 3.4690433 | 9 | 25339 | 25457 | | host-firilo-1 | 355 | 2.7464788 | 7 | 13951 | 14069 | | host-gujudi-2 | 380 | 2.75 | 8 | 16111 | 16229 | | host-xegyxy-3 | 360 | 2.7305555 | 7 | 5962 | 6080 | | host-jaqomi-4 | 269 | 2.7695167 | 7 | 16984 | 17102 | ...

# Data Generation utility

This command-line tool can generate a base sampling of catalog, fact and
report files suitable for consumption by the PuppetDB benchmark utility.

Note that it is only necessary to generate a small set of initial sample
data since benchmark will permute per node differences. So even if you want
to benchmark 1000 nodes, you don't need to generate initial
catalog/fact/report json for 1000 nodes.

If you want a representative sample with big differences between catalogs,
you will need to run the tool multiple times. For example, if you want a set
of 5 large catalogs and 10 small ones, you will need to run the tool twice
with the desired parameters to create the two different sets.

## Flag Notes

### Catalogs

#### Resource Counts

The num-resources flag is total and includes num-classes. So if you set
--num-resources to 100 and --num-classes to 30, you will get a catalog with a
hundred resources, thirty of which are classes.

#### Edges

A containment edge is always generated between the main stage and each
class. And non-class resources get a containment edge to a random class. So
there will always be a base set of containment edges equal to the resource
count. The --additional-edge-percent governs how many non-containment edges
are added on top of that to simulate some further catalog structure. There is
no guarantee of relationship depth (as far as, for example Stage(main) ->
Class(foo) -> Class(bar) -> Resource(biff)), but it does ensure some edges
between classes, as well as between class and non-class resources.

#### Large Resource Parameter Blobs

The --blob-count and --blob-size parameters control inclusion of large
text blobs in catalog resources. By default one ~100kb blob is
added per catalog.

Set --blob-count to 0 to exclude blobs altogether.

### Facts

#### Baseline Facts

Each fact set begins with a set of baseline facts from:
[baseline-agent-node.json](./resources/puppetlabs/puppetdb/generate/samples/facts/baseline-agent-node.json).

These provide some consistency for a common set of baseline fact paths
present on any puppet node. The generator then mutates half of the values to
provide variety.

#### Fact Counts

The --num-facts parameter controls the number of facts to generate per host.

There are 376 leaf facts in the baseline file. Setting num-facts less than
this will remove baseline facts to approach the requested number of facts.
(Empty maps and arrays are not removed from the factset, so it will never
pair down to zero.) Setting num-facts to a larger number will add facts of
random depth based on --max-fact-depth until the requested count is reached.

#### Total Facts Size

The --total-fact-size parameter controls the total weight of the fact values
map in kB. Weight is added after count is reached. So if the weight of the
adjusted baseline facts already exceeds the total-fact-size, nothing more is
done. No attempt is made to pair facts back down the requested size, as this
would likely require removing facts.

#### Max Fact Depth

The --max-fact-depth parameter is the maximum nested depth a fact added to
the baseline facts may reach. For example a max depth of 5, would mean that
an added fact would at most be a nest of four maps:

  {foo: {bar: {baz: {biff: boz}}}}

Since depth is picked randomly for each additional fact, this does not
guarantee facts of a given depth. Nor does it directly affect the average
depth of facts in the generated factset, although the larger the
max-fact-depth and num-facts, the more likely that the average depth will
drift higher.

#### Package Inventory

The --num-packages parameter sets the number of packages to generate for the
factset's package_inventory array. Set to 0 to exclude.

### Reports

#### Reports per Catalog

The --num-reports flag governs the number of reports to generate per
generated catalog.  Since one catalog is generated per host, this means you
will end up with num-hosts * num-reports reports.

#### Variation in Reports

A report details change, or lack there of, during enforcement of the puppet
catalog on the host. Since the benchmark tool currently chooses randomly from the
given report files, a simple mechanism for determining the likelihood of
receiving a report of a particular size (with lots of changes, few changes or
no changes) is to produce multiple reports of each type per host to generate
a weighted average. (If there are 10 reports, 2 are large and 8 are small,
then it's 80% likely any given report submission submitted by benchmark will
be of the small variety...)

The knobs to control this with the generate tool are:

* --num-reports, to determine the base number of reports to generate per catalog
* --high-change-reports-percent, percentage of that base to generate as
  reports with a high number of change events, as determined by:
* --high-change-resource-percent, percentage of resources in a high change
  report that will experience events (changes)
* --low-change-reports-percent, percentage of the base reports to generate
  as reports with a low number of change events as determined by:
* --low-change-resource-percent, percentage of resources in a low change
  report that will experience events (changes)

The left over percentage of reports will be no change reports (generally the
most common) indicating the report run was steady-state with no changes.

By default, with a num-reports of 20, a high change percent of 5% and a low
change percent of 20%, you will get 1 high change, 4 low change and 15
unchanged reports per host.

#### Unchanged Resources

In Puppet 8, by default, the agent no longer includes unchanged resources in
the report, reducing its size.

The generate tool also does this by default, but you can set
--no-exclude-unchanged-resources to instead include unchanged resources in
every report (for default Puppet 7 behavior, for example).

#### Logs

In addition to a few boilerplate log lines, random logs are generated for
each change event in the report. However other factors, such as pluginsync,
puppet runs with debug lines and additional logging in modules can increase
log output (quite dramatically in the case of debug output from the agent).

To simulate this, you can set --num-additional-logs to include in a report.
And you can set --percent-add-report-logs to indicate what percentage of
reports have this additional number of logs included.

### Random Distribution

The default generation produces relatively uniform structures.

* for catalogs it generates equal resource and edge counts and similar byte
  counts.
* for factsets it generates equal fact counts and similar byte counts.

Example:

   jpartlow@jpartlow-dev-2204:~/work/src/puppetdb$ lein run generate --verbose --output-dir generate-test
   ...
   :catalogs: 5

   |     :certname | :resource-count | :resource-weight | :min-resource | :mean-resource | :max-resource | :edge-count | :edge-weight | :catalog-weight |
   |---------------+-----------------+------------------+---------------+----------------+---------------+-------------+--------------+-----------------|
   | host-sarasu-0 |             101 |           137117 |            90 |           1357 |        110246 |         150 |        16831 |          154248 |
   | host-lukoxo-1 |             101 |           132639 |            98 |           1313 |        104921 |         150 |        16565 |          149504 |
   | host-dykivy-2 |             101 |           120898 |           109 |           1197 |         94013 |         150 |        16909 |          138107 |
   | host-talyla-3 |             101 |           110328 |           128 |           1092 |         82999 |         150 |        16833 |          127461 |
   | host-foropy-4 |             101 |           136271 |           106 |           1349 |        109811 |         150 |        16980 |          153551 |

   :facts: 5

   |     :certname | :fact-count | :avg-depth | :max-depth | :fact-weight | :total-weight |
   |---------------+-------------+------------+------------+--------------+---------------|
   | host-sarasu-0 |         400 |       2.77 |          7 |        10000 |         10118 |
   | host-lukoxo-1 |         400 |        2.8 |          7 |        10000 |         10118 |
   | host-dykivy-2 |         400 |     2.7625 |          7 |        10000 |         10118 |
   | host-talyla-3 |         400 |     2.7825 |          7 |        10000 |         10118 |
   | host-foropy-4 |         400 |     2.7925 |          7 |        10000 |         10118 |
   ...

This mode is best used when generating several different sample sets with
distinct weights and counts to provide (when combined) an overall sample set
for benchmark that includes some fixed number of fairly well described
catalog, fact and report examples.

By setting --random-distribution to true, you can instead generate a more random
sample set, where the exact parameter values used per host will be picked
from a normal curve based on the set value as mean.

* for catalogs, this will effect the class, resource, edge and total blob counts

Blobs will be distributed randomly through the set, so if you
set --blob-count to 2 over --hosts 10, on averge there will be two per
catalog, but some may have none, others four, etc...

* for facts, this will effect the fact and package counts, the total weight and the max fact depth.

This has no effect on generated reports at the moment.

Example:

   jpartlow@jpartlow-dev-2204:~/work/src/puppetdb$ lein run generate --verbose --random-distribution
   :catalogs: 5

   |     :certname | :resource-count | :resource-weight | :min-resource | :mean-resource | :max-resource | :edge-count | :edge-weight | :catalog-weight |
   |---------------+-----------------+------------------+---------------+----------------+---------------+-------------+--------------+-----------------|
   | host-cevani-0 |             122 |            33831 |            93 |            277 |           441 |         193 |        22044 |           56175 |
   | host-firilo-1 |              91 |           115091 |           119 |           1264 |         91478 |         130 |        14466 |          129857 |
   | host-gujudi-2 |             129 |            36080 |           133 |            279 |           465 |         180 |        20230 |           56610 |
   | host-xegyxy-3 |             106 |           120603 |           136 |           1137 |         92278 |         153 |        17482 |          138385 |
   | host-jaqomi-4 |             107 |           211735 |            87 |           1978 |         98354 |         159 |        17792 |          229827 |

   :facts: 5

   |     :certname | :fact-count | :avg-depth | :max-depth | :fact-weight | :total-weight |
   |---------------+-------------+------------+------------+--------------+---------------|
   | host-cevani-0 |         533 |  3.4690433 |          9 |        25339 |         25457 |
   | host-firilo-1 |         355 |  2.7464788 |          7 |        13951 |         14069 |
   | host-gujudi-2 |         380 |       2.75 |          8 |        16111 |         16229 |
   | host-xegyxy-3 |         360 |  2.7305555 |          7 |         5962 |          6080 |
   | host-jaqomi-4 |         269 |  2.7695167 |          7 |        16984 |         17102 |
   ...

raw docstring

puppetlabs.puppetdb.cli.pdb-dataset

Pg_restore and timeshift entries utility This command-line tool restores an empty database from a backup file (pg_dump generated file), then updates all the timestamps inside the database. It does this by calculating the period between the newest timestamp inside the file and the provided date. Then, every timestamp is shifted with that period. It accepts two parameters:

[Mandatory] -d / --dumpfile Path to the dumpfile that will be used to restore the database.
[Optional]-t / --shift-to-time Timestamp to which all timestamps from the dumpfile will be shifted after the restore. If it's not provided, the system's current timestamp will be used. !!! All timestamps are converted to a Zero timezone format. e.g timestamps like: 2015-03-26T10:58:51+10:00 will become 2015-03-26T11:58:51Z !!! !!! If the time difference between the latest entry in the dumpfile and the time provided to timeshift-to is less than 24 hours this tool will fail !!!

Pg_restore and timeshift entries utility
This command-line tool restores an empty database from a backup file (pg_dump generated file), then updates all the
timestamps inside the database.
It does this by calculating the period between the newest timestamp inside the file and the provided date.
Then, every timestamp is shifted with that period.
It accepts two parameters:
 - [Mandatory] -d / --dumpfile
   Path to the dumpfile that will be used to restore the database.
 - [Optional]-t / --shift-to-time
   Timestamp to which all timestamps from the dumpfile will be shifted after the restore.
   If it's not provided, the system's current timestamp will be used.
!!! All timestamps are converted to a Zero timezone format. e.g timestamps like: 2015-03-26T10:58:51+10:00
will become 2015-03-26T11:58:51Z !!!
!!! If the time difference between the latest entry in the dumpfile and the time provided to timeshift-to is less
than 24 hours this tool will fail !!!

raw docstring

puppetlabs.puppetdb.cli.services

Main entrypoint

PuppetDB consists of several, cooperating components:

Command processing

PuppetDB uses a CQRS pattern for making changes to its domain objects (facts, catalogs, etc). Instead of simply submitting data to PuppetDB and having it figure out the intent, the intent needs to explicitly be codified as part of the operation. This is known as a "command" (e.g. "replace the current facts for node X").

Commands are processed asynchronously, however we try to do our best to ensure that once a command has been accepted, it will eventually be executed. Ordering is also preserved. To do this, all incoming commands are placed in a message queue which the command processing subsystem reads from in FIFO order.

Refer to puppetlabs.puppetdb.command for more details.
Message queue

We use stockpile to durably store commands. The "in memory" representation of that queue is a core.async channel.
REST interface

All interaction with PuppetDB is conducted via its REST API. We embed an instance of Jetty to handle web server duties. Commands that come in via REST are relayed to the message queue. Read-only requests are serviced synchronously.
Database sweeper

As catalogs are modified, unused records may accumulate and stale data may linger in the database. We periodically sweep the database, compacting it and performing regular cleanup so we can maintain acceptable performance.

Main entrypoint

PuppetDB consists of several, cooperating components:

* Command processing

  PuppetDB uses a CQRS pattern for making changes to its domain
  objects (facts, catalogs, etc). Instead of simply submitting data
  to PuppetDB and having it figure out the intent, the intent
  needs to explicitly be codified as part of the operation. This is
  known as a "command" (e.g. "replace the current facts for node
  X").

  Commands are processed asynchronously, however we try to do our
  best to ensure that once a command has been accepted, it will
  eventually be executed. Ordering is also preserved. To do this,
  all incoming commands are placed in a message queue which the
  command processing subsystem reads from in FIFO order.

  Refer to `puppetlabs.puppetdb.command` for more details.

* Message queue

  We use stockpile to durably store commands. The "in memory"
  representation of that queue is a core.async channel.

* REST interface

  All interaction with PuppetDB is conducted via its REST API. We
  embed an instance of Jetty to handle web server duties. Commands
  that come in via REST are relayed to the message queue. Read-only
  requests are serviced synchronously.

* Database sweeper

  As catalogs are modified, unused records may accumulate and stale
  data may linger in the database. We periodically sweep the
  database, compacting it and performing regular cleanup so we can
  maintain acceptable performance.

raw docstring

puppetlabs.puppetdb.cli.time-shift-export

Timestamp shift utility

This simple command-line tool updates all the timestamps inside a PuppetDB export. It does this by calculating the period between the newest timestamp inside the export and the provided date. Then, every timestamp is shifted with that period. It accepts three parameters:

[Mandatory] -i / --input Path to the .tgz pdb export, which will be shifted.
[Optional] -o / --output Path to the where the shifted export will be saved. If no path is given, the shifted export is sent as a stream to standard output. You may use it like this: lein time-shift-export -i export.tgz -o > shifted.tgz
[Optional]-t / --shift-to-time Timestamp to which all the export timestamp will be shifted. If it's not provided, the system's current timestamp will be used.

!!! All timestamps are converted to a Zero timezone format. e.g timestamps like: 2015-03-26T10:58:51+10:00 will become 2015-03-26T11:58:51Z !!!

Timestamp shift utility

This simple command-line tool updates all the timestamps inside a PuppetDB export.
It does this by calculating the period between the newest timestamp inside the export and the provided date.
Then, every timestamp is shifted with that period.
It accepts three parameters:
 - [Mandatory] -i / --input
   Path to the .tgz pdb export, which will be shifted.
 - [Optional] -o / --output
   Path to the where the shifted export will be saved.
   If no path is given, the shifted export is sent as a stream to standard output. You may use it like this:
   lein time-shift-export -i export.tgz -o > shifted.tgz
 - [Optional]-t / --shift-to-time
   Timestamp to which all the export timestamp will be shifted.
   If it's not provided, the system's current timestamp will be used.

 !!! All timestamps are converted to a Zero timezone format. e.g timestamps like: 2015-03-26T10:58:51+10:00
 will become 2015-03-26T11:58:51Z !!!

raw docstring

puppetlabs.puppetdb.cli.tk-util

This namespace is separate from cli.util because we don't want to require any more than we have to there.

This namespace is separate from cli.util because we don't want to
require any more than we have to there.

raw docstring

run-tk-cli-cmd

puppetlabs.puppetdb.cli.util

As this namespace is required by both the tk and non-tk subcommands, it must remain very lightweight, so that subcommands like "version" aren't slowed down by loading the entire logging subsystem or trapperkeeper, etc.

As this namespace is required by both the tk and non-tk subcommands,
it must remain very lightweight, so that subcommands like
"version" aren't slowed down by loading the entire logging
subsystem or trapperkeeper, etc.

raw docstring

puppetlabs.puppetdb.cli.version

Version utility

This simple command-line tool prints a list of info about the version of PuppetDB. It is useful for testing and other situations where you'd like to know some of the version details without having a running instance of PuppetDB.

The output is currently formatted like the contents of a java properties file; each line contains a single property name, followed by an equals sign, followed by the property value.

Version utility

This simple command-line tool prints a list of info about
the version of PuppetDB.  It is useful for testing and other situations
where you'd like to know some of the version details without having
a running instance of PuppetDB.

The output is currently formatted like the contents of a java properties file;
each line contains a single property name, followed by an equals sign, followed
by the property value.

raw docstring

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

puppetlabs.puppetdb.cli.benchmark

puppetlabs.puppetdb.cli.benchmark.query

puppetlabs.puppetdb.cli.fact-storage-benchmark

puppetlabs.puppetdb.cli.generate

puppetlabs.puppetdb.cli.pdb-dataset

puppetlabs.puppetdb.cli.services

puppetlabs.puppetdb.cli.time-shift-export

puppetlabs.puppetdb.cli.tk-util

puppetlabs.puppetdb.cli.util

puppetlabs.puppetdb.cli.version

puppetlabs.puppetdb.client