Skip to content

Commit

Permalink
Edits to pgd_bench content
Browse files Browse the repository at this point in the history
  • Loading branch information
ebgitelman authored and djw-m committed Sep 19, 2023
1 parent 5cb7557 commit 281a447
Show file tree
Hide file tree
Showing 2 changed files with 110 additions and 119 deletions.
169 changes: 84 additions & 85 deletions product_docs/docs/pgd/5/reference/testingandtuning.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,117 +4,116 @@ navTitle: Testing and tuning
indexdepth: 2
---

EDB Postgres Distributed has tools which help with testing and tuning of your PGD clusters. For background, read the [Testing and Tuning](../testingandtuning) section.
EDB Postgres Distributed has tools that help with testing and tuning your PGD clusters. For background, see [Testing and tuning](../testingandtuning).


## `pgd_bench`
## pgd_bench

### Synopsis

A benchmarking tool for PGD enhanced PostgreSQL.
A benchmarking tool for PGD-enhanced PostgreSQL.

```shell
pgd_bench [OPTION]... [DBNAME] [DBNAME2]
```

`DBNAME` may be a conninfo string of the format:
`DBNAME` can be a conninfo string of the format:
`"host=10.1.1.2 user=postgres dbname=master"`

Consult the [Testing and Tuning - Pgd_bench](../testingandtuning#pgd_bench) section for examples
of `pgd_bench` options and usage.
See [pgd_bench in Testing and tuning](../testingandtuning#pgd_bench) for examples
of pgd_bench options and usage.

### Options

`pgd_bench` specific options include:
pgd_bench-specific options include the following.

#### Setting mode

`-m` or `--mode`

Which can be set to `regular`, `camo`, or `failover`. It defaults to `regular`.
The mode can be set to `regular`, `camo`, or `failover`. The default is `regular`.

* regular — Only a single node is needed to run `pgd_bench`
* camo — A second node must be specified to act as the CAMO-partner (CAMO should be set up)
* failover — A second node must be specified to act as the failover.
* `regular` — Only a single node is needed to run pgd_bench.
* `camo` — A second node must be specified to act as the CAMO partner. (CAMO must be set up.)
* `failover` — A second node must be specified to act as the failover.

When using `-m failover`, an additional option `--retry` is available. This will
instruct `pgd_bench` to retry transactions when there is a failover. The `--retry`
option is automatically enabled with `-m camo`.
When using `-m failover`, an additional option `--retry` is available. This option
instructs pgd_bench to retry transactions when there's a failover. The `--retry`
option is enabled with `-m camo`.

#### Setting GUC variables

`-o` or `--set-option`

This option is followed by `NAME=VALUE` entries, which will be applied using the
Postgresql [`SET`](https://www.postgresql.org/docs/current/sql-set.html) command on each server, and only those servers, that `pgd_bench` connects to.

The other options are identical to the Community PostgreSQL `pgbench`. For more
details, consult the official documentation on
[`pgbench`](https://www.postgresql.org/docs/current/pgbench.html).

We list all the options (`pgd_bench` and `pgbench`) below for completeness.

#### Initialization options:
- `-i, --initialize` — invokes initialization mode
- `-I, --init-steps=[dtgGvpf]+` (default `"dtgvp"`) — run selected initialization steps
- `d` — drop any existing `pgbench` tables
- `t` — create the tables used by the standard `pgbench` scenario
- `g` — generate data client-side and load it into the standard tables, replacing any data already present
- `G` — generate data server-side and load it into the standard tables, replacing any data already present
- `v` — invoke `VACUUM` on the standard tables
- `p` — create primary key indexes on the standard tables
- `f` — create foreign key constraints between the standard tables
- `-F, --fillfactor=NUM` — set fill factor
- `-n, --no-vacuum` — do not run `VACUUM` during initialization
- `-q, --quiet` — quiet logging (one message each 5 seconds)
- `-s, --scale=NUM` — scaling factor
- `--foreign-keys` — create foreign key constraints between tables
- `--index-tablespace=TABLESPACE` — create indexes in the specified tablespace
- `--partition-method=(range|hash)` — partition `pgbench_accounts` with this method (default: range)
- `--partitions=NUM` — partition `pgbench_accounts` into `NUM` parts (default: 0)
- `--tablespace=TABLESPACE` — create tables in the specified tablespace
- `--unlogged-tables` — create tables as unlogged tables (Note: unlogged tables are not replicated)

#### Options to select what to run:
- `-b, --builtin=NAME[@W]` — add builtin script NAME weighted at W (default: 1). Use `-b list` to list available scripts.
- `-f, --file=FILENAME[@W]` — add script `FILENAME` weighted at W (default: 1)
- `-N, --skip-some-updates` — updates of pgbench_tellers and pgbench_branches. Same as `-b simple-update`
- `-S, --select-only` — perform SELECT-only transactions. Same as `-b select-only`

#### Benchmarking options:
- `-c, --client=NUM` — number of concurrent database clients (default: 1)
- `-C, --connect` — establish new connection for each transaction
- `-D, --define=VARNAME=VALUE` — define variable for use by custom script
- `-j, --jobs=NUM` — number of threads (default: 1)
- `-l, --log` — write transaction times to log file
- `-L, --latency-limit=NUM` — count transactions lasting more than NUM ms as late
- `-m, --mode=regular|camo|failover` — mode in which pgbench should run (default: `regular`)
- `-M, --protocol=simple|extended|prepared` — protocol for submitting queries (default: `simple`)
- `-n, --no-vacuum` — do not run `VACUUM` before tests
- `-o, --set-option=NAME=VALUE` — specify runtime SET option
- `-P, --progress=NUM` — show thread progress report every NUM seconds
- `-r, --report-per-command` — latencies, failures and retries per command
- `-R, --rate=NUM` — target rate in transactions per second
- `-s, --scale=NUM` — report this scale factor in output
- `-t, --transactions=NUM` — number of transactions each client runs (default: 10)
- `-T, --time=NUM` — duration of benchmark test in seconds
- `-v, --vacuum-all` — vacuum all four standard tables before tests
- `--aggregate-interval=NUM` — data over NUM seconds
- `--failures-detailed` — report the failures grouped by basic types
- `--log-prefix=PREFIX` — prefix for transaction time log file (default: `pgbench_log`)
- `--max-tries=NUM` — max number of tries to run transaction (default: 1)
- `--progress-timestamp` — use Unix epoch timestamps for progress
- `--random-seed=SEED` — set random seed ("time", "rand", integer)
- `--retry` — retry transactions on failover, used with "-m"
- `--sampling-rate=NUM` — fraction of transactions to log (e.g., 0.01 for 1%)
- `--show-script=NAME` — show builtin script code, then exit
- `--verbose-errors` — print messages of all errors
This option is followed by `NAME=VALUE` entries, which are applied using the
PostgreSQL [`SET`](https://www.postgresql.org/docs/current/sql-set.html) command on each server that pgd_bench connects to, and only those servers.

The other options are identical to the PostgreSQL pgd_bench command. For
details, see the PostgreSQL
[pgd_bench](https://www.postgresql.org/docs/current/pgbench.html) documentation.

The complete list of options (pgd_bench and pgbench) follow.

#### Initialization options
- `-i, --initialize` — Invoke initialization mode.
- `-I, --init-steps=[dtgGvpf]+` (default `"dtgvp"`) — Run selected initialization steps.
- `d` — Drop any existing pgd_bench tables.
- `t` — Create the tables used by the standard pgd_bench scenario.
- `g` — Generate data client-side and load it into the standard tables, replacing any data already present.
- `G` — Generate data server-side and load it into the standard tables, replacing any data already present.
- `v` — Invoke `VACUUM` on the standard tables.
- `p` — Create primary key indexes on the standard tables.
- `f` — Create foreign key constraints between the standard tables.
- `-F, --fillfactor=NUM` — Set fill factor.
- `-n, --no-vacuum` — Don't run `VACUUM` during initialization.
- `-q, --quiet` — Quiet logging (one message every 5 seconds).
- `-s, --scale=NUM` — Scaling factor.
- `--foreign-keys` — Create foreign key constraints between tables.
- `--index-tablespace=TABLESPACE` — Create indexes in the specified tablespace.
- `--partition-method=(range|hash)` — Partition `pgbench_accounts` with this method. The default is `range`.
- `--partitions=NUM` — Partition `pgbench_accounts` into `NUM` parts. The default is `0`.
- `--tablespace=TABLESPACE` — Create tables in the specified tablespace.
- `--unlogged-tables` — Create tables as unlogged tables. (Note: Unlogged tables aren't replicated.)

#### Options to select what to run
- `-b, --builtin=NAME[@W]` — Add built-in script NAME weighted at W. The default is 1. Use `-b list` to list available scripts.
- `-f, --file=FILENAME[@W]` — Add script `FILENAME` weighted at W. The default is 1.
- `-N, --skip-some-updates` — Updates of pgbench_tellers and pgbench_branches. Same as `-b simple-update`.
- `-S, --select-only` — Perform SELECT-only transactions. Same as `-b select-only`.

#### Benchmarking options
- `-c, --client=NUM` — Number of concurrent database clients. The default is 1.
- `-C, --connect` — Establish new connection for each transaction.
- `-D, --define=VARNAME=VALUE` — Define variable for use by custom script.
- `-j, --jobs=NUM` — Number of threads. The default is 1.
- `-l, --log` — Write transaction times to log file.
- `-L, --latency-limit=NUM` — Count transactions lasting more than NUM ms as late.
- `-m, --mode=regular|camo|failover` — Mode in which to run pgbench. The default is `regular`.
- `-M, --protocol=simple|extended|prepared` — Protocol for submitting queries. The default is `simple`.
- `-n, --no-vacuum` — Don't run `VACUUM` before tests.
- `-o, --set-option=NAME=VALUE` — Specify runtime `SET` option.
- `-P, --progress=NUM` — Show thread progress report every NUM seconds.
- `-r, --report-per-command` — Latencies, failures, and retries per command.
- `-R, --rate=NUM` — Target rate in transactions per second.
- `-s, --scale=NUM` — Report this scale factor in output.
- `-t, --transactions=NUM` — Number of transactions each client runs. The default is 10.
- `-T, --time=NUM` — Duration of benchmark test, in seconds.
- `-v, --vacuum-all` — Vacuum all four standard tables before tests.
- `--aggregate-interval=NUM` — Data over NUM seconds.
- `--failures-detailed` — Report the failures grouped by basic types.
- `--log-prefix=PREFIX` — Prefix for transaction time log file. The default is `pgbench_log`.
- `--max-tries=NUM` — Max number of tries to run transaction. The default is `1`.
- `--progress-timestamp` — Use Unix epoch timestamps for progress.
- `--random-seed=SEED` — Set random seed (`time`, `rand`, `integer`).
- `--retry` — Retry transactions on failover, used with `-m`.
- `--sampling-rate=NUM` — Fraction of transactions to log, for example, 0.01 for 1%.
- `--show-script=NAME` — Show built-in script code, then exit.
- `--verbose-errors` — Print messages of all errors.

#### Common options:
- `-d, --debug` — print debugging output
- `-h, --host=HOSTNAME` — database server host or socket directory
- `-p, --port=PORT` — database server port number
- `-U, --username=USERNAME` — connect as specified database user
- `-V, --version` — output version information, then exit
- `-?, --help` — show help, then exit

- `-d, --debug` — Print debugging output.
- `-h, --host=HOSTNAME` — Database server host or socket directory.
- `-p, --port=PORT` — Database server port number.
- `-U, --username=USERNAME` — Connect as specified database user.
- `-V, --version` — Output version information, then exit.
- `-?, --help` — Show help, then exit.
60 changes: 26 additions & 34 deletions product_docs/docs/pgd/5/testingandtuning.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Testing and Tuning PGD clusters
navTitle: Testing and Tuning
title: Testing and tuning PGD clusters
navTitle: Testing and tuning
---

You can test PGD applications using the following approaches:
Expand Down Expand Up @@ -29,26 +29,26 @@ of the application.
### pgd_bench

The Postgres benchmarking application
[`pgbench`](https://www.postgresql.org/docs/current/pgbench.html) has been
extended in PGD 5.0 in the form of a new applications: `pgd_bench`.
[`pgbench`](https://www.postgresql.org/docs/current/pgbench.html) was
extended in PGD 5.0 in the form of a new applications: pgd_bench.

[`pgd_bench`](/pgd/latest/reference/testingandtuning#pgd_bench) is a regular command-line utility that's added to PostgreSQL's bin
directory. The utility is based on the Community PostgreSQL `pgbench` tool but
supports benchmarking CAMO transactions and PGD specific workloads.
[pgd_bench](/pgd/latest/reference/testingandtuning#pgd_bench) is a regular command-line utility that's added to the PostgreSQL bin
directory. The utility is based on the PostgreSQL pgbench tool but
supports benchmarking CAMO transactions and PGD-specific workloads.

Functionality of the `pgd_bench` is a superset of those of `pgbench` but
requires the BDR extension to be installed in order to work properly.
Functionality of pgd_bench is a superset of those of pgbench but
requires the BDR extension to be installed to work properly.

Key differences include:

- Adjustments to the initialization (`-i` flag) with the standard
`pgbench` scenario to prevent global lock timeouts in certain cases
- `VACUUM` command in the standard scenario is executed on all nodes
- `pgd_bench` releases are tied to the releases of the BDR extension
and are built against the corresponding PostgreSQL flavour (this is
reflected in the output of `--version` flag)
pgbench scenario to prevent global lock timeouts in certain cases.
- `VACUUM` command in the standard scenario is executed on all nodes.
- pgd_bench releases are tied to the releases of the BDR extension
and are built against the corresponding PostgreSQL flavor. This is
reflected in the output of the `--version` flag.

The current version allows users to run failover tests while using CAMO or
The current version allows you to run failover tests while using CAMO or
regular PGD deployments.

The following options were added:
Expand Down Expand Up @@ -93,24 +93,22 @@ transactions.

### Notes on pgd_bench usage

- When using custom init-scripts it is important to understand implications behind the DDL commands.
It is generally recommended to wait for the secondary nodes to catch-up on the data-load steps
before proceeding with DDL operations such as `CREATE INDEX`. The latter acquire global locks which
can't be acquired until the data-load is complete and thus may time out.

- No extra steps are taken to suppress client messages, such as `NOTICE`s and `WARNING`s emitted
by PostgreSQL and or any possible extensions including the BDR extension. It is the user's
responsibility to suppress them by setting appropriate variables (e.g. `client_min_messages`,
`bdr.camo_enable_client_warnings ` etc.).

- When using custom init-scripts, it's important to understand implications behind the DDL commands.
We generally recommend waiting for the secondary nodes to catch up on the data-load steps
before proceeding with DDL operations such as `CREATE INDEX`. The latter acquire global locks that
can't be acquired until the data load is complete and thus might time out.

- No extra steps are taken to suppress client messages, such as `NOTICE` and `WARNING` messages emitted
by PostgreSQL and or any possible extensions, including the BDR extension. It's your
responsibility to suppress them by setting appropriate variables, such as `client_min_messages`,
`bdr.camo_enable_client_warnings`, and so on.

## Performance testing and tuning

PGD allows you to issue write transactions onto multiple master nodes. Bringing
those writes back together onto each node has a cost in performance.
those writes back together onto each node has a performance cost.

First, replaying changes from another node has a CPU cost, an I/O cost,
First, replaying changes from another node has a CPU cost an an I/O cost,
and it generates WAL records. The resource use is usually less
than in the original transaction since CPU overheads are lower as a result
of not needing to reexecute SQL. In the case of UPDATE and DELETE
Expand All @@ -135,7 +133,7 @@ If PGD is running slow, then we suggest the following:
1. Write a custom test script for pgd_bench, as close as you can make it
to the production system's problem case.
2. Run the script on one node to give you a baseline figure.
3. Run the script on as many nodes as occurs in production, using the
3. Run the script on as many nodes as occur in production, using the
same number of sessions in total as you did on one node. This technique
shows you the effect of moving to multiple nodes.
4. Increase the number of sessions for these two tests so you can
Expand All @@ -145,9 +143,3 @@ If PGD is running slow, then we suggest the following:

Use all of the normal Postgres tuning features to improve the speed
of critical parts of your application.






0 comments on commit 281a447

Please sign in to comment.