postgresql pgbench Workload Simulation pg_bench - ghdrako/doc_snipets GitHub Wiki

pgbench is built in to PostgreSQL.

To start using pgBench, you first need to initialize a test database. This sets up a set of standard test tables with data for benchmarking:

pgbench -i -s 10 database_name

The -s 10 flag represents the scale factor, which determines the size of the test data. A higher scale factor creates larger tables, simulating more intensive workloads.

After initialization, you can run a benchmark test to evaluate how the database performs under load. For example:

pgbench -c 10 -t 1000 database_name

This command simulates 10 clients (-c 10), each executing 1,000 transactions (-t 1000), providing a realistic simulation of concurrent database activity. The tool then reports key performance metrics, such as:

  • Transactions per second (TPS): Indicates the throughput of the database.
  • Latency: Measures the response time for each transaction, helping identify potential delays.
  • Client-side and server-side performance: Allows you to evaluate both sides of the transaction process.
pgbench -i -I g -s 10
  • -I init_steps
    • d (Drop)
    • t (create Tables)
    • g or G (Generate data, client-side or server-side)
    • v (Vacuum)
    • p (create Primary keys)
    • f (create Foreign keys)

Using partitioned tables

Initialization:

pgbench -i --partitions <integer> [--partition-method <method>]

By default use range partitions

$ pgbench -i -s 100 --partitions=10

set partition method

$ pgbench -i -s 100 --partitions=10 --partition-method=hash

Test:

$ pgbench -j $( nproc ) -c $( nproc ) -T 30



#### Custom script
pgBench also supports advanced features like custom scripts, different transaction types, and adjustable workloads, making it highly customizable for various performance scenarios. Whether you're stress-testing your system or fine-tuning PostgreSQL for production workloads, pgBench provides valuable insights into how your database will perform under load.

my_benchmark.sh #!/bin/bash

for run in {1..10}; do echo "select * from comments where blog_id = $(jot -r 1 1 1000);" >> queries.bench done

echo "UPDATE comments SET view_count = 0 WHERE blogs.id = 1" >> queries.bench

-T/--time time seconds

-j/--jobs number of threads

-c/--client number of clients

-M/--protocol querymode = prepared

-r/--report-latencies

pgbench --host localhost --port 5432 --username root --protocol prepared --time 60 --jobs 8 --client 8 --file queries.bench --report-latencies my_database_name"


chmod +X my_benchmark.sh ./my_benchmark



Another tool for PostgreSQL benchmarking is [HammerDB](https://github.com/TPC-Council/HammerDB). HammerDB may have a more realistic workload test environment out of the box.


#### Initialize pgbenchmark

./pgbench -i -p 5432 -d postgres

 it creates `pgbench_history`, `pgbench_tellers`, `pgbench_accounts`, and `pgbench_branches` tables to run the transactions for benchmarking.

./pgbench -c 10 # test with 10 clients with 10 tx per client ./pgbench -c 100 -T 300 -S -n # “-S” option - red only worload - secect , -n is to skip vacuuming on tables


pgbench typically runs transactions on its own tables. If you have a workload of 50% reads and 50% writes (or a 60:40 environment), you can create a script file with a set of statements to achieve the expected workload.

$ cat /tmp/bench.sql INSERT INTO test_bench VALUES(1,'test'); INSERT INTO test_bench VALUES(1,'test'); SELECT * FROM test_bench WHERE id=1; SELECT * FROM test_bench WHERE id=2;


./pgbench -c 100 -T 300 -S -n -f /tmp/bench.sql


#### File the database test data
* `-i` inicialize and fill data
* `-p port` The database server's port number
* `-s scale_factor` Multiply the number of rows generated by the scale factor. For example, -s 100 will create 10,000,000 rows in the `pgbench_accounts` table. Default is 1. 

time pgbench -p 6000 -i -s 1000

⚠️ **GitHub.com Fallback** ⚠️