Scaling PostgreSQL: Beyond the Basics

When you start building applications, a single instance of PostgreSQL running with default configuration is usually more than enough. However, as your application scales and your transactional volume hits thousands of TPS (transactions per second), you'll quickly find that out-of-the-box configurations begin to buckle. In this post, we'll explore the critical components of scaling PostgreSQL: WAL management, connection pooling with pg_bouncer, and essential Linux kernel tuning.

PostgreSQL Primary-Replica with pg_bouncer Architecture Diagram
Typical high-availability PostgreSQL architecture with pg_bouncer.

Mastering the Write-Ahead Log (WAL)

The Write-Ahead Log (WAL) is the backbone of PostgreSQL's durability and replication. Every change is written to the WAL before it's applied to the actual data files. If your system crashes, PostgreSQL replays the WAL to recover.

As you scale, the default WAL configuration can become a bottleneck. You need to ensure that your wal_level is set appropriately (usually to replica or logical) and that your max_wal_size is large enough to prevent excessively frequent checkpoints.

# postgresql.conf
wal_level = replica
max_wal_size = 4GB
min_wal_size = 1GB
checkpoint_completion_target = 0.9

Setting the checkpoint_completion_target to 0.9 spreads the checkpoint I/O load over a longer period, preventing sudden spikes in disk utilization that can stall other queries.

Connection Pooling with pg_bouncer

PostgreSQL uses a process-per-connection model. Creating a new connection is expensive, and having thousands of idle connections consumes massive amounts of RAM. This is where a connection pooler like pg_bouncer becomes indispensable.

pg_bouncer sits between your application and the database. It maintains a pool of open connections to PostgreSQL and multiplexes incoming client connections over them. By utilizing transaction pooling mode, pg_bouncer can handle thousands of concurrent client connections with only a few dozen actual database connections.

A typical pgbouncer.ini snippet looks like this:

[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
listen_port = 6432
listen_addr = *
auth_type = md5
auth_file = userlist.txt
pool_mode = transaction
max_client_conn = 10000
default_pool_size = 50

Kernel Tuning (sysctl)

Even with a perfectly configured database and connection pooler, the underlying OS can limit performance. Linux is generally configured for general-purpose computing, not dedicated high-performance databases.

Tuning kernel parameters via sysctl is essential. Memory management, specifically transparent huge pages (THP) and swappiness, are critical areas.

  • vm.swappiness: You want the database to use RAM, not swap. Reduce swappiness to 1 (or 0, though 1 is generally safer in modern kernels).
  • vm.dirty_background_ratio: Adjusting how the kernel flushes dirty pages to disk can prevent massive I/O stalls during checkpoints.
# /etc/sysctl.d/99-postgres.conf
vm.swappiness = 1
vm.dirty_background_ratio = 5
vm.dirty_ratio = 20
vm.overcommit_memory = 2

Furthermore, disabling Transparent Huge Pages (THP) at the OS level is highly recommended for PostgreSQL, as THP can lead to unpredictable latency spikes and memory bloat.

Conclusion

Scaling PostgreSQL requires a holistic approach. It’s not just about tweaking one configuration file; it’s about understanding the entire stack from the application connection behavior down to the Linux kernel's memory management. By mastering WAL tuning, utilizing connection poolers like pg_bouncer, and applying strict OS-level optimizations, you can ensure your data tier remains highly available and performant under massive load.