Database Connection Pooling Explained

A 20-pod Kubernetes deployment, each pod with a pool size of 20, is asking for 400 Postgres connections before a single user shows up. The default max_connections on Postgres is 100. The maths does not work, and the pager goes off at 09:00 on Monday when traffic returns. Connection pooling is one of those topics that looks like a tuning detail until it takes a service down.

This is a practical walk through what pools do, how to size them for Postgres and Node.js, and the failure modes that catch teams out: serverless cold starts, leaked connections, and the moment you realise you actually need PgBouncer.

Why connection pools exist

Opening a TCP connection to Postgres, performing the SSL handshake, and authenticating takes around 25 to 75 milliseconds on a healthy network. For an API that handles a 10 ms query, paying the connection cost on every request is absurd. Pools amortise that cost by keeping a small set of live connections ready to reuse.

There is a second, more important reason: Postgres pays a real price for each connection. Every backend is a separate OS process with its own memory footprint, typically 5 to 10 MB plus whatever work_mem and temp buffers it accumulates. A typical Postgres server can comfortably handle 100 to 300 active connections; past that, throughput drops sharply because the kernel and the database scheduler spend more time context switching than executing queries.

Brandur Leach’s well-known experiment on this is worth knowing: with a fixed CPU and a fixed working set, throughput peaks at a number of connections close to the core count, then degrades. More connections is not more parallelism. It is more queue.

How a pool actually works

Every pool has the same handful of moving parts:

  • min or idle: connections kept open even when nothing is happening
  • max: the hard ceiling on concurrent connections
  • acquire timeout: how long a caller will wait for a free connection before erroring
  • idle timeout: how long an idle connection sits before being closed
  • connection timeout: how long the pool waits for the database to accept a new connection

A request asks the pool for a client. If a free connection exists, it is handed back immediately. If not, and the pool is below max, a new connection is opened. If the pool is at max, the request waits in a FIFO queue until a connection is released or the acquire timeout fires.

App request 1 request 2 request 3 (queued) Pool (max 4) active active idle idle Postgres max_connections

The crucial point: max is a property of the pool, but the database has its own ceiling. If every app instance has its own pool, you have to count totals.

The pool-sizing formula, and what it actually means

The most cited rule of thumb comes from the HikariCP pool sizing wiki ↗:

connections = ((core_count * 2) + effective_spindle_count)

For a database server with 8 cores and SSD storage (effective spindle count ~1), the sweet spot is around 17 connections. That number is the total active connections the database can usefully handle, not the size of any individual client pool.

So if you have 4 application instances all pointing at the same Postgres, each pool should be roughly 17 / 4 = 4. Not 20. Not 50. Four.

Real-world starting points that work for most CRUD apps:

SetupPer-process poolNotes
Single Node app, small Postgres10Plenty for typical traffic
2 to 4 Node instances, mid-size Postgres5 to 10Total stays under 50
10+ instances or autoscalinguse a connection proxyClient-side pool of 5; PgBouncer in front
Serverless (Lambda, Workers)1 per instanceAlways front with PgBouncer or RDS Proxy

Always cross-check against the database’s max_connections. The Postgres connection settings docs ↗ describe the parameter; the default is 100, and most managed services let you raise it but charge in memory for doing so.

A concrete Node.js example

Using node-postgres ↗:

import { Pool } from 'pg';

const pool = new Pool({
  host: process.env.PGHOST,
  user: process.env.PGUSER,
  password: process.env.PGPASSWORD,
  database: process.env.PGDATABASE,
  // sizing
  max: 10,
  min: 2,
  // timeouts (all in ms)
  connectionTimeoutMillis: 5000,
  idleTimeoutMillis: 30000,
  // catch leaks
  allowExitOnIdle: true,
});

pool.on('error', (err) => {
  // background connections that error must not crash the app
  console.error('idle client error', err);
});

export async function getUser(id) {
  const client = await pool.connect();
  try {
    const { rows } = await client.query('SELECT * FROM users WHERE id = $1', [id]);
    return rows[0];
  } finally {
    client.release();
  }
}

The try / finally is non-negotiable. Forget the release() once and that connection is leaked for the life of the process. Wrap the pattern in a helper if your team keeps tripping over it.

Common pitfalls

Too big a pool. The most expensive misconfiguration. A pool of 100 against a Postgres with max_connections = 100 means one app instance can monopolise the whole database, locking everyone else out. Throughput also drops because Postgres is now juggling 100 backends across 8 cores.

Unreleased connections. A handler that throws before release() runs leaks the connection forever. Symptoms: a slow climb in active connections, then sudden timeouts when the pool saturates. Fix with try / finally, an automatic-release helper, or your ORM’s transaction wrapper.

Long-running transactions. A transaction holding a row-level lock for 30 seconds is also holding a pool slot for 30 seconds. Long transactions effectively shrink your pool. Move heavy work outside transactions, and watch pg_stat_activity for long-lived idle in transaction rows.

Serverless cold-start storms. A traffic spike spawns 200 Lambda containers, each opening its own pool, each at min connections. Postgres sees 200 to 1000 simultaneous handshakes and refuses connections. The fix is always the same: put a connection proxy between serverless and the database. Never let serverless functions speak directly to Postgres at scale.

Mismatched timeouts. If your acquire timeout is 30 seconds but your HTTP server’s request timeout is 10 seconds, the request errors before the pool even gives up trying. Make timeouts step down through the stack: external < HTTP < acquire < query.

When to add PgBouncer or a proxy

Once the total of (instances * pool_max) approaches a third of max_connections, it is time for an external pooler. PgBouncer ↗ is the standard choice; on AWS the equivalent is RDS Proxy ↗.

PgBouncer offers three pooling modes; transaction pooling is the one that matters in practice:

ModeBackend reused afterUse when
SessionClient disconnectYou need session features and clients are few
TransactionEach transaction completesDefault for web apps; gives 100x+ multiplexing
StatementEach statementRarely used; breaks multi-statement transactions

Transaction pooling has real constraints. Prepared statements, SET outside a transaction, advisory locks held across queries, and LISTEN/NOTIFY all break because the next query may run on a different backend. Most ORMs need a flag to disable prepared statements when running through PgBouncer in transaction mode. Check your driver’s documentation for that.

A typical layout: 30 application pods, each with a pg pool of max: 5, all pointing at PgBouncer with default_pool_size = 25. The application sees up to 150 connection slots; Postgres only ever sees 25 backends. The numbers are made up but the ratio is the point.

A sizing checklist before you ship

  1. Find your Postgres max_connections. Most managed services default to 100 to 200.
  2. Subtract 10 percent as headroom for admin tools and superuser slots.
  3. Divide by the maximum number of app instances you expect at peak (autoscaling included).
  4. Round down. That is your per-instance pool max.
  5. Set min to roughly 20 to 30 percent of max so you have warm connections.
  6. Set acquire timeout shorter than your HTTP request timeout.
  7. Add a metric for pool wait time and alarm on the p95.
  8. If you cannot make the maths work, add PgBouncer.

If you are starting from scratch with a database choice, the trade-offs around concurrency, connection cost, and pooling vary a lot by engine. The companion piece how to choose the right database for your project walks through that decision. Once you are running on Postgres, the next two performance levers worth pulling are usually indexing and schema shape; the developer’s guide to database indexing and how to design a database schema that scales cover the practical side.

The short version

Pools are not a tuning knob you twist until the dashboard goes green. They are a contract between your app and the database about how many parallel clients each side can tolerate. Get that contract right and Postgres will run boringly. Get it wrong and the failure mode is a slow climb in connection counts, then a sudden cliff. Pick a small pool, keep an eye on acquire latency, and add a proxy before you actually need one.

Frequently asked questions

What is the right pool size for a Node.js app talking to Postgres?

Start small. For a single Node process, a pool of 10 to 20 is plenty for most CRUD workloads. The HikariCP guidance of (cores * 2) + spindle_count refers to the database server, not the client; pick a per-process pool size that, multiplied by the number of running processes, stays well under your Postgres max_connections.

Why does a bigger pool sometimes make the app slower?

More connections mean more context switches, more lock contention, and more memory pressure on Postgres. Past the point where active queries equal CPU cores, extra connections just queue inside the database instead of inside your app. Throughput plateaus and tail latency gets worse.

Do I need PgBouncer if I am already using a client-side pool?

If you have one or two app instances, no. If you have dozens of instances or a serverless platform that spawns short-lived workers, yes. PgBouncer multiplexes thousands of client connections onto a small number of real Postgres connections, which is the only way to keep max_connections sane at scale.

What is the difference between transaction and session pooling?

Session pooling assigns a backend connection to a client for the entire session. Transaction pooling reassigns the backend after every transaction, which is far more efficient but disallows session-scoped features like prepared statements, LISTEN/NOTIFY, and SET LOCAL outside a transaction.

How do I know my pool is too small?

Watch acquire wait times. If clients regularly wait more than a handful of milliseconds for a connection, the pool is the bottleneck. If acquire is fast but query latency is high, the bottleneck is the database itself and adding more pool slots will make things worse, not better.

Enjoyed this article? Get more developer tips straight to your inbox.

Comments

Join the conversation. Share your experience or ask a question below.

0/1000

No comments yet. Be the first to share your thoughts.