PgDog vs. Citus
Mar 20th, 2025
Lev Kokotov
I don’t like writing these. More often than not, they are not particularly neutral, and state facts selectively. So, I’m going to write one focusing of PgDog’s architecture and point out some key differences with Citus. I will also suggest use cases when one is, in my opinion, better than the other. Here we go.
Processes vs. threads
The year is 2025 and we’re still talking about the merits of IPC and mutexes. Not a debate I’m interested in that much, but just to clarify, PgDog is using threads. Well, to be exact, it’s using tasks, which are executed on a multi-threaded asynchronous runtime, called Tokio. This point is actually important, because Tokio concurrency is much, much higher than a simple multi-threaded process.
Applications that use asynchronous I/O organize their work in encapsulated units. These units are distributed evenly between CPU threads, and if one of them touches a socket (or any file descriptor), it’s immediately put to sleep until that FD has data in its input buffer, or finished writing data to the device. Meanwhile, the executing thread is told to go do other useful work, like running tasks that finished waiting for I/O.
This architecture has dramatic effects on performance. For I/O bound workloads, async applications are several orders of magnitude faster than just using threads or multiple processes. Tokio uses epoll on Linux (and kqueue on BSDs), so PgDog, theoretically, can handle 100,000s (if not millions) of connections.
An I/O bound application takes bytes from socket A and writes them to socket B. What happens in between sockets requires CPU time, and the more of that you need, the less benefits you gain from using an async runtime. Since PgDog is written in Rust, it’s pretty fast, even for things that require the CPU, which we still try to keep to a minimum.
In practice, even with the simplest of proxies, like PgBouncer, the CPU gets used quite a bit. So my recommendation is to run these to at most 50% utilization, and add more instances to reduce load. You want to leave some headroom for outliers.
Citus uses a process-based architecture. It’s an extension that runs inside Postgres. It’s limited by how many connections Postgres itself can handle, which usually tops out at about 5,000. This is enough for a lot of deployments, but coming from a massively parallel world of container-orchestrated Rails apps, I am used to serving 150,000+ connections. That requires an async pooler, like PgDog, or PgBouncer, pgcat, or many others.
Nothing is stopping you from placing a pooler in front of a Citus instance. However, this will only work if the majority of clients are idle, since the process-based Postgres can’t serve more than one client connection per process.
In practice, Postgres concurrency is much lower than 5,000. The recommendation I’ve seen in the wild is to limit your connections to 2 per CPU core. So, even on the biggest of machines available from AWS, you can probably have up to 384 (192 vCPUs x 2) active connections.
Well, there you have it. The threads vs. processes debate settled, once and for all. The answer: it depends on what you’re doing. PgDog was written for high throughput, OLTP workloads, with lots of bytes passing through. A fast, async and multi-threaded runtime, is exactly what we need. If your focus is OLAP, then Postgres will saturate your disks long before process-based concurrency becomes an issue. For such use cases, Citus can work just fine.
More is, technically, better
Since the theme of the day is horizontal scaling, let’s discuss how PgDog is deployed in production. PgDog is a stateless network proxy. This means that connections going through PgDog don’t require a specific instance of PgDog. They can be sent through any one of them, as long as the instances are identically configured. Synchronizing configuration can be done using any number of orchestration tools like Kubernetes, ECS, or just Bash and scp.
If you’ve used PgBouncer (or pgcat) before, you can deploy PgDog exactly the same way. PgDog instances don’t talk to each other, don’t use external storage beyond a couple config files, and only talk to databases when serving transactions and running health checks.
Conversely, Citus architecture uses a single coordinator Postgres instance. Citus stores authoritative information about its worker nodes in metadata tables, inside the coordinator. So, all requests that need to talk to the workers have to go through the coordinator. This works well for OLAP: most of the work is done by the workers anyway, like reading terabytes of data from tables and calculating aggregates. The coordinator just assembles the final result, which is quick.
For OLTP, the coordinator is bombarded with thousands of requests per second. Most queries are just fetching a single row from a couple tables. This requires the coordinator to do most of the work, while the data nodes are relatively idle, fetching hot data from shared buffers. If you’re looking to scale OLTP, a horizontally scalable coordinator is necessary.
It’s possible to put multiple PgBouncer instances in front of a Citus coordinator. This architecture, however, still has a choke point, the coordinator, and the same 2 x CPU cores active connection limit.
So, what’s the catch?
As of this writing, PgDog has only basic support for cross-shard aggregates. That’s a feature we’ll continue to improve. In the meantime, if you have complex analytical queries, Citus will probably work better. If you’re using Postgres for online use cases, like lots of fast queries, PgDog can be an optimal, albeit still experimental, way to scale your database. This project is only a few months old, so keep that in mind when you’re considering your options.
That being said, if you like what you’ve seen so far and want to help build what we think is the next iteration of horizontal scalability for Postgres, get in touch. PgDog is looking for early design partners to help shape its future.