Gajus Kuizinas
Scaling databases is hard. However, perhaps the lowest hanging fruit is introducing read-only replicas.
A typical load balancing requirement is to route all "logical" read-only queries to a read-only instance. This requirement can be implemented in 2 ways:
Create two database clients (read-write and read-only) and pass them around the application as needed.
Use a middleware to assign query to a connection pool based on the query itself.
Option 1: Two distinct clients
The first option is preferable as it is the most explicit. However, it also has the most overhead to implement – adding this to an existing codebase could mean modifying thousands lines of code. The other downside of this approach is that it that it is easy to overlook when a read-only query is using a read-write database client, and while that isn't a major issue, it would be nice if the inverse it true – we would be warned if we are using the wrong connection for the query. That might be possible with a middleware that routes queries.
Option 2: Middleware
The second option routes queries based on the query itself and the context in which it was initiated. The following has to be true for us to safely route a query to a read-only instance:
It must be a SELECT query.
It mustn't be part of a transaction.
It mustn't execute a volatile function
The first two requirements are relatively straightforward to implement. The third one requires that we introduce a convention.
Handling volatile functions
A volatile function in PostgreSQL is any function that has side-effects (e.g. writes data) or that can change output even within a single table scan (e.g. random()). In our context, only the first part of volatile function is relevant. However, there is no way of telling just by looking at the query if it is volatile or not.
Therefore, we need to have a convention for marking such queries as not safe for read-only routing. There is more than one way of doing it, but we chose a very simple approach: a magic comment that marks query as volatile. Just add @volatile anywhere in the query (comment) to indicate that it has side-effects and have middleware check for that keyword.
We could also do the inverse here and use a convention to mark which queries are safe for routing to the read-only instances using @readOnly keyword. This way we could avoid checking the query for other keywords altogether and entirely rely on what the engineer instruments. However, this approach suffers from the same short-comings as the #1 option: Easy to miss opportunities and requires editing every query one-by-one.
Implementing query routing using Slonik
We are using Slonik PostgreSQL client in our project, and lucky for us, Slonik has a beforePoolConnection middleware that can be used to implement all 3 requirements. In short, beforePoolConnection is called just before the query is assigned a connection. All we have to do is create a second read-only pool and route queries to that read-only pool when all 3 conditions are satisfied. The code is mighty straightforward:
A few things that are worth clarifying about the above code:
We are not handling SELECT INTO
It is worth emphasizing that if a client initiates connection using pool#connect() or pool#transaction(), then connectionContext.query is null, i.e., we are going to fall back to using the main pool.
And that's it! Short and sweet. We now have an efficient query routing mechanism that reduced the load on our main instance and allows us to better scale our service.
2022