Recently, my team and I successfully completed a daunting task. We were tasked with solving a problem that involved enhancing our application’s security. At the time the front-end was sending raw SQL statements to a third-party API for data retrieval. We needed to create a new flow where the backend would inspect the query before proxying the request to the existing API.
By interjecting the backend into the flow, we were able to accomplish two key objectives:
- Restrict the input with additional business rules;
- Restrict access to data by applying object-level permissions.
For our efforts to be successful, it was paramount that the front-end would be open to adopting this new endpoint. We knew that the front-end could already send any arbitrary query to the existing API. Therefore, we concluded the only way we could get adoption is to ensure the interface could continue to support any arbitrary query.
To adopt this endpoint, the front-end was asked to change a raw SQL string into a structured JSON, such as:
SELECT * FROM my_table
{ "select": [{ "column": "*" }], "from": [{ "operator": "FROM", "tableName": "my_table" }] }
In this series of articles, you’ll learn about how we updated our Nestjs backend by adding a new POST endpoint having a fully documented, validated, and parsed input JSON body with complete SQL support.
In this first article, you’ll get a glimpse of the io-ts schema we created. In later articles, we’ll explore more details of the schema.
Along the way you’ll read about how we overcame a repeating challenge in this endeavor: mutual recursion.
What is Mutual Recursion?
If you’re unfamiliar with the term mutual recursion, here is a definition from Wikipedia:
In mathematics and computer science, mutual recursion is a form of recursion where two mathematical or computational objects, such as functions or datatypes, are defined in terms of each other.
You might have encountered recursion before and probably have written recursive algorithms. For example, you can implement the math operator modulo with a recursive function. Another common implementation of a function utilizing recursion is tree traversal.
A straightforward example of recursion is:
// hello callstack limit const foo = () => { foo(); }
The kind of recursion shown above is not too difficult to achieve: the function foo
is allowed to call itself because foo
is in scope.
Mutual recursion adds another wrinkle. With mutual recursion, two or more functions or data types not only refer to themselves but also to each other.
Now, consider the goal at hand and the complexity of the problem becomes apparent: define a JSON structure that supports complete SQL syntax.
Consider some of these possibilities when writing a SQL SELECT
statement. Such a statement could have a great deal of mutual recursion such as:
😁
SELECT
some columnsFROM
a table (maybe);
🤔 Some of those columns could be nested
SELECT
statements themselves;
😖 Some of those tables might also be nested
SELECT
statements;
😵 The statement might begin with a
WITH
clause, which is a collection of all the above, accessible anywhere in theSELECT
statement;
☠ By the way
WITH
also supports theRECURSIVE
keyword, allowing for internal recursion.
This is just the beginning of describing a SQL SELECT
statement.
While trying to implement this complex input, we ran into a limitation with the Nestjs framework.
Decorators Do Not Support Mutual Recursion
When we tried to define the structure of a SQL statement in our Nestjs data transfer objects (DTOs), we immediately encountered a limitation.
Nestjs controllers rely heavily on decorators for both API documentation and validation but unfortunately, decorators do not support mutual recursion.
The problem lies with how mutual recursion is solved: returning the reference to the mutually recursive class lazily requires a deferred function call. If you don’t do this, you inevitably encounter a runtime error trying to access a value in the temporal dead zone.
This limitation stems from the lack of support of decorators in class expressions.
Because of this limitation, we could not accurately express the entire SQL structure with using the traditional Nestjs decorator-based approach.
Validation Solution
It is unfortunate that the SQL endpoint could not use the existing and amazing class validators we had used elsewhere in the Nestjs application. We needed another validator that could handle mutual recursion.
Arriving at the final solution took a couple of attempts to get right.
Attempt One: Joi
Inspired by the Nestjs documentation, we initially tried using joi
to validate the incoming SQL request. We followed the existing examples by creating joi
schemas and validation pipes.
At first blush, the outcome was actually quite pleasant.
Since joi
has support for mutual recursion, we were able to express complex, recursive SQL structures in our validation schemas. Additionally, the validation errors were very succinct and helpful.
Unfortunately, this approach created a new problem.
Since we wanted to use the OpenAPI decorators to generate Swagger documentation, we still needed to create DTOs for our application.
The problem with the joi
approach is that if either the DTOs or the joi
validation schema needed to change, both needed to be kept in sync. To make matters worse, there was no compiler support to prevent drift between the two structures.
Attempt Two: io-ts
To ensure a more stable solution, we wanted to rely on the TypeScript compiler to flag the code in places where the two structures were out of sync.
To achieve this, we scrapped the joi
schema approach and implemented the validations with another npm package: io-ts
. The motivation for choosing io-ts
is the package’s clever use of TypeScript’s type assertions to provide both static type checking and runtime type checking.
In making this move, we defined the specifications of the SQL structure using the available io-ts
types and combinators. From this, we were then able to infer the TypeScript definition of those types. Finally, to achieve the goal of tying the DTOs to the validations, we applied the implements
keyword.
Although we still had to keep both the io-ts
schema and the DTOs in sync, the new workflow was more palatable. Any change introduced in the schema would cause the compiler to complain until the implementation was updated.
The Sub-Select Schema
When defining the interface for our SQL endpoint, we will start by introducing the structure of the sub-select.
🍿 In later articles, we’ll explore in greater detail the more interesting shapes such as
ExpressionItem
andFromItems
.
This structure describes the top-level properties of the standard SQL SELECT
clauses you’d expect to see: WITH
, SELECT
(optionally DISTINCT
), FROM
, WHERE
, GROUP BY
, HAVING
, ORDER BY
, LIMIT
and OFFSET
.
There are two parts to this snippet: the SubSelect
interface and the io-ts
runtime shape TSubSelect
, as shown below:
import * as t from 'io-ts'; export interface SubSelect { with?: WithItem[]; select: SelectItem[]; distinct?: boolean; from?: FromItems; where?: ExpressionItem[]; groupBy?: ExpressionItem[]; having?: ExpressionItem[]; orderBy?: OrderByItem[]; limit?: number | null; offset?: number | null; } export const TSubSelect: t.Type<SubSelect> = t.recursion('SubSelect', () => t.intersection([ t.type({ select: t.array(TSelectItem), }), t.partial({ with: t.array(TWithItem), distinct: t.boolean, from: TFromItems, where: t.array(TExpressionItem), groupBy: t.array(TExpressionItem), having: t.array(TExpressionItem), orderBy: t.array(TOrderByItem), limit: t.union([t.number, t.null]), offset: t.union([t.number, t.null]), }), ]), );
😵 There is a lot to unpack here! We’ll go through various examples of each of these structures to further explain what’s going on. To further understand this structure, begin reading the innermost properties and then expand outward.
If this is your first time seeing an io-ts
example, then the above is a great introduction for you. Basically, every foundational element you would need is present.
This table compares each property’s TypeScript type against its corresponding io-ts
type:
Property | Description | Typescript | io-ts |
---|---|---|---|
with | Array of CTEs | WithItem[] | t.array(TWithItem) |
select | Array of columns | SelectItem[] | t.array(TSelectItem) |
distinct | Adds DISTINCT | boolean | t.boolean |
from | Array of tables | FromItems | TFromItems |
where | Array of expressions | ExpressionItem[] | t.array(TExpressionItem) |
groupBy | Array of expressions | ExpressionItem[] | t.array(TExpressionItem) |
having | Array of expressions | ExpressionItem[] | t.array(TExpressionItem) |
orderBy | Array of sorts | OrderByItem[] | t.array(TOrderByItem) |
limit | Adds LIMIT | limit | null | t.union([t.number, t.null]) |
offset | Adds OFFSET | limit | null | t.union([t.number, t.null]) |
These define the properties of an object, and objects can be made with two io-ts
types:
t.type
Defines an object where each property is required;t.partial
Defines an object where each property is optional.
As you can see by our TypeScript interface, we want the select
property to be required but the remaining properties are optional. This is done in TypeScript by specifying ?
after a property.
To make a composite object that has a mixture of required and optional properties we create an intersection of the t.type
and t.partial
interfaces using t.intersection
.
Lastly, when creating a recursive structure, we must first define the interface and then we re-implement that in io-ts
by creating a function that can lazily return the shape of the SubSelect
. This is accomplished using the t.recursion
type.
😬 Unfortunately, both parts are necessary because
io-ts
cannot implement recursion without a hint.
While it is unfortunate that we cannot avoid some repetition, we can ensure that the two structures are equivalent. Specifying the t.Type<>
type annotation on the left-hand side of the assignment operation causes the TypeScript compiler to verify the value being assigned is compatible with that type. In this case, the type of the return value is inferred and then checked against the static template.
We’ve covered the top-level properties of the SubSelect
at a high level. We’ve also introduced the basics of io-ts
shapes. In later articles, we’ll explore more details about the SubSelect
. Finally, we’ll build on this foundation to create even more complex structures and achieve the FullSelect
.
Example
Here is an example of the types of queries the schema supports along with the generated SQL the parser produces.
This SQL SELECT
statement will return the estimated number of rows in a table. The query relies on physical disk usage attributes to compute an estimate.
SELECT CAST( CASE WHEN reltuples < 0 THEN NULL WHEN relpages = 0 THEN CAST(0 AS REAL) ELSE reltuples / relpages END * ( PG_RELATION_SIZE(oid) / CAST(CURRENT_SETTING('block_size') AS INTEGER) ) AS BIGINT ) FROM pg_catalog.pg_class WHERE oid = CAST('my_table.my_schema' AS REGCLASS)
To implement this query, the schema looks like this:
{ "select": [ { "operator": "CAST", "expression": { "operator": "*", "source": { "operator": "CASE", "when": [ { "where": { "operator": "LT", "source": { "column": "reltuples" }, "target": { "value": 0 } }, "then": { "value": null } }, { "where": { "operator": "EQ", "source": { "column": "relpages" }, "target": { "value": 0 } }, "then": { "operator": "CAST", "expression": { "value": 0 }, "dataType": "REAL" } } ], "else": { "operator": "/", "source": { "column": "reltuples" }, "target": { "column": "relpages" } } }, "target": { "operator": "()", "value": { "operator": "/", "source": { "functionName": "PG_RELATION_SIZE", "arguments": [{ "column": "oid" }] }, "target": { "operator": "CAST", "expression": { "functionName": "CURRENT_SETTING", "arguments": [{ "value": "block_size" }] }, "dataType": "INTEGER" } } } }, "dataType": "BIGINT" } ], "from": [ { "alias": "", "operator": "FROM", "allowedTableName": "pg_catalog.pg_class" } ], "where": [ { "operator": "EQ", "source": { "column": "oid" }, "target": { "operator": "CAST", "expression": { "value": "my_schema.my_table" }, "dataType": "REGCLASS" } } ] }
What’s Next?
Building the schema for the SQL endpoint was both challenging and rewarding. Along the way, we learned how to build immensely rich and detailed io-ts
schemas. We also learned more about previously unexplored SQL syntax.
The schema we created employed a number of techniques such as enumerations and discriminating unions, intersections of unions, and mutual recursion. It is highly extensible and quite fun to maintain.
What’s more, the front-end was able to consume this new API and convert all of their existing, raw SQL strings into structured JSON requests.
In the next article, we’ll dive deeper into the components that define the schema. In later articles, we’ll explore how the structure is parsed and the SQL string is created. I can tell you that we would never have built this level of capability if the parser was difficult to develop. When you see the parser, you’ll be amazed at how very little code was required to add all these features.
Even later articles will explore the highly detailed Swagger documentation we generated through Nestjs Swagger decorators.