tscs37's Blog

Graph-based Request Routing

HTTP Routers are largely based around string-matching with placeholders. When a request for /api/customers/1/billing comes in, a patterns like /api/:relation:/:id:/:relation: might match on the router, prompting it to call a very specific function with the mentioned parameters.

But what if it could be made easier? This is where graphs come in. Graphs are a interesting structure, more so when they are a data structure in a computer. When a computer needs to resolve dependencies for a software installation, you’re walking a type of graph, most likely a DAG, a directed, acyclic graph. The internet can be represented with a directed graph, arrows pointing to and from websites as they link to eachother.

Graph databases are most ideal to representing social relationships between humans and graphs can be used to represent the block history of bitcoin. Almost all tree structures we use in today’s databases are a type of graph by nature.

In short, graphs are everywhere. So why aren’t they in HTTP routers? The dependency relations of data and parameters should be representable in a graph without problem.

Dissecting Requests

The first step in this journey is to prepare requests by taking them apart.

A Request is turned into a collection of linear, directed graphs, meaning the graphs do not have any branches, similar to a chain. One of these graphs would be the URL, another the Parameters. Additionally it contains the other metadata but not as a graph since that would be wasteful. HTTP Headers don’t make good graphs since any consumer might need to touch some random HTTP header.

The request also contains response and metadata generated by the routing.

/api/customers/1/billing now turns into api -> customers -> 1 -> billing, a fairly simple graph.

Graph Consumers

The next step is to build a graph out of request consumers. A consumer is a node on the graph that consumes 0 or more nodes of the request part and will either finish the request or pass it on. The graph is defined by what node in the request graph allows it to be routed to it. If no more nodes can route the request, the dataset the nodes have produced is returned.

The node api for example will only accept requests where the first node is api. The node will set a metadata entry to prevent looping the request, otherwise requests like /api/customers/api/customers would return the customer dataset twice. If it finds the metadata entry set, it should finish the request.

The node customers will require the URL node customers and consume either no or additional nodes if these nodes are a number, assuming it is a customer ID. If it finds no customer ID, it will return the entire customer data set. If it finds the ID, it returns the specified customer. It will consume customer IDs until it hits a routable URL node. The interesting part is that now the node can still continue to route more data. The returned dataset is merely saved but not yet written to the HTTP response.

This is where the billing node comes in. It will react to a billing url route. It checks the result data so far, if it contains multiple customers, it returns the billing data for each customer, if it finds one, it returns only that one. If it finds, for example, a store relation, it can return all billing data for a store instead.

The billing node is simply a node that takes existing data and returns related billing data.

This is where the magic happens

Let’s introduce a new node; sortby. Sortby consumes 1 node, which is a JSON path specification to point to an element by which to sort. It returns the current dataset sorted by the specified path.

The sortby node is composable! /api/customers/1/2/3/billing/sortby/.customer.postcode will return all billing data for the customers 1, 2 and 3, sorted by postcode.

We can introduce other, similar nodes that perform other functions. Each node can compose data based on the URL query graph and already existing data. If you want sorting, you don’t need to tell each REST endpoint to sort the output, you introduce a single sort function that works on every dataset equally.

Graph based routing also allows one to compose middlewares fairly easily in a composable manner.

A node like auth can require user credentials and add user metadata to the request. Nodes following it can use this data and see that authentication was successfull, if not they can restrict to only public data, which might only be no data at all.

Conclusion

This is still a rather short post and I’m merely exploring the idea of how this could work. Consider it a “write it down to see how bad this idea is” post.

I think Graph-based routing is an idea worth exploring at some point but I imagine performance of these routers would be subpar.