N+1 problem in GraphQL
Last updated 10 October 2021
In my last article, I talked about the differences between GraphQL and REST and the scenarios where one is better than the other. I also mentioned that GraphQL solved the over-fetching issue that REST APIs have which help developers to improve the performance of their applications. In GraphQL, however, there is a common problem called n+1 problem. In this article, I will discuss the n+1 problem and how we could mitigate it.
The n+1 problem occurs because GraphQL executes a separate function – called resolver – for each field. REST, on the other hand, has only one resolver per endpoint. These additional resolvers make GraphQL has the risk of making extra round trips to the database.
{
authors { # fetches authors (1 query)
id
name
posts { # fetches posts for each author (1 query for n authors)
id
name
}
}
}
In the above example, the server makes one database call to fetch the authors, then makes one database call for each author (n times) to fetch the author's posts. For example, if we have 100 authors, then the server will make 101 database calls – one call to get the authors, and 100 calls to get the posts for each author. The computing power of these extra database calls is massive when applied to a larger dataset. Let's say the time required to do a single database call is ten milliseconds. So, the total time needed to resolve this request will be 1,010 milliseconds. What happens if we have a million authors?
Also, in GraphQL, neither clients nor servers can predict how expensive a request is until the query is executed. GraphQL only uses a single endpoint, and the potential size of incoming requests is unknown. In REST, on the other hand, costs are predictable because there's only one trip per endpoint requested.
Facebook previously introduced a solution to the GraphQL n+1 problem by creating DataLoader – a library that batches requests specifically for Node.js. This technique enables servers to define batch loaders that tell how to group and fetch similar data. The benefits of using batched requests are huge as it massively decreases the computing power needed to process the same requests because the server doesn't need to make unnecessary database round trip.