How to avoid N+1 Query

Bakti Pratama
3 min readFeb 18, 2021

Why is it so slow???” One of the top reasons for slow performance in backend is N+1 query. It is a query to load the main model and N queries for loading the associated model, inefficient query, and a common performance anti-pattern.

It is much faster to issue 1 query which returns 100 results than to issue 100 queries which each return 1 result.

Let’s take a look at this example with rails active record. Let’s say you have two tables on your DB, cars and variants, where an entity in cars can have many variants.

class Car < ActiveRecord::Base

has_many :variants
...
end
class Variant < ActiveRecord::Base

belongs_to :car
...
end

You want to process all the variants for all of the BMW brand from your cars, you may be doing something like this:

cars = Car.where(brand: 'BMW')
cars.each do |car|
variants = car.variants
# Process....
end
> SELECT `cars`.* FROM `cars` WHERE `cars`.`brand` = 'BMW'
> SELECT `variants`.* FROM `variants` WHERE `variants`.`car_id` = 1 > SELECT `variants`.* FROM `variants` WHERE `variants`.`car_id` = 2
> SELECT `variants`.* FROM `variants` WHERE `variants`.`car_id` = 3
> SELECT `variants`.* FROM `variants` WHERE `variants`.`car_id` = 4
> SELECT `variants`.* FROM `variants` WHERE `variants`.`car_id` = 5
..............

The above program is inefficient and N+1 query. This is still okay if you have a small number of cars but if you have a large number, let’s say 1000 BMW cars (I wish), then you will make 1000 queries to the variants table.

How to fix it

Batching the query
For the above example, you can load all variants in batch instead of querying them for each car.

car_ids = Car.where(brand: 'BMW').pluck(:id)
variants = Variant.where(car_id: car_ids)
....

From the above code, you will always make 2 queries no matter how many cars do you have. It’s better than previous one, it’s no longer n+1 query, but it can still lead to a problem when the size of car_ids is too big. You may want to consider to limit the batch size, for example:

car_ids = Car.where(brand: 'BMW').pluck(:id)
car_ids.in_groups_of(100).each do |grouped_car_ids|
variants = Variant.where(car_id: grouped_car_ids)
....
end

Using include method

ActiveRecord has a method called :includes which loads associated records in advance and limits the number of SQL queries made to the database. Depending on what your query is, :includes will use either the ActiveRecord method :preload or :eager_load.

Preload

Car.includes(:variants).where(brand: 'BMW')> SELECT `cars`.* FROM `cars` WHERE `cars`.`brand` = 'BMW'
> SELECT `variants`.* FROM `variants` WHERE `variants`.`car_id` in (1,2,3,4,5,..)

Yes, it makes the same result with the batching on the above section. Preload always generates two SQLs. It fetches the records for the leading model first (cars) and then load the associated records with the leading model (variants).

If your main model is too large, you can still do batching with include. For example:

Car.includes(:variants).where(brand:'BMW')
.in_batches(of:100) do |grouped_car_ids|
....
end

Eager Load

Car.includes(:variants).where(variants:{type:"vehicle"})

> SELECT ... FROM "cars" LEFT OUTER JOIN "variants" ON "variants"."car_id" = "cars"."id" WHERE (variants.type = "vehicle")

Eager loading loads all associations in a single query using LEFT OUTER JOIN. The include method will be forced to use eager load when the where clause is using an attribute from the associated table.

Highlight N+1 query during development

Code review can help to catch the N+1 issue before it goes to prod but sometimes it’s not straightforward to catch. Thanks to bullet gem, it can help you to track and report inefficient queries.

You can integrate bullet gem with your rails project and enable it in test mode. It will print N+1 query into log/bullet.log

Please check here for more details

--

--

Bakti Pratama

Software Engineer | Architect | Manager | A father of two