Using EntityGraphs to solve the N+1 Query problem

The application performance represents the biggest challenge that developers focus on the most, in order to build applications and websites with a good user experience. This allows delivering an appropriate product to users. Since the user experience is often related to the average response time, it becomes necessary to minimise it, and usually the bottleneck is the number of queries sent to the database.

Regarding backend performance, loading related entities with JPA and Hibernate is a convenient feature, but it's also the most common cause of performance problems, which might be triggered by one of the following factors :

  • related entities are eagerly loaded, even if they are not needed or.
  • related entities are lazily loaded which creates N+1 query issue if they are needed.

In this article we're going to explain how to prevent N+1 query problem using Spring Data JPA EntityGraphs.

What is N+1 Query Problem?

The N+1 query problem is a performance anti-pattern that happens when an ORM (Object Relational Mapping) like Hibernate executes N additional query statements to fetch the same data that could have been retrieved when executing the primary query. The query count is N+1, with N being the number of queries for every result of the initial query.

How to identify N+1 Query Problem?

To detect this problem in your application, Hibernate provides tracing options that enable SQL logging in the console. By analysing these logs we can easily detect if hibernate is issuing N+1 queries for a given call.

To activate these logs we can add the following settings to your application.properties file :

#Show all queries
spring.jpa.show-sql=true
spring.jpa.properties.hibernate.format_sql=true
#Turn Statistics on
spring.jpa.properties.hibernate.generate_statistics=true
logging.level.org.hibernate.stat=debug

Example

Consider the following model, a product can have a category and list of stores associated with it. Below is a simplified entity for a product.

@Data
@Entity
public class Product {
    @Id
    @GeneratedValue
    private Long id;

    private String productName;

    @ManyToMany(fetch = FetchType.LAZY)
    private Set<Stores> stores;
    
    @ManyToOne(fetch = FetchType.LAZY)
    private Category category;

}
Product.java

Imagine a user interface that displays products and its availability in all the stores. For that we'll need to fetch a list of Products and their Stores at the same time.

However, because Product and Store have a ManyToMany relationship in the model, the default fetching style is Lazy. We are unable to get stores from the Product directly.

To get stores, the first thing that comes to mind is to execute one request to the database to get a list of Products, and then execute another request to the database for each Product to retrieve the product's stores. As is shown in the following example :

@Repository
public interface ProductRepository extends JpaRepository<Product, Long> {
    List<Product> findAll();
}
ProductRepository.java

And bellow is the logic of fetching data :

@AllArgsConstructor
@Service
public class ProductService {
    private final StoreRepository storeRepository;
    private final ProductRepository productRepository;

    public List<ProductDTO> getProductsWithStorePosition() {
        List<ProductDTO> productDTOList = new ArrayList<>();
        List<Product> productList = productRepository.findAll();
        
        for (Product product : productList) {
            List<Store> stores = storeRepository.getStoreByProducts_Id(product.getId());
            productDTOList.add(ProductDTO.buildProductDTO(product, stores));
        }
        
        return productDTOList;
    }

}
ProductService.java

After inspecting the logs, we find that we first fetch a list of Products :

Hibernate:
select
product0_.id as id1_7_,
product0_.category_id as category3_7_,
product0_.product_name as product_2_7_
from
product product0_

Then a series of queries are executed to get a list of Stores for each Product, one of which is shown below :

Hibernate:
select
store0_.id as id1_9_,
store0_.description as descript2_9_,
store0_.position as position3_9_
from
store store0_
left outer join
store_products products1_
on store0_.id=products1_.store_id
left outer join
product product2_
on products1_.products_id=product2_.id
where
product2_.id=?

We can fix this problem by fetching each list of products with their stores. To do that we can use Join fetch, explained in the next section.

Join fetch

To avoid the N+1 query problem, we can use JOIN FETCH. The FETCH keyword of the JOIN FETCH statement is JPA-specific. It instructs the persistence provider to not only join the two database tables contained in the query, but also initialize the association on the returned entity. It works with both JOIN and LEFT JOIN statements.
Creating a JPQL query with a join fetch clause to retrieve products with stores :

@Repository
public interface ProductRepository extends JpaRepository<Product, Long> {
    @Query("SELECT p FROM Product p inner JOIN FETCH p.stores")
    List<Product> findAll();

}
ProductRepository.java (using join fetch)
@AllArgsConstructor
@Service
public class ProductService {
    private final ProductRepository productRepository;

    public List<ProductDTO> getProductsWithStorePositionUsingJoinFetch() {
        List<ProductDTO> productDTOList = new ArrayList<>();
        List<Product> productList = productRepository.findAll();
        
        for (Product product : productList) {
            Set<Store> stores = product.getStores();
            productDTOList.add(ProductDTO.buildProductDTO(product, stores));
        }
        
        return productDTOList;
    }
}
ProductService.java (using join fetch)

Now when we call the findAll method, it runs a single SELECT query with JOINs :

Hibernate:
select
product0_.id as id1_7_0_,
store2_.id as id1_9_1_,
product0_.category_id as category3_7_0_,
product0_.product_name as product_2_7_0_,
store2_.description as descript2_9_1_,
store2_.position as position3_9_1_,
stores1_.product_id as product_1_8_0__,
stores1_.stores_id as stores_i2_8_0__
from
product product0_
inner join
product_stores stores1_
on product0_.id=stores1_.product_id
inner join
store store2_
on stores1_.stores_id=store2_.id

Although performing join queries when necessary is sufficient, in a large-scale enterprise project, the number of such methods will almost certainly expand, and there will likely be many almost identical JPQL queries required if not for the fetched associations. As a result, EntityGraphs are a better fit in these situations.

JPA EntityGraphs

EntityGraphs are introduced in JPA 2.1 and used to allow partial or specified fetching of objects. When an entity has references to other entities we can specify a fetch plan by EntityGraphs in order to determine which fields or properties should be fetched together.

- Fetch Graph vs Load Graph

There are two types of EntityGraphs, Fetch and Load, which define if the entities not specified by attributeNodes of EntityGraphs should be fetched lazily or eagerly.

FETCH: It is the default graph type. When it is selected, the attributes that are specified by attribute nodes of the entity graph are treated as FetchType.EAGER and attributes that are not specified are treated as FetchType.LAZY.
LOAD: When this type is selected, the attributes that are specified by attribute nodes of the entity graph are treated as FetchType.EAGER and attributes that are not specified are treated according to their specified or default FetchType.

Below is an example of using EntityGraphs with Load fetch type :

@Repository
public interface ProductRepository extends JpaRepository<Product, Long> {
    @EntityGraph(attributePaths = {"stores"}, type = EntityGraph.EntityGraphType.LOAD)
    List<Product> findAll();
}
ProductRepository.java (using EntityGraph)
@AllArgsConstructor
@Service
public class ProductService {
    private final ProductRepository productRepository;
  
    public List<ProductDTO> getProductsWithStorePositionUsingEntityGraph() {
        List<ProductDTO> productDTOList = new ArrayList<>();
        List<Product> productList = productRepository.findAll();
        
        for (Product product : productList) {
            Set<Store> stores = product.getStores();
            productDTOList.add(ProductDTO.buildProductDTO(product, stores));
        }
        
        return productDTOList;
    }
}
ProductService.java (using EntityGraph)

And, when executing the above findAll method, Hibernate generates the following SQL SELECT query :

Hibernate:
select
product0_.id as id1_7_0_,
store2_.id as id1_9_1_,
product0_.category_id as category3_7_0_,
product0_.product_name as product_2_7_0_,
store2_.description as descript2_9_1_,
store2_.position as position3_9_1_,
stores1_.product_id as product_1_8_0__,
stores1_.stores_id as stores_i2_8_0__
from
product product0_
inner join
product_stores stores1_
on product0_.id=stores1_.product_id
inner join
store store2_
on stores1_.stores_id=store2_.id

For more complex and reusable graphs we can describe a fetch plan with its paths and boundaries with @NamedEntityGraph annotation in the entity class.

- NamedEntityGraph

The definition of a named entity graph is done by the @NamedEntityGraph annotation at the entity level. It defines a unique name and a list of attributes (the attributeNodes) that shall be loaded. The following example shows the definition of the entity graph "graph.Product.stores" which will load the list of Stores of a Product.

@Data
@Entity
@NamedEntityGraph(name = Product.WITH_STORES_GRAPH,
        attributeNodes = @NamedAttributeNode("stores"))
public class Product {
    public static final String WITH_STORES_GRAPH = "graph.Product.stores";
  
    @Id
    @GeneratedValue
    private Long id;

    private String productName;

    @ManyToMany(fetch = FetchType.LAZY)
    private Set<Store> stores;

    @ManyToOne(fetch = FetchType.LAZY)
    private Category category;
}
Product.java (using NamedEntityGraph)

And we can assign the entity graph to repository methods as we like :

@Repository
public interface ProductRepository extends JpaRepository<Product, Long> {
    @EntityGraph(value = Product.WITH_STORES_GRAPH, type = EntityGraph.EntityGraphType.LOAD)
    List<Product> findAll();
}
ProductRepository.java (using NamedEntityGraph)

- Entity SubGraphs

For the sake of the article, let's consider this code snippet that represents a store entity which has a ManyToOne relation with a lazily fetched City.

@Data
@Entity
public class Store {
    @Id
    @GeneratedValue
    private Long id;
    private String storeName;
    private String position;

    @ManyToOne(fetch = FetchType.LAZY)
    private City city;
}
Store.java

Let's suppose we want to retrieve a product with a list of stores and their city using EntityGraph to avoid N+1 query problem. We can specify nested relations in the “attributePaths” using @EntityGraph in a specific repository method by separating them like so :

@Repository
public interface ProductRepository extends JpaRepository<Product, Long> {
  
   @EntityGraph(attributePaths = {"stores.city"}, type = EntityGraph.EntityGraphType.LOAD)
   List<Product> findAll();
}
ProductRepository.java

In case we opted for NamedEntityGraph we can achieve the same results using the subgraph attribute of this annotation.

We can use subgraphs to create complicated EntityGraphs with several levels. A subgraph is an entity graph, that is part of another entity graph or entity subgraph.

@Data
@Entity
@NamedEntityGraph(name = Product.WITH_STORES_AND_CITY_GRAPH,
        attributeNodes = {
                @NamedAttributeNode(
                        value = "stores",
                        subgraph = "city-subgraph"
                )
        },
        subgraphs = {
                @NamedSubgraph(
                        name = "city-subgraph",
                        attributeNodes =
                                {
                                        @NamedAttributeNode("city")
                                }
                )
        }
)
public class Product {
    public static final String WITH_STORES_AND_CITY_GRAPH = "graph.Product.stores.city";
    @Id
    @GeneratedValue
    private Long id;

    private String productName;

    @ManyToMany(fetch = FetchType.LAZY)
    private Set<Store> stores;

    @ManyToOne(fetch = FetchType.LAZY)
    private Category category;
}
  • Subgraph : attribute to specify the subgraph name for a particular entity graph attribute.
  • Subgraphs : attribute of the @NamedEntityGraph annotation to specify the subgraph details.
  • @NamedSubgraph : annotation specifies the subgraph details. The name attribute is the name of the subgraph.
  • @NamedAttributeNode : annotation to add the attributes to the subgraph.

Conclusion

As we've seen, JPA's EntityGraphs provides a relatively simple and flexible API to boost the performance of applications with complex domain associations that require heavy database interactions and therefore makes for an elegant yet powerful solution to the N+1 query issue.

In this article, we covered how to use annotations to generate EntityGraphs and entity subgraphs. You can also use EntityManager API to define EntityGraphs by calling the createEntityGraph() method and then integrating it as a hint to Criteria and JPQL queries through the setHint() method.