Technical Rants‎ > ‎

Hibernate Tips: Efficient Usage of Bag for Association

More and more people adopted Hibernate as their database development tool, as its implementation of ORM enables developers to manipulate tuples in relational database in object-oriented way. However, using Hibernate is not as easy as it seems like. Let’s have a look at a simple example(NHibernate), adding two new children to the father loaded from hibernate session :

Sample Hibernate Code

            var fatherId = Guid.NewGuid();

            var father = Session.Load<Person>(fatherId);

            var boy = new Person("terry");

            var girl = new Person("mary");

            father.Children.Add(boy);

            father.Children.Add(girl);

            Session.Save(father);

 

And the mapping file for the class Person is as follows:

 

Sample Mapping File

<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2" default-access="fieldassembly="Domain.Test" namespace="Domain.Test">

  <class name="Person" table="person">

    <id name="id"  column="personId" type="Guid" >

      <generator class="guid"/>

    </id>

    <bag name="children" inverse="false">

      <key column="id"/>

      <one-to-many class="Domain.Test.Person, Domain.Test"/>

    </bag>

  </class>

</hibernate-mapping>

 

Execute it and it works fine! However, if you attach a sql profiler to your database, you will find that hibernate will first issue delete statement to delete all the children first and then add them back to the father one by one, which means that when you need to add two children, hibernate will issue one delete sql statement and two insert statement! In this case, if the size of father’s children keeps growing, the number of sql issued for a simple add will also increase, which is intolerable.

 

The reason why hibernate need to delete and add them all back is that the association mapping between father and children is bag and the inverse attribute of it is false!  Using bag for association mapping, the domain objects in the collection could be duplicate and without order. However, it is these duplicate and unordered features make hibernate no way to know which domain objects are newly added and which domain objects have been updated or deleted. Thus, the only correct way for hibernate to deal with the change is to remove all the old children and save all the new ones in the collection, which hurts the performance a lot!  What’s worse, bag mapping also could not support fetch-join, which could be adopted to resolve N+1 problem.  The recommended collection mapping for this usage is to use set instead of bag. Duplicate objects are not allowed in set, so hibernate could differentiate which domain objects are newly added and only issue the necessary add sql statement.

 

After the analysis, it seems that using bag is really a bad practice in hibernate, but why does it still exist? Actually, if you use bag correctly, it will improve also your performance. The change should be made like this:

 

Change inverse to true

    <bag name="children" inverse="true">

      <key column="id"/>

      <one-to-many class="Domain.Test.Person, Domain.Test"/>

    </bag>

 

Changing inverse to true means that the modification of father’ children collection will not handled by hibernate. In other words, the association between father and children is maintained by children. Let’s look at the code of adding children to father:

 

Adding Children to Father

            father.Children.Add(boy);

            father.Children.Add(girl);

 

The execution of  father.Children.Add(boy) will no longer require hibernate to fetch the content of the Children collection from database before the add action, which of course would improve the performance.  This efficient way to use bag is actually to make the children colletion of father “read only”: After children data is loaded from database and assigned to father, further change will not be handled by hibernate. Then, if you want to add the children, you should use the assignment of children’s parent property:

 

           

Assign Father to Children

            boy.Parent = father;

            girl.Parent = father;

 

The summary of using bag is listed as follows:

  1. Always try set for association mapping first
  2. Using bag only when the collection is expected to be “read only” for hibernate
  3. Do remember to set “inverse = true”
  4. Let the objects in the other side to maintain the association
Comments