Books




Articles‎ > ‎

Reducing SQL Calls

Why reduce the number SQL calls?

You may have a naive implementation which does too much work.  You may have poorly designed code.  However, the best reason is because you profiled the code and the large number of SQL calls is your current bottleneck.

Measure Twice, Cut Once

Like carpentry, when doing any sort of optimization you should profile your code before you make any changes.  Then you should actually understand what the profile is telling you.  Get context for the profile by asking questions. What are the metrics from you database server telling you? Are there simply too many calls?  Are there a few calls that take a really long time, because of IO, CPU?  How many times is the code path that we are profiling executed?  And my favorite question, what could be missing from the metrics?

Anecdote

A while back I had been trying to find an explanation for why every dynamic web page with a database component seemed uniformly slower than I expected.  It felt like there was a lot of work going on, even in the pages where there should have had been very little.

I figured I could shortcut the profiling by just looking at the database since it should have a perfect picture of all the database calls.  I got a list of SQL queries, but nothing in the list looked unusual.  I got operating system metrics and it seems like both CPU and overall network usage is high. I then look at the application server and saw network usage is also higher than expected.

I decided to bite the bullet and profile a page.  I process the raw call profile into a summary roll-up of SQL calls getting total counts for each SQL operation.  I immediately saw the problem was the number of database pings.

Some legacy code we have, had been doing an inordinate amount of db pings. The pings never showed up in the SQL listing because our database so heavily optimized them that they didn't even show up in the SQL statement list. I fixed the code and got 10%-15% percent reduction in network traffic, database load, and page response time.

Fix the Problem, Not the SQL

Now you know where the bottleneck is and you probably have a rough idea of how to fix the SQL.  DON'T fix the SQL.  Instead you should ask your self what you actually need from the database.  Once you determine that you actually need the data from the problematic SQL, then you can start looking at how to fix it.

If your constantly requesting a fixed number of rows using unique IDs, you can use an IN statement to reduce N calls to 1.  Be careful, apprise yourself about the limitations of IN statements. There is a another tradeoff to using in statements.  Most databases cache based on the query.  If you dynamically construct your IN statement you may not be using the database cache as well as you could if you avoided the IN statement all together.  Usually it's better to make sure you are effectively using the database's cache.

If your getting most of the data from a small table, you might as well read in the entire table in one call.

Obviously if your getting the same data over and over again, don't. Cache the data in memory for the life time of the call.

If you need to get more aggressive you can use joins, materialized views, stored procedures, etc to roll-up data that would normally be in different db calls.  The trade off is more complexity in maintenance and possibly more moving parts.