# Wednesday, 18 April 2007

Working on a particular data intensive project has thought me a lot about SQL optimization both about structure optimization and stored procedure optimizations. We all know about introducing redundancy and indexes to improve read performance but I've picked up a couple of tricks from our resident SQL expert I thought I'd pass along. Please bear in mind that these are only my observations of how our systems behaves your mileage may vary.

Main Problem Area

For our web solutions the performance issues we generally encounter with an application are usually related to the data tier. I can recall maybe two instances where the application tier was the culprit otherwise always the data tier. When diagnosing performance issues you can save yourself a lot of time by eliminating the frontend and going directly to the database. Also determining whether you have a problem in the data tier is pretty straight forward using the Profiling tools of SQL Server. Granted my experience with Windows application is limited but I would imagine that the likelihood of more issues related to memory and CPU usage cropping up is higher but you'll probably still need to deal with some kind of database if you're doing anything in the business.

Temporary Tables / Table Variables

My number one tip for performance optimization when dealing with large amounts of data is to try to limit the number of rows you're dealing with. It's a nobrainer using filters to do so but my .NET brain needed some time to get used to the idea of doing it using temporary tables and tables variables. For some reason I had a hard time dealing with the fact that I should pull data out of a comfortable data structure and stick into a data variable somewhere; maybe because I'm used to avoiding transferring a lot of data across the database boundary into my application. Keeping in mind that we don't actually move the data outside the database and only store the temporary tables for the duration of the execution of our stored procedure it makes a lot of sense to pick out the rows we need and only work on the smaller set.

I had a case yesterday where a query would execute for almost a solid two minutes, actually 1 minute and 40 seconds. The case was a view joined onto itself to do some rolling calculations. By extracting my set from the view once and placing it into a table variable I was able to squeeze execution down to  6 seconds. That's an improvement in the order of 16x people! :)

Default Parameter Values vs. Execution Plans

An interesting case came up a while back where when we were trying to figure out why the users were experiencing long execution times when doing a particular operation. However much we tried we were unsuccessful at reproducing the problem. Whenever we'd run the procedure the performance was great and very much in line with what we were expecting to see. Doing the operation from the interface ourselves did yield the same lack of performance. Basically we were seeing great performance when executing the procedure directly in Management Studio and poor performance when execution the same procedure from the application UI.

Eventually we figured it out: It turned out that the stored procedure had some default parameter values set for a couple of optional search parameters. The application UI would not pass these parameters to the stored procedure and SQL Server would select an execution plan optimized for the default scenario. We were passing all the parameters to the procedure in Management Studio and were causing SQL Server to select a more optimized execution plan thus giving us the better performance and the impression that "something was happening between ASP.NET and SQL Server".

Removing the default parameter values and always passing in the same number of parameters fixed the issue. Probably more a gotcha that anything else :)

WITH RECOMPILE

Ah WITH RECOMPILE, I remember it fondly from some years ago when I was studying for the SQL Server 2000 exam and I thought to myself, "I'm NEVER going to need that". Well it turns out that I did need it and the scenario was a text book example too. Basically we have an application in production which visualizes work load for an entire week. When the week starts out the users start scheduling that work for individual work days attaching a date and the actual work needing to be handled. Basically the result of this is that the dataset Monday morning is vastly different from the dataset Tuesday when most of the work is scheduled and good to go.

The main take away from this is this: When your datasets change a lot from execution to execution you should look into adding WITH RECOMPILE to stored procedures working on the dataset.

IF Statements

Yes IF statements! The rule of thumb is that they're evil. Ok maybe not but I've seen a couple of instances where the IF statement actually affected performance in a negative way because SQL Server was unable to select the best execution plan. Also in most cases you can embed the IF logic in a single SQL statement using the where clause like so

Redundancy

I know that redundancy is a pretty well known technique but I still want to mention it because you can gain massive performance increased by sprinkling a little bit of redundancy into your schemas. The basic idea is to avoid doing calculation by storing the result but course this introduces the possible of data which contradicts each other so tread carefully.

For this one we had data structures containing status history for inventory. For each item we knew when it had been set to a specific status so we just went ahead and used that existing knowledge to determine inventory status for a particular item. Not what I would call a speed demon once we started adding aprox. 150,000 items to the inventory. Simply adding the current status code to the item gave us many times the performance we were seeing and it scales well.

VIEW

Views are a great way of simplifying queries but unfortunately we can't use filtering in many cases so we end up with a lot of views assembled from various table. Before we can limit the resultset SQL Server will have to give us back the entire thing which isn't a very cost effective way of doing things. We tried to create very general purpose views but that just got us in the situation I just described so instead we went with more specific views with lots more filtering inside them. Alternately simply move the results from the view into a temporary store which also gave us some great results.

In many cases we need to join a view onto itself in order to do rolling summations to determine what data is available in the future. Basically a kind of what if calculation without actually going ahead and updating anything. This is a very intensive process where the general purpose views needed to be assembled a couple of times before we could get our results back. Limiting the resultset was the only option to actually make this stuff perform at acceptable levels.

Finally

This is not a very scientific SQL optimization post nor is it intended to be. You could probably pick apart my observations and start talking about the what, the why, exclusive lock this, dirty read that, etc. but instead think of it as a more practical look at SQL optimization.I bring you this post with some examples because you might have an easier time determining what action to take if you face situation similar to the ones I just described.

For me starting out with a SQL optimization is pretty vexing because looking at the SQL statement reveals nothing. I really need to get into the meat of things before the possibilities start to reveal themselves. Once I pry open the statement points of issue blossom before my very eyes to reveal where I can get my results. The moment when I see a query speed up from minutes to mere seconds is probably the most satisfying part of my job. Quick wins don't come along very often in software development but SQL affords some of the best ones :)

posted on Wednesday, 18 April 2007 16:32:32 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Wednesday, 08 December 2004

A whole bunch of webcasts about SQL Server 2005 were recorded yesterday and is up for your viewing pleasure. Be sure to check out some of these as they are sure to contain information for you as a developer which you'll need in the future.

The Microsoft SQL product team is hosting a special week of live MSDN Webcasts to provide you a sneak peek at what Microsoft SQL Server 2005 Beta 3 has in store for the future of database development. SQL Server 2005 offers a new paradigm for database development that integrates SQL Server and the CLR to provide several major benefits, including enhanced programming mode, enhanced safety and security, user defined types and aggregates, and a common development environment where database development is integrated into the Microsoft Visual Studio 2005 development environment.

SQL Server 2005 Webcasts

posted on Wednesday, 08 December 2004 08:28:05 (Romance Standard Time, UTC+01:00)  #    Comments [0] Trackback
# Friday, 12 November 2004

OMG the Reporting Services capabilities in SQL Server 2005 are going to be way too cool! I have been doing some reports in Reporting Services for SQL Server 2000 and something seemed to be missing: The ability for the business user to create simple ad hoc reports. I am watching a webcast where they are demoing that exact capability. Very cool stuff.

By the way I found the webcast via Eric Nelson's post SQL Server 2005 Reporting Services Webcast available on demand from Technet.

posted on Friday, 12 November 2004 11:34:58 (Romance Standard Time, UTC+01:00)  #    Comments [0] Trackback
# Monday, 09 August 2004

I have been wanting to post about this for a while but have forgotten to do so. Roy Osherove did a really cool post on how to use Enterprise Services (which come with the framework) to simplify the clean-up after database testing.

He basically uses the builtin transaction support to rollback after each test is complete. This is definitely something I will be incorporating in my future tests of DAL components.

Anyway he has some great code examples you should check out if this stuff at all interests you.

posted on Monday, 09 August 2004 11:29:29 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Tuesday, 27 July 2004

I passed this exam a while back and took a whole bunch of notes in the process of preparing for it which I typed up and forgot about. I found the notes this week and thoguht that I would post them here so that they wouldn't get lost again ... and maybe, just maybe some of you might find them useful.

Oh yeah - they are in Danish. Sorry, I just can't find the time to translate them to English so this will be for all the Danish database developers out there :)

70-229 Designing and Implementing SQL Server 2000 Databases - noter.doc (101 KB)
posted on Tuesday, 27 July 2004 14:22:15 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Saturday, 19 June 2004

Darrell Norton posted a stored procedure which will change all users on objects to DBO. Pretty useful if you, like me, sometimes forget to specify the owner when creating an object.

posted on Saturday, 19 June 2004 16:53:48 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback