# Wednesday, 18 April 2007

Working on a particular data intensive project has thought me a lot about SQL optimization both about structure optimization and stored procedure optimizations. We all know about introducing redundancy and indexes to improve read performance but I've picked up a couple of tricks from our resident SQL expert I thought I'd pass along. Please bear in mind that these are only my observations of how our systems behaves your mileage may vary.

Main Problem Area

For our web solutions the performance issues we generally encounter with an application are usually related to the data tier. I can recall maybe two instances where the application tier was the culprit otherwise always the data tier. When diagnosing performance issues you can save yourself a lot of time by eliminating the frontend and going directly to the database. Also determining whether you have a problem in the data tier is pretty straight forward using the Profiling tools of SQL Server. Granted my experience with Windows application is limited but I would imagine that the likelihood of more issues related to memory and CPU usage cropping up is higher but you'll probably still need to deal with some kind of database if you're doing anything in the business.

Temporary Tables / Table Variables

My number one tip for performance optimization when dealing with large amounts of data is to try to limit the number of rows you're dealing with. It's a nobrainer using filters to do so but my .NET brain needed some time to get used to the idea of doing it using temporary tables and tables variables. For some reason I had a hard time dealing with the fact that I should pull data out of a comfortable data structure and stick into a data variable somewhere; maybe because I'm used to avoiding transferring a lot of data across the database boundary into my application. Keeping in mind that we don't actually move the data outside the database and only store the temporary tables for the duration of the execution of our stored procedure it makes a lot of sense to pick out the rows we need and only work on the smaller set.

I had a case yesterday where a query would execute for almost a solid two minutes, actually 1 minute and 40 seconds. The case was a view joined onto itself to do some rolling calculations. By extracting my set from the view once and placing it into a table variable I was able to squeeze execution down to  6 seconds. That's an improvement in the order of 16x people! :)

Default Parameter Values vs. Execution Plans

An interesting case came up a while back where when we were trying to figure out why the users were experiencing long execution times when doing a particular operation. However much we tried we were unsuccessful at reproducing the problem. Whenever we'd run the procedure the performance was great and very much in line with what we were expecting to see. Doing the operation from the interface ourselves did yield the same lack of performance. Basically we were seeing great performance when executing the procedure directly in Management Studio and poor performance when execution the same procedure from the application UI.

Eventually we figured it out: It turned out that the stored procedure had some default parameter values set for a couple of optional search parameters. The application UI would not pass these parameters to the stored procedure and SQL Server would select an execution plan optimized for the default scenario. We were passing all the parameters to the procedure in Management Studio and were causing SQL Server to select a more optimized execution plan thus giving us the better performance and the impression that "something was happening between ASP.NET and SQL Server".

Removing the default parameter values and always passing in the same number of parameters fixed the issue. Probably more a gotcha that anything else :)

WITH RECOMPILE

Ah WITH RECOMPILE, I remember it fondly from some years ago when I was studying for the SQL Server 2000 exam and I thought to myself, "I'm NEVER going to need that". Well it turns out that I did need it and the scenario was a text book example too. Basically we have an application in production which visualizes work load for an entire week. When the week starts out the users start scheduling that work for individual work days attaching a date and the actual work needing to be handled. Basically the result of this is that the dataset Monday morning is vastly different from the dataset Tuesday when most of the work is scheduled and good to go.

The main take away from this is this: When your datasets change a lot from execution to execution you should look into adding WITH RECOMPILE to stored procedures working on the dataset.

IF Statements

Yes IF statements! The rule of thumb is that they're evil. Ok maybe not but I've seen a couple of instances where the IF statement actually affected performance in a negative way because SQL Server was unable to select the best execution plan. Also in most cases you can embed the IF logic in a single SQL statement using the where clause like so

Redundancy

I know that redundancy is a pretty well known technique but I still want to mention it because you can gain massive performance increased by sprinkling a little bit of redundancy into your schemas. The basic idea is to avoid doing calculation by storing the result but course this introduces the possible of data which contradicts each other so tread carefully.

For this one we had data structures containing status history for inventory. For each item we knew when it had been set to a specific status so we just went ahead and used that existing knowledge to determine inventory status for a particular item. Not what I would call a speed demon once we started adding aprox. 150,000 items to the inventory. Simply adding the current status code to the item gave us many times the performance we were seeing and it scales well.

VIEW

Views are a great way of simplifying queries but unfortunately we can't use filtering in many cases so we end up with a lot of views assembled from various table. Before we can limit the resultset SQL Server will have to give us back the entire thing which isn't a very cost effective way of doing things. We tried to create very general purpose views but that just got us in the situation I just described so instead we went with more specific views with lots more filtering inside them. Alternately simply move the results from the view into a temporary store which also gave us some great results.

In many cases we need to join a view onto itself in order to do rolling summations to determine what data is available in the future. Basically a kind of what if calculation without actually going ahead and updating anything. This is a very intensive process where the general purpose views needed to be assembled a couple of times before we could get our results back. Limiting the resultset was the only option to actually make this stuff perform at acceptable levels.

Finally

This is not a very scientific SQL optimization post nor is it intended to be. You could probably pick apart my observations and start talking about the what, the why, exclusive lock this, dirty read that, etc. but instead think of it as a more practical look at SQL optimization.I bring you this post with some examples because you might have an easier time determining what action to take if you face situation similar to the ones I just described.

For me starting out with a SQL optimization is pretty vexing because looking at the SQL statement reveals nothing. I really need to get into the meat of things before the possibilities start to reveal themselves. Once I pry open the statement points of issue blossom before my very eyes to reveal where I can get my results. The moment when I see a query speed up from minutes to mere seconds is probably the most satisfying part of my job. Quick wins don't come along very often in software development but SQL affords some of the best ones :)

posted on Wednesday, 18 April 2007 16:32:32 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Tuesday, 17 April 2007

Well it's been out for a while but I was having trouble finding out what is new in this release from the main site. Jon Galloway has the lowdown on that one:

  • Multiple Database support! Now you can generate a full DAL for as many databases as you like.
  • Enterprise Library 3.0 Support. Just added this in and it works nicely.
  • All-new command-line tool. You can now use our command-line tool (called SubCommander) to do all kinds of fun things, like generate your code and version your database. You can even hook into it using Visual Studio's External Tools - this will allow you to generate your project's code with the click of a button, and it will use your project settings (look for a blog post on Rob's blog).
  • Improved querying. You can now use our Query tool to run OR's, IN's, and aggregates. You can even type what you want to see:
    IDataReader rdr = new SubSonic.Query("Products").WHERE("CategoryID = 5").ExecuteReader();
    We've also renamed many of our methods (well, we've added aliases) to make the query more readable. You can now use WHERE, AND, OR, IN, BETWEEN_AND, etc. to make your calls that much more readable.
  • New Controls. You can now use our Smart Dropdown, which loads itself:
    <subsonic:DropDown id=mySmarty runat=server tablename="categories" />
    You can also use our new ManyToMany Checkbox list helper to both list and save information for many to many relationships:
    <subsonic: ManyManyList id=myList runat=server MapTableName="Product_Category_Map" PrimaryTableName="Products" PrimaryKeyValue="1" ForeignTableName="Categories" />
  • A new AutoScaffold page that you can drop right into your project to admin all your tables. This thing reads your tables and creates scaffolds for you automagically (thanks Eric!).
  • A completely reworked code-generation system that uses an ASP-style templating system. You can now code up your templates like you would an ASP page (also just like CodeSmith). Test them in your web site to make sure they work, then replace (or add to) the bits that get generated at runtime. You can override our templates by specifying a template directory in the web.config:
    <SubSonicService defaultProvider="Northwind" fixPluralClassNames="false" templateDirectory="D:\PathToMyTemplates">
  • Regular Expression Naming Engine. If you don't like what our convention is, then you can use your own with some simple regex. Pass a simple string, or a dictionary replacement and all of your tables/views will be renamed as you specify.
  • Query Inspection. Want to know what's happening to your query, and how long it's taking? You can simply use the new Inspect() method which outputs the results and statistics to HTML for you to review.
  • Improved Trace/Debug. We've added tracing to (almost) every facet of SubSonic, so if you turn tracing on you can see what SubSonic's trying to do. We're always adding to this and if you see something we've missed, let us know :).
  • Improved Scaffolding. Your foreign key values are now pulled into the GridView and displayed instead of their keys. Eric figured this one out - go dude!
posted on Tuesday, 17 April 2007 12:13:10 (Romance Daylight Time, UTC+02:00)  #    Comments [2] Trackback
# Thursday, 12 April 2007

Do not play this game at work, if you have a deadline, or if you need to go to the toilet real bad. It will suck up your time like you would have never thought possible.

"A live action puzzle game written for fun in flash, based on a Warcraft 3 custom map. You have to stop your enemies, or 'creeps', from travelling all the way across the screen. Tower pieces can be purchased and placed on the map to kill the creeps before they make it across."

Desktop Tower Defense

posted on Thursday, 12 April 2007 14:11:25 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Wednesday, 11 April 2007

I got turned on to Google Co-op when reading a post on some blog I can't remember the name of off the top of my head but I didn't really get until later when I got around to listening to the Dan Appleman episode on .NET Rocks.

The problem with my first introduction to Google Co-op was that it only searched a single site. Now what Dan has done is basically to add multiple sites to his custom search and thereby he has create SearchDotNet an engine searching only quality .NET sites for content with Dan Appleman as a kind of editor-in-chief. Check out his post SearchDotNet.com – Google custom search for .NET developers.

This really opened my eyes to the possibilities of Google Co-op. I've been toying with the idea of creating a couple of sites with the main purpose of aggregating great content but I never got around to it due to time constraints. Now thanks to Google I can create the sites easily and I can do with very little HTML.

First thing everybody should do is to integration Google search into their blogs if they haven't already done so and then go ahead and start creating searches for their favorite sites. I know I have a couple I visit when I'm trying to solve some kind of problem.

posted on Wednesday, 11 April 2007 16:33:00 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback

Great news to the people annoyed by the fact that Team System isn't available to everyone: Some of the unit testing features from Team System will start appearing in Visual Studio Professional in the Orcas release. Also Microsoft is looking into adding more stuff from Team System into Visual Studio so be sure to voice your opinion on that.

"Due to popular demand we have decided to add the majority of the unit testing features of Team System to the Pro Sku of Visual Studio.  With the release of Orcas, the support for authoring, generating and running unit tests, will become available to the users of the Pro Sku of Visual Studio. Pro customers will also take advantage of the some of the unit testing improvements we have added into Orcas"

[Unit Testing Trickling into Pro!]

I'm still thinking that we need to see much more especially with projects like TestDriven.NET and NCover providing so much more. Also Team Foundation Server is a big part of the equation. Of course we have the ability to duck tape together an open source solution which does approximately the same thing. At this point I would even argue that an open source solution would be better but taking the future into consideration the open source solution will never be able to keep up with Team System simply due to the fact that the one thing Microsoft does extremely well is product integration. They have all the basics down and they have been moving on the round two for some time now.

posted on Wednesday, 11 April 2007 11:23:01 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Monday, 09 April 2007

To create awareness of Mozilla Develop Center, Mozilla Corporation has created a very cute desktop background for us. Check it out. I just had to add it as my own desktop image :)

Get larger versions of the background image.

posted on Monday, 09 April 2007 20:43:18 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback

The final release of Enterprise Library 3.0 is upon us. I've have been looking forward to this release for some time and the timing couldn't be better as I'm currently architecting two new solutions which will benefit significantly of the new features offered by EntLib 3.0 such as the Validation Block also the Policy Injection Block looks very exciting. Can't wait to get my hands dirty with those two. Of course the ever popular logging and data access blocks are still included, using those is just a no brainer as far as I'm concerned.

Here's what program manager Tom Hollander has to say:

"If you've been keeping up with the Community Technology Preview releases, there shouldn't be anything too surprising in the final release, although the overall quality should be much higher. All of the major new features have already been described on this blog or in other places, but here is a quick summary of the most significant changes"

[Just Released! Enterprise Library 3.0 - April 2007]

Download Enterprise Library 3.0 - April 2007

posted on Monday, 09 April 2007 10:16:29 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Sunday, 08 April 2007

For the average computer user finding out whether his or her computer would run a particular application has been somewhat of a hassle it not downright impossible. Game distributors have traditionally added minimum and recommended requirement on the box of the game or application but what average computer user actually know what kind of hardware is inside their computer? Heck I even have trouble remembering all those parts.

To end that particular frustration Windows Vista includes a feature called Windows Experience Index. Basically this is a number as opposed to hardware specifications which tells you whether your computer will be able to run a certain application or game. The overall score is determined by the lowest number (in my case memory) and tells the user where an upgrade would make the most sense.

Add to this that Windows Experience Index is nicely integrated into the Games Explorer and you have yourself a very nice way of easily determining if you will have a nice time playing a particular game.

Now all that remains to be seen is whether PC game makers will have more sense when it comes to determining the recommended experience index than they do today with their recommended hardware requirement. In any event the single number is much easier for the user to get some kind of sense of what is needed to run a particular game.

posted on Sunday, 08 April 2007 18:22:08 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Wednesday, 04 April 2007

In an effort to make my unit tests easier to create I am looking into doing more interface based programming. I highly recommend MSDN Architecture Webcast: .NET Interface-Based Programming with Juval Lowy of IDesign for a good start.

posted on Wednesday, 04 April 2007 12:58:46 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback
# Tuesday, 03 April 2007

Blogging is about writing. Many claim that content is king. If content is king, then the army that protects and defends the king is the written word.

Here are some things to think about next time to put your army to work on your blog.

[Blogging Is About Writing]

posted on Tuesday, 03 April 2007 14:04:12 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback

Here is something you do not see every day:

It’s cruel to even want to watch these videos. You do know that, right? Not just because people are attacked by animals. The added cruelty is you’ll be enjoying an animal attack on a human who only wanted to enlighten you.

[5 Best Videos: Animals Attacking Reporters]

posted on Tuesday, 03 April 2007 13:59:45 (Romance Daylight Time, UTC+02:00)  #    Comments [0] Trackback