A couple of days ago I pushed some changes to our custom ePrescription application into production. The changes had been made a couple of weeks ago and tested fine. However, after deploying we began to receive calls from angry doctors about how slow the system was, and how long it was taking to complete tasks. During this same period, we had a major network issue with out anti-virus software (name withheld as we are awaiting an official response from the company). A routine upgrade of the anti-virus software brought our entire network to a halt. The upgrade closed all network connections on every server and client machine until a reboot was initiated. We wound up having to manually connect to every server in our agency (30+ at last count) and click a confirmation on a dialog box to finish the installation and reboot the machines. A mass email was also sent out for users to perform the same steps. Eventually the problem was taken care of and network traffic resumed. Due to this failure, I wasn’t sure if the complaints about the ePrescription software was related to the anti-virus debacle, or the code changes I made. The answer became clear when after the network cleared up, the application’s performance was still suffering. My job then became to analyze the queries running from the application. For this I turned to SQL Server Profiler. If you haven’t used this awesome tool, I recommend you begin immediately. With profiler, you can set up a trace that will listen to and display information about all kinds of events and activities being executed on a SQL Server. For my purposes, I selected a built in Tuning Trace and filtered it to listen to TSQL statements from the database used by the software. I ran the trace for approximately 10 minutes and was able to immediately see that one query was taking anywhere from 8 to 15 seconds to execute, and it was a query that I had put in as part of the upgrades. I don’t know why I didn’t catch this in testing. Our test server is extremely slow by nature. It has been beat up to no ends and I may have subconciously considered the slow execution just a result of running on the test system. I won’t make that mistake again. The query that was causing all the commotion was relatively simple, selecting four columns from a table joined with a view. I remembered in the past a project that I had taken over for a departed employee. There was a function that was used to delete records from a table. When run, it would often cause the ColdFusion server to time out. Upon analysis, I saw that this function was calling a view that called another function. I changed the procedure to call one function only, and the process completed in no more than 2 seconds after that. I figured the view in my new query was to blame. I did some research and came across this article that explains performance hits encountered when using views, especially when joining. I restructred the query to omit the use of the view and it is now running on average of 20 miliseconds. That’s a far better figure than 8 to 15 seconds! I’ve just deployed the new query into production, and am confident that our Dr.’s will have a much better experience tomorrow.
Hope this helps. Happy profiling / analyzing / viewing!