Data Accelerator Blog
Chatty SQL Applications
We see a lot of chatty applications, applications that run fine on a LAN but become frustratingly slow when run over a WAN or the internet. I’m going to explain a little about what we mean by chattiness and how we solve the problem.
I’ve come up with a series of queries for the AdventureWorks database that show a typical set of SQL queries that may occur when the user clicks on an order in a sales application.
Imagine the user clicks on an order with the ID 43659. The application issues a query to get the sales order header to display information such as the customer and the sales order number.
SELECT SalesOrderNumber, OrderDate, CustomerID, TerritoryID, SalesPersonID TotalDue FROM Sales.SalesOrderHeader WHERE SalesOrderID=43659
The next query gets the sales order details with the same ID to get all the line items for the order.
SELECT OrderQty, UnitPrice, ProductID, SpecialOfferID FROM Sales.SalesOrderDetail WHERE SalesOrderID=43659
The next statement is just a set statement, setting the connection state for the following queries. Who knows why it’s there, perhaps it was thought to have some performance benefit at some point (we see a lot of this!)
SET NOCOUNT ON
The next 5 queries might be used to populate drop down boxes to allow the user to select a customer, sales territory, product, sales person and special offer.
SELECT CustomerID, AccountNumber FROM Sales.Customer SELECT TerritoryID, Name FROM Sales.SalesTerritory SELECT ProductID, ProductNumber, Name FROM Production.Product WHERE SellEndDate IS NULL OR SellEndDate > GetDate() SELECT SalesPerson.BusinessEntityID, Person.LastName FROM Sales.SalesPerson inner join Person.Person on SalesPerson.BusinessEntityID = Person.BusinessEntityID SELECT SpecialOfferID, Description FROM Sales.SpecialOffer WHERE EndDate IS NULL OR EndDate > GetDate()
Then there’s another set statement, turning off what was set above
SET NOCOUNT OFF
The last query gets the sales reasons using the sales order ID so the user can see the reasons why the customer made the purchase.
SELECT SalesReason.Name as SalesReasonName from Sales.SalesOrderHeaderSalesReason INNER JOIN Sales.SalesReason ON SalesOrderHeaderSalesReason.SalesReasonID = SalesReason.SalesReasonID WHERE SalesOrderID=43659
This would run fine on a LAN, running the script in Management Studio with no latency shows the execution time is 62ms. However, when you run it over a WAN with 100ms latency the execution time jumps up to 1078ms, a whole second more due to the 10 round trips to the database. Yikes!
Response Times: The 3 Important Limits
There are 3 main time limits (which are determined by human perceptual abilities) to keep in mind when optimizing application performance:
- 0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.
- 1.0 second is about the limit for the user’s flow of thought to stay uninterrupted, even though the user will notice the delay. Normally, no special feedback is necessary during delays of more than 0.1 but less than 1.0 second, but the user does lose the feeling of operating directly on the data.
- 10 seconds is about the limit for keeping the user’s attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.
There are a few solutions: Maybe you could change the application to make less round trips? While this is good advice in theory, in practice making those change across the application can be impractically expensive. Perhaps you could use a caching product such as RiverBed or SafePeak? Alas none of the products out there maintain SQL Server’s consistency model. This means that you either have to spend a long time configuring it (or rather spend a lot of money configuring it) or it could introduce bugs into your application.
It is possible to create a caching system that maintains SQL Server’s consistency model, we worked on such a system as a forerunner to creating Data Accelerator, but it is highly complex and introduces scaling limits. We understand why none of our competitors guarantee to maintain the consistency model!
As a simple example, take the where clause in the 6th query above:
WHERE SellEndDate IS NULL OR SellEndDate > GetDate()
This states that the results should change with time, caching would stop it changing over time. Perhaps that doesn’t matter for your application but the point is that the caching layer does not obey the same rules as the database, the proxy is not transparent. It will change the behaviour of your application, perhaps in ways that introduce bugs.
Data Accelerator solves this problem by predicting what queries your application will issue so that it can prefetch them and stream the results to the user’s PC. This avoids the network round trips.
Running SQL Management Studio via Data Accelerator for the first time shows no improvement as it’s still learning, but the second time the execution time plummets to 156ms. Basically the 10 round trips have been reduced to 1 making it almost as fast as it was on the LAN. And that’s how fast it will be from now on, you don’t have to go through the slow training phase again, the learning data is stored on the server and so once one user has completed the operation it will be fast for all users from then on.
Data Accelerator allowing an application to get the results of 10 queries with only 1 network round trip
Ok, so what happens when the user clicks on another sales order? one with a different ID? That’s fine, Data Accelerator learns how your application uses parameters and makes predictions accordingly. If we change 43659 to 44106 in the three places in the script above and run it again via Data Accelerator, the first execution time is 360ms and the second is back down to 155ms.
Why was the first run slower? In that run Data Accelerator predicted the ID parameter was a constant and would always have the value of 43659. However on the 2nd run it now knows that ID cannot be a constant as it has seen a counter example and it instead makes the prediction that the ID used in the first query is the same one used in queries 2 and 9 as they correlate in all the examples it has seen. This turns out to be correct taking the 2nd run back down to 155ms. Now DA understands this model every time the form is opened it will be fast no matter what the ID is.
If you want to try this out yourself download the SQL script and sign up for our trial below. I ran the script with “Include Client Statistics” turned on which shows the “Total execution time” in the output. I also used this WAN emulator to introduce the latency.