For anyone trying to troubleshoot a Slow SQL server, I wanted to come up with a test that will take the SQL issue and generalize it. Why does this need to be generalized? I have found that a customer or a support team may introduce a bias in all aspects of the tests. Begin with the Data. Data is impossible to to show a unique result. You may say this database does not go as fast as my favorite one, on a separate server. you cannot accurately prove one server is faster or slower then another server. Why?; for a basic Idea, take look at another case, where I lay out some basic testing tenets to go by.I will re-state them here. They sound like car rules, but they are universal testing rules you can apply to any situation.
From Car Rules to Computers
- The performance should be documented and repeatable.
- More than one test should be run, and simple is usually more realistic.
- Tests should be standardized, down to a science, so that if applied to another matching scenario, you would expect similar results.
- Keep the time down to a short test. The longer the test, the more variables can be introduced.
- Do not focus on two separate car models not functioning the same, find a way to introduce a baseline into what a reasonable car will perform like. Then prove or disprove your baseline.
In order to get a good unbiased test result for SQL, I came of with a dynamically created SQL database, that gets created once. Once Created, you can run some test on this standardize database, and compare with results on, say your laptop, or another machine, where your processor, Memory and disk resources are similar. All you have to do is to follow the method. One simply must not use ones own data.
Read on-Grab the download from here or the top of the page.
The SQL Baseline for Customers who report server A is slow then Server B.
When a customer claims that one machine is slower than the other, there is always the possibility the customer has an actual baseline. However, when they say one is slower than another, this usually indicates they don’t know what a baseline is.
A Baseline is a collection of metrics, about the server, when it is installed at Greenfield time. When the Server is first Deployed with SQL, a baseline should be taken. Then, future claims as to a slow server, should be taken against itself; not another server.
When a person wants to compare two servers, this is almost an impossible ask. It’s like asking us to compare why two people do not complete a personality test in a similar way. From a support standpoint, it is a fruitless pursuit, and often creates a bad CE, in trying to fulfill their request.
The goal of this process, is to give Support and the customer a way to meet on common ground. The customer claim that the server is slow may as well be translated into, The Data on my servers does not match!! And they are correct. And we don’t support data. The Key word is Data. This question of “SLOW-ER” pulls us into the customers’ data sets.
This process gives us a way to use our own data set. The advantage cannot be understated. We will be telling them one machine is slower or it’s not.
Accepting that generally one machine is slower, do not underestimate this result, as the customer re-introduces his production elements. If the Baseline test show a machine 20% slower, then any difference, more than 20%, will be due to specific workloads introduced by the customer. All of the SQL Subject matter experts have known this, but we all spend weeks trying to find the leverage to prove it. Without an “absolute”, we could not substantiate that claim. This caused these cases to last for months. This method below, should cut these cases into a two day case, at most.
In the following test for SQL you will see four files, which compose of a method of base-lining SQL performance, without using Bias Data from the Customer, or a third party company. This Test Is devoid of the implication of using cashing or indexing, so it is a perfectly simple test to illustrate capabilities between two machines.
The reason this test has been devised is due to customer demand. Customers often ask us to compare two adjacent machines. Often times these comparisons can only be done using apples to oranges methods. Often times, these cases end up being a point of contention for the customer and for support teams. The goal of this test is to mitigate that disparity.
Here are the files you will need
Figure 1. Files you will need
The Excel Spread sheet is to be filled out and returned to Support. We keep a master copy of this spread to monitor the scripts performance against a variety of machines and situations. Over time, we will have a database of how this script performs, on average, across a multitude of platforms. And the simple measure we are obtaining, is time. How long does it take the baseline query to complete?
The Results of the script, should answer the question, is my “server” really slower than average, or slower than another server? In order to do this, strict adherence to rules must occur. This test must be run, with all other operations terminated on the SQL server. There should be no Antivirus running, there should be no other applications running. Other than a baseline Windows machine, with core applications and services running, the server should be running SQL with no client connections. In other words the SQL machine needs to be out of production. There are columns In SupportBaseline.xlsx, but it will be noted in the analysis, that the machine was in production, and the results may not reflect a true baseline.
Several baseline runs can be collected with the single variable as the total number of rows, this script will create. The default is set to 1 million. The recommendation is a million rows, on average. However, depending on how powerful the server is, or how much down time you are allowed, you can adjust this variable to fit into your needs CreateSupportTest.sql is the file where this change is made see below.
Figure 2. Where to adjust how long the script will run
How long do I run the initial test?
As a general rule, 1 million rows should take less than 15 minutes on a reasonable SQL server. However Performance degrades fast. For example, A SQL VM with only 3 GB of ram, will take 121 minutes to run the query. So the first run should be 100,000. Then multiply the length of time it takes to complete by 10.
This is how long a million rows should take to complete. You can judge how many rows you should choose, depending on the amount of time you want the query to take to complete.
- Determine how long you want to run the query. Follow How long do I run the initial test?
- Set the value of the # of Rows. Follow Test Parameters
- Record the initial values of the server in the XL spread SupportBaseline.xlsx
- Run the Query named CreateSupportTest.sql here is a how to if you need it
- Record the results in the Excel spread sheet SupportBaseline.xlsx. use start and stop time and it will auto populate the time of execution
- Repeat as necessary, populating the spread sheet, and returning to Louis Reeves in Support support. He is keeping the overall list of how the query runs in several different scenarios and can give you more information about how your query results compare to other machines running the same query
- When you are finished testing a server, there are two scripts that are cleanup scripts. Run DropSupportTest.SQL. here is a how to if you need it
- Then run DropSupportMaster.SQL. here is a how to if you need it
That’s it, Now you can complicate things, by running things like Diskspd against these machines, but, it will be best to just keep it simple and stay with the program laid out. If you desire to look at diskspd, go ahead and read The Fallcy of Performance or; Are you bringing your Support Agent Apples or Oranges? This will help you the plan for running Diskspd commands. So here you really have two ways to testing the claim of a SLOW:
I hope this series of articles is helpful in troublsehooting issues with model data.