How to repair Software service won’t start on a domain controller or Windows software protection will not start access denied 5. on server 2012 R2.

Good day all,

I had the strangest activation issue today. I decided to detail the issue If I ever see it again. So I must admit, the whole idea came from searching the core team blog. My issue was the Software activation service would not start. This resulted in all of the activation related items failing from the customer perspective.

My particular error code was a little different, but the error verbiage was the same. I considered the verbiage enough to try a few things, and I found success. Its always important to share success.

The Core Team can solve your issue without me, so feel free to consult their article. I am just showing the folder and registry locations I needed to add the SPPSVC to, for my protection service to start. This is apparently only on a Domain Controller, Hence the name.

The Key was to add the NT Service\SPPSVC to some specific locations, both in the Folder system and the Registry. There was a little trick here. You have to deselect the domain, when choosing the account. This caused me to wallow along for much longer, as I never thought of doing this on my own power. That is where the Core Team Saved me. Thank you guys.

Screen shots of my 42DC Server look like:

 

 

Just to be clear what I am saying, you are to not use active directory groups when making your search. If you do, the NT SERVICE\SPPSVC will not be

there. So select your local machine and add your group. It should be there.

 

So now we understand how to make an otherwise mysterious user account show up on a Domain controller, here we go with the folder and registry locations.

  1. The Store Folder Located:

C:\Windows\System32\spp\ – Right click and chose the store folder permissions

2. The SPPSVC registry folder located at the  SPPSVC folder

Regedit\Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\SPPSVC

3. The SoftwareProtection Registry Folder located at

Regedit\Computer\HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Software Protection

4. This one, I don’t know if this was necessary, so I would not do this unless it was your last gasp:

  • Take ownership of C:\Windows\System32\SPPSVC.EXE
  • Make sure you screen shot original permissions and put them back
  • Add SPPSVC as needed.

Those were the locations I found. Now I did not find this all on one MS KB, so I certainly don’t recommend this is a true fix for an issue. I simply found this in the moment of trying to get a customer back into functionality. Please look to the core team for updated information. This will supersede anything I have here today.

 

Update 6/22/2017

 

So in my one particular case, there was an additional location that changed. I had to find it with PROCMON by sysinternals. Using Procmon, I followed the MS blog to capture the traffic. I did not even need to filter the traffic. It clearly showed the WPA folder in the registry was missing a permission. Since the Service stared, after I made this change, It was definately the network service that was missing:

Here is the Key Location in regedit:

Computer\HKEY_LOCAL_MACHINE\SYSTEM\WPA

 

I hope this may help get you out of a Jam, and I hope your licensing functions well.

Louis

Advertisements

Das Cache – Help Support Das Cache Sandisk

Good day,

I wanted to write a little something about DAS Cache.  You may not have heard of it. In fact, Unless you purchase a Dell Server, you may not ever be exposed to it. Apparently, When you purchase a Dell Server, you may purchase an application called Das Cache.

Please be advised the support for this product is available only if you have a Pro Support Contract with Dell. See the sales information on this product here.

“San Disk DAS Cache software is fully supported by Dell Services and available with the purchase of a next-generation Dell Power Edge server”

 


Tell Dell Support you have Das Cache

After Spending a little time with this product, the main support incident happens when the support person is not told that Das Cache is installed on the system. The fact is this product is simple and will not likely fail. But, the key point is when Das Cache is Deployed, It takes the Place of windows, in terms of manageability and control.

What happens is either the IT person or the support agent treats this disk as a Windows Disk, trying to run operations on the disk directly. You can’t treat a DAS Cache Disk as a windows disk. I am thinking, this first fact is the cause of Support-ability problems with the product.


Use the Das Cache GUI

The Gui of Das Cache is simple. You simply select a Disk to use as the scratch disk, and a volume, which serves as your data volume. Das Cache keeps copies of the most used data, on an SSD Drive. I found that it doesn’t have to even be SSD. It could be just two Sata Jbod Disks.

The way you set this product up Is you select “add Cache” at the right side of the application window. You select disk 1, then disk 2, and finish. The product is now using Disk1 as a cache for the data volume Disk2.

If you want to do anything to that data disk, or the cache disk, the only option you have is to use the GUI, and assign or un-assign cache from the cache tab:

1

Or you may stop or start acceleration from the volumes tab:

2

The only other thing you can really do is reset performance statistics or Build an Incident report. This menu is off to the right hand side, of the GUI interface.

3


What troubleshooting vs. recreate Cache

In all honesty, the amount of troubleshooting that can be done on this simple product, is very limited. The danger of this product is if you try to go into windows and do things to this disk. Instead of doing that, you can just come into the DAS Cache GUI, stop accelerating, and then removing caching. Then you can come in and create the same cache, and choose start accelerating. That should be the extent of troubleshooting.

Most Common errors

My understanding is the most common errors with this technology is File Server Resource Manager errors. For example, here are some known issues with FSRM.

In reviewing File server resource manager, My advice would be to disable it and find out if the errors stop. If they do, then, it would be a question of disabling specific settings in FSRM to find out, what setting is incompatible.

I could see certain aspects of FSRM not being happy with Das Cache. On the other hand, these errors are likely false positives, as the Cache itself is not part of the actual File Storage system. As long as the Cache is working. I would say you just have some erroneous errors to contend with.

In the most extreme, you should just stop caching, stop accelerating, and perhaps re-configure the cache. The only reason this technology would have any trouble is if you made changes to the disk inside windows. Just make sure you use the DAS Cache GUI only, and you will be fine to troubleshoot with that only.

Take Away

To troubleshoot with windows, you should remove caching and acceleration, to troubleshoot the disk, by itself.

In conclusion, I hope this is helpful, if you have to take this your support team, just make sure the first thing you tell them is you have a Das Cache setup. That is an important point. Otherwise, that leaves the support team looking at, what they think is a Windows disk, with some very unusual failures.

Thank you

 

Louis

 

#I work for Dell Services.

Use a Baseline Database Generator Script for reviewing performance of SQL Instance

Use a Baseline Database Generator Script for reviewing performance of SQL Instance

For anyone trying to troubleshoot a Slow SQL server, I wanted to come up with a test that will take the SQL issue and generalize it. Why does this need to be generalized? I have found that a customer or a support team may introduce a bias in all aspects of the tests. Begin with the Data. Data is impossible to to show a unique result. You may say this database does not go as fast as my favorite one, on a separate server. you cannot accurately prove one server is faster or slower then another server. Why?; for a basic Idea, take look at another case, where I lay out some basic testing tenets to go by.I will re-state them here. They sound like car rules, but they are universal testing rules you can apply to any situation.

From Car Rules to Computers

  1. The performance should be documented and repeatable.
  2. More than one test should be run, and simple is usually more realistic.
  3. Tests should be standardized, down to a science, so that if applied to another matching scenario, you would expect similar results.
  4. Keep the time down to a short test. The longer the test, the more variables can be introduced.
  5. Do not focus on two separate car models not functioning the same, find a way to introduce a baseline into what a reasonable car will perform like. Then prove or disprove your baseline.

In order to get a good unbiased test result for SQL, I came of with a dynamically created SQL database, that gets created once. Once Created, you can run some test on this standardize database, and compare with results on, say your laptop, or another machine, where your processor, Memory and disk resources are similar. All you have to do is to follow the method. One simply must not use ones own data.

Read on-Grab the download from here or the top of the page.

The SQL Baseline for Customers who report server A is slow then Server B.

Disclaimer

When a customer claims that one machine is slower than the other, there is always the possibility the customer has an actual baseline. However, when they say one is slower than another, this usually indicates they don’t know what a baseline is.

A Baseline is a collection of metrics, about the server, when it is installed at Greenfield time. When the Server is first Deployed with SQL, a baseline should be taken. Then, future claims as to a slow server, should be taken against itself; not another server.

When a person wants to compare two servers, this is almost an impossible ask. It’s like asking us to compare why two people do not complete a personality test in a similar way. From a support standpoint, it is a fruitless pursuit, and often creates a bad CE, in trying to fulfill their request.

The goal of this process, is to give Support and the customer a way to meet on common ground. The customer claim that the server is slow may as well be translated into, The Data on my servers does not match!! And they are correct. And we don’t support data. The Key word is Data. This question of “SLOW-ER” pulls us into the customers’ data sets.

This process gives us a way to use our own data set. The advantage cannot be understated. We will be telling them one machine is slower or it’s not.

Accepting that generally one machine is slower, do not underestimate this result, as the customer re-introduces his production elements. If the Baseline test show a machine 20% slower, then any difference, more than 20%, will be due to specific workloads introduced by the customer. All of the SQL Subject matter experts have known this, but we all spend weeks trying to find the leverage to prove it. Without an “absolute”, we could not substantiate that claim. This caused these cases to last for months. This method below, should cut these cases into a two day case, at most.

 

The Process

 

In the following test for SQL you will see four files, which compose of a method of base-lining SQL performance, without using Bias Data from the Customer, or a third party company. This Test Is devoid of the implication of using cashing or indexing, so it is a perfectly simple test to illustrate capabilities between two machines.

The reason this test has been devised is due to customer demand. Customers often ask us to compare two adjacent machines. Often times these comparisons can only be done using apples to oranges methods. Often times, these cases end up being a point of contention for the customer and for support teams. The goal of this test is to mitigate that disparity.

 

Here are the files you will need

Figure 1. Files you will need

 

SupportBaseline.xlsx

The Excel Spread sheet is to be filled out and returned to Support. We keep a master copy of this spread to monitor the scripts performance against a variety of machines and situations. Over time, we will have a database of how this script performs, on average, across a multitude of platforms. And the simple measure we are obtaining, is time. How long does it take the baseline query to complete?

 

Test Parameters

The Results of the script, should answer the question, is my “server” really slower than average, or slower than another server? In order to do this, strict adherence to rules must occur. This test must be run, with all other operations terminated on the SQL server. There should be no Antivirus running, there should be no other applications running. Other than a baseline Windows machine, with core applications and services running, the server should be running SQL with no client connections. In other words the SQL machine needs to be out of production.  There are columns In SupportBaseline.xlsx, but it will be noted in the analysis, that the machine was in production, and the results may not reflect a true baseline.

Several baseline runs can be collected with the single variable as the total number of rows, this script will create. The default is set to 1 million. The recommendation is a million rows, on average. However, depending on how powerful the server is, or how much down time you are allowed, you can adjust this variable to fit into your needs CreateSupportTest.sql is the file where this change is made see below.

 

Figure 2. Where to adjust how long the script will run

How long do I run the initial test?

As a general rule, 1 million rows should take less than 15 minutes on a reasonable SQL server. However Performance degrades fast. For example, A SQL VM with only 3 GB of ram, will take 121 minutes to run the query. So the first run should be 100,000. Then multiply the length of time it takes to complete by 10.

This is how long a million rows should take to complete. You can judge how many rows you should choose, depending on the amount of time you want the query to take to complete.

Process

  1. Determine how long you want to run the query. Follow How long do I run the initial test?
  2. Set the value of the # of Rows. Follow Test Parameters
  3. Record the initial values of the server in the XL spread SupportBaseline.xlsx
  4. Run the Query named CreateSupportTest.sql here is a how to if you need it
  5. Record the results in the Excel spread sheet SupportBaseline.xlsx. use start and stop time and it will auto populate the time of execution
  6. Repeat as necessary, populating the spread sheet, and returning to Louis Reeves in Support support. He is keeping the overall list of how the query runs in several different scenarios and can give you more information about how your query results compare to other machines running the same query
  7. When you are finished testing a server, there are two scripts that are cleanup scripts. Run DropSupportTest.SQL. here is a how to if you need it
  8. Then run DropSupportMaster.SQL. here is a how to if you need it

That’s it, Now you can complicate things, by running things like Diskspd against these machines, but, it will be best to just keep it simple and stay with the program laid out. If you desire to look at diskspd, go ahead and read The Fallcy of Performance or; Are you bringing your Support Agent Apples or Oranges? This will help you the plan for running Diskspd commands. So here you really have two ways to testing the claim of a SLOW:

 

I hope this series of articles is helpful in troublsehooting issues with model data.

Louis.

 

 

 

 

 

The Fallacy of Performance or; Are you bringing your Support Agent Apples or Oranges? VM Virtualization Performance with DiskSpd.

The Fallacy of performance.

I don’t think you don’t already know this. My experience does tell me that we all group things together naturally and sometimes the performance issues we find, are really assertions made, with one piece of evidence. This performance claim is generally hard for a support agent to frame. Not that your case wont be worked, on. It will. It just it may take support teams hours or weeks to get to the truth of your statements.

I have to write this because it is so prevalent. When someone calls me and they want to open a support case, I generally try to standardize the case to some truthful statements, which I can prove, disprove, or alter.

However one such case type, that does not fit into such neat lines is the Virtualization of Performance case. Rather than describe in computer terms, let’s use American Auto Makers Ford and Chevy.

Baselines Matter

I Studied Ford and Chevy Specifications for months. I know the performance characteristics of each very well. Let’s even say I own a Chevy, and I am now looking for a second car of equal specifications. Let’s even say I own a Ford and am looking for a second ford. Then I purchase that second car, and on the way home, I find that the one car does not seem to meet the specifications of the first!

Of course I must call the car company and make the complaint that one car is not as fast as the other, or not as quick to brake, or some other specification. How about this; the air is not as cold in one as it is in the other!! This is what happens in Performance calls.

Just so you know we technically should not even entertain these types of questions. But in support we do, to some degree, because we want to help, and were not sure what your showing us yet. We don’t see the pieces for several calls.  You can’t force it, it just takes time.  This is because you are asking us to form a relationship to ideas that are not related. Two cars are not related, two computers are not related.

 

How about taking a cross country trip? One car crossed the country in three days. The other took 4. At this point you may be seeing my point. Trying to get down to difference in different items, of any type can be like comparing apples to oranges. What’s even worse is when one of the items is off limits. So My Ford is definitely slower, but my wife takes the other ford to work so we really can’t use that one for testing! Now what do we do?

Basic Performance assumptions

So there are some simple rules that you should apply to any performance problem:

 

  1. The performance should be documented and repeatable.
  2. More than one test should be run, and simple is usually more realistic.
  3. Tests should be standardized, down to a science, so that if applied to another matching scenario, you would expect similar results.
  4. Keep the time down to a short test. The longer the test, the more variables can be introduced.
  5. Do not focus on two separate car models not functioning the same, find a way to introduce a baseline into what a reasonable car will perform like. Then prove or disprove your baseline.

 

Now obviously, the complexity of computers can result in more rules, but if you follow these basics, you can at least find some sanity in your test results. In fact Support has an absolute need, that this happens. It is very possible, nothing is really, wrong, if we don’t get down to brass tacks.

So real world

Call into support and report when you run this command on one machine, things are fine. When you run on the other environment, things failed. This is the Disk Speed command. This is the replacement for SQLIO. I really like this tool.

 

  • diskspd –b8K –d30 –o4 –t8 –h –r –w25 –L –Z1G –c20G  D:\iotest.dat > DiskSpeedResults.txt

 

However, what is hiding in this statement, violates all the 5 rules above. This is an assertion, based on one command. Furthermore, you ran this command, in the test location,  over and over, while other VMS are also running, randomly, creating a random pattern of storage fragmentation, while the Production environment was only one once, in a very controlled situation. These commands were not run in a scientific fashion.

It literally took me a days to think of a way to baseline this situation and to  test this correctly. This is where the 5 rules came from. I think they are solid rules for support to go by. So here is how you test to make your case to Support:

User Guide and Product here

Introduce a Baseline. Anything is better then nothing

The Above diskspd command is complex and long. Come up with some simple tests and run more of them, over time. Second, test your commands, on a laptop, or desktop, with a specific Ram, Storage, and Processor Profile. Once you record all the results on the client machine, duplicate the test in the Virtual Machine. Make sure it’s the only Virtual Machine Running. Make sure nothing is running on the Host but this one VM, with specific resources.

Now below I am not giving you results. I am just giving you the commands, Along with some Instructions on how to use DSKSPD. I am also leaving you with Articles that VMware and Microsoft Hyper-V use, when asking for baseline testing. Notice, how many little requirements they have Seem familiar? There is a reason for this! We are all trying to be scientific.

Tests to establish a Baseline.

  • .\diskspd -c100M -d20 c:\test1 d:\test2
  • .\diskspd -c2G -b4K -F8 -r -o32 -W30 -d30 -Sh d:\testfile.dat
  • .\diskspd -t1 -o1 -s8k -b8k -Sh -w100 -g80 c:\test1.dat d:\test2.dat
  • .\diskspd.exe -c5G -d60 -r -w90 -t8 -o8 -b8K -h -L
  • .\diskspd.exe -c10G -d10 -r -w0 -t8 -o8 -b8K -h -L d:\testfile.dat
  • .\scriptname.ps1
  • Same as above- second location
  • .\Diskspd -b8k -d30 -Sh  -o8 -t8  -w20 -c2G d:\iotest.dat

 

This list will generate about 15 unique results. Any of these will run on a laptop or a server. Just make sure you read the text character decoder sheet available with the product.

So the instructions are very simple. The specs on the Hyper-V or VMware VM, must be the same as the laptop. My laptop has 16GB of ram, and 8 Processors.

The VM must be the only one running, and the OS should be a fresh Install. Now if the results of testing are in the ballpark of your comparison Client, then you are not having a performance issue.

The moral of the story is test from different perspectives, and use the Scientific method, as much as you are able to.

I hope this is helpful in your troubleshooting.

 

A few other Details

Here is a way to manually pre-create the files if disired

  • fsutil file createnew d:\iotest.dat 20000000
  • fsutil file createnew d:\iotest.dat 2000000000
  • fsutil file createnew d:\iotest.dat 20000000000

Here is all of the best articles on storage, and IO online right now. I was surprised that so many of  Storage Performance Needs are all in one place.

This could be an important point. If you came to this site because your numbers are not matching reality, your Monitoring tools, may not be collecting the right perfmon numbers, then you may need the Hyper-V performance script to use to see your actual VM numbers. try using this tool Run this tool on the host, while using diskspd on your VM.

DO not run more then one instance of Diskspd at once!! This will invalidate your tests!

 

Finally, as promised, here is how  VMware or Microsoft  handle these issues:

 

Louis

Microsoft Licensing Issues may Require a Tool Called MGADiag or the Web Version of Genuine Microsoft Advantage!

Good day,

 

A friend of mine hit me up looking for an application called MGADiag. Wow what an old tool!! But yes, I still have a copy.

I sent it to him. After review, I decided not to include a copy to this article. I did post a link to it, but i want to let people know

they should really look a the new web based tool.(Here) The new tool does not do the same thing, at first glance. but on further review, the

web page relies on a plugin, which seemingly collects similar information.

We don’t have much choice on newer Systems. Just look at figure 1. MGADIAG doesn’t work too well with Windows 10.  Other tabs have worse errors, but some tabs work OK.  .

 

Figure 1.

Reason for most failures

Hey, In its time, this was a great tool. It collected a lot of the licensing information, all in a few tabs. Great for Windows XP and maybe Vista and Windows 7. Beyond that, let the caveat emptor!

It looks like MGADiag was retired for a reason. So my evaluation of MGADiag in 2017 is its in need of a re-vamp

Well the revamp is basically this article and the web tool. here.

Basically there is a 90% chance your activation issue is captured in this simple document ( here). Did you activate the key from this server already? One License equals one machine. These are basic Tenants, that will tell you if you are genuine or not.

So My Licensing is really Broke

If everything checks out against the article, then go ahead and check licensing with the Genuine Advantage Web tool. If that checks out, then you can check with MS.

What to do about Activation (MS)

If you end up with MS, Microsoft, where you are not in the wrong, you should use standard procedures to get activation and license issues fixed.  So lets get that out of the way.

So you can just do the right thing. Just go to a command line and type  SLUI 3 or SLUI 4. This is command line activation and Voice Activation. When the Voice Activation answers the phone, you will be made to explain why you think your license is genuine. This may require a supervisor. Only the supervisors apparently have the information to make a determination, or perhaps, only the supervisors have the ability to fix a wrong determination in their job description.

So here is the thing. Microsoft can see all license keys. they can see if they are activated. So your story needs to explain why a license would be activated in their system. If your story makes sense, and you don’t mind a deactivation to activate your machine, or multiple activation’s are justified for the key you are using, they will generally make good on your license. .

To conclude, we have MGADiag, Genuine Advantage website, and we have SLUI as tools we can use to work on licensing issues.

But Be careful! MGADiag is no longer a public MS download

For example, look at this link, its wrapped in another application completely! Beware!

Here is the Newer tool, in all its glory:

 

and Finally, Another Blog example of the basic activation steps as they have not changed much over the years

 

Louis Reeves

Windows Performance Recorder, Xperf123 and CLUE all collect ETW traces for use with Windows Performance analyzer!

Good Evening.

I wanted to make a quick article for those Support cases where I need to perform an analysis on the issue, in a way that will allow me to see the Data Set in the most Creative way possible.

I do think you will prefer the Graphical Interface method of doing this , but  the site where it is hosted is going to close down at some point. So I will be attaching a link to the download, in case it becomes a lost web site.

Actually there are a couple of tools we should be aware of. So this article is about Ways to Use Xperf to collect logs for support evaluations.  Specifically, I am calling out three ways; the command line, The Core Windows Recorder, and two additional tools.

XPERF

So all of these tools will require xperf to be installed. this is part of the Windows performance tool kit. this also contains windows performance recorder and analyzer. The truth is you can just run the windows performance recorder, and this will achieve the objective of this article. But, you cant just let the recorder run, in perpetuity. There is a hit to the system for running it, and it will eventually fill up your hard drive.

Not that these other tools have methods which are any better. The main thing you need to know, is  you must monitor resources and know when to start and stop these tools your self. They can be dangerous if not used by an IT person with experience. The bottom line is use caution!

 

Command Line

For the command line options, I am just going to show you how to start,stop and obtain the log (ETW) file. You will then return the file to the support department, and they can give you an analysis.

Some example of commands which will result in a file you can give to your support team:

  • Start Trace
    • Xperf –on DiagEasy
    • Xperf –providers KG
  • Stop trace (and generate ETW)
    • Xperf –d trace.etl
  • display trace
    • Xperf trace.etl

Well That was easy wasn’t it? Well the rest of this is not too much harder. This is some complex stuff, but we want to make it easy to collect if possible. So the next tool on the block is the Windows Performance Recorder. I will not even spend any article time on this. This is the simple Next Next Finish windows method. You can do some searching on the internet if you need a few screen shots.

Now to the meat of the show. Two Tools I think you may find helpful. XPerf123 and the Clue tool. Clue is Collection of logs and the User Experience. Xperf123 is at codeplex, but they say codeplex is closing. I will not include links to their site. I will have a copy of the tool in this article. Xperf123 Download

So we are starting with XPERF 123. This is the tool on CodePlex. Download it here- Xperf123 Download

So this tool will let you form the syntax of a shell command to start and stop a log collection. It allows for all the variables you would ant like circular logging etc..

The basic article that was on the codeplex sight is below, for your convenience. This is in case the codeplex data is gone from the internet:

XPERF123

Project Description
This tool is used to automate the process of collecting xperf traces easy without the user worring about the various settings and configuration options.

UPDATE
The tool does not package XPerf.exe, perfctrl.dll, xbootmgr.exe, xbootmgrSleep.exe or xperfview.exe. Please download the Windows Performance Toolkit separately from http://msdn.microsoft.com/en-us/performance/cc752957 and then run this tool from the same location as the files.

Why this tool?
Collecting ETW traces was never this easy. With this new utility, xperf/xbootmgr logs can be collected without breaking a sweat. Just a few clicks and the required data gets collected. You no longer need to enter complicated commands to collect the data. Just select the kind of data/monitoring you desire and XPerf123 is going to get that data for you just like 1 – 2 – 3.
It also creates a simultaneous perfmon running at 5 seconds interval.

System Requirements
.NET Framework 3.0
Administrator rights on the machine.
Windows 2003/Windows Vista/Windows 7/Windows Server 2008/Windows 2008R2.

So how do I use it????
1. Follow the wizard interface of the tool.
2. From the drop down menu, select the kind of trace you want to capture.
3. Click on Start button.
4. Reproduce the issue.
5. Click on Stop button.
6. The file is will be created in the same location as the XPerf123.exe

Main features
– In Normal mode, the default paramaters for BufferSize, MinBuffers and MaxBuffers is 1024.
– It can be customized for advanced settings.
– There is option to have log the trace file in circular mode which is enabled by default. If required, it can be unchecked.
– Logs are created in the same directory by default.
– We can also save the logs to a different location then from the location where we run it from.
– It also creates a perfmon counter and starts it when we start the xperf capture.
– If Perfmon was also collected, the Perfmon logs are located in the C:\PerfLogs\ directory with the name perflognnnnnn.blg
– If we select stack walk, then the default stack walks for the respective traces will be enabled unless the user manually selects the stackwalk parameters. This is benifical for someone who wants to do stack tracing but doesn’t know what all the options to select for stack walk.
– The creation of the registry and the reboot prompts for stack walks have been automated. In the next build, I will try to log that information as well to the log file so that we know what registries were modified or created.
– Advanced options in the xbootmgr parameters to set the Buffer Options and the Enable Property .
– The Pool Trace will only work if we are using a version of xperf that supports the feature.

What do I need to get started
We need to have all the files in the same directory as xperf123.exe –
XPerf.exe
perfctrl.dll
xbootmgr.exe
xbootmgrSleep.exe
xperf.exe

Unless necessary, the General option should be able to get all the required information.
The program is designed to auto elevate, but if not getting the required results, please try running it as an administrator.
For reviewing XPerf logs, we need the xperfview.exe.

1.png
Starting up the Xperf123.exe

2.png
Select the kind of data collection you need

3.png
Enable Perfmon logging ( If you want )

4.png
And we are done. Click Start to start the capture

CLUE TOOL
Now we have one tool left. This is the newest I have seen. This tool will collect logs when there is a problem on the system. This could be a good tool to use under some circumstances.
This tool is the CLUE tool:

Clue stands for Collection of Logs and the User Experience. This tool is an automated way to collect the logs only when the issue is occurring. This is helpful, because the log collection itself can be part of a slowness or latency problem

 

Requirements for this tool:

 

  1. Download tool from – http://aka.ms/ClueTool
  2. Download and Install the Windows Performance Toolkit (WPT)
  3. Toolkit can be installed during setup. See the Clue Usage Guide.docx
  4. Right Click and choose properties of zip file. Choose unblock
  5. Unzip to long term location
  6. Run the Setup.bat file with Admin Rights

 

All features of the application will run out of C:\ProgramData\Clue  directory. If you need to run in a different directory, then change the config.xml file.

Output files will be located at \Microsoft\Windows\Clue\IncidentFolderManagement , again unless you specify otherwise in the config.xml file.

The bottom line is there are two things you want to check out. One is the scheduled Tasks, that start with CLUE_. Make sure they meet your needs as to when to collect data, and for how long.

Second is the config.xml file. You can set many things before the install, that saves you from making multiple changes after the install.

Below is what you will see in the scheduled tasks in Windows;

You will then see inside the CLUE folder, the tasks that you can change to meet your needs:
This is a great tool, in that you have some control over when and why the log collection runs. It can even survive a reboot. So this is a great tool, when
you dont know when the problem is going to occur.
To conclude, I have presented 4 ways you may get an ETL log collected and ready to send to your support person. If you have any issues, Call your support team and they should be able to help you out with it.
Windows Performance Recorder, Xperf, Xperf123, and Clue all try to do the same thing. However, it is our way of having many ways that makes us a great county!! Well Maybe a Great world, because I am certain the players in these tools are quite diverse. Indeed Hail Diversity! and Hail Molvania!
Louis

Getting Accurate Latency from Dynamically expanding Hyper-V virtual Machine Disks

 

This Article is about the tool called Hyper-V Performance Monitor Tool (PowerShell)

you can download it from the tech net article down the page, or use the link above.

So Hyper-V gets thrown around loosely these days, when you talk about Virtualization or Performance Tuning, or Planning or any other aspect of the product life-cycle of a new Host Deployment.

Over the last few years, we have made rapid changes from Physical Host Machines for production work loads, to these Virtual monstrosities, that now host our whole company.

Along with this change, you may recall that Early Hyper-V documentation has gently let us know that monitoring inside the Virtual Machine was not going to give us the parity results to the Physical counters, depending on configuration.  This is for a few reasons, which are beyond this articles scope. However, I would like to shine light on this, so more people can think differently about their Virtual performance.

The most common measure of how well a server is performing is Latency in Milliseconds. Every one is most concerned with how much latency is in the storage system. Perhaps with good reason. The San storage vendors can perform so fast now days, that you can throw the empire state building at a server, and the Latency is less then 10 Milliseconds (MS). Or is it throughput?

To be clear, we are interested more in Latency then throughput. Latency should be minimized, and throughput will generally increase.

Can I make a case that Counters are not reliable?

Well let me tell your actual Latency is not as easily attained as you would think. If I told you that your tests are lying to you, would you believe me? Lets say your in shock. Without knowing your design, my generalized answer has a very high chance of being correct.  The issue is that you have a SAN and you are tying to get latency by measuring responses to a file, that go though 4 different filters, and then has to wait until it gets queued to a disk subsystem, that is always expanding. I promise you; your numbers are incorrect.

If you cannot quantify how latency and throughput are different, but related, then I would say you should not stop reading.

Storage Latency of VM guests

There are many problems with calculating storage latency, but disk is the model we are going to use to illustrate how tricky it can be to find out how your VM is performing.

The most common approach to gaining latency information is to use a command line tool. Normally the tool will work fine. The model breaks down when the Disk itself is changing, along with the Ram and processor availability. The bottom line is a Virtual machine may lie to you about resource numbers at any given time.  Add to the mix, that the clock cycle is a weakness in any virtualization platform. That means that the calculation of time itself can result In poor results based on good math with bad numbers.

There are a crowd of you who say that is bull. Well all I can say is; don’t read this and good luck solving your latency issues.

Let me try to list some areas where the numbers may go awry. I am just making a one line explanation with a link, so you can read more. I don’t want this to be about the problems. Below, I talk more about the solution. Read more if you have a specific issue:

 

I could keep going. Do you get the feeling there are a ton of variables that change how storage latencies should be calculated?

From my experience, I have found that every set of servers are their own data set of network behavior. There are some basic assumptions I found to pass along to Admins who want to find out latency of Virtual machines.

Guidelines for VM latency Study

Who to Blame

So again, the basic message is that the calculation of latencies is totally based on the sum of the deployment factors. In one data center you may find under reporting, and the other Over Reporting. Support agents do not have the Onus to prove why one is slower then the other. We will have to look at your Design and deploy and try to make a story of things we can identify. it is not likely we will find that moment where the Deployment deviated from your Baseline storage latency measurements. We offer Best effort, but encourage you to strip down your deployment to make a core Baseline latency for a Dynamically expanding VM. All Vms will compare to that one. We go from there.

Using the Stop Gap solutions for Monitoring Virtual Machines

SO it was just a few years ago, this issue with VM monitoring was not easily remedied. You could certainly use the Perfmon counters to get VM stats. But Customers just want to run Disk Speed or SQLIO, and get an output to look at. This did not exist for quite some time. Thankfully there is a script out there, that will now carve out some parity to those tools. the link is at the TechNet Gallery:

Gallery.TechNet.Com

Hyper-V Performance Monitor Tool (PowerShell)

Below is the walk through of the basic performance collection.

you Just run the Script from an Admin PowerShell. There are a few ways to run it:

.\Monitor-HyperVGuestPerformance.ps1

### export data to csv via GUI, defaults to current dir
.\Monitor-HyperVGuestPerformance.ps1 -ExportToCsv

### retrieve data as PSobjects, great for parsing and logging, -name parameter is optional, defaults to automatic discovery
.\Monitor-HyperVGuestPerformance.ps1 -PSobjects

### specify host and interval/samples manually
.\Monitor-HyperVGuestPerformance.ps1  -Name host1,host2 -PSobjects -Interval 2 -MaxSamples 5
### accepts pipeline input
‘Host1′,’Host2’
| .\Monitor-HyperVGuestPerformance.ps1 -PSobjects

### Log to SQL server with Write-ObjectToSQL , this example uses SQL auth
.\Monitor-HyperVGuestPerformance.ps1 –PSobjects  |  Write-ObjectToSQL –TableName table –Database db -Server server –credential (get-credential)

 

image

If the domain connection fails, it tries for a Local connection:

 

image

In my case, I ran the tool on the Host, and this GUI below popped up. All I did was hit monitor, and I got an export vm_perfmon_stats file. This file can be used to find your latency.

Untitled

While this method may not be pretty, It does follow the rules for Hyper-V guest. The main purpose for this tool would be to use instead of SQLIO or DISK SPEED. tools like these should be used for hardware testing. A Hyper-V Server, running on ISCSI shared storage, with two VHDX files attached, is likely going to come back with Erroneous Latencies. This may not be perfect, but I do believe you will see a consistent result that is not a totally unbelievable number.

See I changed the Sample and interval:

Untitled

And I get a time-frame to wait for the test results:

image

Find the Link at the Microsoft TechNet Center. Thank you for taking the time to Read about Storage Performance for Hyper-V virtual Machines

I hope this helps in your Baseline Studies.

The result is a nice little Excel Display of the data, that I cleaned up a little with colors, to the Excel Fields.

 

image

Louis