I’ve got a customer who has deployed Direct Access quite a while ago. Something which we have observed for a while now is that the CPU usage of the servers is rather high. Some details about our setup: we got 2 Direct Access servers which are load balanced using Windows NLB. They are running Windows 2012 R2, have 4 vCPU’s and 8 GB of RAM. When troubleshooting this issue, we were seeing 400 active users, roughly 200 for each server. Here’s what the CPU usage looked like:
![image001 image001]()
As you can see sqlservr.exe is using 67% CPU. Now that’s quite a lot… I would hope a DA server had other things to do with it’s CPU instead of running an SQL instance. Now I know where this instance comes from. We configured inbox accounting on the Direct Access servers. This allows an administrator to pull up reports about who connected when to what resources. You can choose between Radius and Windows Internal Database (WID) for the auditing data targets. We choose the WID approach. We configured our accounting to hold data for 3 months. So I started wondering, is the SQL database instance having troubles with the amount of data? Or is there an issue with indexes that are fragmented or… In order to investigate this, we’d had to do some SQL talking to this instance. As it’s a WID instance, we can only talk to it from the box itself. So we can either install the SQL commandline tools or the SQL Management Studio. I’m not an SQL guru, so I prefer to do my troubleshooting using the SQL Management Studio. In order to determine what version you can use you can check the location for the sqlservr.exe binary:
![image003 image003]()
And from the details you can see that a WID on a Windows 2012 R2 is actually build 11.0.2100.60 which, if bing is correct, is a SQL 2012 edition.
![image005 image005]()
So I took the SQL 2012 iso and installed the SQL Management Studio on the DA servers. Watch out when going through the setup, we don’t want to install another SQL instance! Just the management tools. Here’s the string we can use to connect to the instance: \\.\pipe\MICROSOFT##WID\tsql\query
![image007 image007]()
After connecting to the instance we see that there’s only one Database (RaAcctDb) which has 4 tables. This query I found here: TechNet Gallery and also resembles the query that is presented here: KB2755960. To check all indexes for fragmentation issues, execute the following query:
SELECTOBJECT_NAME(ind.OBJECT_ID)ASTableName,
ind.nameASIndexName,indexstats.index_type_descASIndexType,
indexstats.avg_fragmentation_in_percent
FROMsys.dm_db_index_physical_stats(DB_ID(),NULL,NULL,NULL,NULL)indexstats
INNERJOINsys.indexesind
ONind.object_id=indexstats.object_id
ANDind.index_id=indexstats.index_id
WHEREindexstats.avg_fragmentation_in_percent> 30
ORDERBYindexstats.avg_fragmentation_in_percentDESC
No indexes were returned. Another thing which I hear from time to time is rebuild “statistics”. So I checked them and I saw they were two weeks old. I figured rebuilding them couldn’t hurt:
useRaAcctDb;
UPDATESTATISTICSconnectionTable
WITHFULLSCAN
GO
useRaAcctDb;
UPDATESTATISTICSEndpointsAccessedTable
WITHFULLSCAN
GO
useRaAcctDb;
UPDATESTATISTICSServerEndpointTable
WITHFULLSCAN
GO
useRaAcctDb;
UPDATESTATISTICSSessionTable
WITHFULLSCAN
GO
Again no real change in CPU usage… Ok, back to the drawing board. I googled a bit for “high cpu usage SQL” and I found the following blog: http://mssqlfun.com/2013/04/01/dmv-3-what-is-currently-going-on-sys-dm_exec_requests-2/ One of the queries there is this one:
SELECT
R.SESSION_ID,
R.REQUEST_IDASSESSION_REQUEST_ID,
R.STATUS,
S.HOST_NAME,
C.CLIENT_NET_ADDRESS,
CASEWHENS.LOGIN_NAME=S.ORIGINAL_LOGIN_NAMETHENS.LOGIN_NAMEELSES.LOGIN_NAME+'('+S.ORIGINAL_LOGIN_NAME+')'ENDASLOGIN_NAME,
S.PROGRAM_NAME,
DB_NAME(R.DATABASE_ID)ASDATABASE_NAME,
R.COMMAND,
ST.TEXTASQUERY_TEXT,
QP.QUERY_PLANASXML_QUERY_PLAN,
R.WAIT_TYPEASCURRENT_WAIT_TYPE,
R.LAST_WAIT_TYPE,
R.BLOCKING_SESSION_ID,
R.ROW_COUNT,
R.GRANTED_QUERY_MEMORY,
R.OPEN_TRANSACTION_COUNT,
R.USER_ID,
R.PERCENT_COMPLETE,
CASER.TRANSACTION_ISOLATION_LEVEL
WHEN 0 THEN'UNSPECIFIED'
WHEN 1 THEN'READUNCOMITTED'
WHEN 2 THEN'READCOMMITTED'
WHEN 3 THEN'REPEATABLE'
WHEN 4 THEN'SERIALIZABLE'
WHEN 5 THEN'SNAPSHOT'
ELSECAST(R.TRANSACTION_ISOLATION_LEVELASVARCHAR(32))
ENDASTRANSACTION_ISOLATION_LEVEL_NAME
FROM
SYS.DM_EXEC_REQUESTSR
LEFTOUTERJOINSYS.DM_EXEC_SESSIONSSONS.SESSION_ID=R.SESSION_ID
LEFTOUTERJOINSYS.DM_EXEC_CONNECTIONSCONC.CONNECTION_ID=R.CONNECTION_ID
CROSSAPPLYSYS.DM_EXEC_SQL_TEXT(R.SQL_HANDLE)ST
CROSSAPPLYSYS.DM_EXEC_QUERY_PLAN(R.PLAN_HANDLE)QP
WHERE
R.STATUSNOTIN('BACKGROUND','SLEEPING')
The result:
![image009 image009]()
It returns one ore more queries the SQL instance is currently working on. It’s actually pretty easy and very powerful. The first record is a sample entry we care about. The others are me interacting with the SQL management studio. Scroll to the right and you’ll see both the execution plan and the actual query. Now how cool is that?!
![image011 image011]()
There we can get the query being executed (QUERY_TEXT)
CREATEPROCEDURE raacct_InsertSession(
@Hostname NVARCHAR(256),
@ClientIPv4Address BINARY(4),
@ClientIPv6Address BINARY(16),
@ClientISPAddressType SMALLINT,
@ClientISPAddress VARBINARY(16),
@ConnectionType TINYINT,
@TransitionTechnology INT,
@TunnelType INT,
@SessionHandle BIGINT,
@Username NVARCHAR(256),
@SessionStartTime BIGINT,
@AuthMethod INT,
@HealthStatus INT)AS
BEGIN
DECLARE @SessionId BIGINT
DECLARE @ConnectionId BIGINT
DECLARE @NumActiveSessions SMALLINT
IF (@SessionHandle ISNULLOR @SessionHandle = 0)
BEGIN
-- error (BAD PARAMETER)
RETURN (1)
END
IF (@SessionStartTime ISNULLOR @SessionStartTime = 0)
BEGIN
-- error (BAD PARAMETER)
RETURN (1)
END
SELECT @SessionId = 0
BEGINTRANSACTION
SELECT @SessionId = [SessionId]
FROM [dbo].[SessionTable]
WHERE @SessionHandle = [SessionHandle]
AND @SessionStartTime = [SessionStartTime]
IF (@@ROWCOUNT> 0)
BEGIN
-- error (session already exists)
ROLLBACKTRANSACTION
RETURN (2)
END
-- check if connection exists
SELECT @ConnectionId = connTbl.[ConnectionId]
FROM [dbo].[ConnectionTable] AS connTbl, [dbo].[SessionTable] AS sessTbl
WHERE sessTbl.SessionState = 1
AND connTbl.ConnectionId = sessTbl.ConnectionId
AND connTbl.Hostname = @Hostname
AND connTbl.ClientIPv4Address = @ClientIPv4Address
AND connTbl.ClientIPv6Address = @ClientIPv6Address
AND connTbl.ClientISPAddressType = @ClientISPAddressType
AND connTbl.ClientISPAddress = @ClientISPAddress
AND connTbl.ConnectionType = @ConnectionType
AND connTbl.TransitionTechnology = @TransitionTechnology
AND connTbl.TunnelType = @TunnelType
IF@@ROWCOUNT= 0
BEGIN
-- create connection record
INSERTINTO [dbo].[ConnectionTable]([Hostname],
[ClientIPv4Address],
[ClientIPv6Address],
[ClientISPAddressType],
[ClientISPAddress],
[ConnectionType],
[TransitionTechnology],
[TunnelType]
)
VALUES (@Hostname,
@ClientIPv4Address,
@ClientIPv6Address,
@ClientISPAddressType,
@ClientISPAddress,
@ConnectionType,
@TransitionTechnology,
@TunnelType
)
IF@@ERROR<> 0
BEGIN
-- error (failed to create connection), return from here
ROLLBACKTRANSACTION
RETURN (99)
END
SET @ConnectionId =@@IDENTITY
END
SELECT @NumActiveSessions =COUNT(SessionHandle)
FROM [dbo].[SessionTable]
WHERE [SessionState] = 1
SET @NumActiveSessions = @NumActiveSessions + 1
INSERTINTO [dbo].[SessionTable]([ConnectionId],
[SessionHandle],
[Username],
[SessionStartTime],
[AuthMethod],
[HealthStatus],
[NumConcurrentConnections]
)
VALUES (@ConnectionId,
@SessionHandle,
@Username,
@SessionStartTime,
@AuthMethod,
@HealthStatus,
@NumActiveSessions
)
IF@@ERROR<> 0
BEGIN
ROLLBACKTRANSACTION
RETURN (4)
END
COMMITTRANSACTION
END
The only thing I can see, with my limited SQL knowledge, is that potentially performance hits might occur on the where statements:
FROM [dbo].[SessionTable]
WHERE @SessionHandle = [SessionHandle]
AND @SessionStartTime = [SessionStartTime]
And
FROM [dbo].[ConnectionTable] AS connTbl, [dbo].[SessionTable] AS sessTbl
WHERE sessTbl.SessionState = 1
AND connTbl.ConnectionId = sessTbl.ConnectionId
AND connTbl.Hostname = @Hostname
AND connTbl.ClientIPv4Address = @ClientIPv4Address
AND connTbl.ClientIPv6Address = @ClientIPv6Address
AND connTbl.ClientISPAddressType = @ClientISPAddressType
AND connTbl.ClientISPAddress = @ClientISPAddress
AND connTbl.ConnectionType = @ConnectionType
AND connTbl.TransitionTechnology = @TransitionTechnology
AND connTbl.TunnelType = @TunnelType
The where statements act as filters and columns they use are often indexed. Without an index the SQL server would have to scan the complete table looking for the records. Now on smaller tables that’s not an issue but the SessionTable table contains 14.482.972 records!
![image013 image013]()
So if we check the indexes for that table, one would hope SessionHandle, SessionStartTime and SessionState to be present:
![image015 image015]()
The last one UQ_SessionT… seems to have both SessionHandle and SessionStartTime in it. So I guess that should satisfy the first where statement:
![image017 image017]()
Now what about SessionState? I can’t seem to find that one… Now back to our query that showed us the query being executed. There’s also an XML_QUERY_PLAN. It’s clickable in the Management Studio:
![image019 image019]()
See how this query cost shows 50%? Further down there’s another Query that shows the other 50%. Both show “missing index”:
![image021 image021]()
As previously stated, I’m not an experienced SQL engineer/DBA. I try to crosscheck stuff I find online before applying it. Also I wouldn’t do this kind of stuff on a FIM Service or SCCM database. Those are pretty complex databases. But I made a personal assessment and the Direct Access auditing database seems simple enough to tinker with it. So I decided to give it a try and create the index. Undoing this is pretty straightforward, so I guess there’s no real harm in going forward. Right-click one of the existing indexes and choose Script Index as > CREATE To > New Query Editor Window
![image023 image023]()
Simply change both the Index name and the column to “SessionState”. And execute the query. After refreshing the UI you can see the index:
![image025 image025]()
And there goes the CPU usage:
![image027 image027]()
Conclusion: to me it looks like the DA team just forgot this particular index. From the other indexes you can tell they actually did something for those. I’m not really sure why we didn’t just log a case with Microsoft. Partially I guess because we were afraid/guessing we’d get the answer: by design with that amount of auditing data. But after this troubleshooting session we can clearly see there’s shortcoming in the SQL database setup. As with most stuff you read on the internet: be careful when applying in your environment. If you do not know what commands/queries you’re executing, look them up and do some background reading.