Working diligently as any good DBA might, you have established maintenance routines for each of the SQL Servers under your purview.
As a part of this maintenance you have scripted solutions to intelligently manage and maintain the fragmentation levels for each of the indexes within each database on each instance.
To further show how diligent you are as a DBA, the outcomes of each maintenance run are logged and you review the logs each morning. This routine helps keep you on top of everything that is happening within the environment.
For months, maybe even years, things are running smoothly. Never a failure. Never an error. Just routine log review day after day. Then one day it happens – there is an error. The index maintenance script failed one night.
You receive the error message similar to the following:
Msg 2552, Level 16, State 1, Line 1 The index “blah” (partition 1) on table “blah_blah_blah” cannot be reorganized because page level locking is disabled
Immediately, you start double-checking yourself and verifying that it worked the previous night. You even go so far as to confirm that the same index was previously reorganized. How is it possible that it is failing now on this index. What has changed? Has something changed?
Time for a little digging and investigating, so the dirty work begins. On one side of the coin you are relieved to e able to do something different. On the other side of that coin, you are rather annoyed that something seems to have changed. These feelings are perfectly normal!
First things first – you investigate the indexes in question to confirm what the error is saying. This is easily done with a query such as the following:
SELECT i.name AS IdxName
, SCHEMA_NAME(o.schema_id) AS SchName
, OBJECT_NAME(i.object_id) AS ObjName
FROM sys.indexes i
INNER JOIN sys.objects o
ON o.object_id = i.object_id
WHERE OBJECTPROPERTY(i.object_id,'IsMSShipped') = 0;
Scrolling through the results, you notice (eventually) that the IX_SpecialOfferProduct_ProductID in the AdventureWorks2014 database has page locks disabled. You are absolutely certain that this index was created allowing page locks. Pondering the problem for a moment, you recall having read about the default trace (there are several articles on the default trace – here) and the thought occurs to try and see if there is a breadcrumb there about the change. Using the query from that default trace article, a picture starts to unscramble. Here is that query reposted and a snippet of the results:
DECLARE @DBName sysname = 'AdventureWorks2014'
, T.TextData AS SQLStatement
, CASE EventClass
END AS DDLOperation
( SELECT REVERSE(SUBSTRING(REVERSE(path),
CHARINDEX('\',REVERSE(path)),256)) + 'log.trc'
WHERE is_default = 1)), DEFAULT) T
WHERE EventClass in (46,47,164)
AND EventSubclass = 0
AND ObjectType <> 21587-- don't bother with auto-statistics as it generates too much noise
AND DatabaseName = @DBName;
SELECT @d1 = MIN(StartTime)
SET @diff= DATEDIFF(hh,@d1,GETDATE());
SELECT @diff AS HrsSinceFirstChange
, @d1 AS FirstChangeDate
, tt.StartTime AS LastChange
, sv.name AS obj_type_desc
, tt.ServerName,tt.LoginName, tt.NTUserName
, (dense_rank() OVER (ORDER BY ObjectName,ObjectType ) )%2 AS l1
, (dense_rank() OVER (ORDER BY ObjectName,ObjectType,StartTime ))%2 AS l2
FROM #temp_trace tt
INNER JOIN sys.trace_events AS te
ON tt.EventClass = te.trace_event_id
INNER JOIN sys.trace_subclass_values tsv
ON tt.EventClass = tsv.trace_event_id
AND tt.ObjectType = tsv.subclass_value
INNER JOIN master.dbo.spt_values sv
ON tsv.subclass_value = sv.number
AND sv.type = 'EOD'
ORDER BY StartTime DESC;
DROP TABLE #temp_trace;
This is a great start. Not seen in the results is the timestamp showing when it was done – which was due solely to snipping. Also not shown is the text of the statement that was run for those three events. So close, yet so far away. This is not quite enough to have the smoking gun evidence to show Jason (me) that I did something wrong and unauthorized. What to do now?
All is not lost yet! Your stunning memory kicks in and you recall several articles about using Extended Events to audit server and database changes. Better yet, you recall that you deployed an XE session to the server where this error occurred. And yes, you are slightly embarrassed that you failed to remove the XE session after fiddling with it. We won’t tell anybody that you deployed a test XE session to a production server as if it were your sandbox. The session currently deployed is trapping all object changes unlike the following session that has it filtered down to just objects that are indexes.
-- Create the Event Session
IF EXISTS ( SELECT *
WHERE name = 'ObjChangeAudit' )
DROP EVENT SESSION ObjChangeAudit
EXECUTE xp_create_subdir 'C:\Database\XE';
CREATE EVENT SESSION ObjChangeAudit ON SERVER
ADD EVENT sqlserver.object_altered ( SET collect_database_name = ( 1 )
ACTION ( sqlserver.sql_text,sqlserver.nt_username,sqlserver.server_principal_name,sqlserver.client_hostname,
WHERE database_name = 'AdventureWorks2014'
AND object_type = 'INDEX'
OR object_type = 'OBJ'
ADD TARGET package0.event_file ( SET filename = N'C:\Database\XE\ObjChangeAudit.xel' )
WITH (STARTUP_STATE = OFF
,TRACK_CAUSALITY = ON);
/* start the session */
ALTER EVENT SESSION ObjChangeAudit
STATE = START;
You query the trace file with a query like this:
SELECT CAST ([t2].[event_data] AS XML) AS event_data, t2.file_offset,t2.file_name, cte1.event_session_id--, '' AS event_predicate
FROM ( SELECT REPLACE(CONVERT(NVARCHAR(128),sesf.value),'.xel','*.xel') AS targetvalue, ses.event_session_id
FROM sys.server_event_sessions ses
INNER JOIN sys.server_event_session_fields sesf
ON ses.event_session_id = sesf.event_session_id
--INNER JOIN sys.server_event_session_events sese
-- ON ses.event_session_id = sese.event_session_id
WHERE sesf.name = 'filename'
AND ses.name = 'ObjChangeAudit'
OUTER APPLY sys.fn_xe_file_target_read_file(cte1.targetvalue, NULL, NULL, NULL) t2
SELECT event_data.value('(event/@name)', 'varchar(50)') AS event_name
'varchar(max)') AS DBName
'varchar(max)') AS DDLPhase
'varchar(max)') AS object_type
, event_data.value('(event/@timestamp)', 'varchar(50)') AS [TIMESTAMP]
'varchar(max)') AS SystemTime
'varchar(max)') AS ClientHostName
'varchar(max)') AS ServerPrincipalName
'varchar(max)') AS nt_username
'varchar(max)') AS sql_text
FROM #xmlprocess x
LEFT OUTER JOIN sys.server_event_session_events sese
ON x.event_data.value('(event/@name)', 'varchar(50)') = sese.name
AND x.event_session_id = sese.event_session_id
ORDER BY timestamp,event_data.value('(event/action[@name="event_sequence"]/value)',
DROP TABLE #xmlprocess;
Wow! Look at the results! There is a smoking gun finally.
Two results happen to pin the root cause of the change squarely on my shoulders. Yup, Jason changed that index to disallow page locks. I highlighted three different areas of interest in the results. The yellow and green indicate the DDL phase. One row for the start of the statement, and another row for the commit of that statement. The red highlight shows me the exact change that was made to this index. This is all very good info!
It really is great to have the smoking gun. If something is broke and it worked previously, it is essential to find the root cause. With a root cause under the belt, what needs to be done to fix the failure? That is a little bit easier that finding the root cause. That is, unless there is a technical reason for the index to no longer allow page locks (maybe that smoking gun is less of a smoking gun and more like baby spittle after all).
Here is how you fix the problem:
ALTER INDEX IX_SpecialOfferProduct_ProductID ON Sales.SpecialOfferProduct SET (ALLOW_PAGE_LOCKS = ON)
The Extended Events session would be very noisy and capture every alter index statement, right? It should capture statements like the following, right?
ALTER INDEX IX_SpecialOfferProduct_ProductID ON Sales.SpecialOfferProduct REBUILD
ALTER INDEX IX_SpecialOfferProduct_ProductID ON Sales.SpecialOfferProduct REORGANIZE
ALTER INDEX IX_SpecialOfferProduct_ProductID ON Sales.SpecialOfferProduct REBUILD WITH (DATA_COMPRESSION = PAGE)
The answer to that question is: Yes, Yes, Yes. These statements are all captured due to the use of the ALTER statement. Here are the results of the XE session with all of these scripts having been executed:
If you want to audit when the indexes are changing and how they are changing, this will do just the trick. If there are a ton of changes, then be prepared for a deluge of information.