Labels

Friday, 23 September 2016

Common Significant Wait types with BOL explanations

                                                          WaitTypes 

Network Related Waits

       ASYNC_NETWORK_IO :Occurs on network writes when the task is blocked behind the network

 Locking Waits

1. LCK_M_IX: Occurs when a task is waiting to acquire an Intent Exclusive (IX) lock.

2. LCK_M_IU: Occurs when a task is waiting to acquire an Intent Update (IU) lock.

3. LCK_M_S: Occurs when a task is waiting to acquire a Shared lock.

I/O Related Waits

1.ASYNC_IO_COMPLETION: Occurs when a task is waiting for I/Os to finish.

2. IO_COMPLETION: Occurs while waiting for I/O operations to complete.              This wait type generally represents non-data page I/Os. Data page I/O completion waits appear  as

 PAGEIOLATCH waits

3. PAGEIOLATCH_SH:Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Shared mode. Long waits may indicate problems with the disk subsystem.

4. PAGEIOLATCH_EX:Occurs when a task is waiting on a latch for a buffer that is in an I/O request.  The latch request is in Exclusive mode. Long waits may indicate problems with the disk subsystem.

5. WRITELOG : Occurs while waiting for a log flush to complete. Common operations that cause log flushes are checkpoints and transaction commits.

6. PAGELATCH_EX: Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Exclusive mode.

7. BACKUPIO:Occurs when a backup task is waiting for data, or is waiting for a buffer in which to store data

CPU Related Waits

1. SOS_SCHEDULER_YIELD:  Occurs when a task voluntarily yields the scheduler for other tasks to execute. During this wait the task is waiting for its quantum to be renewed.
2. THREAD POOL : Occurs when a task is waiting for a worker to run on.
  This can indicate that the maximum worker setting is too low, or that batch executions are taking  unusually long, thus reducing the number of workers available to satisfy other batches.
3. CX_PACKET: Occurs when trying to synchronize the query processor exchange iterator .You may consider lowering the degree of parallelism if contention on this wait type becomes a problem.

SQL Server : Slow running query resons

                                                    Query Performing Slowly

1) Identify the SPID of the query from Sys.sysprocesses

2) Find any blockings exists or not and if it exists find the LEAD Blocker.

3) Identify how many connections are there, Benchmark if exceeds is an issue

4) CPU Utilization, Memory Utilization, System Memory Information and how much SQL Server is consuming. If AWE is enabled or not.

5) Any jobs are running both at OS level and SQL Server level.

6) Affinity Mask for Processor and IO should be checked.

7) Query observation, Coding standards have been followed or not. Commit issued at regular intervals in the code.

8) Table statistics are outdated. Need to update stats.

9) Indexes are created on correct columns are not. Verify the query and check the columns listed under WHERE condition and ensure that there are indexes created.

10) Even if indexes are created, they are fragmented or not.

11) Run server side trace to track what other activities are currently being rolled out to find the cause of the issue. If approved Profiler also can be used.

12) Network Latency between Client and the Server can cause performance issue.

13) MAXDOP feature can be utilized to improve the parallel execution of the query.

14) If clustered instance and Active-Active (N+1/N+M) Multiple Instance. If all nodes fail the last surviving node has all instances running on it causing performance issue.

15 )  Check disk space availability for that database and also for system databases.

16 ) Verify if any Application or Database specific jobs are running or not.

17 ) Network bandwidth to be verified and also Storage SAN bandwidth when accessing data. Contact Network/Storage team.

18) Identify top 10 long running queries and see if they are causing the performance lag.

19) Ask requester for the Query. Verify the Estimated Execution Plan of the Query.

20) Check if any Table Scans are present in the Plan. Table Scans are very expensive from resource perspective.

Also verify the Load on the system, Example average load is 2500 but we could see 10,000 connections.

SQL Server DBA : Troubleshooting 100% Memory / Memory Leak(s)

                        Troubleshooting 100% Memory / Memory Leak(s):

ü  Verify Task Manager for basic understanding of Memory utilization by which Process.

ü  If memory is consumed by other processes contact respective team to get it fixed.

ü  If memory is consumed high by SQL Server follow below steps

 Verify SQL Server Error logs for any memory related errors

MTL Based Errors:

 i) SQL Server 2000

WARNING: Failed to reserve contiguous memory of Size

 ii) SQL Server 2005

Failed Virtual Allocate Bytes: FAIL_VIRTUAL_RESERVE

 iii)SQL Server 2005

Failed to initialize the Common Language Runtime (CLR)

BPool Based Errors:

 i) BPool::Map: no remappable address found.

ii) BufferPool out of memory condition

iii)LazyWriter: warning, no free buffers found.

BPool (or) MemToLeave errors:

i) Error: 17803 “Insufficient memory available..”

ii) Error: 701, Severity: 17, State: 123.

There is insufficient system memory to run this query.

b) If MTL is the reason for the Memory issue we have to determine  whether it is SQL Server or some non-SQL component that is using the most MemToLeave memory

Query:

select sum(multi_pages_kb)  from sys.dm_os_memory_clerks

In MTL if SQL Server Owned memory is very less ,then determine if there are COM objects, SQL Mail, or 3rd party extended stored procedures being used, and move them out of process if possible(or contact App Team).

c) If MTL is not the reason then we need to focus on BPool portion and who is occupying more in BPool.

To find out who is consuming more in BPool fire below query:

select  *  from sys.dm_os_memory_clerks order by  Single_pages_kb  desc

To calculate the BPool approximate usage size use below command:

select  sum(single_pages_kb) from sys.dm_os_memory_clerks

Extra Counters to Monitor:-

Monitor PLE(Perfmon Counter):-
PLE is the expected time a read page from file is maintained in Buffer Pool.

Monitor BCHR(Perfmon Counter):-

Buffer Cache Hit Ratio is the utilization ratio of Buffer Pool towards I/O operation. Microsoft recommends >95% for OLTP and >90% for OLAP.