Concept

When designing High Availability for database it is important to set the practical expectations of RPO and RTO, this ensures there is minimal impact for critical applications when database environment is restored. Thus it is always recommended to follow best practices of Maximum Availability Architecture(MAA) keeping in mind that complexity is always the enemy of High Availability and at the same time it is worth to forecast the scalability with proper planning.

So in Oracle majority of the systems does implement Data Guard as part of their High Availability(HA) solution, but technically how do we derive RTO when either Switchover OR Failover is performed ? With careful thoughts we can break up timing it took for each of the stages during Failover and Switchover.

In brief switchover can be broken down into four main key operations.

  1. Convert to Standby: On old primary all the existing sessions will be killed, database control file will be converted into standby control file type and propagates an 'End of Redo Mark' acknowledgement to standby to proceed with the switchover.
  2. Cancel Recovery: On old standby database recovery will be stopped after ensuring that any remaining redo is applied.
  3. Convert to Primary: On old standby database it will be brought into mount mode, clears online redo log files and converts control file into primary control file type.
  4. Open new Primary: Database will be opened, in case of RAC all the instances will be opened in parallel.

In the same way failover can be broken down into four main key operations all of which are performed in old standby.

  1. Cancel Recovery: Recovery of database will be stopped and brings database into mount state, in case of RAC all instances will be brought into mount state in parallel.
  2. Terminal Recovery: Standby redo log files will be archived and ensures that any unapplied redo is recovered.
  3. Convert to Primary: Online log files will be cleared and control file will be converted.
  4. Open Primary: Database will be opened, in case of RAC all instances will opened in parallel.

Approach to scrutinize role transition

Biggest challenge will be to find where the time being spent during database role transition as in few stages mentioned above the timing information deviates due to multiple reasons like size of the database SGA, roll back of uncommitted transactions by initializing undo, total number of active sessions needs to be killed, even in case of Active Data Guard total number active sessions needs to be killed, number of data files in a database and many more reasons. The goal behind finding the time spent on each stage of role transition is to bring down the RTO by identifying the root cause of delay in stage. Thus it is very important to understand the details of each stage and tune up the system to meet expectations of RTO.

From 11.2.0.4 onwards there are two ways to analyze the role transition performance, Oracle recommends to use Time Management Interface(TMI) to portray accurately the delineation of each Switchover and Failover stages and the other method is by querying X$DBGALERTEXT table which is not so accurate when compared to TMI but acceptable in most cases.

Time Management Interface(TMI)

TMI is enabled by setting 16453 event at level 15 in the database and it is very low overhead as it adds a line in alert log file of the database for each call made by Oracle during role transition. Each line of TMI event in the alert log file depicts the start and end of the event in each stages for switchover and failover. This break up of delineation for begin and end call provides accurate information of time it took in each stages of switchover and failover.

TMI event can be set as shown below.

ALTER SYSTEM SET EVENT=‘16453 trace name context forever, level 15’ scope=spfile sid=’*’

After 16453 event is set we will find that lot of TMI events are stamped into alert log file, to precisely track the time it took for each stage after switchover or failover we need to correlate the exact event tags. Each of these tags which delineate the begin and end call will have its corresponding timestamp and we can use this timestamp to calculate the total time it took for each call.

Below table describes the tags post database switchover in 12.1 release with TMI event set.

Stage

Call BEGIN/END

Text String

Convert To Standby - (Old Primary)

BEGIN

END

TMI: dbsdrv switchover to target BEGIN <DATE> <TIMESTAMP>

TMI: kcv_switchover_to_target send 'switchover to primary' msg BEGIN <DATE> <TIMESTAMP>

Cancel Recovery - (Old Standby)

BEGIN

END

TMI: kcv_commit_to_so_to_primary wait for MRP to die BEGIN <DATE> <TIMESTAMP>

TMI: kcv_commit_to_so_to_primary wait for MRP to die END <DATE> <TIMESTAMP>

Convert to Primary - (Old Standby)

BEGIN

END

TMI: kcv_commit_to_so_to_primary BEGIN CTSO to primary <DATE> <TIMESTAMP>

TMI: adbdrv BEGIN 10 <DATE> <TIMESTAMP>

Open Primary - (Old Standby)

BEGIN

END

TMI: adbdrv BEGIN 10 <DATE> <TIMESTAMP>

TMI: adbdrv END 10 <DATE> <TIMESTAMP>

Below table describes the tags post database failover in 12.1 release with TMI event set.

Stage

Call BEGIN/END

Text String

Cancel Recovery

BEGIN

END

TMI: adbdrv termRecovery BEGIN <DATE> <TIMESTAMP>

TMI: adbdrv termRecovery END <DATE> <TIMESTAMP>

Terminal Recovery

BEGIN

END

TMI: krdsmr full BEGIN Starting media recovery <DATE> <TIMESTAMP>

TMI: krdemr full END end media recovery <DATE> <TIMESTAMP>

Convert to Primary

BEGIN

END

TMI: kcv_commit_to_so_to_primary BEGIN CTSO to primary <DATE> <TIMESTAMP>

TMI: adbdrv BEGIN 10 <DATE> <TIMESTAMP>

Convert to Primary

BEGIN

END

TMI: adbdrv BEGIN 10 <DATE> <TIMESTAMP>

TMI: adbdrv END 10 <DATE> <TIMESTAMP>

For example in 12.1 idle database having no workload and without any application sessions connected, TMI event provides below information derived by calculating begin and end of each call through shell script which mines the alert log file.

Switchover timing results in 12.1 with TMI event set

Stage

Starte Time

End Time

Time Taken (Seconds)

Convert To Standby (Old Primary) 2016-10-18 05:18:57.156908 2016-10-18 05:19:02.885444 5
Cancel Recovery (Old Standby) 2016-10-18 05:19:03.392786 2016-10-18 05:19:06.407796 3
Convert To Primary (Old Standby) 2016-10-18 05:19:06.408927 2016-10-18 05:19:21.961796 15
Open Primary (Old Standby) 2016-10-18 05:19:21.961796 2016-10-18 05:24:05.947789 284

Failover timing results in 12.1 with TMI event set

Stage

Start Time

End Time

Time Taken (seconds)

Cancel Recovery  2016-10-18 06:00:45.965034 2016-10-18 06:00:49.691050 4
Terminal Recovery  2016-10-18 06:00:49.695405 2016-10-18 06:00:52.795999 3
Convert To Primary 2016-10-18 06:00:52.800402 2016-10-18 06:00:54.394868 2
Open Primary  2016-10-18 06:00:54.394868 2016-10-18 06:03:58.518725 184

As you could see most of the time is spent while opening the primary database and in case of switchover it has taken more time than failover due to additional coordination to ensure there is no data loss between primary and standby when Data Guard is configured as asynchronous. During switchover, standby needs to either clear or create the online redo log files before it could be opened as primary database. In case of failover there will be no coordination between primary and standby as data loss is controlled by Data Guard protection mode and total RTO is accounted just for role transition of standby into primary database. There are many other accountable additional factors such as RAC, ADG, sessions and SGA size which influence time to open and close the database having impact on overall role transition performance.

X$DBGALERTEXT

This method of querying table X$DBGALERTEXT is not preferable when compared to TMI due to overhead of querying this table directly proportional to the size of alert log file. So it is not optimal to query this table for deliberation of stages related to database role transition and is not so accurate as to what information TMI can provide. This fixed table directly maps to the xml version of alert log file facilitating sql queries against database alert log file.

Below table describes the text strings corresponding to MESSAGE_TEXT column of table X$DBGALERTEXT post database switchover in 12.1 release.

Stage

Call BEGIN/END

Text String

Convert To Standby- (Old Primary)

BEGIN

END

‘SWITCHOVER VERIFY: Send VERIFY request to switchover target%’

‘SWITCHOVER: received request 'ALTER DTABASE COMMIT TO SWITCHOVER TO PRIMARY' from primary database%’

Cancel Recovery - (Old Standby)

BEGIN

END

‘Role Change: Canceling MRP - no more redo to apply%’

‘Role Change: Canceled MRP%’

Convert to Primary - (Old Standby)

BEGIN

END

‘Role Change: Canceled MRP%’

‘SWITCHOVER: completed request from primary database%’

Open Primary - (Old Standby)

BEGIN

END

‘ALTER DATABASE OPEN%’

‘Completed: ALTER DATABASE OPEN%’

Below table describes the text strings corresponding to MESSAGE_TEXT column of table X$DBGALERTEXT post database failover in 12.1 release.

Stage

Call BEGIN/END

Text String

Cancel Recovery

BEGIN

END

'Terminal Recovery requested%'

'Attempt to do a Terminal Recovery%'

Terminal Recovery

BEGIN

END

'Attempt to do a Terminal Recovery%'

'Terminal Recovery: successful completion%'

Convert to Primary

BEGIN

END

'Terminal Recovery: successful completion%'

'Switchover: Complete - Database mounted as primary%'

Open Primary

BEGIN

END

‘ALTER DATABASE OPEN%’

‘Completed: ALTER DATABASE OPEN%’

We can use below query to find the timestamp for begin and end of the related text strings and then calculate the total time taken for each stages of database role transition. If you notice in the sql we have used ORIGINATING_TIMESTAMP for timestamp of text string 'Convert to Standby/Cancel Recovery BEGIN timestamp' to consider it as begin timestamp of first step during role transition so that we can derive timing information of other stages by using this sql.

SELECT MIN(ORIGINATING_TIMESTAMP)
FROM X$DBGALERTEXT
WHERE MESSAGE_TEXT LIKE '[Text String]'
AND ORIGINATING_TIMESTAMP > [Timestamp of message - Convert to Standby/Cancel Recovery BEGIN timestamp];

Role transition optimization

There are few recommendations that can employ to minimize the RTO during role transition of database particularly for planned outage (switchover) and also few of the recommendations are applicable for un-planned outage (failover) as well.

  • Configure standby database to use real time apply so that redo is applied immediately when it is arrived into standby redo log files, if broker is configured with DelayMins=0 then broker always uses real time apply. In 12.1 release and above there is no need of specifying 'using current logfile' clause as by default real time apply will be used when standby redo log files exists.
  • Configure LOG_FILE_NAME_CONVERT parameter to pre-create standby redo log files when MRP is initiated first time. Can also be achieved manually as well by clearing the online redo log files. This reduces the time to open the database as primary after switchover by avoiding the creation/clearing of online redo log files on standby before it could be open as primary database.
  • Minimize the number of archiver process LOG_ARCHIVE_MAX_PROCESSES required for database to both local and remote archiving as it takes more time to shutdown the database when having high number of archiver process, in case of switchover this can be reduced before switchover and then increase it after completion of role transition.
  • Configure flashback to bring database back into protection level immediately in case of failover, can also be used for quickly recovering logical or user erros.
  • Use broker wherever possible to reduce the complexity of manual steps to be taken for role transition and management of log archiving parameters.
  • Rely on Fast Start Fail Over(FSFO) to automatically failover the database as part of disaster recovery plan.

Conclusion

To minimize the impact on application we need to minimize the downtime and derive the RTO beforehand to predict practically achievable target. Many factors influences database role transition time and hence it is important to track down the accurate timing information of each stage during switchover or failover. Varying workload on the database and multiple configuration factors like RAC, ADG, SGA size, number of sessions and number of data files in the database enforces the difficulty to accurately tune the role transition timing. This is where TMI can be leveraged to precisely capture the role transition time information which will be handy to investigate the performance further. If we measure the timeline across different database versions it shows that role transition time have been steadily improving with the latest versions.