Promising Zero Data LAG to Applications - Active Data Guard

Oracle Community

Promising Zero Data LAG to Applications - Active Data Guard

Written By

Introduction

To achieve zero data LAG to the applications with the Active Data Guard. Apart from disaster recovery purpose there are many things we can capture the features with Active Data Guard by running the applications on standby database but we need to configure few more additional things to achieve zero data lag on standby under any circumstances. This article explains all.

Zero Data LAG

We know well that starting from 11g, Standby database can be opened read only mode and same time recovery can be started together with Active Data Guard feature/license which is providing a large scope to the applications to use for any reporting jobs which uses only Select queries. From here many customers preferring to use Active Data Guard so that we can offload the Reporting jobs, backups and much more to the standby database which can free up the resources on primary/production database.

In the earlier days of Active Data Guard, most of them are in assumption that Physical standby is in Sync with primary database and applications are using up to date data. It is true if the configuration is proper and the things are going fine but it's a blunder if something gone wrong on standby database. So let's suppose a Banking application running for the quarterly report which is very critical and same time due to some issue standby was behind the primary for 30 minutes or 40 minutes but still reports executed successfully but without fresh data which misleads our assumptions. Usually such reports were configured on primary for the guarantee of data but they completely rely on Active Data Guard as it meets the jobs requirement. But this is true if you have configured Data Guard/Active Data Guard in such a way to achieve Zero Data Loss. Let me provide small example on data lag and how to identify the Lag in simple way.

The configuration looks good and no issues found with it.

DGMGRL> show configuration;

Configuration - ckpt12c

Protection Mode: MaxPerformance

Databases:

canada - Primary database

   india - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:

SUCCESS

DGMGRL>

From Primary database when I ran query on primary database, it returned 5 rows.

SQL> select db_unique_name,database_role from v$database;

DB_UNIQUE_NAME DATABASE_ROLE

--------------- ----------------

CANADA         PRIMARY

SQL> select * from adg_tab;

QNAME             QID QTYPE

---------- ---------- ----------

AAAA             1001 CURRENT

BBBB             1002 SAVINGS

CCCC             1003 CREDIT

CCCC             1004 SAVINGS

DDDD             1005 SAVINGS

 

But from standby database the same table returned only 3 rows. So there is something wrong but you still able to fetch database which is historical and no matter whether its 1 minute past or 30 minutes. Still even one transaction is also a big data loss if it's not present on Standby database.

SQL> select db_unique_name,database_role from v$database;

DB_UNIQUE_NAME DATABASE_ROLE

--------------- ----------------

INDIA           PHYSICAL STANDBY

SQL> select * from adg_tab;

QNAME             QID QTYPE

---------- ---------- ----------

AAAA             1001 CURRENT

BBBB             1002 SAVINGS

CCCC             1003 CREDIT

SQL>

So maybe the past reporting jobs are lying that standby having the updated information same as primary database? In order to justify we can use simple queries from Primary and Standby database. Of course primary SCN will be bit ahead to standby but we have to ensure that standby SCN is advancing.

SQL> select current_scn from v$database;

CURRENT_SCN

-----------

   3902260

SQL> select current_scn from v$database;

CURRENT_SCN

-----------

   3901748

Apart from that we can use various queries to estimate LAG between Primary and Standby database. There are few very basic things we must consider for Zero Data LAG they are Real-Time apply and Database protection mode. Of course Oracle recommends strongly to use Real-Time apply in any protection modes.

Prerequisites for Promising zero data lag

To achieve "zero data loss", the protection mode of database is a crucial role, With the Maximum Protection mode by default behavior is Zero Data loss, but when it comes to Maximum Availability or Maximum Performance we need to take care of configuring parameters and settings.
Regardless of any protection mode , the standby should be in real time query mode and of course which can achieved only through Real-Time apply. In order to configure Zero Data Loss ensure that there is no DELAY_MINS configured on primary which points to the remote destination. DELAY_MINS basically the data in standby will be received from primary with delay, The DELAY_MINS parameter is of course good feature/parameter so that any changes done on primary will not immediately applied on standby, hence if any truncated tables, dropped schemas from primary but still we can retrieve them from standby because the changes were not applied because of the parameter DELAY_MINS. So before configuring zero data lag we must check the primary database remote destinations does not have configured. You can also check through views whether configured or not.

SQL> select dest_id,delay_mins from v$archive_dest where dest_id in (2,3,4);

   DEST_ID DELAY_MINS

---------- ----------

         2         0

         3         0

         4         0

STANDBY_MAX_DATA_DELAY

This parameter is an session level parameter , by setting this parameter to a value for example 10 seconds, then it means whenever the application starts querying the data and it can tolerate maximum of 10 seconds. We can also configure parameter to "0" hence there will be no chance for tolerance and finally session will be ended without error but it won't results the stale data.

STANDBY_MAX_DATA_DELAY parameter default value is "NONE" that means the parameter will not be applicable and if the use query the data it will not check whether the data is stale or not.

SQL> show parameter STANDBY_MAX_DATA_DELAY

SQL>

The parameter will not be visible at database level, because its designed for session level to configure. Still if you try to set this parameter and it ends up with warning as this parameter is applicable only for non-SYS users.

SQL> alter session set standby_max_data_delay=30;

ERROR:

ORA-03174: STANDBY_MAX_DATA_DELAY does not apply to SYS users

We will see demonstration how this parameter actually works when actual user queries the data. Before proceeding with that we will check broker status of standby database and it performing Real-Time query and also there is no apply lag as well.

DGMGRL> show database india

Database - india

Role:             PHYSICAL STANDBY

Intended State:   APPLY-ON

Transport Lag:     0 seconds (computed 0 seconds ago)

Apply Lag:         0 seconds (computed 1 second ago)

Apply Rate:       0 Byte/s

Real Time Query:   ON

Instance(s):

   drmcdb

Database Status:

SUCCESS

DGMGRL>

Now Connect to the normal user and set the parameter "standby_max_data_delay" to 30 seconds. As soon as we set this parameter the data lag tolerance is acceptable only for 30 seconds, If staleness of data exceeded 30 seconds then it will results us error but never produce the stale/old data which can mislead the things.

Standby:

 

SQL> conn c##nassyam

Enter password:

Connected.

SQL> show user

USER is "C##NASSYAM"

SQL> select * from std_lag;

       QID QNAME

---------- ----------

       101 AAAA

       102 BBBB

       103 CCCC

SQL> alter session set standby_max_data_delay=30;

Session altered.

SQL>

After setting the parameter standby_max_data_delay to 30 seconds then to interpret the error we have forced to stop the MRP and then inserted a new row from primary database. Then the new rows are visible in primary.

DGMGRL> edit database india set state='apply-off';

Succeeded.

DGMGRL>

Primary:

SQL> insert into std_lag values (104,'DDDD');

1 row created.

SQL> commit;

Commit complete.

SQL> select * from std_lag;

       QID QNAME

---------- ----------

       104 DDDD               <-- New Row

       101 AAAA

       102 BBBB

       103 CCCC

 

SQL>

If you see above the new row inserted is visible on primary but when we run the select query across the same table then it results an error but not given old data without new row. So this example promises how the zero data lag will be ensured with the parameter.

Standby:

SQL> show parameter standby_max_data_delay

SQL> select * from std_lag;

select * from std_lag

*

ERROR at line 1:

ORA-00604: error occurred at recursive SQL level 1

ORA-03172: STANDBY_MAX_DATA_DELAY of 30 seconds exceeded

SQL>

We can crosscheck how much lag is standby with the primary database using Broker.

DGMGRL> show database india

Database - india

Role:             PHYSICAL STANDBY

Intended State:   APPLY-OFF

Transport Lag:     0 seconds (computed 0 seconds ago)

Apply Lag:         10 minutes 17 seconds (computed 0 seconds ago)

Apply Rate:       (unknown)

Real Time Query:   OFF

Instance(s):

   drmcdb

Database Status:

SUCCESS

DGMGRL>

It's not necessary that we should check from Broker but the results will be viewable through below view.

SQL> select name,value from v$dataguard_stats;

NAME                           VALUE

------------------------------ ----------------------------------------

transport lag                 +00 00:00:00

apply lag                     +00 00:10:25

apply finish time             +00 00:00:00.052

estimated startup time         18

SQL>

Finally, it's not mandatory to use when connecting to SQL prompt but we can also configure triggers so that whenever applicable tries to access application it will verify the standby delay with primary.

SQL> show user

USER is "C##NASSYAM"

SQL> ;

1 CREATE OR REPLACE TRIGGER STD_LAG

2 AFTER LOGON ON database

3 BEGIN

4 IF (SYS_CONTEXT('USERENV','DATABASE_ROLE') in ('PHYSICAL STANDBY'))

5 THEN

6 execute immediate 'ALTER SESSION SET STANDBY_MAX_DATA_DELAY=20';

7 END IF;

8 END;

9*

SQL> /

Trigger created.

SQL>

With the above trigger the staleness of data will be allowed only until 20 seconds and after that the user cannot query the data and it will be ended up with the errors. Again we have stopped MRP and inserted one more row in the table.

DGMGRL> edit database india set state='apply-off';

Succeeded.

DGMGRL>

DGMGRL> show database india

Database - india

 

Role:             PHYSICAL STANDBY

Intended State:   APPLY-OFF

Transport Lag:     0 seconds (computed 1 second ago)

Apply Lag:         25 seconds (computed 1 second ago)

Apply Rate:       (unknown)

Real Time Query:   OFF

Instance(s):

   drmcdb

Database Status:

SUCCESS

DGMGRL>

So we have now lag of 25 seconds which should raise exception with trigger.

Primary:

SQL> insert into std_lag values (104,'DDDD');

1 row created.

SQL> commit;

Commit complete.

SQL> select * from std_lag;

       QID QNAME

---------- ----------

       104 DDDD

       101 AAAA

       102 BBBB

       103 CCCC

SQL>

 

Standby:

When the lag is reached more than 20 seconds and the session won't even allow you to connect the session. Of course the trigger settings depend on you how to configure.

SQL> conn c##nassyam/oracle

ERROR:

ORA-00604: error occurred at recursive SQL level 2

ORA-03172: STANDBY_MAX_DATA_DELAY of 20 seconds exceeded

Warning: You are no longer connected to ORACLE.

SQL>

Of course the information we can also track from the alert log as below

Sat Jun 20 06:24:23 2015

opidcl aborting process unknown ospid (13546) as a result of ORA-3172

Sat Jun 20 06:24:29 2015

opidcl aborting process unknown ospid (13554) as a result of ORA-3172

Sat Jun 20 06:24:35 2015

opidcl aborting process unknown ospid (13558) as a result of ORA-3172

Sat Jun 20 06:24:37 2015

opidcl aborting process unknown ospid (13566) as a result of ORA-3172

Sat Jun 20 06:24:44 2015

opidcl aborting process unknown ospid (13569) as a result of ORA-3172

-bash-3.2$ oerr ora 3172

03172, 00000, "STANDBY_MAX_DATA_DELAY of %s seconds exceeded"

// *Cause: Standby recovery fell behind the STANDBY_MAX_DATA_DELAY

//         requirement.

// *Action: Tune recovery and retry the query later, or switch to another

//         standby database within the data delay requirement.

-bash-3.2$

By thus feature of standby_max_data_delay it will ensure there is no stale data.

Apply Synchronization

Related to data lag issues over Active Data Guard, Oracle introduced one more command "alter session sync with primary" so that after connection to the user first it tries to sync the standby with primary and the it leaves to you for further use. This parameter indirectly checks and confirms there is no data lag if any issues with MRP of standby and unable to synchronizes the standby with primary then session will be available to query the tables without waiting for the redo apply. We will see how it works.

DGMGRL> edit database india set state='apply-off';

Succeeded.

DGMGRL> show database india;

Database - india

Role:             PHYSICAL STANDBY

Intended State:   APPLY-OFF

Transport Lag:     0 seconds (computed 0 seconds ago)

Apply Lag:         9 seconds (computed 0 seconds ago)

Apply Rate:       (unknown)

Real Time Query:   OFF

Instance(s):

   drmcdb

Database Status:

SUCCESS

DGMGRL>

 

From the above command of DGMGRL we ensured that MRP is not running. Now we will connect to the standby using application user and we can check whether the standby is in sync with primary using the command. This command can be used by any user and there are no restrictions with it.

SQL> show user

USER is "C##NASSYAM"

SQL> alter session sync with primary;

ERROR:

ORA-03173: Standby may not be synced with primary

SQL> !oerr ora 3173

03173, 00000, "Standby may not be synced with primary"

// *Cause: ALTER SESSION SYNC WITH PRIMARY did not work because either the

//         standby was not synchronous or in a data-loss mode with regard

//         to the primary, or the standby recovery was terminated.

// *Action: Make the standby synchronous and no-data-loss with regard

//         to the primary. Make the standby recovery active.

SQL>

 

Finally, the command can be enforced from the trigger so that whenever user login, before querying data first it will check the standby synchronization with the primary and then it proceeds to access tables/data.

SQL> conn c##nassyam/oracle

Connected.

SQL> ;

1 CREATE OR REPLACE TRIGGER STD_SYNC

2 AFTER LOGON ON DATABASE

3 BEGIN

4 IF (SYS_CONTEXT('USERENV','DATABASE_ROLE') in ('PHYSICAL STANDBY'))

5 THEN

6 execute immediate 'ALTER SESSION SYNC WITH PRIMARY';

7 END IF;

8 END;

9*

SQL> /

Trigger created.

SQL>

If in case there is any LAG on standby with primary database, then first it interprets or enforce the command to make standby synchronize before leaving to applications.

DGMGRL> show database india;

Database - india

Role:             PHYSICAL STANDBY

Intended State:   APPLY-OFF

Transport Lag:     0 seconds (computed 0 seconds ago)

Apply Lag:         17 seconds (computed 0 seconds ago)

Apply Rate:       (unknown)

Real Time Query:   OFF

Instance(s):

   drmcdb

Database Status:

SUCCESS

DGMGRL>

After having GAP on standby, when we connect to standby user then of course it connects but it will wait until the trigger is completely executed.

-bash-3.2$ sqlplus c##nassyam/oracle

SQL*Plus: Release 12.1.0.1.0 Production on Sat Jun 20 06:50:34 2015

Copyright (c) 1982, 2013, Oracle. All rights reserved.

-- waited ---

Last Successful login time: Sat Jun 20 2015 06:47:42 +05:30

Connected to:

Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production

With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL>

Of course there will be no errors will appear at session level but you can track them from alert log that standby may not be synced.

Alert Log

Sat Jun 20 06:50:39 2015

Errors in file /u01/app/oracle/diag/rdbms/india/drmcdb/trace/drmcdb_ora_14743.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-03173: Standby may not be synced with primary

ORA-06512: at line 4

Sat Jun 20 06:50:44 2015

Errors in file /u01/app/oracle/diag/rdbms/india/drmcdb/trace/drmcdb_ora_14743.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-03173: Standby may not be synced with primary

ORA-06512: at line 4

 

Trace File:

*** 2015-06-20 06:50:39.340

*** SESSION ID:(39.87) 2015-06-20 06:50:39.340

*** CLIENT ID:() 2015-06-20 06:50:39.340

*** SERVICE NAME:(SYS$USERS) 2015-06-20 06:50:39.340

*** MODULE NAME:(sqlplus@CKPT-ORA-04 (TNS V1-V3)) 2015-06-20 06:50:39.340

*** ACTION NAME:() 2015-06-20 06:50:39.340

*** CONTAINER ID:(1) 2015-06-20 06:50:39.340

Skipped error 604 during the execution of SYS.STD_SYNC

*** 2015-06-20 06:50:39.341

dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=0, mask=0x0)

----- Error Stack Dump -----

ORA-00604: error occurred at recursive SQL level 1

ORA-03173: Standby may not be synced with primary

ORA-06512: at line 4

*** 2015-06-20 06:50:44.374

Skipped error 604 during the execution of C##NASSYAM.STD_SYNC

*** 2015-06-20 06:50:44.374

dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=0, mask=0x0)

----- Error Stack Dump -----

ORA-00604: error occurred at recursive SQL level 1

ORA-03173: Standby may not be synced with primary

ORA-06512: at line 4

Conclusion:

The standby lag reason whatever it may be but to avoid the misconception or staleness of data to the applications, we can use above the techniques mentioned. Hence there is large scope to provide accuracy of DATA instead of assumption that our standby is in complete SYNC.

3918 2 /
Follow / 21 Jun 2015 at 5:12am

Good Article Basha!

Follow / 21 Jun 2015 at 10:00am

Thank You for the feedback