Default Blame Acceptor: Oracle RAC with OCR mirrored by ASM pitfalls

Since 11gR2 OCR and Voting files can placed into ASM. Here an overview:

ASM Reduncancy Level	OCR Mirrors	Votingdisks	Failgroups	Min. count of disks
EXTERNAL	1	1	1	1
NORMAL	2	3	3	3
HIGH	3	5	5	5

The most common configuration of RAC is a 2 node RAC. For example Oracle database applicance.
If a RAC has one storage the will be no problem and ASM mirror is not nessecary, EXTERNAL redundancy can be used. If the RAC has two storages there is a problem with the Votingdisks. Therefore a third location will be needed. In most configurations two locations are expensive enough and a third location is not available.
Here a real situation:
Customer has two nodes and two storages. All files are mirrored on ASM with normal redundancy, also OCR diskgroup:
Site A: 2x voting, secondary RAC node
Site B: 1x voting, master RAC node
Suddenly site A will break down due to a site desaster. Some seconds later the RAC node on Site B will shutdown due to OCR errors.
Why does this happens:
ASM mirroring is done at block/extent level.

EXTERNAL mirroring does mean no mirror
NORMAL = extent will be located in one other failgroup
HIGH = extent will be located in two other failgroups

Build up on the count of disks needed and the mirror copies witch should be used?

EXTERNAL = not usable for two storages
NORMAL = 3 Disks with 3 voting disks and an OCR mirror
therefore segmentation of disks is 2:1, but only 2 mirrored blocks of OCR maybe all OCR blocks are on Site A
HIGH = 5 Disks with 5 voting disks and an HIGH OCR mirror
therefore segmentation of disks is 3:2, but only 3 mirrored blocks of OCR maybe all OCR blocks are on Site A

So what can be done. The solution: NORMAL redundancy with HIGH redundancy OCR mirror. The following construct will be created:
Disk segmentation is 2:1 and 3 mirrored block of OCR. All blocks of the OCR will be mirrored on every Disk. Whatever witch Site on the disaster will happen, at least one OCR mirror copy will be available.
Here the demonstration on 12.1.0.1 GI:
1. Create Cluster with normal redundancy cluster diskgroup DG_CLUSTER
2. Check asm template of OCR asm diskgroup:

SQL> select * from v$asm_template where group_number=1;

GROUP_NUMBER ENTRY_NUMBER REDUND STRIPE S NAME                           PRIM MIRR     CON_ID
------------ ------------ ------ ------ - ------------------------------ ---- ---- ----------
           1          123 MIRROR COARSE Y VOTINGFILE                     COLD COLD          0
           1          343 MIRROR COARSE Y OCRFILE                        COLD COLD          0

3. Check mirror on OCR:

ASMCMD> ls -l +DG_CLUSTER/vmsvr-clu2/OCRFILE
Type     Redund  Striped  Time             Sys  Name
OCRFILE  MIRROR  COARSE   JUL 23 23:00:00  Y    REGISTRY.255.821572803

4. Check ASM extent distribution

SQL>select g.name
2          ,d.path
3          ,e.XNUM_KFFXP extent
4          ,decode(e.lxn_kffxp,0,'primary',1,'mirror-normal','mirror-high') mirrormeta
5  from x$kffxp        e
6        ,v$asm_alias  a
7        ,v$asm_disk   d
8        ,v$asm_diskgroup  g
9  where e.number_kffxp=a.file_number
10       and e.disk_kffxp=d.disk_number
11       and d.group_number = g.group_number
12       and a.name='REGISTRY.255.821572803'
13*      order by 3,4 desc

NAME                         PATH                               EXTENT MIRRORMETA
---------------------------------------------------------------------- -------------
DG_CLUSTER                   ORCL:ORA_DISK_2                         0 primary
DG_CLUSTER                   ORCL:ORA_DISK_1                         0 mirror-normal
DG_CLUSTER                   ORCL:ORA_DISK_3                         1 primary
DG_CLUSTER                   ORCL:ORA_DISK_1                         1 mirror-normal
DG_CLUSTER                   ORCL:ORA_DISK_1                         2 primary
DG_CLUSTER                   ORCL:ORA_DISK_3                         2 mirror-normal
DG_CLUSTER                   ORCL:ORA_DISK_2                         3 primary
DG_CLUSTER                   ORCL:ORA_DISK_3                         3 mirror-normal
DG_CLUSTER                   ORCL:ORA_DISK_3                         4 primary
DG_CLUSTER                   ORCL:ORA_DISK_2                         4 mirror-normal
DG_CLUSTER                   ORCL:ORA_DISK_1                         5 primary
DG_CLUSTER                   ORCL:ORA_DISK_2                         5 mirror-normal
DG_CLUSTER                   ORCL:ORA_DISK_2                         6 primary
DG_CLUSTER                   ORCL:ORA_DISK_1                         6 mirror-normal
DG_CLUSTER                   ORCL:ORA_DISK_3                         7 primary
DG_CLUSTER                   ORCL:ORA_DISK_1                         7 mirror-normal
DG_CLUSTER                   ORCL:ORA_DISK_1                         8 primary
DG_CLUSTER                   ORCL:ORA_DISK_3                         8 mirror-normal
DG_CLUSTER                   ORCL:ORA_DISK_2                         9 primary
DG_CLUSTER                   ORCL:ORA_DISK_3                         9 mirror-normal
DG_CLUSTER                   ORCL:ORA_DISK_3                        10 primary
DG_CLUSTER                   ORCL:ORA_DISK_2                        10 mirror-normal
...

As you can see the diskgroup is made up of 3 disks (ORA_DISK_1 – 3). Further there are only two mirrors of each extent.
5. Backup OCR

[root ~]# ocrconfig -manualbackup
vmsvredu3     2013/07/24 23:51:17     /opt/oracle/12.1/grid/cdata/vmsvr-clu2/backup_20130724_235117.ocr
vmsvredu3     2013/07/23 22:52:10     /opt/oracle/12.1/grid/cdata/vmsvr-clu2/backup_20130723_225210.ocr
vmsvredu3     2013/07/23 22:45:11     /opt/oracle/12.1/grid/cdata/vmsvr-clu2/backup_20130723_224511.ocr

6. To correct this problem if don’t have an OCR mirror, stop cluster and start one node exclusiv and without crsd

crsctl start crs -excl -nocrs

7. Change asm template

SQL> alter diskgroup dg_cluster modify template OCRFILE attributes (HIGH);

Diskgroup altered.

SQL> select * from v$asm_template where group_number=1;

GROUP_NUMBER ENTRY_NUMBER REDUND STRIPE S NAME                           PRIM MIRR     CON_ID
------------ ------------ ------ ------ - ------------------------------ ---- ---- ----------
           1          120 MIRROR COARSE Y PARAMETERFILE                  COLD COLD          0
           1          121 MIRROR COARSE Y ASMPARAMETERFILE               COLD COLD          0
           1          123 MIRROR COARSE Y VOTINGFILE                     COLD COLD          0
           1          124 MIRROR COARSE Y DUMPSET                        COLD COLD          0
           1          125 HIGH   FINE   Y CONTROLFILE                    COLD COLD          0
           1          126 MIRROR COARSE Y FLASHFILE                      COLD COLD          0
           1          127 MIRROR COARSE Y ARCHIVELOG                     COLD COLD          0
           1          128 MIRROR COARSE Y ONLINELOG                      COLD COLD          0
           1          129 MIRROR COARSE Y DATAFILE                       COLD COLD          0
           1          230 MIRROR COARSE Y TEMPFILE                       COLD COLD          0
           1          231 MIRROR COARSE Y BACKUPSET                      COLD COLD          0
           1          232 MIRROR COARSE Y XTRANSPORT BACKUPSET           COLD COLD          0
           1          233 MIRROR COARSE Y INCR XTRANSPORT BACKUPSET      COLD COLD          0
           1          234 MIRROR COARSE Y AUTOBACKUP                     COLD COLD          0
           1          235 MIRROR COARSE Y XTRANSPORT                     COLD COLD          0
           1          237 MIRROR COARSE Y CHANGETRACKING                 COLD COLD          0
           1          238 MIRROR COARSE Y FLASHBACK                      COLD COLD          0
           1          239 MIRROR COARSE Y KEY_STORE                      COLD COLD          0
           1          340 MIRROR COARSE Y AUTOLOGIN_KEY_STORE            COLD COLD          0
           1          341 MIRROR COARSE Y AUDIT_SPILLFILES               COLD COLD          0
           1          342 MIRROR COARSE Y DATAGUARDCONFIG                COLD COLD          0
           1          343 HIGH   COARSE Y OCRFILE                        COLD COLD          0

22 rows selected.

SQL>

8. Remove old OCR

ASMCMD> ls -l
Type     Redund  Striped  Time             Sys  Name
OCRFILE  MIRROR  COARSE   JUL 24 10:00:00  Y    REGISTRY.255.821572803
ASMCMD> rm -f REGISTRY.255.821572803

9. Restore OCR

[root ~]# ocrconfig -restore /opt/oracle/12.1/grid/cdata/vmsvr-clu2/backup_20130724_235117.ocr

10. Check new OCR

ASMCMD> ls -l
Type     Redund  Striped  Time             Sys  Name
OCRFILE  HIGH    COARSE   JUL 24 10:00:00  Y    REGISTRY.255.821615711
ASMCMD>

11. Check crsd starts

[oracle ~]$ crsctl start res ora.crsd -init
CRS-2672: Attempting to start 'ora.crf' on 'vmsvredu3'
CRS-2672: Attempting to start 'ora.storage' on 'vmsvredu3'
CRS-2676: Start of 'ora.storage' on 'vmsvredu3' succeeded
CRS-2676: Start of 'ora.crf' on 'vmsvredu3' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'vmsvredu3'
CRS-2676: Start of 'ora.crsd' on 'vmsvredu3' succeeded

12. Restart cluster normal

[root ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.crsd' on 'vmsvredu3'
CRS-2677: Stop of 'ora.crsd' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.evmd' on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.storage' on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'vmsvredu3'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'vmsvredu3'
CRS-2677: Stop of 'ora.storage' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'vmsvredu3'
CRS-2677: Stop of 'ora.drivers.acfs' on 'vmsvredu3' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'vmsvredu3' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'vmsvredu3' succeeded
CRS-2677: Stop of 'ora.evmd' on 'vmsvredu3' succeeded
CRS-2677: Stop of 'ora.asm' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'vmsvredu3'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'vmsvredu3' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'vmsvredu3'
CRS-2677: Stop of 'ora.cssd' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'vmsvredu3'
CRS-2677: Stop of 'ora.crf' on 'vmsvredu3' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'vmsvredu3'
CRS-2677: Stop of 'ora.gipcd' on 'vmsvredu3' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'vmsvredu3' has completed
CRS-4133: Oracle High Availability Services has been stopped.

[root@vmsvredu3 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root ~]#

13. Check ASM mirroring again:

SQL> select g.name
  2          ,d.path
  3          ,e.XNUM_KFFXP extent
  4          ,decode(e.lxn_kffxp,0,'primary',1,'mirror-normal','mirror-high') mirrormeta
  5  from x$kffxp        e
  6        ,v$asm_alias  a
  7        ,v$asm_disk   d
  8        ,v$asm_diskgroup  g
  9  where e.number_kffxp=a.file_number
 10        and e.disk_kffxp=d.disk_number
 11        and d.group_number = g.group_number
 12        and a.name='REGISTRY.255.821615711'
 13        order by 3,4 desc
 14  ;

NAME                           PATH                               EXTENT MIRRORMETA
------------------------------ ------------------------------ ---------- -------------
DG_CLUSTER                     ORCL:ORA_DISK_1                         0 primary
DG_CLUSTER                     ORCL:ORA_DISK_3                         0 mirror-normal
DG_CLUSTER                     ORCL:ORA_DISK_2                         0 mirror-high
DG_CLUSTER                     ORCL:ORA_DISK_2                         1 primary
DG_CLUSTER                     ORCL:ORA_DISK_3                         1 mirror-normal
DG_CLUSTER                     ORCL:ORA_DISK_1                         1 mirror-high
DG_CLUSTER                     ORCL:ORA_DISK_3                         2 primary
DG_CLUSTER                     ORCL:ORA_DISK_1                         2 mirror-normal
DG_CLUSTER                     ORCL:ORA_DISK_2                         2 mirror-high
DG_CLUSTER                     ORCL:ORA_DISK_1                         3 primary
DG_CLUSTER                     ORCL:ORA_DISK_3                         3 mirror-normal
DG_CLUSTER                     ORCL:ORA_DISK_2                         3 mirror-high
DG_CLUSTER                     ORCL:ORA_DISK_2                         4 primary
DG_CLUSTER                     ORCL:ORA_DISK_3                         4 mirror-normal
DG_CLUSTER                     ORCL:ORA_DISK_1                         4 mirror-high
DG_CLUSTER                     ORCL:ORA_DISK_3                         5 primary
DG_CLUSTER                     ORCL:ORA_DISK_1                         5 mirror-normal
DG_CLUSTER                     ORCL:ORA_DISK_2                         5 mirror-high
DG_CLUSTER                     ORCL:ORA_DISK_1                         6 primary
DG_CLUSTER                     ORCL:ORA_DISK_3                         6 mirror-normal
DG_CLUSTER                     ORCL:ORA_DISK_2                         6 mirror-high
...

All done. Now a disaster can come.
References:

https://twiki.cern.ch/twiki/bin/view/PDBService/ASM_Internals
How to restore ASM based OCR after complete loss of the CRS diskgroup on Linux/Unix systems (Doc ID 1062983.1)

Sunday, August 18, 2013

Oracle RAC with OCR mirrored by ASM pitfalls

No comments:

Post a Comment

Total Pageviews

Reference