Friday, October 23, 2015

About "Turkey postpone the switch to DST for 2015 for 2 weeks" and rebooting Exadata Storage Cell without affecting ASM

Hi,

As you all know, DST is happening at 25/10/2015, but in my country ,Turkey, it is postponed 2 weeks. So all the systems must be patched to handle the situation.  If your database has columns with %TIME ZONE% property, you must apply the db patch. In this post, i will show you some part of the work we done while patching. In Exadata systems, cell nodes and db nodes must be rebooted after the required OS patch applied.


Thanks to Oracle, patches for Servers and databases  are released. It can be found on this link.

With all the admins, we are ready for 8 NOV :)

Patching Exadata is a hard challenge. I am very busy to blog all the work, so i will show you how to reboot Exadata Storage Cell in patch process.

You do this job node by node, so here are the steps for one node. And we don't want to get trouble with  ASM.

First, lets look at the situation of ASM disk.

# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

 DATA_DISK1                 ONLINE  Yes
 DATA_DISK2                 ONLINE  Yes
 PROD_DATA_DISK1            ONLINE  Yes
 PROD_DATA_DISK2            ONLINE  Yes
 RECO_DISK1                 ONLINE  Yes
 RECO_DISK2                 ONLINE  Yes

Alternatively you can check status of the disk with following

# cellcli -e list griddisk

 DATA_DISK1                 active
 DATA_DISK2                 active
 PROD_DATA_DISK1            active
 PROD_DATA_DISK2            active
 RECO_DISK1                 active
 RECO_DISK2                 active


Ok, we know Exadata handles when some Storage Cells are missing. So it is important to make longer disk repair time. Lets look at current values for disk repair time.

SQL> select dg.name,a.value from v$asm_diskgroup  dg, $asm_attribute a where dg.group_number=a.group_number and  a.name='disk_repair_time';

DATA         12H
PROD_DATA    12H
RECO         12H

If one of the disk repair time is default make it longer with following SQL.

SQL> ALTER DISKGROUP DATA SET ATTRIBUTE 'DISK_REPAIR_TIME'='12H';

After that make all the disks in this node inactive. Later, reboot the Storage Cell.

# cellcli -e alter griddisk all inactive

DATA_DISK1                  successfully altered
DATA_DISK2                  successfully altered
PROD_DATA_DISK1             successfully altered
PROD_DATA_DISK2             successfully altered
RECO_DISK1                  successfully altered
RECO_DISK2                  successfully altered

When the Cell is up, bring the disks online. Here what i see is SYNCING redo disks take quite a time, in my test Exadata, it took 15 min to sync where RECO diskgroup has more than 18TB used area.

# cellcli -e alter griddisk all active

DATA_DISK1                  successfully altered
DATA_DISK2                  successfully altered
PROD_DATA_DISK1             successfully altered
PROD_DATA_DISK2             successfully altered
RECO_DISK1                  successfully altered
RECO_DISK2                  successfully altered


# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

 DATA_DISK1                 ONLINE          Yes
 DATA_DISK2                 ONLINE          Yes
 PROD_DATA_DISK1            SYNCING         Yes
 PROD_DATA_DISK2            ONLINE          Yes
 RECO_DISK1                 SYNCING         Yes
 RECO_DISK2                 SYNCING         Yes

If you query disks from another Storage Cells, yo will see something like this

# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

 DATA_DISK1                 ONLINE            Yes
 DATA_DISK2                 ONLINE            Yes
 PROD_DATA_DISK1            ONLINE            "Cannot deactivate because partner disk  PROD_DATA_DISK1 is not online"
 PROD_DATA_DISK2            ONLINE            Yes
 RECO_DISK1                 ONLINE            "Cannot deactivate because partner disk  RECO_DISK1 is not online"
 RECO_DISK2                 ONLINE            "Cannot deactivate because partner disk  RECO_DISK2 is not online"


After that verify the disk status. And we're done.

# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

 DATA_DISK1                 ONLINE  Yes
 DATA_DISK2                 ONLINE  Yes
 PROD_DATA_DISK1            ONLINE  Yes
 PROD_DATA_DISK2            ONLINE  Yes
 RECO_DISK1                 ONLINE  Yes
 RECO_DISK2                 ONLINE  Yes

# cellcli -e list griddisk

 DATA_DISK1                 active
 DATA_DISK2                 active
 PROD_DATA_DISK1            active
 PROD_DATA_DISK2            active
 RECO_DISK1                 active
 RECO_DISK2                 active


Ok, that is all

Thanks for reading.

Enjoy & share.







No comments :

Post a Comment