Views expressed here are solely that of my own. Please make sure that you test the code/queries that appear on my blog before applying them to the production environment.

Thursday, February 11, 2010

Oracle background processes still use and lock the dropped ASM disks and these disks can not be removed and assigned to another server without restart

Oracle background processes still use and lock the dropped ASM disks and these disks can not be removed and assigned to another server without restarting the databases which are locking this ASM disks.

Hi All,

Today, our Unix Admins group wanted to take back a storage LUN (/dev/hdisk19) which had been given temporarly to be used for a project and assigned to the the ASM of one of our Oracle databases.

We use Oracle Database Enterprise Edition and ASM version 10.2.0.4.

As a normal procedure I dropped the ASM disk which is corresponding to this LUN.
First I connected to the ASM instance and checked the ASM disk name of the related ASM device.

SQL> select name, substr(path, 1, 20) from v$asm_disk order by name;

NAME                           SUBSTR(PATH,1,20)
------------------------------ --------------------
DG_DB_ASM_DB01_0000            /dev/ASM_Disk1
DG_DB_ASM_DB01_0001            /dev/ASM_Disk2
DG_DB_ASM_DB01_0002            /dev/ASM_Disk3
DG_DB_ASM_DB01_0003            /dev/ASM_Disk4
DG_DB_ASM_DB01_0004            /dev/ASM_Disk5

NAME                           SUBSTR(PATH,1,20)
------------------------------ --------------------
DG_DB_ASM_DB01_0005            /dev/ASM_Disk6
DG_DB_ASM_DB01_0006            /dev/ASM_Disk13
DG_DB_ASM_DB01_0007            /dev/ASM_Disk14
DG_DB_ASM_DB01_0008            /dev/ASM_Disk15
DG_DB_ASM_DB01_0009            /dev/ASM_Disk17

10 rows selected.    

I used the following command to drop the ASM disk.
ALTER DISKGROUP DG_DB_ASM_DB01 DROP DISK DG_DB_ASM_DB01_0009;

After this command completes, Unix Admin group tried to remove the hdisk (/dev/hdisk19) corresponding to this device and they said that there were some processes in the Unix server (IBM AIX) which were using this hdisk and because of that they can not remove it.

I checked again to find out which processes use this ASM disk corresponding this ASM device.
[osuser01@os01]:/oracle > fuser /dev/ASM_Disk17
/dev/ASM_Disk17:   200796  430276  675882  712952  745630  749608  757766  782390  892942 1220622 1515754 1613864 1765430 1921252 1945674
[osuser01@os01]:/oracle > ps -ef|grep  200796
oracle  200796       1   0 16:52:57      -  0:00 oracle+ASM (LOCAL=NO)
oracle 1146924 1589428   0 17:56:00  pts/2  0:00 grep 200796
[osuser01@os01]:/oracle > ps -ef|grep  430276
oracle  430276       1   0   Oct 06      - 20:01 asm_gmon_+ASM
oracle 1589430 1146926   0 17:56:05  pts/2  0:00 grep 430276
[osuser01@os01]:/oracle > ps -ef|grep  675882
oracle  675882       1   0   Oct 06      - 26:20 ora_cjq0_DB01
oracle 1400942 1065140   0 17:56:09  pts/2  0:00 grep 675882
[osuser01@os01]:/oracle > ps -ef|grep  712952
oracle  712952       1   0   Oct 06      - 10:15 ora_smon_DB01
oracle 1065142 1400944   0 17:56:14  pts/2  0:00 grep 712952
[osuser01@os01]:/oracle > ps -ef|grep  745630
oracle  745630       1   0   Oct 06      - 207:01 ora_ckpt_DB01
oracle 1380554 1400950   0 17:56:19  pts/2  0:00 grep 745630
[osuser01@os01]:/oracle > ps -ef|grep  749608
oracle  749608       1   0   Oct 06      - 58:08 ora_lgwr_DB01
oracle 1065156 1380564   0 17:56:25  pts/2  0:00 grep 749608
[osuser01@os01]:/oracle > ps -ef|grep  757766
oracle  757766       1   0 09:27:29      -  0:11 ora_j000_DB01
oracle 1876066 1589452   0 17:56:31  pts/2  0:00 grep 757766
[osuser01@os01]:/oracle > ps -ef|grep  782390
oracle  782390       1   0   Oct 06      - 3793:04 ora_dbw0_DB01
oracle 1589454 1876068   0 17:56:42  pts/2  0:00 grep 782390
[osuser01@os01]:/oracle > ps -ef|grep  892942
oracle  892942       1   0   Oct 06      -  1:59 ora_rbal_DB01
oracle 1589456 1065172   0 17:56:48  pts/2  0:00 grep 892942
[osuser01@os01]:/oracle > ps -ef|grep  1220622
oracle 1065174 1589460   0 17:56:54  pts/2  0:00 grep 1220622
oracle 1220622       1   0   Oct 06      - 15:45 ora_mmon_DB01
[osuser01@os01]:/oracle > ps -ef|grep  1515754
oracle 1400968 1876080   0 17:57:02  pts/2  0:00 grep 1515754
oracle 1515754       1   0   Oct 06      -  2:55 asm_rbal_+ASM
[osuser01@os01]:/oracle > ps -ef|grep  1613864
oracle 1065184 1876088   0 17:57:13  pts/2  0:00 grep 1613864
oracle 1613864       1   0   Oct 06      -  4:20 asm_lgwr_+ASM
[osuser01@os01]:/oracle > ps -ef|grep  1765430
oracle 1765430       1   0   Oct 06      -  3:19 asm_dbw0_+ASM
oracle 1876090 1065186   0 17:57:18  pts/2  0:00 grep 1765430
[osuser01@os01]:/oracle > ps -ef|grep  1921252
oracle 1065188 1876092   0 17:57:25  pts/2  0:00 grep 1921252
oracle 1921252       1   0 15:34:14      -  0:00 oracle+ASM (LOCAL=NO)
[osuser01@os01]:/oracle > ps -ef|grep  1945674
oracle 1065192 1876098   0 17:57:30  pts/2  0:00 grep 1945674
oracle 1945674       1   0   Oct 06      - 49:47 ora_mmnl_DB01

As we can see, the locking processes belong to the Oracle database and ASM instance background processes.

I also checked the database side whether dropped disk is still there or not.

I connected to the ASM instance and run the following query.

SQL> select name, substr(path, 1, 20) from v$asm_disk order by name;

NAME                           SUBSTR(PATH,1,20)
------------------------------ --------------------
DG_DB_ASM_DB01_0000            /dev/ASM_Disk1
DG_DB_ASM_DB01_0001            /dev/ASM_Disk2
DG_DB_ASM_DB01_0002            /dev/ASM_Disk3
DG_DB_ASM_DB01_0003            /dev/ASM_Disk4
DG_DB_ASM_DB01_0004            /dev/ASM_Disk5

NAME                           SUBSTR(PATH,1,20)
------------------------------ --------------------
DG_DB_ASM_DB01_0005            /dev/ASM_Disk6
DG_DB_ASM_DB01_0006            /dev/ASM_Disk13
DG_DB_ASM_DB01_0007            /dev/ASM_Disk14
DG_DB_ASM_DB01_0008            /dev/ASM_Disk15
/dev/ASM_Disk17

10 rows selected.    

ASM device name still appears there but without any ASM disk name, this looks a bit strange because we successfully dropped this ASM disk and there are still Oracle background processes which does not release this ASM device.

After searching a bit about this problem, I found a Metalink note in Oracle Metalink website explaining that this problem occurs because of an Oracle bug which will be solved in Oracle version 10.2.0.5 (we use Oracle version 10.2.0.4). (Oracle Metalink note : Asm Devices Are Still Held Open After Dismount or Drop [ID 402526.1])

As a quick solution, since this database is not a production database, I restarted ASM and Oracle database instance to get rid of these background processes locking the ASM device and problem was solved.

Regards,
Ural

No comments: