Kamran Agayev's Oracle Blog

Oracle Certified Master

Getting ORA-01105 during RAC db startup

Posted by Kamran Agayev A. on July 30th, 2014

Today, while starting RAC instances of 2 node RAC database (10gR2 on Linux), I got the following error in the first node:

ORA-01105: mount is incompatible with mounts by other instances
ORA-01677: standby file name convert parameters differ from other instance

 

I checked the alert.log file, but there was no enough information to solve this issue:

Wed Jul 30 09:58:48 AZST 2014
Setting recovery target incarnation to 2
ORA-1105 signalled during: ALTER DATABASE MOUNT…
Wed Jul 30 09:58:58 AZST 2014
SUCCESS: diskgroup DATA was dismounted

 

After playing with some initialization parameters, I found a metalink note where it was defined as a bug (bug13001004)

Check out the following metalink note:

Spfile defined in OCR is not used if one exists in $ORACLE_HOME/dbs (Doc ID 1373622.1)

 

The solution is – to move parameter file to the centralized directory (/ocfs) and remove any instance_name parameter

Posted in Administration, RAC issues | No Comments »

www.OracleVideoTutorials.com

Posted by Kamran Agayev A. on June 13th, 2014

Just came across a wonderful blog where you can find dozens of different Oracle related video tutorials

www.OracleVideoTutorials.com

oraclevideotutorials

Posted in Administration | 2 Comments »

Using odd number of disks for Voting disk

Posted by Kamran Agayev A. on May 29th, 2014

As you’ve already known, you should use odd number of disks for voting disk. A node must be able to strictly access more than half of the voting disks at any time. Let me show you how it works. I have installed and configured two node 11gR3 RAC on VirtualBox and use the following case to show how it works:

- Create a diskgroup with 3 failure groups and 3 different disks

- Move voting disk to the new diskgroup. Shutdown the second node and deattach one of the disks. In this case, cluster should start as it can access more than half of the voting disks (2 from 3)

- Start the second node. The cluster should be up. Shut the second node again and deattach the second voting disk. And start it. The cluster will not start. Check the ocssd.log file

- Shut down all node, attach the previous disks and start it again. Cluster will be up

Here’re the detailed steps:

- Create a diskgroup :

Pic1

- Mount the diskgroup at the second node:

SQL> ALTER DISKGROUP vdisk MOUNT;

 

- Replace voting disk, move it to the new diskgroup and query the voting disk:

Pic2

 

 

 

 

 

 

 

 

- Shutdown the second instance and reattach one of the disks of VDISK diskgroup :

Pic3

 

-

 

 

 

 

 

 

 

 

 

- Star the second node, query the Voting disk and check if the clusterware is up:

Pic4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

- Shutdown the second node again, remove the second disk from the Voting diskgroup and start the node:

pic5

 

 

 

 

 

 

- Check the log file at $GRID_HOME/log/node2/cssd/ocssd.log :

2014-05-29 01:51:23.055: [ CSSD][2946955008]clssnmvVerifyCommittedConfigVFs: Insufficient voting files found, found 1 of 3 configured, needed 2 voting files
2014-05-29 01:51:23.055: [ CSSD][2946955008](:CSSNM00020:)clssnmvVerifyCommittedConfigVFs: voting file 0, id 279c162c-1b964f88-bfb1d622-aecc9e4e not found
2014-05-29 01:51:23.055: [ CSSD][2946955008](:CSSNM00020:)clssnmvVerifyCommittedConfigVFs: voting file 1, id 7e282f3f-5e514f42-bfb79396-c69fda76 not found
2014-05-29 01:51:23.055: [ CSSD][2946955008](:CSSNM00021:)clssnmCompleteVFDiscovery: Found 1 voting files, but 2 are required. Terminating due to insufficient configured voting files

- As you see, cluster is down. Now, shutdown both nodes, add disks to the second node and check the status of the clusterware:

Pic6

 

 

 

 

 

 

 

 

 

Posted in RAC issues | No Comments »

How to troubleshoot CRSCTL REPLACE VOTEDISK error?

Posted by Kamran Agayev A. on May 27th, 2014

It took me some time to investigate why CRSCTL REPLACE VOTEDISK command is not working.
[oracle@node1 ~]$ crsctl replace votedisk VDISK
CRS-4264: The operation could not be validated
CRS-4000: Command Replace failed, or completed with errors.
When you get an error during VOTEDISK replacement, make sure you check the following items:

- Make sure the disk group you’re moving the voting disk is mounted on all nodes.

- Make sure the compatibility parameter is set to the version of Grid software you’re using. You can change it using the following command:

alter diskgroup VDISK set attribute ‘compatible.asm’=’11.2′;

Query V$ASM_DISKGROUP view to make sure it’s the same with the rest disk groups and with the version of the Grid Software:

select group_number, name, compatibility, database_compatibility from v$asm_diskgroup;

- Check alert.log file of an ASM instance, any available trace file of the ASM instance. Check /var/log/messages file and trace the replace command usint strace file. See if you can catch any error from the log file:

[grid@node5 ~]strace crsctl replace votedisk VDISK
- Make sure you’ve an odd number of votedisk

- Make sure there’s enough space in the diskgroup

- Make sure disk permissions is correct

- Make sure you’re running the command using Grid Software owner

Today, all above mentioned checks are failed :) . In my case, the problem was using incorrect “crsctl” command. After upgrading the RAC environment from 11.2.0 to 11.2.3 I was still using old crsctl (by accident, forgot to set environment variables correctly). But no need to worries, it was a test database.

Let me know if you have any additional check to investigate voting disk replace failure

Cheers

Posted in RAC issues | 4 Comments »

Cluster won’t start if diagnostic_dest folder is missing

Posted by Kamran Agayev A. on March 3rd, 2014

One of the reason of why cluster won’t start is DIAGNOSTIC_DEST folder is missing. Here it is what I got today in of the nodes of the RAC environment:

db-bash: crs_stat -t

HA Resource Target State
———– —— —–
error connecting to CRSD at [(ADDRESS=(PROTOCOL=IPC)(KEY=ora_crsqs))] clsccon 184

 

While checking alert log file of the clusterware ($GRID_HOME/log/node1/alertnode1.log

[/home/oracle/11.2.0/grid_1124/bin/oraagent.bin(4745)]CRS-5011:Check of resource “+ASM” failed: details at “(:CLSN00006:)” in “/home/oracle/11.2.0/grid_1124/log/node01/agent/ohasd/oraagent_oracle/oraagent_oracle.log”

 

ASM instance failed to start. I connected to ASM instance and tried to start it manually:

db-bash-$ asm

SQL*Plus: Release 11.2.0.4.0 Production on Mon Mar 3 10:32:24 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

ASM> startup

ORA-48108: invalid value given for the diagnostic_dest init.ora parameter

ORA-48140: the specified ADR Base directory does not exist [/home/oracle/11.2.0/dbhome]

ORA-48187: specified directory does not exist

HPUX-ia64 Error: 2: No such file or directory

Additional information: 1

ASM>

 

ADR Base Directory is missing. After creating it, I successfully started the CRS and got the happiest message :) :

db-bash-$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
db-bash-$

Posted in Administration, RAC issues | 2 Comments »

How to configure Enterprise Manager 11gR2 for RAC environment

Posted by Kamran Agayev A. on March 2nd, 2014

After migrating 10gR2 single instance to 11gR2 RAC db, the next step should be configuring Enterprise Manager (if you don’t use centralized Grid Control)

 

It can be done by a simple command. But as usual (at least for me) it didn’t go smooth and I got some errors :) . In this article I will show you solutions for some of them.

The command should be run is as follows:

emca -config dbcontrol db -repos create –cluster

 

It will ask you to provide the following information:

db-bash-$ emca -config dbcontrol db -repos create -cluster

 

STARTED EMCA at Mar 2, 2014 3:03:19 PM

EM Configuration Assistant, Version 11.2.0.3.0 Production

Copyright (c) 2003, 2011, Oracle.  All rights reserved.

 

Enter the following information:

Database unique name:

Service name:

Listener port number:

Listener ORACLE_HOME

Password for SYS user:

Password for DBSNMP user:

Password for SYSMAN user:

Cluster name:

Email address for notifications (optional):

Outgoing Mail (SMTP) server for notifications (optional):

ASM ORACLE_HOME :

ASM port :

ASM username [ ASMSNMP ]:

ASM user password:

 

-          Get Database unique name from db_unique_name parameter from the database

SQL> show parameter db_unique_name

-          To provide a service name, get the SCAN listener name from the host (ps –ef | grep tns), get the status of SCAN listener by running lsnrctl status LISTENER_SCAN1 command from $GRID_HOME and get the service name from the output

Service “RACDB.MYCOMP.COM” has 3 instance(s).
     Instance “RACDB1″, status READY, has 1 handler(s) for this service…
     Instance “RACDB2″, status READY, has 1 handler(s) for this service…
     Instance “RACDB3″, status READY, has 1 handler(s) for this service…

 

-          Make sure all password files at all nodes – orapwNODE1[2-3] – are same. If not, create a password file at each node and provide the same password

-          As a Cluster Name, provide the name of the cluster. To get the cluster name, use the following command:

cd $GRID_HOME/bin

./cemutlo –n

-          Make sure you’ve ASMSNMP user created at ASM instance and SYSDBA privilege is granted. Do it in the first node, then copy ora+ASM1 to the different nodes under $GRID_HOME/dbs/+ASM[1-2]

 

Hope above mentioned notes will help you to create EM db control successfully for your RAC environment.

 

To get more information from metalink, use the following MOS:

How to manage DB Control 11.x for RAC Database with emca (Doc ID 578011.1)

 

Posted in Administration | No Comments »

V$ASM_DISKGROUP displays information from the header of ASM disks

Posted by Kamran Agayev A. on January 17th, 2014

While playing with OCR recovery, suddenly I realized that V$ASM_DISKGROUP view gets information from the headers of the ASM disk files that are specified at *.ASM_DISKSTRING parameter. Here’s the description of V$ASM_DISKGROUP view from documentation:

V$ASM_DISKGROUP displays one row for every ASM disk group discovered by the ASM instance on the node.

http://docs.oracle.com/cd/E18283_01/server.112/e17110/dynviews_1027.htm

 

I got explain plan of V$ASM_DISKGROUP to know which X$ table stand behind it and got – X$KFGRP

SQL> set autotrace on

SQL> select count(1) from v$asm_diskgroup;

COUNT(1)

———-

3

 

pic1

 

 

 

 

 

 

 

SQL> select name_kfgrp from x$kfgrp;

NAME_KFGRP

——————————

 

DATA

FLASH

OCR

 

SQL> select GRPNUM_KFDSK, NUMBER_KFDSK, STATE_KFDSK, ASMNAME_KFDSK, PATH_KFDSK from x$KFDSK;

pic2

 

 

 

 

 

 

 

 

Then I queried ASM_DISKGROUPS parameter :

 

SQL> show parameter disk

 

NAME                                 TYPE        VALUE

———————————— ———– ——————————

asm_diskgroups                       string      OCR, DATA, FLASH

asm_diskstring                       string      /dev/oracleasm/disks

 

No I will create a new tablespace under FLASH diskgroup, create a new table, change owner of the disk of FLASH diskgroup and make it #*disappear* from V$ASM_DISKGROUP view, and then return everything back 

SQL> create tablespace new_tbs datafile ‘+FLASH’;

Tablespace created.

SQL> create table new_table (id number) tablespace new_tbs;

Table created.

SQL> insert into new_table values(1);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from new_table;

ID
———-
1

 

I create a parameter file from spfile, change ASM_DISKGROUPS parameter to OCR,DATA (remove FLASH) and mount the ASM instance again using a parameter file with

ASM_DISKGROUPS=’OCR’,'DATA’ specified:

 

pic7

 

So DISK6 is member and as the disk is discovered by ASM instance, FLASH diskgroup is dismounted, but still there.

Let’s change the owner of the disk and check it again. But before checking the owner, let’s read it’s header by KFED:

[root@node1 disks]# kfed read DISK6

pic4

 

Now let’s start the instance and check V$ASM_DISKGROUP view:
pic5

 

Query X$KFGRP view:


 
SQL> select NAME_KFGRP from X$KFGRP;

NAME_KFGRP

——————————

 

DATA

OCR

SQL>

 

Switch to the database and check if you can query the table:

SQL> select * from new_table;

select * from new_table

*

ERROR at line 1:

ORA-01157: cannot identify/lock data file 8 – see DBWR trace file

ORA-01110: data file 8: ‘+FLASH/rac/datafile/new_tbs.257.837080939′

SQL>

 

 

Now shutdown the ASM instance, return the owner back and check V$ASM_DISKGROUP again:

[root@node1 disks]# chown -R oracle:dba DISK6
pic6

 

FLASH diskgroup appeared, however it’s not specified at ASM_DISKGROUPS parameter. Now mount the diskgroup and query the table again:

SQL> alter diskgroup flash mount;

Diskgroup altered.

SQL>

SQL> select * from new_table;

ID
———-
1

SQL>

 

This means that if you want to move the ASM instance to another host, it’s enough to specify ASM_DISKSTRING parameter, V$ASM_DISKGROUP will discover all diskgroups

Posted in Administration, RAC issues | 1 Comment »

Step by Step Installing Oracle 12c RAC on VirtualBox – Video Tutorial

Posted by Kamran Agayev A. on September 22nd, 2013

It’s 3.50 AM at San Francisco and I’m finalizing a new video tutorial that I have been creating for the users who will attend RAC Attack at Oracle Openworld 2013 (Ninja Operation). Tomorrow is the first day of OOW13 and I’m very happy that I was able to finish it in time. The first minutes of this tutorial was created in the plane when I was flying to SFO this week :)

For the step by step instruction, I’ve used the following article

http://en.wikibooks.org/wiki/RAC_Attack_-_Oracle_Cluster_Database_at_Home/RAC_Attack_12c

 

You can watch the video from the following link:
12c_rac_installation

You can download the video from this link:

http://www.kamranagayev.com/Video_Tutorials/Step_by_Step_installing_Oracle_12c_RAC.mp4

 

Hope you’ll enjoy the video!

Posted in Administration, RAC issues | 4 Comments »

I’m RACAttack Ninja!

Posted by Kamran Agayev A. on September 19th, 2013

I’m proud to announce that I’m one of the 10 RAC Attack Ninjas this year! If you’re attending OOW13, don’t forget to come and say “hi” to Ninjas. We’ll be at OTN Lounge in the lobby of Moscone South on 24th and 25th September between 10am-2pm

Using documentation created by Ninjas, you’ll be able to install and run Oracle 12c RAC on your laptop!

For the step by step introduction, check the following link:

http://racattack.org/12c

To get more information about RacAttack event, check the following link:

http://www.pythian.com/blog/rac-attack-at-oracle-openworld-2013-operation-ninja/

You can check our facebook page:

https://www.facebook.com/racattackoow13

And don’t forget to use #RACAttack hashtag.

I’m working on “Step by Step Installing Oracle 12c RAC Video tutorial” and hope it will be ready for the RACAttack event. If you don’t have time to install it during OOW days, come and watch the video tutorial and learn how to install Oracle 12c RAC

See you all soon!

131982_634077923291835_2111683842_o

Posted in Administration, RAC issues | 1 Comment »

My book is published!

Posted by Kamran Agayev A. on September 13th, 2013

My book is out!

I would like to announce to all my fellow DBA friends that my book is published and is available at Amazon.

http://www.amazon.com/Oracle-Backup-Recovery-secrets-In-Focus/dp/0984428232/

It took me more than one year to write it. 3 years ago I got an email from Rampant Techpress where they asked me if I can write a book about RMAN Backup and Recovery. I wasn’t ready at that time, but decided to accept that challenge and agreed. The first man came in my mind to help me at that time was my friend Aman Sharma. He also agreed and we started the journey together.

After writing the first chapter I realized that we really need to have technical reviewers, so I asked Syed Sabdar and Hemant K. Chitale if they can help us. Fortunately they agreed.  I would like to thank them both as they did a great job and made a lot of corrections

All chapters are based on practical scenarios and deep research and I hope it will help you to understand RMAN in depth, to create your own RMAN backup strategy and to recover from any type of failures you can face in your production database. There’re a lot of scenarios under each chapter with an explanation and step by step guide to help you to reproduce them on your test environment

book

Posted in Administration, RMAN Backup and Recovery | 13 Comments »