Thursday, 23 April 2015

root.sh failed with "clscfg: Error in retrieving own node information" while adding node to the RAC cluster

Oracle Grid in 11.2.0.4 and DB in 11.2.0.3

Accidentally all cluster related files were deleted and i have to remove this node from cluster and add it back.

While running root.sh in this node log was showing below error message
---------------------------------------------------------------------
Performing root user operation for Oracle 11g

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/11.2.0.4/grid/home_2
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/11.2.0.4/grid/home_2/crs/install/crsconfig_params
Installing Trace File Analyzer
Unable to retrieve local node number 1.
Internal Error Information:
  Category: 0
  Operation:
  Location:
  Other:
  Dep: 0
clscfg: Error in retrieving own node information
/u01/app/11.2.0.4/grid/home_2/bin/clscfg -add failed
/u01/app/11.2.0.4/grid/home_2/perl/bin/perl -I/u01/app/11.2.0.4/grid/home_2/perl/lib -I/u01/app/11.2.0.4/grid/home_2/crs/install /u01/app/11.2.0.4/grid/home_2/crs/install/rootcrs.pl execution failed


Tried lot of options but all of them failed. At the end tried to deconfig. However this was not succesful 
-------------------------------------------------------------------------
[root@ ~]# /u01/app/11.2.0.4/grid/home_2/crs/install/rootcrs.pl -deconfig
Using configuration parameter file: /u01/app/11.2.0.4/grid/home_2/crs/install/crsconfig_params
Oracle Clusterware stack is not active on this node
Restart the clusterware stack (use /u01/app/11.2.0.4/grid/home_2/bin/crsctl start crs) and retry
Failed to verify resources

This time same command with force option did the trick
------------------------------------------------------------------------------
[root@ ~]# /u01/app/11.2.0.4/grid/home_2/crs/install/rootcrs.pl -deconfig -force
Using configuration parameter file: /u01/app/11.2.0.4/grid/home_2/crs/install/crsconfig_params
PRCR-1119 : Failed to look up CRS resources of ora.cluster_vip_net1.type type
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.gsd is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.ons is registered
Cannot communicate with crsd

CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Stop failed, or completed with errors.
CRS-4544: Unable to connect to OHAS
CRS-4000: Command Stop failed, or completed with errors.
Removing Trace File Analyzer
Successfully deconfigured Oracle clusterware stack on this node

Executed orainstRoot.sh once again
---------------------------------------------------------------------------------
[root ~]# /u01/app/oraInventory/orainstRoot.sh
Changing permissions of /u01/app/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.

Changing groupname of /u01/app/oraInventory to oinstall.
The execution of the script is complete.

This time root.sh was successful.
----------------------------------------------------------------------------
[root ~]# /u01/app/11.2.0.4/grid/home_2/root.sh
Check /u01/app/11.2.0.4/grid/home_2/install/root_d002_2015-04-23_14-57-51.log for the output of root script

[root ~]# tail -100f /u01/app/11.2.0.4/grid/home_2/install/root_d002_2015-04-23_14-57-51.log
Performing root user operation for Oracle 11g

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/11.2.0.4/grid/home_2
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/11.2.0.4/grid/home_2/crs/install/crsconfig_params
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to upstart
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node d001, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 11g Release 2.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Preparing packages for installation...
cvuqdisk-1.0.9-1

Configure Oracle Grid Infrastructure for a Cluster ... succeeded

Monday, 17 November 2014

RMAN Incremental Backups to Roll Forward a Physical Standby Database

* DataGuard setup between primary and standby databases - ORACLE 11gR2

* Some of the archive logs  in primary were deleted accidentally and these were noto shipped to Standby. We realized it after 3 days.

* We followed the approach mentioned in below link by just taking incremental backup in primary to rollforward standby database.

https://docs.oracle.com/cd/B19306_01/server.102/b14239/scenarios.htm#CIHIAADC

Friday, 14 November 2014

ORACLE 11gR2 - ASM Start failed with ORA-04031

This is 2 node RAC with ASM. 

ASM and DB was up and running in Node 1 where as ASM start in Node 2 (11gR2)failed with ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^f6c30305","kglHeapInitialize:temp").


DB is configured to use Automatic Memory Management feature.

Please see below doc for AMM feature.

I Double the value for memory_target, memory_max_target to get rid of error.

Logged in second node as 'grid' user and increased the value for both parameters and started ASM in Node 1.

 alter system set memory_max_target=4096m scope=spfile;
 alter system set memory_target=1024m scope=spfile;

As it was scope=spfile new value will reflect after next Instance restart in Node 2.

Saturday, 4 October 2014

File Transfer failed with "File too large"

Most of you might be aware of this already.

I was trying to move files from ServerA to ServerB. After copying 10GB, file transfer failed with "File too Large".

Login Target server as root and put the following entry in /etc/security/limits for that particular user.

user1:
        fsize = -1
        nofiles = -1

This makes the size of file transfer as "unlimited".

Hope this helps.


Sunday, 14 September 2014

Oracle Goldengate - ERROR: sending message to REPLICAT (Timeout waiting for message).

ORACLE 11gR2 - GoldenGate

Recently we observed  lag in one of the Replicat in Golden Gate. Tried to get 'stats','lag','stop' this replicat. However all these commands failed with 
"ERROR: sending message to REPLICAT <Replicat Name> (Timeout waiting for message)."

Identified Unix PID for this replicat and killed and restarted it. 
ps -ef| grep <replicat name>; kill -9 <pid>

This resolved the issue and we see laggap decreasing after this.



Friday, 12 September 2014

ORA-01555 on Standby

DB - 11gR2

Users were running select queries on Primary where as the same select was failing with ORA-01555 in Active standby. 

Undo_retention =   900 sec --Primary
Undo_retention = 21600 sec --Standby

Inspite of higher retention at standby users were facing ORA-01555. 

Later realized that undo_retention at stantby does not have any significance. Oracle Dataguard just replicates the undo_retention value from Primary. Increased the Undo_retention in Primary to get rid of this error.

Below note  really helped in understanding this.
http://alexeymoseyev.wordpress.com/2013/10/24/ora-01555-on-standby/



ORA-12502: TNS:listener received no CONNECT_DATA from client.

Users were able to connect from One client where as they were having issues from another client.
Jobs were failing with ORA-12502 error.

This is RAC env. 11gR2

From both clients
1. we can telnet,ping to db hosnames
2. we can telnet,ping to scan

however telnet to VIP was not working for problematic client. Informed apps team to Open firewall request for VIPs from client. This resolved the issue.

Below two links helped me in troubleshooting the issue.

http://levipereira.wordpress.com/2011/05/03/configuring-client-to-use-scan-11-2-0/
http://stelliosdba.blogspot.sg/2012/02/ora-12502-tnslistener-received-no.html