garden1.jpg
Install OEM agent 10.2.0.4

Install Oracle Enterprise Manager 10.2.0.4 agents.

Install OEM 10.2.0.4 agents.

This document outlines steps to install OEM 10.2.0.4 agents, and troubleshoot errors such as “Suspended on Agent Unreachable” or “Instance Health Check initialization failed” or “Unreachable Start”. See section Troubleshooting on this document for more detail and possible fixes.

If you are here looking for a fix for error "Suspended on Agent Unreachable” then here is the quick fix so you dont have to read this entrire document:
A Quick fix for "Suspended on Agent Unreachable” is as follow:

Situation 1) Getting this error right after installing 10.2.0.4 agents

Disable the Health Check Metric Collection in Grid Control 10.2:
Targets Database Click on the desired database, On the bottom click on Metric and Policy Settings, do a find (Ctrl F) for status and look for Instance Status, click on frequency, default is Every 15 Seconds, now click disable Click continue, click OK. Then bounce agents.
If disabling Health Check Metric does not fix your problem then you may want to read the rest of this document.

Situation 2) Getting this error outside of a new 10.2.0.4 installation:

cd $AGENT_HOME/bin

emctl stop agent

Backup then delete all files in

$AGENT_HOME/sysman/emd/upload and $AGENT_HOME/sysman/emd/state

emctl clearstate agent

emctl secure agent

emctl start agent


Reference:
Naming:

OS is HPUX.
Target server where OMS is installed: myoms
Target server where agent is installed: myagent
oma is the os account owner of OEM agent.

How to Download agent 10.2.0.4:
From Oracle OTN download agent 10.2.0.4
unzip HPUX_Grid_Control_agent_download_10_2_0_4_0.zip into an staging area.
DOwnloads -àEnterprise Manager àSECTION: Mass Agent Deployment

Install OEM 10.2.0.4 agents.


1) First Un-install previous agent. (If any).
I currently haved agent 10.2.0.3 which I would need to un-install following these steps:
vi /var/opt/oracle/oraInst.loc
Here make sure you point it to a orainventory location where you have agents 10.2.0.3

(e.g. /u14/app/oracle/product/OEM10.2/oraInventory).
stop agents:
cd /u14/app/oracle/product/OEM10.2/agent10g/bin
./emctl stop agent


Set environment:
TEMPDIR=/u02/temp
export TEMPDIR
TEMP=/u02/temp
export TEMP
TMP=/u04/tmp
export TMP
DISPLAY=172.16.52.156:0.0
export DISPLAY
unset OBJECT_MODE
Start Hummingbird–> Exceed
unset ORACLE_HOME
ORACLE_HOME=/u14/app/oracle/product/OEM10.2/agent10g
export ORACLE_HOME


un-install:
as oma (OS owner of agents)
cd /u14/stage/OMA/agent10204/hpunix/agent
./runInstaller

Once OUI starts, click installed products and remode agent10g which is in
/u14/app/oracle/product/OEM10.2/agent10g
This run for 10 minutes and successfully removed agents 10.2.0.3


2) Install agent 10.2.0.4
cd /var/opt/oracle
vi /var/opt/oracle/oraInst.loc
Here make sure you point it to a new orainventory location within the new home where you
would be installing the 10.2 agents, create this new orainventory location if it does not already exist
(/u14/app/oracle/product/OEM10.2/oraInventory).

As OS user owner of the 10.2.0.4 agents vi .profile and make sure you have the following line
ulimit -n 1024 then log out and log back in.

remove 10.2.0.3 OS files:
cd /u14/app/oracle/product/OEM10.2
pwd
rm -Rf *

Set environment:
TEMPDIR=/u02/temp
export TEMPDIR
TEMP=/u02/temp
export TEMP
TMP=/u04/tmp
export TMP
DISPLAY=172.16.52.156:0.0
export DISPLAY
unset OBJECT_MODE
Start Hummingbird–> Exceed
unset ORACLE_HOME
ORACLE_HOME=/u14/app/oracle/product/OEM10.2/agent10g
export ORACLE_HOME

install:
as oma (OS owner of agents)
cd /u14/stage/OMA/agent10204/hpunix/agent
./runInstaller
Once OUI starts, and prompted you for parent directory, set it to
/u14/app/oracle/product/OEM10.2
Make sure agent10g does not exist under /u14/app/oracle/product/OEM10.2
set Management service Hostname to myoms.mydomain.com with default port 4889
This run for 15 minutes no errors and it tells me to run the roor.sh

logon as root or an account with sudo
cd /u14/app/oracle/product/OEM10.2/agent10g
./root.sh àThis run for 3 seconds no erros.

Now connect to all target datanase instances monitored by this agent:
alter user dbsnmp identified by oem_password;
Now start a browser into https://myoms.mydomain.com:1159/em and navigate to Targetsà Databaseàdatabase nameà Configure
This would Unlock or recreate monitor account dbsnmp and instance will start being

monitored by GRID.
All success.


If necessary the following might help to clear any errors you might get after installation:
cd /u14/app/oracle/product/OEM10.2/agent10g/bin
./emctl clearstate agent
./emctl upload

Troubleshooting:
If you get these error:

Agent is Unreachable. Also all jobs sending email with Status=Suspended on Agent Unreachable

Getting errors as seen below in

/u14/app/oracle/product/OEM10.2/agent10g/sysman/log/emagent.trc.1 :

Error detail:
2008-08-01 06:22:20,466 Thread-12293 ERROR util.files: ERROR: nmeufos_new: failed in lfiopn on file: /u14/app/oracle/product/O
EM10.2/agent10g/sysman/emd/agntstmp.txt.error = 24 (Too many open files)
2008-08-01 06:22:20,466 Thread-12293 ERROR pingManager: Error in updating the agent time stamp file 2008-08-01 06:22:20,467 Thread-12293 ERROR http: snmehl_connect: failed to create socket: Too many open files (error = 24)
2008-08-01 06:22:20,467 Thread-12293 ERROR pingManager: nmepm_pingReposURL: Cannot connect to https://myoms.mydomain.com:1159/em/upload: retStatus=-32
2008-08-01 06:22:20,468 Thread-12293 ERROR util.files: ERROR: nmeufos_new: failed in lfiopn on file: /u14/app/oracle/product/OEM10.2/agent10g/sysman/emd/agntstmp.txt.error = 24 (Too many open files)
2008-08-01 06:22:20,468 Thread-12293 ERROR pingManager: Error in updating the agent time stamp file
2008-08-01 06:22:26,483 Thread-12294 ERROR util.fileops: ERROR: snmeuf_dirlist can’t list directory: /u14/app/oracle/product/OEM10.2/agent10g/sysman/emd/upload: Too many open files (errno=24)
2008-08-01 06:22:26,484 Thread-12295 ERROR engine: Failed when generating a new ECID.
2008-08-01 06:22:26,491 Thread-12295 ERROR fetchlets.healthCheck: GIM-00104: file not found
LEM-00031: file not found; arguments: [lempgmh] [lmserr]
LEM-00033: file not found; arguments: [lempgfm] [Couldn't open message file]
LEM-00031: file not found; arguments: [lempgmh] [lmserr]
2008-08-01 06:22:26,491 Thread-12295 ERROR engine:
[oracle_database,mysid.mydomain.com,health_check] : nmeegd_GetMetricData fa
iled : Instance Health Check initialization failed due to one of the following causes: the
owner of the EM agent process is not same as the owner of the Oracle instance processes; the owner of the EM agent process is not part of the dba group; or the database version is not 10g (10.1.0.2) and above.

End of paste from log emagent.trc.1


Analysis:
Using glance (shift f) for PID of OMA emagent and sar -v –>I can see that oma
continuously opens files specially /u02/…/product/10.2.0/dbs/hc_mysid.dat until it reaches the 1024 limit and then it crashes. Currently it has over 650 open files and increasing and agent was started close to 4 hours ago.

A Quick fix for Suspended on Agent Unreachable:

Disable the Health Check Metric Collection in Grid Control 10.2:
Targets Database Click on the desired database, On the bottom click on Metric and Policy Settings, do a find (Ctrl F) for status and look for Instance Status, click on frequency, default is Every 15 Seconds, now click disable Click continue, click OK.
If disabling Health Check Metric does not fix your problem then you may want to read the rest of this document.

430805.1 says to apply the following patches to fix this problem which seems to be specific to HPUX.

1) Stop or kill agent
2) as oma owner of agent
cd $AGENT_HOME/rdbms/lib
cp ins_rdbms.mk ins_rdbms.mk.org
vi ins_rdbms.mk
comment the line GENOCCISH so that the content of the makefile looks like:
client_sharedlib:
$(GENCLNTSH)
# $(GENOCCISH)
$(GENAGTSH) $(LIBAGTSH) 1.0

3)Apply the Patch 5854190 in the Agent Oracle Home by following the instructions given in the README
For 10.2.0.3 / 10.2.0.4 Agent on HP-UX PA-RISC, download the version 10.2.0.3 of Patch 5854190
For 10.2.0.3 Agent on HP-UX Itanium, download the version 10.2.0.2 of Patch 5854190
For 10.2.0.4 Agent on HP-UX Itanium, download the version 10.2.0.3 of Patch 5854190

Download into staging area.
Make sure /var/opt/oracle/oraInst.loc points to where you have installed agent 10.2.0.4
cd to staging area and unzip p5854190_10203_HP64.zip
cd /u14/stage/OMA/5854190/5854190
ORACLE_HOME=/u14/app/oracle/product/OEM10.2/agent10g
export ORACLE_HOME
/u14/app/oracle/product/OEM10.2/agent10g/OPatch/opatch apply
This would run for 3 minutes and finishes successfully with message: OPatch succeeded.

4) Relink the Agent by following Note 273189.1
Set ORACLE_HOME to agent home:
ORACLE_HOME=/u14/app/oracle/product/OEM10.2/agent10g
export ORACLE_HOME
cd into the 10G Central Agent home/bin directory.
cd $ORACLE_HOME/bin

make sure agent is down:
./emctl stop agent
./emctl status agent
cd $ORACLE_HOME/sysman/lib
make -f ins_emagent.mk agent
–>This run for 2 minutes, no errors.
cd $ORACLE_HOME
pwd
/u14/app/oracle/product/OEM10.2/agent10g
su – root (or sudo)
cd /u14/app/oracle/product/OEM10.2/agent10g
./root.sh
say no to overwrotr “dbhome”, “oraenv” and “coraenv”
exit
su – oma
ORACLE_HOME=/u14/app/oracle/product/OEM10.2/agent10g
export ORACLE_HOME
cd $ORACLE_HOME/bin
emctl start agent

5) Patch target databases:
Do the following steps in each monitored ORACLE_HOME:
su – oracle
10.1.0.3, 10.1.0.2: consider patching / upgrading
10.2.0.2: Apply the Patch 4559294 in the RDBMS Oracle Home by following instructions given in
the README.
10.2.0.3 (and above): nothing to be done. The fix is already included.
Shutdown the monitored database and patch it.
Make sure /var/opt/oracle/oraInst.loc points to where you have installed target database.
cd /u14/stage/patches/db/4559294
unzip p4559294_10202_HP64.zip
cd 4559294
/u02/app/oracle/product/10.2.0/OPatch/opatch apply
This run for 5 minutes no error: OPatch succeeded.

Rename / Delete the healthcheck file $RDBMS_HOME/dbs/hc_<SID>.dat file
cd $ORACLE_HOME/dbs
cp hc_mysid.dat hc_mysid.dat.org
rm hc_mysid.dat

Restart the monitored database. This will recreate the $RDBMS_HOME/dbs/hc_<SID>.dat file
Bounce agents.
Applying the agent patch and target database patch fixed the problem.
For PROD plan to either go to 10.2.0.4 or 11g.

6) Disable the Health Check Metric Collection in Grid Control 10.2
If you cannot patch your database at this moment: see Note 379423.1
This means that the database availability will rely on the Response metric, which is collected by default every 5 minutes.
From GRID home page https://myoms.mydomain.com:1159/em
Targets Database Click on the desired database, On the bottom click on Metric and Policy Settings, do a find (Ctrl F) for status and look for Instance Status, click on frequency, default is Every 15 Seconds, now click disable Click continue, click OK.
Restart agents.
Just disabling Health Check Metric Collection on target database does not fix the problem. I had to patch 10.2.0.4 agent and target database and disable Health Check Metric Collection as outlined by steps 3, 4, 5 and 6 on this page.


Fixed. The problem did not come back. And all jobs started to run successfully automatically.

Fix for Message Too many open files:

To fix this shutdown or kill running agent. Then as oma do ulimit -a this shows me currentl a setting of 60 for nofiles(descriptors). and ulimit -aH which is hard limit shows 1024 for nofiles(descriptors). So I edit oma .profile file and insert the following line at the end of the file ulimit -n 1024 then log out and log back in as oma and restrt agents:

cd /u14/app/oracle/product/OEM10.2/agent10g/bin
ps -aef | grep -i oma
Kill any oma processes if any.

./emctl start agent
./emctl clearstate agent
./emctl upload

What we do:

idbasolutions makes software that allows easy navigation into database objects, sessions, storage, wait conditions, memory structures and more, all with drill-down sub-windows in a cascading and intuitive fashion.

If you like reading our papers, please take a moment and navigate to Product Demo and lookup each modules presentations. Or navigate to Free Version Tab and download a demo copy of our software for you to keep and use at no charge. To buy please navigate to Buy tab to purchase a full copy.

Support documents such as how to install, system and database requirements and so on are available in Support tab.

Latest Blog Items

email

Thank you for your interest in idbasolutions.

Contact us using this email: contact.us@idbasolutions.com