Grid Control (11)

Friday 06 June, 2008 - 13:27

I have been working with alerts on gridctrl. I tried using the Grid Control console without success to get three (3) components active. I had to use the shell

$ opmnctl status

Processes in Instance: EnterpriseManager0.gridctrl.yaocm.id.au
-------------------+--------------------+---------+---------
ias-component | process-type | pid | status
-------------------+--------------------+---------+---------
DSA | DSA | N/A | Down
HTTP_Server | HTTP_Server | 2938 | Alive
LogLoader | logloaderd | N/A | Down
dcm-daemon | dcm-daemon | 11529 | Alive
OC4J | home | 2891 | Stopped
OC4J | OC4J_EMPROV | 2890 | Stopped
OC4J | OC4J_EM | 2951 | Alive
WebCache | WebCache | 3001 | Alive
WebCache | WebCacheAdmin | 2888 | Stopped

$ opmnctl startproc ias-component=OC4J
opmnctl: starting opmn managed processes...
================================================================================
opmn id=gridctrl.yaocm.id.au:6201
no processes matched this request
$ opmnctl startproc ias-component=OC4J_EMPROV
opmnctl: starting opmn managed processes...
================================================================================
opmn id=gridctrl.yaocm.id.au:6201
No processes match the specified configuration.
$ opmnctl startproc process-type=OC4J_EMPROV
opmnctl: starting opmn managed processes...
================================================================================
opmn id=gridctrl.yaocm.id.au:6201
no processes matched this request
$ opmnctl stopall
opmnctl: stopping opmn and all managed processes...

Communication error with the OPMN server local port.
Check the OPMN log files

opmnctl: graceful stop of processes failed, trying forceful shutdown...
$ opmnctl status
Unable to connect to opmn.
Opmn may not be up.
$ opmnctl startall
opmnctl: starting opmn and all managed processes...
$ opmnctl status

Processes in Instance: EnterpriseManager0.gridctrl.yaocm.id.au
-------------------+--------------------+---------+---------
ias-component | process-type | pid | status
-------------------+--------------------+---------+---------
DSA | DSA | N/A | Down
HTTP_Server | HTTP_Server | 14664 | Alive
LogLoader | logloaderd | N/A | Down
dcm-daemon | dcm-daemon | N/A | Down
OC4J | home | 14665 | Alive
OC4J | OC4J_EMPROV | 14666 | Alive
OC4J | OC4J_EM | 14668 | Alive
WebCache | WebCache | 14676 | Alive
WebCache | WebCacheAdmin | 14669 | Alive

The good old "shutdown everything and start up" worked. The reason for persisting with such a crude procedure is that, in operations, time is of the essence. Rather than trawl through myriad logs, quick fix solutions are better. Analysis is always left for later or when the quick fix does not work.

The following logs in $ORACLE_BASE/OracleHomes/oms10g/opmn/logs are useless for problem diagnosis:

  • ipm.log

  • WebCache~WebCacheAdmin~1

  • WebCache~WebCache~1

  • OC4J~OC4J_EM~default_island~1

  • OC4J~OC4J_EMPROV~default_island~1

  • HTTP_Server~1

On the other hand, $ORACLE_BASE/OracleHomes/agent10g/sysman/log/emagent.trc has too much information and errors in the upload process.

At this stage, I have no idea why the opmn processes were not working correctly. And I am left with a XML upload problem.