2017-11-25 ASM Fails to Start

Overview

Even with the correct UDEV settings, ASM and CRS fail to start up.

References

Investigation

Attempt to Start ASM

When I found ASM not to be running, I attempted to start ASM manually:

[oracle@redfern1 ~]$ sqlplus / as sysdba SQL*Plus: Release 12.1.0.2.0 Production on Sat Nov 18 17:42:22 2017 Copyright (c) 1982, 2014, Oracle. All rights reserved. Connected to an idle instance. SQL> startup ORA-01078: failure in processing system parameters ORA-29701: unable to connect to Cluster Synchronization Service

There were no entries in the ASM alert log.

CRS Log

With no entries in the ASM alert log, and the message above indicating a problem with CSS, I went to the CRS log first. The CRS log (/opt/app/oracle/diag/crs/redfern1/crs/trace/alert.log) shows:

2017-11-18 17:27:03.996000 +11:00 2017-11-18 17:27:03.991 [OHASD(2307)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 2307 2017-11-18 17:27:03.995 [OHASD(2307)]CRS-0714: Oracle Clusterware Release 12.1.0.2.0. 2017-11-18 17:27:04.032 [OHASD(2307)]CRS-2112: The OLR service started on node redfern1. 2017-11-18 17:27:04.313 [OHASD(2307)]CRS-1301: Oracle High Availability Service started on node redfern1. 2017-11-18 17:27:04.351 [OHASD(2307)]CRS-8017: location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred 2017-11-18 17:27:05.559000 +11:00 2017-11-18 17:27:05.558 [CSSDMONITOR(2413)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 2413 2017-11-18 17:27:05.744 [CSSDAGENT(2411)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 2411 2017-11-18 17:27:06.360 [ORAAGENT(2405)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 2405 2017-11-18 17:27:06.929000 +11:00 2017-11-18 17:27:06.927 [ORAROOTAGENT(2409)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 2409 2017-11-18 17:27:08.413000 +11:00 2017-11-18 17:27:08.412 [ORAAGENT(2463)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 2463 2017-11-18 17:27:08.591 [MDNSD(2475)]CRS-8500: Oracle Clusterware MDNSD process is starting with operating system process ID 2475 2017-11-18 17:27:08.669 [EVMD(2478)]CRS-8500: Oracle Clusterware EVMD process is starting with operating system process ID 2478 2017-11-18 17:27:09.629000 +11:00 2017-11-18 17:27:09.628 [GPNPD(2514)]CRS-8500: Oracle Clusterware GPNPD process is starting with operating system process ID 2514 2017-11-18 17:27:10.807000 +11:00 2017-11-18 17:27:10.805 [GIPCD(2533)]CRS-8500: Oracle Clusterware GIPCD process is starting with operating system process ID 2533 2017-11-18 17:27:11.100 [GPNPD(2514)]CRS-2328: GPNPD started on node redfern1. 2017-11-18 17:27:17.670000 +11:00 2017-11-18 17:27:17.664 [CSSDMONITOR(2605)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 2605 2017-11-18 17:27:17.824 [CSSDAGENT(2622)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 2622 2017-11-18 17:27:17.830 [ORAROOTAGENT(2624)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 2624 2017-11-18 17:27:18.364 [OCSSD(2661)]CRS-8500: Oracle Clusterware OCSSD process is starting with operating system process ID 2661 2017-11-18 17:27:19.416000 +11:00 2017-11-18 17:27:19.415 [OCSSD(2661)]CRS-1713: CSSD daemon is started in hub mode 2017-11-18 17:27:19.512 [OCSSD(2661)]CRS-1714: Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /opt/app/oracle/diag/crs/redfern1/crs/trace/ocssd.trc 2017-11-18 17:27:34.525000 +11:00

OCSSD Log

Based on the entry from the CRS log highlighted above, the OCSSD log (/opt/app/oracle/diag/crs/redfern1/crs/trace/ocssd.trc) was examined and it shows:

2017-11-18 17:27:19.365144 : CSSD:1471755904: (TLM) Starting CSS daemon, version 12.1.0.2.0 with uniqueness value 1510986439 2017-11-18 17:27:19.365197 : CSSD:1471755904: clsu_load_ENV_levels: Module = CSSD, LogLevel = 2, TraceLevel = 0 2017-11-18 17:27:19.365228 : CSSD:1471755904: clsu_load_ENV_levels: Module = CSSDNMC, LogLevel = 2, TraceLevel = 0 2017-11-18 17:27:19.365252 : CSSD:1471755904: clsu_load_ENV_levels: Module = CSSDGMPC, LogLevel = 2, TraceLevel = 0 2017-11-18 17:27:19.365274 : CSSD:1471755904: clsu_load_ENV_levels: Module = CSSDGMCC, LogLevel = 2, TraceLevel = 0 2017-11-18 17:27:19.365294 : CSSD:1471755904: clsu_load_ENV_levels: Module = CSSDBCMC, LogLevel = 2, TraceLevel = 0 2017-11-18 17:27:19.365313 : CSSD:1471755904: clsu_load_ENV_levels: Module = CLSF, LogLevel = 0, TraceLevel = 0 2017-11-18 17:27:19.365332 : CSSD:1471755904: clsu_load_ENV_levels: Module = SKGFD, LogLevel = 0, TraceLevel = 0 2017-11-18 17:27:19.365350 : CSSD:1471755904: clsu_load_ENV_levels: Module = GPNP, LogLevel = 1, TraceLevel = 0 2017-11-18 17:27:19.365366 : CSSD:1471755904: clsu_load_ENV_levels: Module = CLSINET, LogLevel = 0, TraceLevel = 0 2017-11-18 17:27:19.365386 : CSSD:1471755904: clsu_load_ENV_levels: Module = CLSFA, LogLevel = 3, TraceLevel = 0 CSSD:1471755904: clsugetconf : Configuration type [4]. 2017-11-18 17:27:19.370526 : CSSD:1471755904: clssscGetParameterOLR: OLR fetch for parameter auth rep (9) failed with rc 21 2017-11-18 17:27:19.370530 : CSSD:1471755904: clssscagStartAgLsnr: Failed to get auth location from OLR, constructing manually 2017-11-18 17:27:19.370533 : CSSD:1471755904: clssscagStartAgLsnr: auth location '/opt/app/grid_infra/12.1.0/grid/auth/css/' 2017-11-18 17:27:19.370853 : CSSD:1471755904: clssscGPNPInit: PERF_TIME Initializing GPNP … 2017-11-18 17:27:19.491527 : CSSD:1471755904: clssnmReadDiscoveryProfile: voting file discovery string() 2017-11-18 17:27:19.491529 : CSSD:1471755904: clssnkInit: NK generic layer initializing. 2017-11-18 17:27:19.492218 : CSSD:1312909056: clssscthrdmain: Starting thread clssnmvDDiscThread 2017-11-18 17:27:19.492401 : GPNP:1312909056: clsgpnp_Init: [at clsgpnp0.c:654] '/opt/app/grid_infra/12.1.0/grid' in effect as GPnP home base. 2017-11-18 17:27:19.492407 : GPNP:1312909056: clsgpnp_Init: [at clsgpnp0.c:708] GPnP pid=2661, cli=clsuGpnpg GPNP comp tracelevel=1, depcomp tracelevel=0, tlsrc:ORA_DAEMON_LOGGING_LEVELS, apitl:0, complog:1, tstenv:0, devenv:0, envopt:0, flags=2003 2017-11-18 17:27:19.494824 : GPNP:1312909056: clsgpnpkwf_initwfloc: [at clsgpnpkwf.c:402] Using FS Wallet Location : /opt/app/grid_infra/12.1.0/grid/gpnp/redfern1/wallets/peer/ 2017-11-18 17:27:19.494838 : GPNP:1312909056: clsgpnpkwf_initwfloc: [at clsgpnpkwf.c:414] Wallet readable. Path: /opt/app/grid_infra/12.1.0/grid/gpnp/redfern1/wallets/peer/ 2017-11-18 17:27:19.511333 : GPNP:1312909056: clsgpnp_Term: [at clsgpnp0.c:1347] GPnP cli=clsuGpnpg 2017-11-18 17:27:19.512115 : CLSF:1312909056: Allocated CLSF context 2017-11-18 17:27:19.512119 : CSSD:1312909056: clssnmvDDiscThread: using discovery string for initial discovery 2017-11-18 17:27:19.512126 : SKGFD:1312909056: Discovery with str:: 2017-11-18 17:27:19.512130 : SKGFD:1312909056: UFS discovery with :: 2017-11-18 17:27:19.512134 : SKGFD:1312909056: Execute glob on the string /dev/sd* 2017-11-18 17:27:19.512186 : SKGFD:1312909056: running stat on disk:/dev/sdf 2017-11-18 17:27:19.512233 : SKGFD:1312909056: running stat on disk:/dev/sdd 2017-11-18 17:27:19.512237 : SKGFD:1312909056: running stat on disk:/dev/sde 2017-11-18 17:27:19.512242 : SKGFD:1312909056: running stat on disk:/dev/sdc 2017-11-18 17:27:19.512245 : SKGFD:1312909056: running stat on disk:/dev/sdb 2017-11-18 17:27:19.512249 : SKGFD:1312909056: running stat on disk:/dev/sda 2017-11-18 17:27:19.512255 : SKGFD:1312909056: Fetching UFS disk :/dev/sda: 2017-11-18 17:27:19.512257 : CLSF:1312909056: Ignoring 0-byte file /dev/sda 2017-11-18 17:27:19.512259 : SKGFD:1312909056: Fetching UFS disk :/dev/sdb: 2017-11-18 17:27:19.512260 : CLSF:1312909056: Ignoring 0-byte file /dev/sdb 2017-11-18 17:27:19.512262 : SKGFD:1312909056: Fetching UFS disk :/dev/sdc: 2017-11-18 17:27:19.512263 : CLSF:1312909056: Ignoring 0-byte file /dev/sdc 2017-11-18 17:27:19.512264 : SKGFD:1312909056: Fetching UFS disk :/dev/sde: 2017-11-18 17:27:19.512266 : CLSF:1312909056: Ignoring 0-byte file /dev/sde 2017-11-18 17:27:19.512267 : SKGFD:1312909056: Fetching UFS disk :/dev/sdd: 2017-11-18 17:27:19.512268 : CLSF:1312909056: Ignoring 0-byte file /dev/sdd 2017-11-18 17:27:19.512270 : SKGFD:1312909056: Fetching UFS disk :/dev/sdf: 2017-11-18 17:27:19.512271 : CLSF:1312909056: Ignoring 0-byte file /dev/sdf 2017-11-18 17:27:19.512272 : SKGFD:1312909056: OSS discovery with :: 2017-11-18 17:27:19.512714 : CSSD:1312909056: clssnmvDiskVerify: Successful discovery of 0 disks 2017-11-18 17:27:19.512717 : CSSD:1312909056: clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2017-11-18 17:27:19.512719 : CSSD:1312909056: clssnmvFindInitialConfigs: No voting files found 2017-11-18 17:27:19.512757 : CSSD:1312909056: (:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds

Points to note from this extract from the CSS log:

    1. There was no OLR file set up
    2. Devices prefixed by /dev/sd* are scanned by default for the voting disk
    3. The real voting disk is mounted on /dev/xvdh

Analysis

Reviewing my system build log, I found that the installation of GI failed as described in 06 Install GI 12.1.0.2 was not successful.

Deinstall GI

Tried following the procedure in How to Deconfigure/Reconfigure(Rebuild OCR) or Deinstall Grid Infrastructure (Doc ID 1377349.1), except that I found that perl was not installed.