Describe Cluster hardware best practices

Hardware Design

In Oracle® Clusterware Administration and Deployment Guide 11g Release 2 (11.2), the section, Oracle Clusterware Hardware Concepts and Requirements, says that:

…However, a server that is part of a cluster, otherwise known as a node or a cluster member, requires a second network. This second network is referred to as the interconnect. For this reason, cluster member nodes require at least two network interface cards: one for a public network and one for a private network. The interconnect network is a private network using a switch (or multiple switches) that only the nodes in the cluster can access.

…If you are implementing a cluster for high availability, then configure redundancy for all of the components of the infrastructure as follows:

    • At least two network interfaces for the public network, bonded to provide one address
    • At least two network interfaces for the private interconnect network, also bonded to provide one address

The cluster requires cluster-aware storage that is connected to each server in the cluster. This may also be referred to as a multihost device. Oracle Clusterware supports NFS, iSCSI, Direct Attached Storage (DAS), Storage Area Network (SAN) storage, and Network Attached Storage (NAS).

To provide redundancy for storage, generally provide at least two connections from each server to the cluster-aware storage. There may be more connections depending on your I/O requirements. It is important to consider the I/O requirements of the entire cluster when choosing your storage subsystem.

IPMI

In Oracle® Grid Infrastructure Installation Guide 11g Release 2 (11.2) for Linux, the section, 2.13 Enabling Intelligent Platform Management Interface (IPMI), says that:

Intelligent Platform Management Interface (IPMI) provides a set of common interfaces to computer hardware and firmware that system administrators can use to monitor system health and manage the system. With Oracle 11g release 2, Oracle Clusterware can integrate IPMI to provide failure isolation support and to ensure cluster integrity.

You can configure node-termination with IPMI during installation by selecting a node-termination protocol, such as IPMI. You can also configure IPMI after installation with crsctl commands.

In Oracle® Clusterware Administration and Deployment Guide 11g Release 2 (11.2), the section, About Using IPMI for Failure Isolation, says that:

Failure isolation is a process by which a failed node is isolated from the rest of the cluster to prevent whatever happened to the failed node from happening to other nodes. You must configure and use an external mechanism capable of restarting a problem node without cooperation either from Oracle Clusterware or from the operating system running on that node. To provide this capability, Oracle Clusterware 11g release 2 (11.2) supports the Intelligent Management Platform Interface specification (IPMI), an industry-standard management protocol.