Virtual IP is required to ensure that applications can be designed to be highly available. A system needs to eliminate SPOFs. In Oracle, clients connected to an RAC database must be able to survive a node failure. Client applications connect to the Oracle instance and access the database through the instance. So a node failure will bring down the instance to which the client might have connected. The first design available from Oracle was Transparent Application Failover (TAF). With TAF, a session can failover to the surviving instances and continue processing.
Various limitations existed with TAF; for instance only query failover is supported. Also, to achieve less latency in failing over to the surviving node, Oracle tweaked the TCP timeout (platform dependent, defaults to 10 minutes in most UNIX ports). It wouldn’t be a good idea to design a system in which a client takes 10 minutes to detect that there is no response from the node to which it has connected
To address this, Oracle version 10g introduced a new feature called cluster VIPs—a cluster virtual IP address that would be used by the outside world to connect to the database. This IP address needs to be different from the set of IP addresses within the cluster. Traditionally, listeners would be listening on the public IP of the box and clients would contact the listener on this IP. If the node dies, the client would take the TCP timeout value to detect the death of the node. In 10g, each node of the cluster has a VIP configured in the same subnet of the public IP. A VIP name and address must be registered in the DNS in addition to the standard static IP information. Listeners would be configured to listen on VIPs instead of the public IP. When a node is down, the VIP is automatically failed over to one of the other nodes. During the failover, the node that gets the VIP will “re-ARP” to the world, indicating the new MAC address of the VIP. Clients who have connected to this VIP will immediately get a reset packet sent. This results in clients getting errors mmediately rather than waiting for the TCP timeout value. When one node goes down in a cluster and a client is connecting to same node, the client connection will be refused by the down node, and the client application will choose the next available node from the descriptor list to get a connection. Applications need to be written so that they catch the reset errors and handle them. Typically for queries, applications should see an ORA-3113 error.
Note: In computer networking, the Address Resolution Protocol (ARP) is the method of finding the host’s hardware address (MAC address) when only the IP address is known. ARP is used by the hosts when they want to communicate with each other in the same network. It is also used by routers to forward a packet from one host through another router. In cluster VIP failovers, the new node that gets the VIP advertises the new ARP address to the world. This is typically known as gracious-ARP, and during this operation, the old hardware address is invalidated in the ARP cache, and all the new connections will get the new hardware address.
Virtual Interface or Virtual IP
Virtual IP (VIP) definition in Oracle Clusterware 10g is a logical, public IP address assigned to a node.
It is not physically assigned to the network card.
This logical nature allows the CRS to manage easily its start, stop, and migration features.
Two types of VIP implementations are supported by Oracle Clusterware:
1) Database VIP
> The advantage of using VIP when making connections to the database compared to the traditional TCP method is that it overcomes the delay in receiving a failure signal that is encountered by the user connection when a node is not reachable.
> when the VIP fails over and the client tries to connect to, say, port 1521, it gets an immediate failure rather than having to wait for a TCP timeout. It gets the immediate failure (or NAK [negative acknowledgement]) because the IP is active, but nothing behind that IP has opened the port the client is trying to connect to.(WE SAVE APPROX 10 MINUTES OF TCP TIMEOUT WAIT)
2) Application VIP
> Similar to Database VIP with difference that when an application is bound to a VIP and when the application fails over, the VIP fails over with it. The clients continue to make network requests to the VIP and continue to operate as normal.