Clustering Oracle RAC Virtual Machines across physical and ESX hosts

Thursday Aug 16th 2007 by Tarry Singh

Part 4 of this series examines various possibilities for building clusters across several physical and ESX hosts and takes a quick look at upgrading clustered Virtual Machines in all the three scenarios.

Brief intro

In our previous article, we looked at the clustering possibilities across two or more ESX Servers. In this article, we will take a detailed look at various possibilities of building clusters across several physical and ESX hosts since we weren’t able to pick that one up in our last article. In addition, we will take a quick look at upgrading clustered Virtual Machines in all the three scenarios. There is a very good chance that you have a Oracle RAC test or development cluster on an ESX 2.5 version and want to move over to the latest ESX 3.x version (the latest being ESX 3.0.2 as of last week).

Clustering Oracle RAC Virtual Machines across physical and ESX hosts

I speak to several clients who are running their production Oracle environments on VMware. The choice of running Oracle RAC differs per organization but I firmly believe that it is possible to have a DSS (Decision Support System) running on Oracle RAC (which normally has large transactions and less concurrent users) on ESX Servers in production.  On a typical OLTP environment, it might not be smart to deploy RAC on ESX without any planning but a DSS can surely run fine. Moreover, there is no reason of not trying it on your ESX system. There are already enough test, development and staging deployments running on ESX servers.

Now to take a quick look at the tasks we need to perform to build the cluster:

  • Physical Node: It must have network adapters (2 NIC cards), it must have access to the same storage (SAN LUN volumes) as that of the ESX server (this is for visibility, both guest Virtual Machines and the physical machine must be able to see the shared volume) and the OS version and patches must be identical on all platforms (virtual or physical). Also, note that there shouldn’t be multipathing software running on the physical node.
  • Virtual Node: Here the steps are pretty much the same. The ESX host must have at least 2 physical NICs, although it’s advisable to have three (one for service console, two NICs for teaming/bonding and redundancy), the VM must have two vNICs (Virtual NICs) – one for outbound and the other connected to a private VLAN for high-speed interconnects for cache fusion.
  • Adding shared storage: Please follow the same steps as we did in our clustering VMs across multiple ESX hosts. On the Virtual Machine (physical node is a simple procedure— it’s a simple matter of assigning your HBA to the mapped SAN LUN and you are done) you will click add storage and choose Mapped SAN LUN, the hard disks point to the LUN using RDM (Raw Device Mapping). In the LUN selection, you choose the same LUN (Logical Unit Name) that is being accessed by the physical node(s). Then select the virtual device node on a different SCSI controller hence creating a new SCSI controller. Edit the new SCSI (1: 0) controller properties and change the sharing to “physical”. Carry out the same step for all the shared disks (OCR.vmdk, VOTINGDISK.vmdk, SPFILEASM.vmdk, ASM01.vmdk and so on). Upon clicking finish, you are done.
  • The final step is obviously to install and configure the Oracle RAC clusterware and database.

Upgrading your RAC cluster

Upgrading your ESX server or your cluster software is not an easy task. We will not go too deep into ESX server upgrade as it is out of the scope of this article but will concentrate on several scenarios such as upgrading clusters on one ESX server, across physical hosts or on a typical heterogeneous cluster (physical and virtual nodes):

  • Upgrading the cluster on one ESX host: Power off your VM, let your system admin upgrade your ESX server from 2.5 to 3.x; upgrade your VMFS2 to VMFS3. This you do by opening the VI client, selecting the volume and click “upgrade to VMFS3”, upgrade the shared RDM files if necessary, right click each cluster in the inventory panel and click upgrade virtual hardware. Restart the cluster. Should for any reason you run into an error, try importing the backup vmdks like this:

  • vmkfstools -I /vmfs/volumes/vol1/<old-disk>.vmdk

    Then rename the old-disk.vmdk and edit the >vmx file to point to the new-disk.vmdk. Restart the cluster successfully.

  • Upgrading cluster across ESX hosts: You could do this using shared pass-through RDM and with shared file systems.

    1. Using shared pass-through RDM: Here you first upgrade your ESX server from 2.5 to 3.x. Via the VI client, upgrade your shared pass-through RDM files from VMFS2 to VMFS3, right click the cluster VM and select “upgrade virtual hardware”. Do the same for the boot disk and you are done. Turn on your cluster and verify the upgrade.

    2. Using files in shared (VMFS2) volumes: Do the following before upgrading to VMFS3:

    3. vmkfstools -L lunreset vmhba<C:T:L>:0
      vmkfstools -F public vmhba<C:T:L:P>

      This makes the shared files public. Then do the ESX host upgrades from ESX 2.5 to ESX 3.0. Choose the first upgraded node in the configuration tab and click “storage”: upgrade the VMFS2 disks in your cluster by clicking “Upgrade to VMFS3”, create LUNs for each of the shared RAC disks, create a RDM for each shared disk and import the virtual disk to this RDM:

      vmkfstools -i /vmfs/volumes/vol1/<old-disk>.vmdk
      /vmfs/volumes/vol2/<RACDir>/<rdm-for-vmrac01>/<myrdm.vmdk> -d

      old-disk.vmdk: our RAC vmdk which is to be imported.
      myrdm.vmdk: New RDM for vmrac01 (Our first node)
      vmhba1.2:3:Tthe LUN that backs the RDM

      Now edit the virtual machine’s configuration file (vmrac01.vmx) to point to the RDM instead of the shared file by doing the following: scsi<X>:<Y>.<filename> = “rdm-fxy vmrac01/.vmdk. Restart the cluster and check for its liveliness.


Although the VMware ESX server has several models of clustering and HA, we should not forget that some mission critical applications like Oracle RAC cannot be fully replaced by other OS or even infrastructure level clustering and high availability. The whole purpose of demonstrating the Oracle RAC on ESX is not only to solidify the business imperative in a consolidated setup for test and development purposes but also that you as an administrator have the “RAC running under your desk!” The fact that we can run and even setup, test and benchmark mission-critical applications in our own premises gives us the power to be on top of our applications and businesses.

» See All Articles by Columnist Tarry Singh

Mobile Site | Full Site