FireWire Real Application Cluster

Tuesday Jan 21st 2003 by DatabaseJournal.com Staff
Share:

With the demands of a 24/7 marketplace, a highly available and scalable database is getting increasingly more important. In the past, you had to choose from one of two options in a cluster. RAC takes the cluster architecture even further, providing improved fault resilience and incremental system growth by offering connection failover and load balancing in the same cluster. In the event of a system failure, RAC ensures your database will still be available. RAC gives you the availability and scalability that enterprises demand.

Brian Carr and William Garrett

Combine RAC, Linux and FireWire disk for low-cost development environment.

Toolbox: Oracle 9i (9.0.1.4.0) Database Server with RAC option; Red Hat Linux 7.1. Topics include Oracle database installation, Linux kernel versions, and kernel settings. You should have a good knowledge of Oracle database administration and Linux (or UNIX) operating system management, but whether you're an established DBA and UNIX administrator or merely a DBA and Linux "newbie," the basic advice and techniques here will save you lots of time and aggravation.

With the demands of a 24/7 marketplace, a highly available and scalable database is getting increasingly more important. In the past, you had to choose from one of two options in a cluster. High availability clusters were used to protect your database from hardware failure. Load balanced clusters with many nodes were used to support a much larger volume of traffic than single multi-processor implementations. Redundant components such as additional nodes, multiple interconnects, and arrays of disks helped provide high availability. Such redundant hardware architectures avoid single points-of-failure and provide exceptional fault resilience. RAC takes the cluster architecture even further, providing improved fault resilience and incremental system growth by offering connection failover and load balancing in the same cluster. In the event of a system failure, RAC ensures your database will still be available. In the event of a large spike in traffic, RAC can distribute the load over many nodes. Not only does RAC make good sense from a data availability and performance point of view, but with large SMP servers going for a premium price a pair of 2 processor servers can be half the cost of a single 4 processor server. RAC gives you the availability and scalability that enterprises demand.

RAC provides the following key benefits to e-business application deployments:

  • Flexible and effortless scalability, so that adding nodes to the database is easy and manual intervention is not required to re-partition data when processor nodes are added.
  • A high availability solution that masks node and component failures from end-users.
  • A single management interface for DBAs to perform installation, configuration, backup, upgrade, and monitoring functions once. Oracle automatically distributes the management functions to the appropriate nodes. This means the DBA manages one virtual server.

An alternative to using a Storage Area Network (SAN) or Network Attached Storage (NAS) is an external FireWire hard drive enclosure. This allows a low cost solution for testing your systems in a RAC environment prior to roll-out on your production RAC. With the use of commodity hardware you can build your development environment for a fraction of the cost.

This article looks at the process and some helpful tips to help you configure RAC for your development environment using a low cost alternative to a SAN, etc. After reading this article you should be able to setup your RAC environment more quickly and with fewer headaches.

Background / Overview

Our environment was set up to provide us with a low cost development environment to see how RAC fit into our application environment. We had two Compaq 700 MHz Pentium3 desktops, each with 512MB RAM and 10GB internal drives. We also had a spare switch to use for the interconnect. The only hardware we needed to purchase was an external FireWire enclosure, an IDE hard drive and two FireWire adapters. These additional hardware components came up to less than $400.

The focus of this article is using a FireWire drive with RAC on Linux. Therefore we will not rehash the installation of RAC on Linux since this is well documented in Note Id 184821.1 available on Metalink. You should also read the Oracle RAC installation documentation. We will focus on configuring Linux for FireWire support and how to test your RAC configuration for failover. By the end of this article you will understand the steps necessary to setup your own RAC environment for testing clustered applications. For those familiar with Linux and Oracle we estimate approximately 1-2 days worth of work to setup your development RAC environment.

1: Ensure your configuration is certified.

The setup we used was Red Hat 7.1 with RAC 9.0.1. If you're interested in using another distribution of Linux or 9iR2 be sure to check the certified configurations.

  • Connect and login to http://metalink.oracle.com
  • Click on the "Certify and Availability" button on the menu frame
  • Click on View Certifications by Product hyperlink
  • Choose "Real Application Clusters"
  • Choose the correct platform.

2: Obtain proper FireWire chipset and adapters.

Some FireWire chipsets are better than others at handling multiple logins. In order for RAC to work properly both nodes need to be logged in to the external FireWire hard drive simultaneously. We used an external hard drive enclosure that contained an Oxford Semiconductor OXFW911 sbp2 chipset, which supports up to four concurrent logins. The hardware savings is quite noticeable here, as two FireWire adapters and a shared disk can be bought for less than a single fiber channel controller (let alone the cost of a full SAN implementation).

3: Kernel configuration

In order for FireWire to be recognized by Linux it is recommended that you use a 2.4.19 updated kernel. We downloaded and unpacked the updated kernel from http://linux1394.otncast.otnxchange.oracle.com/

4: Driver modification

As specified in the sbp2.c file of the new kernel source, you'll want to change this file to allow support of multiple logins. To do this change the line

static int sbp2_exclusive_login=1 

to:

static int sbp2_exclusive_login=0

This modification is well documented in the file and will allow both nodes to have access to the external FireWire hard drive simultaneously. You will want to read through the rest of the source code in this file as there are several tuning parameters which can be set here.

5: Compile the kernel

Prior to compiling the kernel be sure to run "make xconfig" from the /usr/src/linux directory. Choose the IEEE 1394 (FireWire) support (EXPERIMENTAL) menu option. Set the following options to "y":

  • IEEE 1394 (FireWire) support (EXPERIMENTAL)
  • OHCI-1394 support
  • SBP-2 support (Hard disks, etc)
  • Enable Phys DMA support for SBP2 (Debug)
  • Raw IEEE1394 I/O support
  • IEC61883-1 Plug support

Next build the kernel according to your distributions instructions.

You may run across an error "nodemgr.c: 1307: parse error before 'else'" when the kernel is compiling. This is a verbose debug option. We commented out line 1304 and recompiled the kernel.

Another error we ran into was "Error invoking target install of makefile /op/oracle/product/9.0.1/plsql/lib/ins_plsql.mk". To resolve this problem you'll need to edit the file $ORACLE_HOME/bin/genclntsh and change the following line:

LD_SELF_CONTAINED="-z defs"
to read:
LD_SELF_CONTAINED=""

6: Detect FireWire devices

The easiest way to add/detect new FireWire devices is to run the shell script rescan-scsi-bus.sh. The script may be found at: http://www.garloff.de/kurt/linux/rescan-scsi-bus.sh

When you run this script you should see the following type of response:

Host adapter 0 (ide-scsi) found.
Host adapter 1 (sbp2) found.
Scanning for device 1 0 0 0 ...
NEW: Host: scsi1 Channel: 00 Id: 00 Lun: 00
      Vendor: WDC WD12 Model: 00JB-75CRA0      Rev:
      Type:   Direct-Access                    ANSI SCSI revision: 06
1 new device(s) found.
0 device(s) removed.

This indicates your FireWire drive was detected by Linux.

7: Partitioning your drive

If you decide to only use one external disk, then you'll need to be aware of a couple of things. If you do not use Logical Volume Manager (LVM) or Oracle Cluster File System (OCFS), then you'll most likely have to use FDISK to partition your raw devices. You will be limited to 3 primaries, 1 extended and 11 logical partitions. This means you will not have room for all the default tablespaces the Database Creation Assistant (DBCA) uses. For this reason we decided to drop the USERS and TOOLS tablespaces during the DBCA setup. This obviously doesn't follow the optimal flexible architecture (OFA), but since this is a development system it will work just fine. You could also use multiple disks to allow you to set up additional partitions.

It's recommended that you use LVM or OCFS.

8: Oracle patch

Install Oracle 9i Database server with the RAC option according to Note ID 184821.1 available on Metalink. There was an issue filed for shutdown immediate taking <60 seconds to unregister.

This was filed as Bug 1841387 which was fixed in the 9.0.1.1 patch set. However Oracle Support recommended we apply the 9.0.1.4.0 patch set. You can download this from http://metalink.oracle.com and simply follow the installation instructions. Once you've created and started your database, you're ready to connect to the RAC from a client workstation.

9: Testing SESSION Failover

You may use SESSION or SELECT type failover. SESSION is the simplest type. When the connection to an instance is lost, SESSION failover results only in the establishment of a new connection to a backup instance. Any work in progress is lost.

SELECT Failover is implemented by transparently re-executing the SELECT statement and then bringing the cursor up to the same point as it was before the failure. There's no automatic recovery mechanism built into SELECT failover to handle DML statements, such as INSERTs and UPDATES which are in progress when a failover occurs. Your application will still need to use error checking routines and transactions, but now if a failure occurs you can try the transaction again on the same connection. If it was a node failure the connection has already reestablished itself to another node.

The METHOD parameter defines if Oracle pre-establishes connections to connect to the backup node.

BASIC : In this case it simply establishes a new connection to the backup node. In this case the backup node isn't used until the used node crashes.

PRECONNECT : In this case it connects also to the backup node so that the switch from one instance to the other is quick.

Now comes the exciting part, testing your Real Application Cluster for failover. The following text describes the connect string in the clients TNSNAMES.ORA file for SELECT failover (notice that TYPE=SELECT):

CLUST = 
(DESCRIPTION = 
  (FAILOVER=ON) 
  (ADDRESS_LIST = 
     (LOAD_BALANCE=ON) 
     (ADDRESS = (PROTOCOL = TCP)(HOST = rac1)(PORT = 1521)) 
     (ADDRESS = (PROTOCOL = TCP)(HOST = rac2)(PORT = 1521)) 
  ) 
  (CONNECT_DATA = 
     (SERVICE_NAME = clust) 
     (FAILOVER_MODE=(TYPE=SELECT)(RETRIES=3)(TIMEOUT=3)) 
  ) 
)

From a client SQL Plus session connect to your RAC database. To determine which instance you connected to you can run the following:

 SQL> select instance_number,instance_name
        2   from v$instance;

In our case we were connected to instance "clust2". So if we were to run an SQL statement and take down this instance, then we should be failed over to instance "clust1" on the other node.

The next step is to run an SQL SELECT statement. Make sure it will run long enough for you to shutdown the instances you're currently connected to. You could import a table from your production system via the imp/exp utilities.

While the SQL is processing, type the following on your server to shut down the instance where your SQL is processing:

 [oracle@rac1]$>  srvctl stop -p clust -i clust2 -s inst

where clust is the name of your cluster database and clust2 is the specific instance you want to take down. If everything is working properly you should see your SQL results pause for a moment and then pick back up. Once the SQL has completed verify that you are connected to the other instance by running:

 SQL> select instance_number,instance_name
        2   from v$instance;

Conclusion

Now you've seen the steps necessary to configure RAC on Linux using FireWire drives. Oracle Real Application Clusters with FireWire on Linux enable you to build a robust clustered system on a shared disk using inexpensive hardware. This will allow you to test your clustered applications and get experience managing a cluster.

http://otn.oracle.com/tech/linux/open_source.html

http://metalink.oracle.com

Brian Carr (editor@oraclegiants.com), founding member of Oracle Giants, Oracle Certified Professional & Oracle ACE. William Garrett (wgarrett@neo.rr.com) is a Senior Application Developer/Web Technologies at a manufacturing company, in Akron, Ohio.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved