We all know the path up the physical and logical chains of an Oracle database. A fairly typical statement in most database administration books states that a tablespace is where the physical meets the logical. Further, these books go into great detail about the building blocks of each chain. The lowest starting point on the physical chain is the operating system block and how it compares to a database block (usually one-to-one or where the database block is a multiple of the operating system block size). Whats typically left out of the physical description is something lower than the block, and that is what comprises a block, namely bytes. The ordering of bytes is an important aspect of how programs and operating systems deal with data, and this is one of those areas where Oracle does things that users take for granted.
A byte is a byte, right? For the most part this is true, but when you factor in the platform on which a byte lives, the details of what a byte is starts to differ. Its still eight bits to a byte, and the ordering of bits is generally not an issue. What is an issue is the ordering of bytes, where byte ordering is referred to as endianness. Lets use a common analogy about computer words and spoken words. In a language, how would you say aloud the number 24? In English, we say twenty-four. In other languages, 24 is spoken as four and twenty. The difference between the two methods is based on where the most significant digit or part occurs. In a language where twenty comes first, being where the most significant part comes first (20 is more significant that 4), the ordering is big endian. In the opposite case, where the least significant digit comes first, the ordering is said to be little endian (or Little-Endian or little-endian, same formatting for big).
In a white paper, Intel defines endianness as follows.
Endianness is the format to how multi-byte data is stored in computer memory. It describes the location of the most significant byte (MSB) and least significant byte (LSB) of an address in memory. Endianness is dictated by the CPU architecture implementation of the system. The operating system does not dictate the endian model implemented, but rather the endian model of the CPU architecture dictates how the operating system is implemented.
There is no better or worse involved when deciding which ordering to use, as each has distinct advantages. Little-Endian form supports multiple precision math routines, while Big-Endian form is very good at performing binary to decimal conversions (plus the advantage of how it writes out values from big to small because that is how the bytes are arranged). If youve looked through any trace or dump files, more than likely youve seen values written in hexadecimal format (words beginning with 0x signify hexadecimal format).
The number 287454020, for example, can be written as 0x11223344. Computer science books where computer math is involved typically have conversion problems related to the base of a number. Convert 1234 to base 2, convert 12.34 to binary, and convert 287454020 to base 16 are examples. What is assumed when converting base whatever into base 16 or hexadecimal is the byte ordering. In a Big-Endian system, the conversion of 287454020 to 0x11223344 would be correct. But what if your system was little-endian? The hex value would be stored as 0x44332211.
The most important sentence in the white paper should immediately bring to mind something Oracle deals with.
If the target application is currently running on a Big Endian platform and the goal is to port to a Little Endian platform, or visa-versa, byte ordering may become an issue.
For those who are relatively new to Oracle, this statement precisely frames the issue of using transportable tablespaces. If anyone ever tells you that *NIX (whatever version of UNIX) is all the same except for some commands and how shells behave, ask them about byte ordering or the endianness of brand A versus that of brand B. The bottom line is that with respect to UNIX, there is a significant difference when it comes to how data is stored. In platform ID order (what was platform 14?), a query out of V$TRANSPORTABLE_PLATFORM shows the following.
PLATFORM_ID PLATFORM_NAME ENDIAN_FORMAT ----------- -------------------------------- -------------- 1 Solaris[tm] OE (32-bit) Big 2 Solaris[tm] OE (64-bit) Big 3 HP-UX (64-bit) Big 4 HP-UX IA (64-bit) Big 5 HP Tru64 UNIX Little 6 AIX-Based Systems (64-bit) Big 7 Microsoft Windows IA (32-bit) Little 8 Microsoft Windows IA (64-bit) Little 9 IBM zSeries Based Linux Big 10 Linux IA (32-bit) Little 11 Linux IA (64-bit) Little 12 Microsoft Windows 64-bit for AMD Little 13 Linux 64-bit for AMD Little 15 HP Open VMS Little 16 Apple Mac OS Big 17 Solaris Operating System (x86) Little
Solaris and HP flavored UNIX have to deal with both formats, and so does Linux, depending on where it comes from (thanks, IBM). Windows is consistent (and were not going into the debate about why one should or should not put a production system on Windows), and Oracle on the Apple Mac OS will be a short-lived side venture.
The Conversion Command
So what if you were doing a double-sided migration, that is, going from Oracle version X to X plus one, and going from HP Tru64 to 64-bit Linux? The endianness of the operating systems differs (technically, the difference lies within the CPU architecture), so that requires a transformation or conversion and that is if the versions of Oracle support that to begin with. To use a simple case, lets suppose a tablespace named USERS has one datafile. The problem comes down to the how of How do you convert the little-endian datafile to one that is big-endian? The answer is simple: do some byte swapping.
Fortunately, Oracle 10g abstracts this process for us via RMAN. It may seem odd at first that RMAN would be involved in converting a file from one format to another, but in a way it makes sense when considering what RMAN can do with respect to managing files. Although there are some restrictions (see the details in the Backup and Recovery Reference), the ability of Oracle to perform the endian conversion for us lifted a major limitation on using transportable tablespaces. In prior versions, you had to be on the same platform, period. With the CONVERT command in RMAN, under the right, but not so narrow conditions, you can perform a database version upgrade in significantly less time than what it had been.
The growth in the size of databases reminds me of the old joke about how money is tossed around by Congress. A billion here, a billion there, and pretty soon youre talking about some real money. The same thing applies with data. A terabyte here, a terabyte there, and pretty soon were talking about some real size. If youve ever had to use export and import on a near terabyte-sized database, the number of hours involved was daunting. Transportable tablespaces helped a good bit, but you were limited to being on the same platform. The ability to go cross-platform now gives you much more freedom with respect to choosing an operating system that suits your needs.
Knowing why (in the past) you were limited to using transportable tablespaces on the same platform is easily understood now. Quite simply, Oracle didnt have the capability to perform the byte swapping needed to go from one endian format to the other. Knowing the limitations or features of older versions of Oracle has relevance in at least two areas. One is that this knowledge gives you an appreciation of how Oracle has developed over the years, and the second is that one day you may find yourself working with an older version.
And where did endian come from? This explanation comes from Webopedia.
The terms big-endian and little-endian are derived from the Lilliputians of Gulliver's Travels, whose major political issue was whether soft-boiled eggs should be opened on the big side or the little side. Likewise, the big-/little-endian computer debate has much more to do with political issues than technological merits.