6.3. Endian Issues
Endianness comes in two varieties: big and little. A
The origin of the odd terms big-endian and little-endian can be traced to the 1726 book Gulliver's Travels, by Jonathan Swift. In one part of the story, resistance to an imperial edict to break soft-boiled eggs on the "little end" escalates to civil war. The plot is a satire of England's King Henry VIII's break with the Catholic Church. A few hundred years later, in 1981, Danny Cohen applied the terms and the satire to our current situation in IEEE Computer (vol. 14, no. 10). 6.3.1. Endianness in DevicesEndianness doesn't matter on a single system. It matters only when two computers are trying to communicate. Every processor and every communication protocol must choose one type of endianness or the other. Thus, two processors with different endianness will conflict if they communicate through a memory device. Similarly, a little-endian processor trying to communicate over a big-endian network will need to do software-byte reordering. Intel's 80x86 processors and their clones are little-endian. Sun's SPARC, Motorola's 68K, and the PowerPC families are all big-endian. Some processors even have a bit in a register that allows the programmer to select the desired endianness. The PXA255 processor supports both big- and little-endian operation via bit 7 in Control Register 1 (Coprocessor 15 (CP15) register 1). An endianness difference can cause problems if a computer unknowingly tries to read binary data written in the opposite format from a shared memory location or file. Figure 6-2(a) shows the memory contents for the data 0x12345678 (a long), 0xABCD (a word), and 0xEF (a byte) on a little-endian machine. The same data represented on a big-endian machine is shown in Figure 6-2(b). Figure 6-2. (a) Little-endian memory, (b) big-endian memory6.3.2. Endianness in NetworkingAnother As it turns out, all of the protocol layers in the TCP/IP suite are defined as big-endian. In other words, any 16- or 32-bit value within the various layer headers (for example, an IP address, a packet length, or a checksum) must be sent and received with its most significant byte first. Let's say you wish to establish a TCP socket connection to a computer whose IP address is 192.0.1.2. IPv4 uses a unique 32-bit integer to identify each network host. The dotted decimal IP address must be translated into such an integer. The multibyte integer representation used by the TCP/IP protocols is sometimes called Suppose an 80x86-based, little-endian PC is talking to a SPARC-based, big-endian server over the Internet. Without further manipulation, the 80x86 processor would convert 192.0.1.2 to the little-endian integer 0x020100C0 and transmit the bytes in the following order: 0x02, 0x01, 0x00, 0xC0. The SPARC would receive the bytes in the followng order: 0x02, 0x01, 0x00, 0xC0. The SPARC would reconstruct the bytes into a big-endian integer 0x020100c0, and misinterpret the address as 2.1.0.192. Preventing this sort of confusion leads to an annoying little implementation detail for TCP/IP stack developers. If the stack will run on a little-endian processor, it will have to reorder (at runtime) the bytes of every multibyte data field within the various layers' headers. If the stack will run on a big-endian processor, there's nothing to worry about. For the stack to be portable (that is, to be able to run on processors of both types), it will have to decide whether or not to do this reordering. The decision is typically made at compile time. A common solution to the endianness problem is to define a set of four preprocessor macros:
Following is an example of the implementation of these macros. We will take a look at the left shift (<<) and right shift (>>) operators in Chapter 7. #if defined(BIG_ENDIAN) && !defined(LITTLE_ENDIAN) If the processor on which the TCP/IP stack is to be run is itself also big-endian, each of the four macros will be defined to do nothing, and there will be no runtime performance impact. If, however, the processor is little-endian, the macros will reorder the bytes appropriately. These macros are routinely called when building and parsing network packets and when socket connections are created. Runtime performance penalties can occur when using TCP/IP on a little-endian processor. For that reason, it may be unwise to select a little-endian processor for use in a device with an abundance of network functionality, such as a router or gateway. Embedded programmers must be aware of the issue and be prepared to convert between their different representations as required. |
Saturday, November 7, 2009
Section 6.3. Endian Issues
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment