Fundamentals of Data Encoding, Big Endian, Little Endian

The processor can work with different types of data. These include integers of different sizes, floating point numbers, texts, structures and even single bits. All these data are stored in the memory as a single byte or multiple bytes.

Integers

Integer data types can be 8, 16, 32 or 64 bits long. If the encoded number is unsigned it is stored in binary representation, while the value is signed the representation is two's complement. In two's complement representation, the most significant bit (MSB) represents the sign of the number. Zero means a non-negative number, one represents a negative value. The table 1 shows the integer data types with their ranges.

Table 1: Integer binary numbers

Number of bits	Minimum value (hexadecimal)	Maximum value (hexadecimal)	Minimum value (decimal)	Maximum value (decimal)
8	0x00	0xFF	0	255
8 signed	0x80	0x7F	-128	127
16	0x0000	0xFFFF	0	65 535
16 signed	0x8000	0x7FFF	-32 768	32 767
32	0x0000 0000	0xFFFF FFFF	0	4 294 967 295
32 signed	0x8000 0000	0x7FFF FFFF	-2 147 483 648	2 147 483 647
64	0x0000 0000 0000 0000	0xFFFF FFFF FFFF FFFF	0	18 446 744 073 709 551 615
64 signed	0x8000 0000 0000 0000	0x7FFF FFFF FFFF FFFF	-9 223 372 036 854 775 808	9 223 372 036 854 775 807

Floating point

Integer calculations do not always cover all mathematical requirements of the algorithm. To represent real numbers the floating point encoding is used. A floating point is the representation of the value A which is composed of three fields:

Sign bit
Exponent (E)
Mantissa (M)

fulfilling the equation

There are two main types of real numbers, called floating point values. Single precision is the number which is encoded in 32 bits. Double precision floating point number is encoded with 64 bits. They are presented in Fig1.

Figure 1: Illustration of a single and double precision real numnbers

The Table2 shows a number of bits for exponent and mantissa for single and double precision floating point numbers. It also presents the minimal and maximal values which can be stored using these formats (they are absolute values, and can be positive or negative depending on the sign bit).

Table 2: Floating point numbers

Precision	Exponent	Mantissa	The smallest	The largest
Single (32 bit)	8 bits	23 bits
Double (64 bit)	11 bits	52 bits

The most common representation for real numbers on computers is standardised in the document IEEE Standard 754. There are two modifications implemented which make the calculations easier for computers.

The Biased exponent
The Normalised Mantissa

Biased exponent means that the bias value is added to the real exponent value. This results with all positive exponents which makes it easier to compare numbers. The normalised mantissa is adjusted to have only one bit of the value “1” to the left of the decimal. It requires an appropriate exponent adjustment.

Texts

Texts are represented as a series of characters. In modern operating systems, texts are encoded using two-byte Unicode which is capable of encoding not only 26 basic letters but also language-specific characters of many different languages. In simpler computers like in embedded systems, 8-bit ASCII codes are often used. Every byte of the text representation in the memory contains a single ASCII code of the character. It is quite common in assembler programs to use the zero value (NULL) as the end character of the string, similar to the C/C++ null-terminated string convention.

Endianness

Data encoded in memory must be compatible with the processor. Memory chips are usually organised as a sequence of bytes, which means that every byte can be individually addressed. For processors of the class higher than 8-bit, there appears the issue of the byte order in bigger data types. There are two possibilities:

Little Endian - low-order byte is stored at a lower address in the memory.
Big Endian - high-order byte is stored at a lower address in the memory.

These two methods for a 32-bit class processor are shown in Fig2

Figure 2: Illustration of Little and Big Endian data placement in the memory

en/multiasm/cs/chapter_3_11.txt · Last modified: 2025/01/08 18:29 by ktokarz