According to the Intel Architectures Software Developer's Manual, Intel 64 and IA-32 processors are "little-endian" machines. Let us look into this concept of endianness and byte ordering, and find out why Intel chose little-endian over big-endian byte ordering.
TL;DR Because little-endian is more consistent, more logical and simpler for the hardware.
"In computing, endianness, also known as byte sex, is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most significant byte of a word at the smallest memory address and the least significant byte at the largest. A little-endian system, in contrast, stores the least-significant byte at the smallest address." - wikipedia
Big-endian is pretty logical for us humans (Martians excluded). It follows our familiar "big first, small second" system of writing. Say you want to write the number "four thousand, three hundred and twenty one". The thousandth is written first ( 4,000 ), then the hundredths ( 300 ), the tens ( 20 ) and the units ( 1 ), written as ( 4,321 ). As much as i know, this is the writing system everyone uses. We may not know the origin of this system but it seems most logical, right?
We are only comfortable with the big-endian system because it is the only one we've been taught, but that does not make it the most logical choice, especially for data processing and computing. An alternative to this system would be the little-endian system where the number "four thousand, three hundred and twenty one" is written as ( 1234 ) where the least significant (smaller) numbers are written first. I'm not proposing we switch our writing system and start writing in reverse, but could there be benefits to using this seemingly bizarre little-endian system, either for humans or for machines? In this article I'll be focusing on comparing these two systems in the context of computer processors and data storage.
Big-endian vs Little-endian has been much like driving on the left side of the road vs driving on the right. The Japanese date convention uses a big-endian format ( yyyy/mm/dd ). Programmers love such ordering because you can use a simple string-compare with the usual first-character-is-most-significant rule to sort data by date, though I doubt choosing this system was inspired by algorithmic efficiency.
Imagine you have an 8-bit processor with a single 16-bit register that can load only a single byte from memory in a clock cycle. If you want to load a 16 bit value, into your 16-bit register. You can:
Load a byte from the fetch location shift that byte to the left 8 places increment memory fetch location by 1 load the next byte
the outcome: you only ever increment the fetch location, you only ever load into the low order part of you wider register, and you only need to be able to shift left.
The result is the 16-bit data gets stored in order (big-endian) i.e most significant to least significant.
If you instead tried to load using little-endian, you would need to load a byte into the lower part of your 16-bit wide register, then load the next byte into a staging area, shift it, and then pop it into the top of your register. Or use a more complex arrangement of gating to be able to selectively load into the top or bottom byte.
The result of trying to go little-endian is you either need more silicon (switches and gates), or more operations.
These days, these considerations are pretty much irrelevant, with our super fast 64-bit processors.
In a little-endian system, the address of a given value in memory, either taken as a 32, 16, or 8 bit width, is always the same.
In other words, if you have in memory a two byte value:
address: 0x00f0 | value: 16 address: 0x00f1 | value: 0
taking that '16' as a 16-bit value (c 'short' on most 32-bit systems) or as an 8-bit value (generally c 'char') changes only the fetch instruction you use — not the address you fetch from.
On a big-endian system, with the above layed out as:
address: 0x00f0 | value: 0 address: 0x00f1 | value: 16
you would need to increment the pointer and then perform the narrower fetch operation on the new value.
When you add or subtract multi-byte numbers, you have to start with the least significant byte. If you're adding two 16-bit numbers for example, there may be a carry from the least significant byte to the most significant byte, so you have to start with the least significant byte to see if there is a carry. This is the same reason that you start with the rightmost digit when doing longhand addition. You can't start from the left.
Consider an 8-bit system that fetches bytes sequentially from memory. If it fetches the least significant byte first, it can start doing the addition while the most significant byte is being fetched from memory. This parallelism is why performance is better in little endian on such a system. If it had to wait until both bytes were fetched from memory, or fetch them in the reverse order, it would take longer.
This is on old 8-bit systems. On a modern CPU I doubt the byte order makes any difference but we still use little endian for historical reasons and backwards compatibility.
It is also easier in little-endian to check whether a multi-byte number is odd or even, since you can use the last byte, which comes first to judge the entire multi-byte number.
In almost all modern processors, little-endian is used, especially in Intel 32-bit and 64-bit machines which dominate the computer markets. It is important to note however that there are big-endian processors out there, and believe it or not, Bi-endian, and middle-endian and mixed-endian machines exist too.
Thank you for reading,
Give me a follow on Twitter to get updated on my new articles, and join my Discord server if you would like to see cool open-source projects i am building with an awesome community of software enthusiasts.
Consider donating, or becoming a sponsor to help me transition into making open-source software, full-time.