Notes on the Intel 8086 processor's arithmetic-logic unit

Posted by elpocko 1 day ago

Comments

Comment by kens 1 day ago

Author here for all your 8086 questions...

Comment by gruturo 1 day ago

Awesome article Ken, I feel spoiled! It's always nice to see your posts hit HN!

Out of curiosity: Is there anything you feel they could have done better in hindsight? Useless instructions, or inefficient ones, or "missing" ones? Either down at the transistor level, or in high level design/philosophy (the segment/offset mechanism creating 20 bit addresses out of 2 16-bit registers with thousands of overlaps sure comes to mind - if not a flat model, but that's asking too much to 1979 design and transistor limitations I guess) ?

Thanks!

Comment by kens 23 hours ago

That's an interesting question. Keep in mind that the 8086 was built as a stopgap processor to sell until Intel's iAPX 432 "micro-mainframe" processor was completed. Moreover, the 8086 was designed to be assembly-language compatible with the 8080 (through translation software) so it could take advantage of existing software. It was also designed to be compatible with the 8080's 16-bit addressing while supporting more memory.

Given those constraints, the design of the 8086 makes sense. In hindsight, though, considering that the x86 architecture has lasted for decades, there are a lot of things that could have been done differently. For example, the instruction encoding is a mess and didn't have an easy path for extending the instruction set. Trapping on invalid instructions would have been a good idea. The BCD instructions are not useful nowadays. Treating a register as two overlapping 8-bit registers (AL, AH) makes register renaming difficult in an out-of-order execution system. A flat address space would have been much nicer than segmented memory, as you mention. The concept of I/O operations vs memory operations was inherited from the Datapoint 2200; memory-mapped I/O would have been better. Overall, a more RISC-like architecture would have been good.

I can't really fault the 8086 designers for their decisions, since they made sense at the time. But if you could go back in a time machine, one could certainly give them a lot of advice!

Comment by bonzini 6 hours ago

As someone who did assembly coding on the 8086/286/386 in the 90s, the xH and xL registers were quite useful to write efficient code. Maybe 64-bit mode should have gotten rid of them completely though, rather than only when REX.W=1.

AAA/AAS/DAA/DAS were used quite a lot by COBOL compilers. These days ASCII and BCD processing doesn't use them, but it takes very fast data paths (the microcode sequencer in the 8086 was pretty slow), large ALUs, and very fast multipliers (to divide by constant powers of 10) to write efficient routines.

I/O ports have always been weird though. :)

Comment by gruturo 23 hours ago

> I can't really fault the 8086 designers for their decisions, since they made sense at the time. But if you could go back in a time machine, one could certainly give them a lot of advice!

Thanks for capturing my feeling very precisely! I was indeed thinking what they could have done better with the same approximate number of transistor and the benefit of a time traveler :) And yes the constraints you mention (8080 compatibility, etc) indeed limit their leeway so maybe we'd have to point the time machine at a few years earlier and influence the 8080 first

Comment by mjevans 14 hours ago

What's that military adage? Something along the lines of 'planned to win the (prior) war'?

There's also the needs of the moment. Wasn't the 8086 a 'drop in' replacement for the 8080, and also (offhand recollection) limited by the number of pins on some of it's package options? This was still an era when it was common for even multiple series of computers from a vendor to have incompatible architectures that required at the very least recompiling software if not whole new programs.

Comment by bcrl 1 day ago

Thanks for publishing your blog! The articles are quite enlightening, and it's interesting to see how semiconductors evolved in the '70s, '80s and '90s. Having grown up in this time, I feel it was a great time to learn as one could understand an entire computer, but details like this were completely inaccessible back then. Keep up the good work knowing that it is appreciated!

A more personal question: is your reverse engineering work just a hobby or is it tied in with your day to day work?

Comment by kens 1 day ago

Thanks! The reverse engineering is just for fun. I have a software background, so I'm entirely unqualified to be doing this :-) But I figure that if I'm a programmer, I should know how computers really work.

Comment by rogerbinns 21 hours ago

Did it make things simpler or more complex for the byte order they picked? It is notable that new RISC designs not much later all started big endian, implying that is simpler. Can you even tell the endianess from the dies?

Comment by kens 21 hours ago

The byte order doesn't make much difference. The more important difference compared to a typical RISC chip is that the 8086 supports unaligned memory access. So there's some complicated bus circuitry to perform two memory accesses and shuffle the bytes if necessary.

To understand why the 8086 uses little-endian, you need to go back to the Datapoint 2200, a 1970 desktop computer / smart terminal built from TTL chips (since this was pre-microprocessor). RAM was too expensive at the time, so the Datapoint 2200 used Intel shift-register memory chips along with a 1-bit serial ALU. To add numbers one bit at a time, you need to start with the lowest bit to handle carries, so little-endian is the practical ordering.

Datapoint talked to Intel and Texas Instruments about replacing the board full of TTL chips with a single-chip processor. Texas Instruments created the TMX1795 processor and Intel slightly later created the 8008 processor. Datapoint rejected both chips and continued using TTL. Texas Instruments tried to sell the TMX1795 to Ford as an engine controller, but they were unsuccessful and the TMX1795 disappeared. Intel, however, marketed the 8008 chip as a general-purpose processor, creating the microprocessor as a product (along with the unrelated 4-bit 4004). Since the 8008 was essentially a clone of the Datapoint 2200 processor, it was little-endian. Intel improved the 8008 with the 8080 and 8085, then made the 16-bit 8086, which led to the modern x86 line. For backward compatibility, Intel kept the little-endian order (along with other influences of the Datapoint 2200). The point of this history is that x86 is little-endian because the Datapoint 2200 was a serial processor, not because little-endian makes sense. (Big-endian is the obvious ordering. Among other things, it is compatible with punch cards where everything is typed left-to-right in the normal way.)

Comment by variaga 20 hours ago

Big-endian matches the way we commonly write numbers, but if you have to deal with multiple word widths or greater than word-width math I find little-endian much more straightforward because LE has the invariant that bit value = 2^bit_index and byte value = 2^(8byte_index).
E.g. a 1 in bit 7 on a LE system always represnts 2^7 for 8/16/32/64/ whatever bit word widths.

This is emphatically not true in BE systems and as evidence I offer that IBM (natively BE), MIPS natively BE) and ARM (natively LE but with a BE mode) all have different mappings of bit and byte indices/lanes in larger word widths* while all LE systems assign the bit/byte lanes the same way.

Using the bit 7 example

- IBM 8-bit: bit 7 is in byte 0 and equal to 2^0

- IBM 16-bit: bit 7 is in byte o and equal to 2^8

- IBM 32-bit: bit 7 is in byte 0 and equal to 2^25

‐ MIPS 16-bit: bit 7 is in byte 1 and equal to 2^7

- MIPS 32-bit: bit 7 is in byte 3 and is equal to 2^7

- ARM 32-bit BE: bit 7 is in byte 0 and is equal to 2^31

Vs. every single LE system, regardless of word width

- bit N is in byte (N//8) and is equal to 2^N

(And of course none of these match how ethernet orders bits/bytes, but that's a different topic)

Comment by mjevans 14 hours ago

Not that it typically matters in a practical sense* (Unless you're writing to a register for a device)...

However I've always viewed Little Endian as 'bit 0' being on the left most / lowest part of the string of bits, but Big Endian 'bit 0' is all the way to the right / highest address of bits (but smallest order of power).

If encoding or decoding an analog value it makes sense to begin with the biggest bit first - but that mostly matters in a serial / output sense, not for machine word transfers which are (at least in that era were) parallel (today, of course, we have multiple high speed serial links between most chips, sometimes in parallel for wide paths).

Aside from the reduced complexity of aligned only access, forcing the bus to a machine word naturally also aligns / packs fractions of that word on RISC systems, which tended to be the big endian systems.

From that logical perspective it might even make sense to think of the RAM not in units of bytes but rather in units of whole machine words, which might be partly accessed by a fractional value.

Comment by 1 day ago

Comment by unixhero 8 hours ago

I dod not know thr 8086 had microcode. What was in it?

Comment by kens 7 hours ago

It had the micro-instructions for most of the machine instructions, but a few simple instructions were implemented directly in hardware.

https://www.reenigne.org/blog/8086-microcode-disassembled/

Comment by 23 hours ago