X86.FR | Doc TB's R&D Lab

The Universal Chip Analyzer now supports Intel 286

Another milestone – albeit not the hardest one – has been reached! The 80286 is the 2^nd gen 16-bit x86 CPU introduced by Intel in 1982. The most important improvement was the use of a separated data and address bus. Its predecessor, the famous Intel 8086, used the same pins to send address and then data. The lack of that slow time-multiplexed bus on the 286 allowed a major performance boost, sometimes more than 100% at a similar clock speed. The microarchitecture also evolved with a more advanced (dedicated) address calculation unit and a faster multiplier. The 80286 was also able to support up to 16 MB of RAM, thanks to its 24-bit address bus.

The 80286 also introduced the protected mode, designed to allow much more advanced memory management, with the ability to build multi-user systems using multitasking applications. Unfortunately, due to several limitations in that first implementation, along with several hardware errata found in earlier stepping, protected mode wasn’t really used by software developers on the 286. Intel only solved all these issues with the 80386. The Intel 80286 was initially released at 4, 6 and 8 MHz on nMOS 1.5 µm process. Later released reached 12.5 MHz in 1 µm CMOS process. Several other companies produced CPU fully based on Intel’s 286 microcode like AMD, Siemens and Harris, with speed up to 25 MHz!

UCA 286 Adapter testing LCC (left) & PLCC (right) 286s

Three common 68-pin packages were used for the vast majority of the 80286s ever produced: the original ceramic PGA, a leadless LCC (also ceramic) and a plastic PLCC. On the picture above, you can see an AMD R80286-8 (LCC) and a Harris CS80C286-25 (PLCC). The Universal Chip Analyzer is able to test all three packages just by plugging the related socket on the PGA DIP Socket. Frequencies available (by DIP Switches or software) are 4, 8, 10, 12.5, 16 and 20 MHz.

Why not 25 MHz? Because the 80286 requires a clock-doubled input and feeding a 50 MHz clock to get the 25 MHz core frequency would have required an external PLL. Not a big deal, but there is only one rare 286-class CPU that supports this frequency (the Harris/Intersil CS80C286-25 pictured above) and its timings is not fully compliant with the 286 specifications. Designing a special UCA adapter just for this chip is trivial, but quite useless because the 286 Adapter is already able to test it at 20 MHz. Speaking about “high” frequencies, using the right socket is crucial

The UCA 286 Adapter is fitted with a high-quality DIP Socket. Directly plugging a PGA 286 CPU is possible but not convenient for testing multiple CPUs in a row. I was able to secure some ZIF Sockets for 68-pin PGA like the blue one pictured here (from AMP) and also some 3M LCC sockets complete with top cap. About PLCCs, I first tried some cheap socket from eBay. That was a disaster: contact pins were too weak and bent after 2-3 insertions. Worst, the maximum frequency allowed was 8-10 MHz. Replacing these crappy sockets with other ones from Foxconn or 3M solved all the issues. I also bought some awesome Yamaichi test Socket for PLCC (on the right), but unfortunately, they use a specific pinout. As the 286 Adapter uses a simple 2-layer PCB, I will consider designing a specific PCB just for them.

With the hardware finished, I’ll later tune the software-testing code to see if I can detect various stepping, and maybe also the manufacturer.

The UCA 386 Adapter supports Ti & Cyrix 486s

Adding support for Cyrix & TI 486s was supposed to be a matter of hours. It finally took almost one month and gave me many headaches. I almost burned everything to the ground several times in rage, begged for help from FPGA’s gurus who told me what I’m trying to achieve was like squaring the circle, but I did not give up. Let’s try to explain why it was so hard.

— always(@TLDR; Technical stuff) —

FPGAs are synchronous beasts used to create finite states machines: almost everything inside a FPGA is synchronized to a clock signal. Each time the clock is ticking, the HDL code analyzes inputs and sets a pre-defined state (that itself defines registers, outputs, the next state, …). To add support for a CPU, you must read the datasheet and write some HDL code that will provide the correct outputs (from the FPGA to the CPU) within the required timings. All these timings are linked to the base clock. A synchronization between the CPU and the FPGA is crucial. For all other CPUs I’ve worked on for the UCA, the FPGA provides the base clock to the CPU. Both the FPGA and the CPU are sharing the same clock and synchronization is easy. But 386s require a clock-doubled input (80 MHz for a 386DX-40 MHz) that I’m not able to provide directly from the FPGA because the 3.3-to-5 volt translators are too slow. So I use an external clock-doubler PLL, but doing so prevents the FPGA from having access to the CPU clock. That’s the root of all issues I had.

Fortunately, using an external phase-locked loop (PLL) means the clock input phase is synchronized with the clock-doubled output signal: the rising edge of both clocks occurs at the same time. Knowing the transmission delays added by the voltage converters at a given frequency, you can still synchronize your FPGA with the CPU without having access to the base clock. That works fine as long as you don’t change the frequency. But that was too easy: I want to be able to switch frequency on-the-fly and within a large range (from 12.5 MHz to 40 MHz to cover all 386s). That’s still possible if you build many bitfiles (compiled HDL “FPGA firmware”), one for each frequency. Nah! I want to use the same bitfile for everything, including support for both microarchitectures (Cyrix & Intel) despite the different timing’s requirements. That’s hell but I almost succeeded.

The actual firmware is not perfect but I’m quite happy with it because it works as expected in most cases. The remaining issue is a hole between ~21 and ~28 MHz where the FPGA can’t reliably catch the required inputs from the CPU at the rising or falling edge of the clock. My Logic Analyzer is unfortunately too slow to solve this but it’s not a big deal. The HDL code works fine at 12.5 MHz, 16 MHz, 20 MHz, 33 MHz and 40 MHz. The only “retail” frequency I’m not able to do is 25 MHz. I built another bitfile for this frequency only and I’ll hope to find a way to merge everything in the same bitfile later. To avoid losing my mind, I’ll wait to have enough money to buy a faster logic analyzer (like the lovely DSLogic U3Pro32) to work on this again.

— End —

But here it is: the UCA supports all Cyrix-based 386 like the 486DLC. Here are the ones I used for the test:

Unlike 386-class CPUs from AMD, which are based on Intel’s microcode and are exact clones, the Cx486DLC introduced in June 1992 uses a custom microarchitecture built from scratch by Cyrix. While still using the 32-bit 386 bus, they come with 486-class features like an embedded L1 cache and some new instructions. The Cyrix 486DLC is not a perfect pin-to-pin replacement for Intel 386s as timings are a bit different and cache control lines must be handled by the chipset. Compatibility issues are well known with many – especially older – motherboards. The original 486DLC was available at 25 MHz, 33 MHz and 40 MHz. All of these were manufactured by Texas Instruments on the 0.8µm CHMOS node. Ti also launched their own, rebranded 486DLC chips, which were exactly the same except for the marking. Please notice the vicious 90° rotation between printings and pin 1 on the Ti486DLC. Fortunately, the Universal Chip Analyzer have strong short-circuit protection built-in…

Cyrix also later released a special, clock-doubled version called the 486DRx². It was available at 16/32, 20/40 MHz, 25/50 MHz and even 33/66 MHz. This later one was the fastest PGA132 CPU ever released.

Cyrix 486DLC-40 & Cyrix 486DRx²-25/50 Tested on the UCA

The original Cyrix 486DLC exists with two steppings: the earliest one with CPUID 0x420 and a later one with CPUID 0x421. The proprietary “DIR” identification registers available on Cyrix’s CPU is only available on newer CPUs. None of the 486DLC tested have them. The 486DRx² is the only one to have DIR registers and reports itself as Model = 0x07. The UCA happily tested the 486DLC at 40 MHz and was even able to overclock my 486DRx2 25/50 at 33/66 MHz for a short time. Cyrix 486s run hot and deserve a proper heatsink. Power consumption is as high as a 486 DX2 and can go as high as 4 watts (4 times higher than a later Intel 386 DX-33)!

Much later in the development process, I feel confident enough to try a blank 486DLC Engineering sample I got many years ago.

This ES is not a clock-doubled CPU like the DRx² and was able to run properly at 33 MHz. CPUID is 0x421 and – surprise! – it has DIR registers, identifying itself at Model = 0x01 (the expected value for a Cyrix 486DLC) and stepping 0x22, with seems to match the handwritten value (2/2) marked on top. The DRx2 25/50 tested above comes with stepping 0x21, so this ES seems newer. I don’t know at this point if any 486DLCs were released commercially with this stepping – or even if any retail 486DLCs have DIR registers enabled.

Let’s now talk about the Ti 486SXL. After having simply renamed the Cyrix 486DLC to Ti 486DLC, Texas Instruments released a new, reworked core they called the “486SXL”. It was available with PGA132 (386) and PGA168 (486) pinouts. Two models were released for PGA132 Socket: the TI 486 SXL40 and the TI 486 SXL2-50. Here they are:

They come with two major differences compared to the Cyrix 486DLC. First, TI boosted the L1 cache from 1 KB to 8 KB (same size as the Intel 486). Then, the clock-doubling feature (also available on the SXL-40 despite its name) is not always activated by default like on the DRx². It must be enabled after boot by software. You basically have to mess with internal proprietary registers to enable the clock doubling mode.

Very few 386 motherboards support the Ti 486SXL but the UCA happily tested it with and without clock-doubling. Just for fun, I ran some benchmarks on all 386s now supported by the Universal Chip Analyzer. The code is not really well-tuned and is only based on some register manipulations and a lot of math integer operations (add, sub, imult and idiv). Here are the results:

Intel 386s appear as the slowest of them all. AMD 386s performances are exactly the same as expected but their famous 40 MHz model offers a 20% boost versus the Intel 386 DX-33. Cyrix 486DLC are much faster. When introduced, they claimed “up to 2x faster than 386DX at same clock frequency”. Our test showed a ~50% improvement between the Intel 386DX-33 and the Cyrix 486DLC-33. The 486DLC-40 is ~80% faster than the fastest Intel 386.

Anyway, the most impressive performance come from the DRx²: the rare 33/66 MHz version is actually ~7x faster than the original Intel 386 DX released at 12.5 MHz in 1986! Results from the TI486SXL show it’s entirely based on the Cyrix 486DLC core with no tuning at all on the microarchitecture. The effect of the increased 8 KB cache is invisible because the UCA has an extremely fast RAM without any wait-states (similar to the L1 cache). Anyway, even real-world applications don’t benefit from a big gain (no more than 3-5% at best).

Stay tuned for more exciting news from the UCA!

The UCA 386 Adapter now supports Intel RapidCAD

The elusive Intel RapidCAD Engineering CoProcessor is a weird and rare 2-chip set designed to upgrade 386 computers. It has been released in February 1992 for $499 and sold as a coprocessor. Technically, the RapidCAD is a 486DX assembled inside a 132-pin ceramic package that plugs into a standard 386 Socket. It features an integrated FPU but Intel removed the 8KB L1 cache and the 486-specific instructions. A second chip (RapidCAD-2) plugs into the 387 Socket, is only needed to provide the #FERR signal used to handle FPU exceptions.

This early sample has been assembled in April 1992 with dies from December 1991. The RapidCAD is able to work at any frequencies from 16 to 33 MHz. The lack of L1 cache and the slower 386 bus used does not provide a significant boost in Integer performances, but the FPU is the fastest available for 386s. The Universal Chip Analyzer is now able to fully test RapidCAD up to 33 MHz.

For some reasons, my sample was unable to run at 12.5 MHz, but works fine from 16 to 33 MHz. It’s probably due to the modification on the internal PLL needed to adapt a 486 CPU (1x clock signal expected) to a 386 Socket (2x clock required). PLLs often have limited top/bottom frequency lock range.

The reported CPUID is 0x340 and the power consumption is quite high (~2W typical in INT, ~2.5W in FPU) for a 386. I ran some INT benchmark only at 33 MHz and I got a score of 105.7 while a standard Intel 386DX-33 (or Am386DX-33) got 99.6. That’s only a 6% increase. The RapidCAD is much faster on FPU, being up to 70% faster than an Intel 387.

The Odd Story of Factory-Downgraded 486s

Counterfeits CPU were very common in the mid-90s. The worst period was between 1993 (just after the launch of the Intel 486 DX2) and 1998 (when the Pentium II started to be multiplier-locked). It was extremely easy for tricksters to remove the original marking and reprint another one with a higher frequency rating. Many DX4-75 were remarked to DX4-100, and even more Pentium 133/150 were remarked as Pentium 166 or 200s.

Genuine factory-remarked CPUs also exist, but they’re generally uncommon. The most well-known example is the double-sigma (ΣΣ) sign added on early 386s after they had been tested bug-free from the infamous 32-bit multiplier bug. Some rare Intel 486 SX were also later remarked with a higher speed grade. Here are two of them:

As for all factory-remarks, the addition is quite obvious. Intel probably binned twice these CPUs again at the request of a big customer (IBM?) and added the second rating later. Today’s story about factory-remarks is much more unusual because it concerns standard models.

Am486DX4-100SV8B (remarked 5×86)

After I published this analysis some weeks ago, a reader told me he had a strange Am486DX4-100 that seemed to be a AMD 5×86. After a careful look at the printings that looked 100% genuine at first sight, he was kind enough to lend it to me for further investigation with the UCA. Here it is:

The “9626” date code tells us it was manufactured in late June or early July 1996, which is quite late for a Am486DX4. I immediately noticed the 25544 package code, only used for the 350 nm die. This die was the basis of all Am486DX5 and Am5x86. The “C” stepping was also unusual as the Am5x86 is based on the A-step (from November 1995) or B-Step (from March 1997). A “C” Stepping build in 1996 is incoherent with the 5×86 line, but very coherent with the 486DX4 (later 486DX4 in the latest “C” Stepping were built on the 25498 package in May/June 1996). So it was time for a test on the Universal Chip Analyzer:

WOW! There is no doubt: this CPU is really based on the standard 350 nm die with a fully enabled 16 KB Write-Back L1 cache and a working 4x multiplier. Actually, it can even be overclocked easily to 133 MHz. All specs, including power consumption and CPUID (0x4F4), make it indistinguishable from an AMD 5×86. This CPU can of course also work with a 3x multiplier like an AMD 486DX4-100 (CPUID drops to 0x494).

After some research, it seems that all CPUs based on the 25544/C package are marked as 486DX4-100SV8B while being really DX5 SV16B (5×86). AMD produced them for quite some time between February 1996 and March 1997. They probably stopped the production of the old 500 nm die in early ’96 but still had some demand from customers for DX4s, so they just used the new 350 nm die and marked these CPUs as DX4-100s. As long as you use the default x3 multiplier, they behave exactly like the old one … except for the cache size.

Has Intel also done such weird things? I could have sworn no way. I was wrong…

Intel 486DX2-66 SK080 (remarked DX4)

The same reader also sends me a DX2-66 that could be “really a DX4-100”. That sounded odd and really unlikely to me because Intel has a strict policy on S-Spec. Intel DX4s also have a specific CPUID to help distinguish them from DX2s by software. Unlike AMD 486s, this CPUID does NOT change with the multiplier used, so it’s strange to have a DX2 with a DX4’s CPUID. Here is the original CPU:

Everything looks genuine here. SK080 is one of the least common S-Spec for Intel DX2s. The only other S-Spec beginning with “SK” is the extremely rare SK058. The SK080 is a 3.3V SL-Enhanced part which seems to have been produced only between WW18’94 (May 1994) and WW48’94 (November 1994). Let’s plug in into the UCA:

Awesome! This is really a DX4 factory-downgraded to DX2-66. The 0x480 CPUID leaves no doubt about the original die used here. The usual power consumption and the ability to work fine at 3.3V at 100 MHz let me think it’s probably a DX4-100. With the multiplier set at 2x, the SK080 also works at 2×33 MHz as expected for a CPU marked as a DX2-66. To be 100% sure, I was able to find another sample to confirm these findings.

The UCA 386 Adapter now supports AMD & IBM 386s

IBM 386

As like all previous microprocessors, Intel licensed the i386 design to third parties. AMD was the only one legally allowed to sell Intel-based 386s to customers (as bare CPUs), but IBM was granted the right to produce some Intel 386s for its own use. They don’t look like a standard ceramic CPUs : IBM used a plastic substrate and a metal cover to protect the die and help with thermal dissipation. Here is how they look like.

If the packaging is different, the internal die is the same as on Intel 386s. 7 different IBM part-numbers are actually known: 23F7189 (?? MHz), 32G6633 (25 MHz), 51F0352 (20 MHz), 51F1783 (20? MHz), 51F1784 (20 MHz), 51F1797 (25 MHz) and 63F7615 (25 MHz).I was able to put my hand on a 51F1784 and the later 63F7615. I tested both on the Universal Chip Analyzer. There is no “Pin 1” mark so I had to guess where is pin 1. Fortunately, the UCA has strong over-current and short-circuit protection. Let’s start with the 63F7615 :

This one is able to work fine up to 33 MHz, with a CPUID set at 0x305, similar to Intel 386 based on the D0-stepping. I don’t know for sure the real rated frequency, but it’s probably only 25 MHz. The other one (51F1784ESD) is not able to work at 33 MHz and not even at 25 MHz. The actual (early) UCA firmware only has 16/25/33/40 MHz, so I can only confirm that that my 51F1784ESD works at 16 MHz but not at 25 MHz. According to various sources, it’s probably rated at 20 MHz.

AMD Am386

AMD also produced a lot of Am386DX at 20, 25, 33 and 40 MHz. While the microcode is 100% from Intel, the manufacturing process is different and they had lower power consumption (thanks to the 0.8µm process used by AMD instead of Intel’s 1µm CMOS-IV on the latest i386s).

Let’s start with the standard Am386 DX/DXL. I tested one Am386 DX/DXL-25 “B-Step”, one Am386 DX/DXL-33 “D-Step” and another Am386 DX/DXL-40 “C-Step”. All came in the 23936 package from Kyocera.

The UCA tool is not yet able to detect them as AMD, but I’m working on a new algorithm based on power consumption to distinguish them from Intel 386s. The B-Step identifies itself as 0x305, the same CPUID used on Intel’s 386 D0-Step. The DXL-25 was able to work up to 33 MHz. Both C- and D-Step have a CPUID set at 0x308, like the later Intel 386s (D1 step and up).

The last CPU to try was the Am386DE-33, an uncommon embedded model. Like the Am386DXL, it uses a fully-static design, meaning it can be clocked down to DC (0 Hz) while retaining all its internal registers content. The biggest difference between the usual Am386DX/DXL and the Am386DE is the disabled Paging Unit in protected mode on the latter. Bit 31 of CR0 (used to enable paging) is reserved on Am386DE. Another difference only available on the Am386DE is the ability to work at its rated frequency with a much lower voltage (down to 3.0V). And it works fine:

At 3.3V, the power needed drops by a huge margin, from 1.1 Watt to as low as 461 mW (0.46W). That’s a -60% power reduction!

[Guide] Am486 Die & Packaging

After weeks spent to test A LOT of AMD 486 with the Universal Chip Analyzer, messing with a gas torch to decap some of them and speaking with a former AMD engineer that worked on them back in the 90s, I’m happy to publish here all the information I was able to get! Here it is:

The Ultimate AMD 486 Die & Packaging Guide

PS: If you have any more information about AMD 486, please leave a comment. Thanks!

The Universal Chip Analyzer now supports Intel 386 !

Another milestone has been reached. A new iconic CPU family is now supported by the UCA: the Intel 80386, the very first 32-bit x86 microprocessor! The i386 was originally released in 1985 at 12.5 MHz and 16 MHz. It added a 3-stage instruction pipeline and an on-chip MMU (Memory Management Unit) able to address up to 4 GB of RAM. A giant capacity for that time. The original Intel 386 – then renamed 386DX – comes in a PGA132 package and its frequency was later upgraded to 20 MHz, 25 MHz and finally 33 MHz. Later clones from AMD & Cyrix – not yet supported by the UCA – were also released at frequencies up to 40 MHz.

The design of the UCA 386 Adapter was challenging because of the high frequencies involved. While all 486s work with a standard clock input, 386s require a clock-doubled signal with a strong voltage swing (CMOS) between 0 and +5V. Generating frequencies up to 80 MHz (for a 386DX-40) with these requirement was not possible with the UCA architecture, so the UCA 386 Adapter includes an external clock-doubling PLL. For the first try, I used a NB3N511 clock multiplier from On Semiconductor but I was unable to reliably run 386s at more than 20 MHz. The UCA is based on a FPGA and timings are crucial. With the NB3N511, I was unable to successfully match timings because of the lack of phase synchronization between the input clock generated by the FPGA and the doubled frequency fed to the CPU. So I needed another PLL, with 0-delay between input and output.

After some research, I gave the ICS570A a try and it worked perfectly fine. I was able to sync the internal FPGA logic to the clock-doubled CPU signal up to 40/80 MHz, without having direct access to that signal. Some high-quality ceramic decoupling caps were also mandatory to achieve the highest frequencies. I had a bad surprise while looking at ZIF socket for the 386 Adapter: for some reasons, PGA132 ZIF Socket are extremely expensive and I was unable to source them at a decent price. I bought some at ~$25 each, but if you can help me find more at a lower price, please drop me an email!

The 4 standard test frequencies for 386 are set to 12.5 MHz, 25 MHz, 33 MHz and 40 MHz. Right now, only Intel 386 are supported, but I’m working on AMD, Cyrix, etc. clones and I’m confident the UCA will support them all very soon. Here are the i386 I have for testing:

The first one is an early 386 clocked at 16 MHz and produced in 1986. The ΣΣ sign engraved shows that it has been tested free of the infamous 32-bit multiplier bug (more on this on a later post). According to this source, the S40344 S-Spec is a B1 stepping. This is confirmed by the CPUID displayed by the UCA Analyzer tool: 0x303. At 12.5 MHz, this CPU requires about 188 mA at 5V (a bit less than 1W).

The second one is very similar to the first one, except the rated speed at 20 MHz. Still the same B1-Stepping, same CPUID, same ΣΣ, same power consumption, same everything. It also can’t be overclocked at 25 MHz.
.
The third one is also rated at 20 MHz but don’t have any S-Spec. It has been assembled in November 1988, more than one year after the previous one. The CPUID is different at 0x305, which indicates a D0 stepping. Surprisingly, the power consumption is 10% higher, at ~212 mA for 5V at 12.5 MHz. Maybe Intel added some logic to solve the numerous erratas in previous stepping, maybe it’s just sample variation. Anyway, the D0 stepping still uses the CHMOS III (1.5 µm) process. This chip can be overclocked at 25 MHz

The fourth one is a 16 MHz marked SX236 and build with the more advanced CHMOS IV process. The CPUID is 0x308, which is used by Intel for D1, D2 and E Stepping. The last stepping is usually marked on the chip, so we can guess it’s a D1 or D2 stepping. Power consumption drops from 212 mA to 147 mA, thanks to the 1 µm process (instead of 1.5 µm). Unfortunately, it can’t work at 25 MHz.
.
The fifth one is a A80386DX-33 made in 1992 with s-spec SX366. It’s the faster clock speed Intel released for a 386DX. The CPUID is still set at 0x308 but the stepping is clearly more advanced: the power consumption drops to 126 mA while using the same CHMOS IV process than the previous one. This particular CPU requires 126 mA at 12.5 MHz, 206 mA at 25 MHz and up to 261 mA at 33 MHz. It can’t be overclocked to 40 MHz.

The last one is much newer than the others. It was manufactured in 2000 and uses the E-Stepping, as stated by the last letter of the lot code. Aside from this, all specs look identical to the SX366. Measures are also the same and it doesn’t work at 40 MHz.

Stay tuned for more exciting news about the UCA!

[UCA CPU Analysis] Prototype UMC Green CPU U5S-SUPER33

While sorting some new Engineering Samples I received lately, I exhumed some prototypes from my collection. They came without missing pins, so they are good candidates for an advanced investigation with the Universal Chip Analyzer.

Let’s begin with the first one, a UMC Green CPU U5S-SUPER33

It’s marked “Confidential” on the last line, which means it’s an engineering sample. The date code is quite early: 9416. It was manufactured on the third week of April 1994. This CPU is not one of the very first samples of the whole U5S line regardless of the frequency, but probably a prototype for the specific 33 MHz version. Also notice the famous “Not for U.S. sale or import” line, written here because UMC was afraid – and rightly so – of the legal consequences of infringing Intel’s ‘338 patent.

Let’s try it on the Universal Chip Analyzer:

The prototype works fine up to 33 MHz. One of the first interesting points to check is the support of the CPUID instruction on such an early prototype. A few weeks ago, I was chatting with mtx500 (another well-known and very technical-aware CPU collector) about the way to detect UMC CPUs. He told me he uses the SALC/FS method to distinguish UMCs. The idea is to use the undocumented Intel opcode 0xD6 “SALC” (Set AL on Carry) instruction with the 0x64 “FS:” prefix. Only on a UMC, the combined 0xD6 0x64 opcodes return the “magic” constant 0x0AB6B1B07 in the EAX register.

I was wondering why to use this method when the CPUID instruction is supported? Mtx500 told me that early U5S might not support the CPUID instruction, so I was impatient to try on an early U5S like this one. It looks like the CPUID instruction is well supported, with the expected “UMC UMC UMC” string reported as well as the usual 0x423 family/model/revision on U5S(X). At first sight, this prototype looks strictly identical to the retail version. I ran some benchmarks to compare with later U5S and the cycle count of the test instruction flow is exactly the same.

However, on closer inspection, I found a noteworthy difference: power consumption. I ran the same INT benchmark on my 4 U5S with the voltage set at 5.11V exactly on all of them. The UCA is quite precise at measuring power consumption. All of them were tested at a fixed 33.3 MHz frequency, no matter their rated maximum speed.

The results are quite interesting. As you can see, all my retail U5S consume (almost) the same current: about 308 mA at 33 MHz while running my benchmark code. For an unknown reason, the U5SD is a bit higher at 321 mA, but the difference in power is only 50 mW. In the opposite, the prototype U5S-33 need MUCH more power to process the exact same code in the same time at the same frequency with the same voltage: 421 mA, which translate to about 2.15 watts.

The most obvious explanation is a switch in manufacturing process between the early prototype and the commercial revision. UMC was, at that time, one of the two major IC manufacturers in Taiwan, along with TSMC. We can consider they can switch easily between processes. On the very first U5S datasheet published in 1993, UMC indicates the U5S is built using a 0.6 µm CMOS process. This is consistent with the power consumption seen on the sample. I was able to find a table of the manufacturing process evolution in Taiwan in the 90s.

UMC and TSMC switched from 0.6 µm in 1993 to 0.4 µm in 1994. According to this table, it seems likely that the prototype is build using the ’93 process (0.6 µm) while UMC switched to the ’94 process (0.4 µm) for their retail mass-volume production. The retail U5S tested here need about 27% less power than the prototype. The theoretical reduction in power consumption between 0.6 µm and 0.4 µm is 33%, so that makes perfect sense. This Engineering sample is probably an early U5S manufactured with the original 1993 0.6 µm process. It is unknown at this point if retail (non-prototype) UMC Green CPU have ever been built with that process. Another analysis with a retail U5S produced earlier than August 1994 (Week 33’94) would allow us to be sure…

Identifying “blank” 486s with the UCA #1

The Universal Chip Analyzer is useful to test and spot counterfeits CPUs, but also to help identify CPUs without markings. The lack of printings on a CPU can be caused by a poor ink quality that gradually faded out over years, by abrasion with other ICs (common when you saved a nice CPU from a “scrap lot”) or because it’s an early engineering sample (prototype). Here are two examples.

Let’s start with the first one.

It’s supposed to be an early engineering sample coming from ST Microelectronics. Hand-writing on top are “X2, Y4 #7”, probably related to the coordinates of this particular die on the wafer (X=2 , Y=4) and the wafer number (#7), and also “PLL”, which probably mean it was designed to test the integrated Phase-Lock Loop (clock multiplier). The back of the CPU shows that the PLL was configured for “3XCLK”. So it’s a DX4 class CPU. But it could also be a 5×86 ES. I have tested it on the UCA at 3.45V.

All 486s from ST are just rebranded Cyrix 486s and this one makes no exception. It identifies itself as a Cyrix Cx486DX4 (M0.7). The most interesting point is the stepping. I have one It’s ST ST486 DX4-100 (you can see it here) and another Cyrix Cx486DX-100GP4. Both come with stepping 3.6. This sample uses stepping 4.0. I do not have any Cyrix 486s (or IBM or It’s ST) with such a late stepping. I am not even sure this stepping finally reached the commercial status. This sample works perfectly fine at 120 MHz (3×40 MHz) with 3.45V while my Cyrix DX4-100 requires 3.6V to work at 120 MHz and the retail It’s ST doesn’t work at all when overclocked at 120 MHz.

At this point, I have no proof that this sample comes from TI and not directly from Cyrix. Anyway, it could be an engineering sample for a hypothetical Cx486DX4-120, that was finally canceled to avoid hurting 5×86 sales. Interesting.

Here is the other one.

The package marking (25253) tell us it’s an early AMD 486s assembled by Kyocera, but there is nothing more written on top or back of the chip. Package number is almost often used for 3.3V parts (while 5V parts from the same era come on the 25220 package). Time to plug it on the UC!

Early AMD 486s use the Intel 486 microcode, so they’re virtually indistinguishable by software. I’m testing a very nice way to distinguish them but that’s another story (I’m waiting for a new PCB and I’ll tell you more if it works as expected). The CPUID have 8 KB of L1 write-through cache and the CPU doesn’t support write-back, so it’s a (N)V8T revision and not a later SV8B. It doesn’t support 3x multiplier, so it’s a DX2 and not a DX4. Testing various frequencies shows it can work fine at 40 MHz. This unmarked CPU is probably an Am486DX2-80 NV8T, or maybe an Am486DX2-66 NV8T with a good overclocking capability. Nothings suggest it’s an engineering sample. Markings have probably faded over time (or have been removed due to mechanical action).

[UCA CPU Analysis] Intel 486 DX2-66 SYE36 ES

I’m starting a new section: IC Analysis! The goal is to study odd or rare CPUs with the Universal Chip Analyzer. As an avid CPU collector, I have many of them. Since I started collecting back in the early 00s, I have only been interested in Engineering Samples. These are basically prototypes of retail CPUs. Knowing their specification is often very interesting for historical purposes (ie: to retrace the timeline of the development).

Let’s start with this 486 DX2-66 Engineering Sample:

This processor is uncommon in many respects, even for an engineering sample. It comes with the standard Intel i486 DX2 logo but other writings are printed instead of being laser-engraved. The part number on the first line (“A80486DX2-66”) is the retail one, while Intel often used the code number (“P24” or “A80P24”) on early prototypes of the 486DX2.

The second line shows the date when the die (the piece of silicon where the CPU has been engraved) was assembled inside the ceramic packaging: week 22 of 1992, so between May 25^th and May 31st, 1992. The date when the die itself has been produced is marked on the back: week 17 of 1992 (between April 20 & 26, 1992). Intel officially introduced the first clock-doubled 486DX2 at 50 MHz on March 3^rd, 1992. The 486DX2 at 66 MHz was launched five months later, on August 10^th, 1992. This sample has been produced before the initial production of the 486DX2-66.

Another very rare feature of this CPU is the Intel’s product spec number used. From the 70s until today, Intel has used a 5-digit alphanumeric code (named “S-Spec”) to identify all their retail products. An S-Spec always starts with the letter “S” (ie: SX366 is a 80386 DX-33 and SR147 is a Core i7 4770K). The only exception is for prototypes (engineering or qualification samples), where the code begins with a “Q”. The presence of that “Q-Spec” (also named QDF) on a CPU is the most effective way to distinguish a pre-production sample from its standard production counterpart. On this obvious engineering sample (also marked “ES” on front), the QDF starts with S: “SYE36”. For a very short period (1991/1992), Intel produced some 386/486 Engineering samples with a spec code starting with “SXE”, “SYE” and “SZE”. The reason is still unknown, but this sample is one of them.

It’s now time to test this SYE36 sample with the UCA

And It works fine! This early sample does not support the CPUID instruction, but the value at reset is 0x433. The first commercial stepping is A2 with a CPUID set at 0x432. Only a DX2-50 has been released with this stepping, which didn’t seem able to run properly at 66 MHz. This sample uses the B1 stepping, like the first retail 486 DX2-66 (SX645) released. Power consumption measured on FPU benchmark mode is quite high (4.3 W) but still within specs (4.5 W). Later DX2-66s need less energy.

Despite its unusual markings, it seems this sample was a qualification sample rather than a “true” engineering sample. It was probably sent to Intel’s customers for validation some weeks before the official launch. Other than that, it’s strictly identical to a SX645 486 DX2-66.