[UCA CPU Analysis] Prototype UMC Green CPU U5S-SUPER33

While sorting some new Engineering Samples I received lately, I exhumed some prototypes from my collection. They came without missing pins, so they are good candidates for an advanced investigation with the Universal Chip Analyzer.

Let’s begin with the first one, a UMC Green CPU U5S-SUPER33

It’s marked “Confidential” on the last line, which means it’s an engineering sample. The date code is quite early: 9416. It was manufactured on the third week of April 1994. This CPU is not one of the very first samples of the whole U5S line regardless of the frequency, but probably a prototype for the specific 33 MHz version. Also notice the famous “Not for U.S. sale or import” line, written here because UMC was afraid – and rightly so – of the legal consequences of infringing Intel’s ‘338 patent.

Let’s try it on the Universal Chip Analyzer:

The prototype works fine up to 33 MHz. One of the first interesting points to check is the support of the CPUID instruction on such an early prototype. A few weeks ago, I was chatting with mtx500 (another well-known and very technical-aware CPU collector) about the way to detect UMC CPUs. He told me he uses the SALC/FS method to distinguish UMCs. The idea is to use the undocumented Intel opcode 0xD6 “SALC” (Set AL on Carry) instruction with the 0x64 “FS:” prefix. Only on a UMC, the combined 0xD6 0x64 opcodes return the “magic” constant 0x0AB6B1B07 in the EAX register.

I was wondering why to use this method when the CPUID instruction is supported? Mtx500 told me that early U5S might not support the CPUID instruction, so I was impatient to try on an early U5S like this one. It looks like the CPUID instruction is well supported, with the expected “UMC UMC UMC” string reported as well as the usual 0x423 family/model/revision on U5S(X). At first sight, this prototype looks strictly identical to the retail version. I ran some benchmarks to compare with later U5S and the cycle count of the test instruction flow is exactly the same.

However, on closer inspection, I found a noteworthy difference: power consumption. I ran the same INT benchmark on my 4 U5S with the voltage set at 5.11V exactly on all of them. The UCA is quite precise at measuring power consumption. All of them were tested at a fixed 33.3 MHz frequency, no matter their rated maximum speed.

The results are quite interesting. As you can see, all my retail U5S consume (almost) the same current: about 308 mA at 33 MHz while running my benchmark code. For an unknown reason, the U5SD is a bit higher at 321 mA, but the difference in power is only 50 mW. In the opposite, the prototype U5S-33 need MUCH more power to process the exact same code in the same time at the same frequency with the same voltage: 421 mA, which translate to about 2.15 watts.

The most obvious explanation is a switch in manufacturing process between the early prototype and the commercial revision. UMC was, at that time, one of the two major IC manufacturers in Taiwan, along with TSMC. We can consider they can switch easily between processes. On the very first U5S datasheet published in 1993, UMC indicates the U5S is built using a 0.6 µm CMOS process. This is consistent with the power consumption seen on the sample. I was able to find a table of the manufacturing process evolution in Taiwan in the 90s.

UMC and TSMC switched from 0.6 µm in 1993 to 0.4 µm in 1994. According to this table, it seems likely that the prototype is build using the ’93 process (0.6 µm) while UMC switched to the ’94 process (0.4 µm) for their retail mass-volume production. The retail U5S tested here need about 27% less power than the prototype. The theoretical reduction in power consumption between 0.6 µm and 0.4 µm is 33%, so that makes perfect sense. This Engineering sample is probably an early U5S manufactured with the original 1993 0.6 µm process. It is unknown at this point if retail (non-prototype) UMC Green CPU have ever been built with that process. Another analysis with a retail U5S produced earlier than August 1994 (Week 33’94) would allow us to be sure…

Identifying “blank” 486s with the UCA #1

The Universal Chip Analyzer is useful to test and spot counterfeits CPUs, but also to help identify CPUs without markings. The lack of printings on a CPU can be caused by a poor ink quality that gradually faded out over years, by abrasion with other ICs (common when you saved a nice CPU from a “scrap lot”) or because it’s an early engineering sample (prototype). Here are two examples.

Let’s start with the first one.

It’s supposed to be an early engineering sample coming from ST Microelectronics. Hand-writing on top are “X2, Y4 #7”, probably related to the coordinates of this particular die on the wafer (X=2 , Y=4) and the wafer number (#7), and also “PLL”, which probably mean it was designed to test the integrated Phase-Lock Loop (clock multiplier). The back of the CPU shows that the PLL was configured for “3XCLK”. So it’s a DX4 class CPU. But it could also be a 5×86 ES. I have tested it on the UCA at 3.45V.

All 486s from ST are just rebranded Cyrix 486s and this one makes no exception. It identifies itself as a Cyrix Cx486DX4 (M0.7). The most interesting point is the stepping. I have one It’s ST ST486 DX4-100 (you can see it here) and another Cyrix Cx486DX-100GP4. Both come with stepping 3.6. This sample uses stepping 4.0. I do not have any Cyrix 486s (or IBM or It’s ST) with such a late stepping. I am not even sure this stepping finally reached the commercial status. This sample works perfectly fine at 120 MHz (3×40 MHz) with 3.45V while my Cyrix DX4-100 requires 3.6V to work at 120 MHz and the retail It’s ST doesn’t work at all when overclocked at 120 MHz.

At this point, I have no proof that this sample comes from TI and not directly from Cyrix. Anyway, it could be an engineering sample for a hypothetical Cx486DX4-120, that was finally canceled to avoid hurting 5×86 sales. Interesting.

Here is the other one.

The package marking (25253) tell us it’s an early AMD 486s assembled by Kyocera, but there is nothing more written on top or back of the chip. Package number is almost often used for 3.3V parts (while 5V parts from the same era come on the 25220 package). Time to plug it on the UC!

Early AMD 486s use the Intel 486 microcode, so they’re virtually indistinguishable by software. I’m testing a very nice way to distinguish them but that’s another story (I’m waiting for a new PCB and I’ll tell you more if it works as expected). The CPUID have 8 KB of L1 write-through cache and the CPU doesn’t support write-back, so it’s a (N)V8T revision and not a later SV8B. It doesn’t support 3x multiplier, so it’s a DX2 and not a DX4. Testing various frequencies shows it can work fine at 40 MHz. This unmarked CPU is probably an Am486DX2-80 NV8T, or maybe an Am486DX2-66 NV8T with a good overclocking capability. Nothings suggest it’s an engineering sample. Markings have probably faded over time (or have been removed due to mechanical action).

[UCA CPU Analysis] Intel 486 DX2-66 SYE36 ES

I’m starting a new section: IC Analysis! The goal is to study odd or rare CPUs with the Universal Chip Analyzer. As an avid CPU collector, I have many of them. Since I started collecting back in the early 00s, I have only been interested in Engineering Samples. These are basically prototypes of retail CPUs. Knowing their specification is often very interesting for historical purposes (ie: to retrace the timeline of the development).

Let’s start with this 486 DX2-66 Engineering Sample:

This processor is uncommon in many respects, even for an engineering sample. It comes with the standard Intel i486 DX2 logo but other writings are printed instead of being laser-engraved. The part number on the first line (“A80486DX2-66”) is the retail one, while Intel often used the code number (“P24” or “A80P24”) on early prototypes of the 486DX2.

The second line shows the date when the die (the piece of silicon where the CPU has been engraved) was assembled inside the ceramic packaging: week 22 of 1992, so between May 25th and May 31st, 1992. The date when the die itself has been produced is marked on the back: week 17 of 1992 (between April 20 & 26, 1992). Intel officially introduced the first clock-doubled 486DX2 at 50 MHz on March 3rd, 1992. The 486DX2 at 66 MHz was launched five months later, on August 10th, 1992. This sample has been produced before the initial production of the 486DX2-66.

Another very rare feature of this CPU is the Intel’s product spec number used. From the 70s until today, Intel has used a 5-digit alphanumeric code (named “S-Spec”) to identify all their retail products. An S-Spec always starts with the letter “S” (ie: SX366 is a 80386 DX-33 and SR147 is a Core i7 4770K). The only exception is for prototypes (engineering or qualification samples), where the code begins with a “Q”. The presence of that “Q-Spec” (also named QDF) on a CPU is the most effective way to distinguish a pre-production sample from its standard production counterpart. On this obvious engineering sample (also marked “ES” on front), the QDF starts with S: “SYE36”. For a very short period (1991/1992), Intel produced some 386/486 Engineering samples with a spec code starting with “SXE”, “SYE” and “SZE”. The reason is still unknown, but this sample is one of them.

It’s now time to test this SYE36 sample with the UCA

And It works fine! This early sample does not support the CPUID instruction, but the value at reset is 0x433. The first commercial stepping is A2 with a CPUID set at 0x432. Only a DX2-50 has been released with this stepping, which didn’t seem able to run properly at 66 MHz. This sample uses the B1 stepping, like the first retail 486 DX2-66 (SX645) released. Power consumption measured on FPU benchmark mode is quite high (4.3 W) but still within specs (4.5 W). Later DX2-66s need less energy.

Despite its unusual markings, it seems this sample was a qualification sample rather than a “true” engineering sample. It was probably sent to Intel’s customers for validation some weeks before the official launch. Other than that, it’s strictly identical to a SX645 486 DX2-66.

The UCA 486 Adapter now supports UMC 486s!

Last but not least, UMC’s 486s can now be tested on the Universal Chip Analyzer. All 486s ever manufactured are now supported! UMC is a Taiwanese semiconductor company still active today, albeit much smaller than its well-known competitors (TSMC, GlobalFoundries, …). In the mid-90s, UMC produced some rare 486s compatible CPU named “UMC Green CPU”. They were in-house design and not Intel-licensed like AMD 486s. Almost immediately after the initial announcement in 1993, Intel sued UMC and its distributors over patent infringements (including the infamous ‘338 patent, read more here). In response, UMC filled an anti-trust suit against Intel, but finally choose to give up and cease production of x86 compatible CPUs. Here are the retail UMC 486s I have:

UMC Green CPUThe most common one was the U5SX (sometimes marked U5S) clocked at 25, 33 or 40 MHz and without FPU. Some UMC Green CPUs labeled “U5SD” were also released. Contrary to what you might think, they don’t include a FPU – verified with the UCA – but are just supposed to use the “DX pinout”. The meaning of this mention is unclear because both the i486SX and the i486DX shares the same layout for non-FPU related pins. Some guys have claimed that the difference is a relocation of the NMI pin, but I have not noticed that. UMC also released extremely rare U5D (with FPU) and U486DX2 (clock doubled with FPU), but only a couple samples are known today worldwide.

U5S SUPER-40 Tested on the UCAThe Universal Chip Analyzer is able to test all UMC 486s, even if I still have an issue at 33 MHz and 40 MHz to grab every I/O. Code fetch works fine, but the output on I/O ports are sometimes dropped. It’s probably easy to solve, but I really need a copy of the UMC 486 manual / datasheet. Unfortunately, it seems nobody has one in the CPU collector’s community. If you can help, you’re more than welcome!

Looking at threads about the UMC 486 on cpu-world, vogons and vcfed, I saw that some people were wondering if the CPU marked “SUPER” were different from the “non-SUPER” ones. So, I ran some INT benchmark with the UCA.

As you can see, all UMC CPUs (U5S-SUPER, U5SX and U5SD) offer the exact same results when clocked at the same frequency. You also probably noticed the awesome relative performance of the UMC 486 versus the Intel 486s. Clocked at 33 MHz, the UMC 486 is as fast as an Intel 486DX2-66 and, when clocked at 40 MHz, it’s almost as fast as an Intel DX4-75. That’s not a bug. The UCA doesn’t have any wait-state on memory subsystem and actually uses a mix of heavy Integer divide and multiplication instructions for the benchmark. This result comes from the ultra-fast ALU designed by UMC, especially on divides. While Intel 486s requires 40 cycles to perform a INT divide, UMC 486s only need 7 cycles. That’s more than 5 times faster! However, in real-world applications, the UMC 486-SUPER at 40 MHz was on par with an Intel i486SX2-66. Still excellent!

The UCA now supports Ti486s (featuring the interesting Ti486SXL)

In the 90s, Texas Instruments (TI) manufactured some 486-class CPUs under its own brand. TI was one of the third-parties who produced Cyrix processors (Cyrix was a fabless company). Consequently, the vast majority of TI-branded CPUs were just rebranded Cyrix 486s in PGA132 (386 pinout) or PGA168 (486 pinout). But Texas Instruments also designed its own 486 micro-architecture: the short-lived TI486SXL.

Good news: the Universal Chip Analyzer with the 486 adapter now supports all TI 486s!

Let’s start with the fastest one, the TI486DX4-G100-GA.

TI486DX4-G100-GAThe “Colorful” Ti486DX4 is a rebranded Cyrix 486DX4-100: clock-tripled with 8KB L1 WB cache. Compared to IBM 486DX4 or ST 486DX4, both also rebranded Cyrix, the Ti486DX4 includes a tiny difference. IBM and ST CPUs are indistinguishable from a Cyrix CPU. Texas Instrument insisted to add a way to distinguish their CPU. This was implemented by setting a bit (DIR1[7]) in one of the Cyrix-specific registers. If the bit is “1”, the CPU is Ti-branded. If the bit is clear, it’s a Cyrix, IBM or ST 486. Other than that, nothing changed.

TI also released some 486 DX2, like this TI486DX2-G80-GA.

TI486DX2-G80-GAAgain, it’s just a rebranded clock-doubled Cyrix 486DX2 with a specific identification bit set. The DX2-80 seems more common than the DX2-66. The Universal Chip Analyzer is perfectly able to test it at 2×40 MHz.

But the most interesting Texas Instruments 486s are the ones based on their own micro-architecture like this TI486SXL2-G66-GA.

TI486SXL2-G66-GACodenamed “Potomac”, this clock-doubled CPU with 8KB of L1 cache doesn’t integrate a FPU. Ti was probably thinking of developing a FPU later. The mechanism to enable the clock-doubled PLL is different than all other CPU, who uses a specific pin to toggle between 3x and 2x (DX4s) or just boot at 2x by default (DX2s).  The Ti 486SXL2 powers up in the non clock-doubled mode. To enable the 2x PLL, you have to write a specific bit in a proprietary register called CCR0 (with the same mechanism involving reading/writing to port 0x22h/0x23h as on Cyrix 486s). Setting bit 6 on CCR0 switchs on the clock-doubling PLL instantly (within 20 µs). I was able to overclock this CPU to 80 MHz at 2×40 MHz (still at 3.45V).

Texas Instruments also released non-clockdoubled “Potomac” CPUs like this Ti486SXL-40.

Ti486SXL-40As you can see on the UCA Analyzer screenshot, it seems there is no way to distinguish them from SXL2 from a hardware point of view. Of course, it’s possible to check if the CCR0 bit 6 is set to 1 to know if clock-doubled mode is enabled, but the SXL2 can also work as a SXL in 1x mode.

While messing with CCR0 and the x86 code that run on the CPU, I found something very interesting: my SXL-40 also supports the clock-doubling mode! I had to run several benchmarks to be sure, but yes: that Ti486 SXL-40 can work as a SXL2 with the exact same power consumption and the same performances.

Ti486SXL - Benchmarks

I used the preliminary Benchmark Mode on the UCA. These Integer scores are not calibrated so they’re only valid to compare these CPUs on a relative scale. We can see that the Ti SXL microarchitecture is ~30% slower than the Intel 486 one at equal clock frequency,  but the SXL-40 is just 18% slower than a Intel 486DX-33. The Ti SXL2-66 is much faster then the DX-33, but the i486DX2-66 is far beyond.

Finally, it seems the SXL and SXL2 – at least on B0 stepping – are the exact same CPU. The B0 stepping is probably the only one that can work in 2x mode. “Potomac” A0 doesn’t support SMI and probably doesn’t have the integrated PLL. I also tried to activate the 2x mode and keep the 40 MHz FSB on the Ti486SXL-40 and … it works, actually doubling the performance (similar to a hypothetical “486SXL2-80”).

Spotting Counterfeit Am486 with the UCA

While I was adding support for AMD CPU on the Universal Chip Analyzer, I spotted what looked-like a strange chip at first sight. I was then working on the L1 cache size detection, to distinguish between CPUs with 8 KB and others with 16 KB. In their BIOS Development Guide, AMD wrote a specific code that checks the status of a tag bit in a test register (TR4). After implementing this test path in the x86 code run by the CPU on the UCA, I needed a CPU with 16 KB L1 cache to try on 486 (5x86s were OK). I found this uncommon Am486 :

This is a nice Am486 DX4-100V16BGI. This part number decodes as follows:  A clock tripled (“DX4”) CPU rated at 100 MHz (“100”) and 3.3V (“V”), with a 16 KB (“16”) Write-Back (“B”) L1 cache in a 168-pin PGA package (“G”) and qualified at Industrial temperature range (“I”). This last point is uncommon because the vast majority of Am486 are “Commercial” grade (0°C to 85°C) and not “Industrial” (-40°C to +100°C). That’s probably why I bought this CPU years ago.

But the AMD code was not working: the size of the cache detected was 8 KB instead of 16 KB. I began to have doubts about the genuineness of this CPU. I started to play with the UCA. No way to enable Write-Back: the CPU stays in Write-Through Mode and the CPUID does not change accordingly as on “SV8B” AMD 486s. This CPU does not support Write-Back. I suspected a remarked early “NV8T” DX4-100, but that was not the case: they come with a CPUID 0x484 and this CPU was 0x482 in 3x Mode and 0x432 in 2x Mode.

I was able to find a very early Am486DX2-80 V8T (notice the lack of “N”) manufactured in 1994 with the first A-Stepping. The UCA detects a CPUID set at 0x432, which match with my fake DX4 (in 2x Mode). Early Am486DX4-100 V8T also exists with a CPUID 0x482 in 3x Mode. Some of them seem to have been later remarked to Am486 DX4-100V16BGI.

On closer inspection, several points should have caught my attention about this CPU. No way to be certain of what it really was without the UCA, but the fact that it was a fake could have been known sooner.

    1. Package code is wrong

The AMD package code is written in bottom left of all AMD CPUs from this era. The first AMD Am486s like the Am486DX-33/40 or very early Am486SX2/DX2s use the “24361” package. Later 486DX2 “V8T” and “NV8T” CPUS come in the “25220” or “25253” package. Enhanced “SV8B” DX4s (with SMI and Write-Back) are assembled with the “25398” package. Then we have package “25498” for newer CPUs like the Am486DE2. Later models (SV16B and 5×86) use the “25544” package”. This later one was expected for a genuine Am486DX4-100V16BGI, but the fake CPU comes with an old “25253” (N)V8T package.

Package code is “25253”, similar than old (N)V8T Am486
    1. Markings without hatching

As you can see in the picture below, AMD markings on CPUs from this era use a typical hatching pattern. This pattern is not present at all on the fake CPU.

    1. Marking error

But the most obvious error is a big mistake on printing. Here you can see the word “COMPATIBLE” is actually spelled “COMPATTBLE”, with a double “T”.

There is no doubt at this point that this CPU is a counterfeit Am486DX4. The only question remaining is when was it remarked by fakers? Counterfeits CPUs – especially 486s – were common in the 90s to boost frequency, but here, the original CPU was already an Am486DX4-100 (albeit a very early one with 8 KB L1 Write-Though Cache, instead of the expected 16 KB L1 Write-Back Cache). More recently, in the mid-2010s, old CPUs from the 90s were also faked to target CPU collectors all over the world.

Looking at eBay listings right now (2020-04-23), I found 4 vendors selling Am486 DX4-100V16BGI for a (very) high price. Two of them – including one who only sells multiple 30 pcs lots – are obviously the same fake as the sample analyzed here. The other two look different but still highly suspicious, with a Windows Logo not on par with the unusual Windows printing from AMD for the first one, and a very odd font for the second one (seems also marked “COMPATIBLF”)

Collectors beware of these CPUs!

The UCA 486 Adapter now supports Cyrix/IBM/ST 486s & 586s

Along with AMD, Cyrix was one of the biggest Intel challengers in the 486 era. While most of the AMD Am486s used the exact same microcode as Intel 486s, Cyrix was the first to release a 100% compatible processor based on a custom design. Being a fabless company, Cx486s were manufactured by IBM, ST Microelectronics and Texas Instruments. All of them sold Cyrix 486s under their own brand.

Adding support for Cyrix-based 486 and 586 was more challenging than expected. As many of you probably remember, 486 motherboards were full of jumpers because of the many different pinouts. I want the UCA to be able to test every CPU out of the box without messing with jumpers, so I had to use many tricks to accommodate the different pinouts. I also wasted a lot of time trying to understand the erratic bugs I had when adding more x86 code to detect Cyrix CPU. The cause was finally obvious, but I had a hard time spotting it: two address lines (A11 & A9) had been inverted in the FPGA code for more than one year!

This stupid typo came on top of another Cyrix-only specification I had to deal with. All the HDL code I wrote for the UCA is focused on achieving 0 wait-states. Unfortunately, when I started to work on support for the Cx486, it crashed almost instantly, even at low frequencies. I rewrote a lot of Verilog to achieve near-perfect timing, matching the original Intel datasheet almost perfectly. But the Cx486 kept crashing. I had to wire everything to my 32-channel logic analyzer to understand why all Cyrix 486s failed to work on the UCA. The answer is shown on this screenshot:

Cyrix added an unexpected (normally chipset-related) mechanism that adds hardware wait-states to every I/O. And not just a couple of them: 32 clock cycles for every I/O by default! The state machine that handles the decoding of CPU cycles inside the FPGA wasn’t able to understand why the CPU doesn’t resume operation after an I/O and assumes a timeout has happened. As soon as I changed the HDL code to handle this case, Cyrix CPUs started to work properly on the UCA. I could have saved myself a lot of effort if I had RTFM more carefully: this behavior is indeed described on the 5×86 CPU BIOS Writer’s Guide, page 12:

Maybe I’ll add a software path later to change this setting with the UCA Analyzer tool. Messing with Cyrix-specific internal registers on the fly is an upcoming feature already planned! As with AMD, I grouped all the non-ES Cyrix CPUs I had in a tray and starts testing.

(1) Let’s start with the Cyrix Cx486S-40, one of the first 486-class CPU released by Cyrix in March 1993. It features 2 KB of write-back L1 cache, quite unusual for the time. The CPUID at reset is 0x450, which does not correspond to any Intel 486 (i486SX are 42x). Power consumption is quite high. Also note that the screenshot is done at 25 MHz for a rated maximum clock of 40 MHz. For an unknown reason that deserves a longer investigation, very early Cyrix 486 like this one cannot run at 33 MHz or more on the UCA with the actual HDL code. Maybe it’s due to the added electrical interference from the Logic Analyzer, or maybe It comes from a regression in the code after I messed with timings, but that Cx486S-40 was able to run at 40 MHz some days ago, so I’m quite confident It will be fixed soon. I was just too lazy to unwire everything to take the screenshot.

(2) Cyrix Cx486DX2-66. A clock doubled 486 with FPU. CPUID after reboot is 0x480 (similar to Intel DX4s) but the CPU does not support the cpuid instruction. Cyrix CPU has two registers named DIR0 and DIR1 for identification. This one contains 0x1B in DIR0, the hex value for a Cx486DX2. DIR1 contains 0x0B. DIR1[7:4] is “CPU Step Identification Number” (here 0x00) and DIR1[3:0] is “CPU Revision Identification” (here 0x0B or 11 in decimal). The actual “Cyrix stepping” is 0.11. This CPU is marked A3CM434M and has been manufactured week 34’1994. It’s an early example. Like the Cx486S, it does not work at more than 25 MHz on the UCA yet (but it will soon).

(5) IBM “Blue Lightning DX2” 486-V666GA. An IBM-branded Cyrix 486DX2-66. They are strictly identical from a microarchitectural point of view but are supposed to come with a stricter QC (Quality Control). This one is a 3.45-3.6V part, and not a 5V CPU like the previous one. It is also much newer (manufactured in March/April 1995). Stepping/Revision is 3.2. No problem running it on the Universal Chip Analyzer at 66 MHz (2 x 33.3 MHz).

(6) IBM 486 DX4 / 486-4V3100GIC. Well, I can’t remember where this CPU come from, but it doesn’t work. Not a single sign of life on the UCA not on a standard 486 motherboard. The power drawn seems linked to the clock signal applied (so the internal die is not shorted), but when wired to the logic analyzer, not a single pin toggles after reset. Unfortunately, it looks dead. 🙁

(7) It’s ST ST486 DX2-66. While IBM-branded Cx486s are often known for their higher QC (and higher overclocking), ST’s 486 are usually less overclockable. This CPU was manufactured in February 1995 but still uses the Stepping 0.12. A single step newer than the very old Cyrix Cx486DX2-66 but much older than the IBM Blue Lightning DX2. It works as expected at 66 MHz

(8) It’s ST ST486 DX2-80. This part is very close to the IBM Blue Lightning DX2. It uses the same 3.2 Stepping but works at 5V instead of 3.45V. Power consumption is quite high (~4.5W) and it runs hot. CPUID is 0x480. No problem to have it running on the UCA at 80 MHz (2 x 40 MHz).

(9) It’s ST ST486 DX4-100. Very late 3.45V clock-tripled CPU manufactured in 1997. The stepping is 3.6, which corresponds to the latest Cyrix 486 revision ever produced. CPUID is still 0x480 and L2 cache is limited to 8 KB Write-back (instead of 16KB for latest Intel 486 DX4s). It runs fine at 66 MHz (2×33.3 MHz) and 100 MHz (3×33.3 MHz).

(3) Cyrix 5×86-100GP. The 5×86 is a short-lived, stripped-down version of the Cyrix 6×86. It features 16 KB of L1 Write-Back cache and a 5th generation (Pentium class) microarchitecture. The vast majority of 5×86 processors run at 2x or 3x multipliers. This example is a quite early “1.3” revision. CPUID changes from 0x429 at 2x to 0x42D at 3x. It can work on the UCA at 120 MHz (3×40 MHz) with 3.6V.

(4) Cyrix 5×86-120GP. Some late (and rare) 5x86s are able to work with a 4x multiplier (in addition to the default 3x multiplier). For some unknown reason, the revision/stepping drops to 0.5 even if the CPU was manufactured way after the previous one (in 1996). CPUID at 4x is 0x42C (and stay at 0x42D at 3x). Here is that nice 5×86-120 running at its rated 120 MHz (3×40 MHz) and then overclocked at 133 MHz (4×33.3 MHz) @ 3.6 volts

For fun, I also tried to increase the voltage to 3.7V before restarting the UCA at … 160 MHz (4x 40 Mhz) ! To my surprise, it successfully completed a test pass. I stopped to avoid any damage to the CPU, but that was probably the fastest pass ever run on the UCA.

Awesome!

Next step is to add support for the remaining 486 brands (and solve the frequency regression on early Cx486s). I also have a nice feature upgrade for the 486 adapter planned soon. Stay tuned!

The UCA 486 Adapter now supports AMD 486 & 5×86

After the initial support for Intel 486s, the Universal Chip Analyzer with the new 486 adapter now supports all 486s from AMD. I’m an avid CPU collector but I only collect Engineering sample (check my collection here!). Of course, some analysis on these ES will be published here soon, but to add support for AMD 486s, I bought some “retail” Am486 and Am5x86. Here they are:Good news : they all work well on the UCA! Here are some notes I took while testing.

Am486DX2-50 (1)Am486DX2-80NV8T (2) : 8 KB L1 Write-Through Cache. Both CPUID 0x432 with cpuid instruction not supported. Virtually undetectable from Intel DX2 : Same microcode. Same power consumption. Exact same performances. Maybe distinguishable from Intel DX2 with JTAG. Work in progress on this point.

Am486DX2-80NV8T

Am486DX4-100NV8T (3) : 8 KB L1 Write-Through Cache. CPUID also 0x432 without cpuid instruction. Real nightmare to distinguish from Intel. B-Step a bit lower power (-5%) Vs AY-Step. INT Perfomance is lower than Intel DX4 (-20%). FP Performance almost identical. Power consumption is also lower (-25%). One interesting thing : In 2x mode, the CPU is exactly as fast as an Intel DX2. However, in 3x mode, it seems significantly slower than a DX4, but only in INT. Performances looks like 2.5x in INT (according to cycle count) and “real” 3x in FP. Strange. That deserves some additional investigation.

Am486DX4-120SV8B (4) – Am486DX4-100SV8B (5) : Newer core with SMI and Write-Back L1 Cache (still 8KB). CPUID instruction supported. 2x/3x Mode and WT/WB change CPUID (0x434/0x474 in 2x mode, 0x484/0x494 in 3x mode). Exact same performance than NV8T. Bit higher power consumption (+10%).

Am486DX4SV8B in 2x Write-Throught Mode
Same Am486DX4SV8B, in 3x Write-Back Mode

Am5x86-P75 (Am486DX5-133V16BGC) (6). 16KB Write-Back L1 Cache ! CPUID instruction supported. 3x/4x Mode and WT/WB change CPUID (0x484/0x494 in 3x mode mode, 0x4E4/0x4F4 in 4x mode). INT Performance identical to 486DX4-100 (looks like INT units are locked in 3x mode). FPU performance are far better (real 4x). Doesn’t support overclocking to 160 MHz.

Am5x86-P75 In 4x Write-Back Mode

Am5x86-P75 (Am486DX5-133W16BGC) (7). Same than previous one, but supports overclocking to 160 MHz with 3.6V !

Am5x86-P75 at 4×40 MHz = 160 MHz @3.6V

More 486s will soon be supported!

The Universal Chip Analyzer, ready for up to 486!

When I started studying FPGAs more than two years ago to build a simple IC tester, I didn’t expect to support anything faster than the Intel 8086. The learning curve for Verilog has been quite harsh since then, but I’m now much more comfortable with complex logic, states machine and timings diagrams. I also learned a lot about how old CPU architectures work in depth and how to interface vintage circuit with modern hardware. Acquiring this knowledge step by step is quite exciting, even if I still have a lot to understand.

By the end of 2018, I was able to successfully interface an Intel 486 DX-33. That was quite a challenge and the HDL code was horrible (and the electronic as well, to be honest), but I knew that it was possible for the modest FPGA I use (a Xilinx Spartan 6 LX9) to support up to 32-bit architecture with quite fast bus frequencies and 0 wait state. Awesome!

* Now Truly Universal

In 2019, I worked on a truly universal hardware platform that can support anything from the original Intel 4004 to much more advanced 32-bits CPUs like Motorola 68040 or Intel 486 (and also MCU, RAM, the Xeon ID platform and almost anything else). The goal was to raise (a bit) the cost of the UCA base board to keep the top adapters as cheap (and easy to design) as possible. The Universal Chip Analyzer now consists of a 3 layers stack.

    • The first (bottom) layer is still the modified Mojo V3 development board with the Spartan 6 LX9 FPGA. I thought for a long time about redesigning the board, but the embedded Micro team did quite a good job and cheap clones Mojo V3 are available. The main changes I made on the board are a much bigger Flash (256 Mb) to accommodate many different FPGA configuration files, a rewritten firmware to support all the new features and some tuning on the FPGA power stage.
    • The second (middle) layer is the heart of the Universal Chip Analyzer. It integrated all the bus transceivers, an internal management bus, and a much more advanced power delivery stage for the CPU under test. Bus transceivers can handle up to a 32-bit bus. The internal management bus can create interconnections between all the components (MCU, FPGA, etc.) spread on the 3 PCB layers. It’s used to bypass the FPGA for some later adapters and to support some nice features out of the box like adapters with embedded displays. The most interesting part is the power stage. The first iAPX86 UCA Shield was powered by USB only. It’s not possible anymore because advanced CPUs sometimes require lower voltages (+3.3V, +3.45V or +3.6V) and much higher current. A DC-DC converter is now integrated, along with a precision current/voltage monitor circuit that is also able to act as a programmable fuse. The voltage converter can be set externally to 5V or 3.3V, but it can also be set by the UCA software to any voltage from 2.2V to 5.5V by 50 mV step! I also added a standard 3-pin fan connector. Mandatory to keep some CPU like 486 DX4s cool! PS : The final PCB color will be black. The green one is cheaper and faster for prototypes…
    • The third (top) layer is a simple passive adapter with just two 50-pin connectors and a Socket (standard or ZIF) to accommodate the CPU under test. The UCA auto-detects the correct FPGA firmware to load according to the adapter plugged (except for later adapters able to support more than one CPU family). A jumper is often present to set a fixed voltage (usually 5V) or to select a software-defined voltage.

This hardware platform was designed for many upcoming adapters and should not evolve any further. It allows a much faster development of adapters, both in hardware and in FPGA HDL code. Many of them are already in development – some almost finished – and they will be released throughout 2020. Some very cool adapters are planned and yes, even for non-CPU!

* Now with i486 support (and soon much more)

I started the development of this base platform with the fastest CPU with a 32-bit bus in mind: the clock-tripled 486 DX4. Because who can do more can do less. If the platform – and especially the FPGA – can support the DX4, it can also support every CPUs down to the 4004. That was a giant step versus the Intel 8086, the fastest CPU supported on the first iAPX86 shield. The 486DX4 is much more advanced, way faster and much more complex to interface. Tricks like forcing a stripped-down bus to 16 bit or adding wait states were forbidden.

Right now, the UCA supports any Intel 486 SX, DX, DX2 and DX4 CPU, from early engineering sample to QFP-on-adapter to late DX4s with write-back cache. The UCA is also able to set different bus frequencies, from 16 MHz to 40 MHz, with CPU frequencies up to 120 MHz (Intel 486 DX4-100 overlocked with a 40 MHz FSB).

At this time, you probably wonder: what about the only 486-class CPU with a 50 MHz bus, the 486 DX-50? Well, it works fine at 40 MHz, but after many weeks, I’m now sure the UCA platform will not be able to support a “true” 50 MHz FSB. That’s just too close to the limitation of the bus transceivers I use. That doesn’t mean the UCA will never support a 486 DX-50 running at 50 MHz (or even higher), that only mean it will not support it without wait states. Back in the day, I don’t think there was a single 486 DX-50 based computer without wait states. When released, this specific CPU was well-known for its instabilities and manufacturers had to add (many) wait states to make it work properly.

Now it’s time to talk about non-Intel 486s. The hardest part was to build an adapter able to support all the different 486 manufacturers out of the box (from AMD to Texas Instrument to Cyrix) without any dip switches despite the different pinout they use. I’m confident I now found the perfect hardware tricks and, even if they’re not working right now due to lack of software implementation, the 486 Adapter for the UCA will support (very) soon all 486s. Yes, that includes AMD & Cyrix 5×86, and some outsiders like UMC 486s and the elusive Texas Instrument 486 SXL2-66!

 

* Now Truly an Analyzer

I always wanted the UCA to be a true Analyzer (and not just a tester), able to dig into the microarchitecture by allowing the user to mess with internal registers and CPU pins. The path is quite complex because the data has to be transferred between several layer. The FPGA is directly connected to the CPU under test and loaded with custom HDL code (the “FPGA Firmware”). A microcontroller IP is inferred inside the FPGA and internally connected to various control signals, to the main RAM used by the CPU, and also to the external microcontroller of the base board. Some C code, written with the Xilinx SDK, is executed by this internal MCU to handle the communication between the FPGA parts and the external MCU. I call it the “iMPU Firmware”. More C code is needed on the external MCU (for the ATMega32u4, called “UCA Firmware”) to pass the data to the USB connector. Finally, a Windows 10 software program written in C# to communicate with the UCA.

Here is the actual UCA Analyzer tool, running with an Intel 486 DX4-100 overclocked to 120 MHz.

On the main tab: the testing status (INT or FP Test in progress, Pass, etc.), various version information on the internal firmware of the UCA, the actual CPU under test detected with actual CPU & FSB frequency, multiplier, voltage, current, power and process. You can also select a different frequency and reset the CPU.

On the specific 486 tab, the actual CPUID of the CPU (acquired from the CPUID instruction or at reset if CPUID instruction is not supported), various information on supported/enabled features and a work-in-progress “control” section to assert some pins of the CPU. For example, we have an Intel 486 DX4-100 in Write-Back mode with CPUID 0x490. If you set the cache to Write-Through mode, the CPUID for this CPU changes to 0x483. Many new features are planned here.

Another work-in-progress tab is for a benchmark feature. The values displayed are random for now, but everything is already implemented to support INT & FP benchmark. The goal is very different than traditional benchmark. Here, we will focus on microarchitecture benchmark only. There are absolutely no bottlenecks usually found on a “real-world” motherboard like chipset or EDO/FPM RAM. The UCA is able to feed the CPU with 0 wait states, so all data is read/written immediately, as if they were in cache.  The true power of the internal ALU/FPU, at their maximum process capabilities, can be revealed. More about the benchmark mode later.

The “Power” tab is one of the most significant improvements. On the top block, you can precisely monitor the voltage and current needed by the CPU, but also configure the internal ADCs (averaging & conversion time) and set the type and value of the protection (usually overcurrent). You can also reset the Alert Flag when shutdown due to overcurrent happens. On the bottom block, you can set the actual voltage for the CPU under test, from 2.2V to 5.5V by 50 mV increments. Useful to test undervoltage or overclocking. Ie: the 486 DX4-100 I used for the screenshot doesn’t work at 120 MHz at 3.3V but everything is fine at 3.45V.

The original Mojo v3 comes with a 60+ MB FPGA bitfile (firmware) uploader written in Java. I adapted it to handle many different bitfile “slots”, but it was time to rewrite it from scratch. I integrated the FPGA firmware uploader tool to the UCA analyzer, with the same features. You can still check all installed bitfile, slot by slot, and upload/delete them as needed. Now with a tool less than 1 MB and without Java.

The “Debug” tab logs all sent/received communication for debugging purpose. I also added a feature on the last “About” tab to easily upload the base UCA firmware (for the external microcontroller).

* Soon truly on sale everywhere?

Many CPU collectors or retro-enthusiasts already expressed their interest for the universal shield described above. A beta-testing phase will begin soon, starting with the 486 Adapter then with many others. I’m still wondering if it’s better to sell the adapters one by one, as soon as they’re ready, or to wait until at least 3 or 4 are available. The latter option can help keep costs as low as possible by panelizing the adapters. The final “retail” price for the UCA (base FPGA board + Universal Shield, without top adapter) is expected to be around $150. Adapters cost will be around $50 or less, depending of the Socket included. Finding ZIF PGA sockets at decent price is the most challenging part, especially for CPUs like the 286s. All those who have already bought an UCA with the iAPX 86 Shield can reuse the FPGA board with a firmware update, so they can expect a big rebate.

Progress is going to be fast, so stay tuned for more exciting information and in-depth CPU analysis!

ATX2AT Smart Converter – Negative Rails

While designing the ATX2AT Smart Converter (and especially when I did the modifications after the Kickstarter campaign – I’ll write about them later), I had to study how negative voltage (-12V & -5V) are used to implement everything correctly.

-12V is not really a problem: this power rail is still available in the current ATX Specification. I tried and measured current consumption on many motherboards and ISA cards to finally decide to protect the rail with a 500 mA resettable fuse (PPTC).

Generating the -5V rail needed by some ISA cards was a bigger challenge to properly size the components. Basically, there is two methods to get -5V from a modern ATX PSU:

  •  DC-DC converter. It can deliver tremendous power (several amps) on a -5V rail converted from +12V or +5V. But it has drawbacks: DC-DC Converter are electrically noisy. It also requires more components and are most expensive.
  •  Linear Regulator. It derives -5V from -12V by converting the unneeded voltage (7V here) to heat. Linear Regulator doesn’t generate electric noise and are quite cheap these days. Its major drawback is the heat they generate, which limit the maximum current they can deliver.

Should I go for a DC-DC Converter scheme and get a noisier current at a higher cost (in order to get more current), or stay with a linear regulator? It was time to study the needs.

A lot of old ISA cards requires -5V internally. Many of them integrate their own 7905 linear regulator and don’t use the -5V rail provided by the power supply through the motherboard. For example, the Media Vision Pro Audio Spectrum 16 (PAS16) is one of them: you can see the 7905 on top left of the board.

But many boards rely on the -5V provided by the PSU. One of them is the famous Sound Blaster 2.0 (CT1350) :

But what is the -5V used for? In all the ISA Card I studied, the -5V is used for the same stage: preamp. Basically, a sound card generates an analog signal from a digital bus with a DAC (Digital to Analog Converter). On the picture above, the DAC is the tiny IC in the center of the board marked “Y3014B”. It’s a Yamaha YM3014B. The output signal from this DAC is very weak and must be amplified before reaching speakers.

First, the analog from the DAC is preamplified with the two ICs marked KA3403 just above him. It’s a signal conditioning stage only. They are simple low-power operational amplifiers used to boost the voltage amplitude of the DAC signal to a bigger one. But in order to connect speakers, you need a much higher current/power capability. So the signal is then fed to a power amplifier: the ST TEA2025B located on the right of the board. This schematic is very common and used in almost all ISA sound cards.

-5V is only required for the KA3403s (the preamplifier stage). The datasheet available here tells us the output load is spec’ed at 2K Ohm so each KA3403 can output 2.5 mA x 4 opamp = ~10 mA. We have two of them and let’s add the power supply current (spec’ed at 7 mA max) and the Sound Blaster 2.0 will never need more than 2 x (10+7) = 34 mA on the -5V. That’s an absolute maximum in a worst-case scenario. Actually, the “typical” power supply is 2.8 mA and the load is much higher than 2K Ohm.

After the math, let’s check in real world:

As you can see, the actual load on -5V is only ~6 mA. This is a low current, but it’s required to get sound. As soon as I disconnected the -5V line, the sound card stops working.

Preamplifiers are very sensitive to electric noise. For such low current requirement, a DC-DC converter is useless and even harmful to sound quality.

The same schematic is found on all sound cards that requires -5V. Here is a Sound Blaster CT1920 AWE Upgrade “Goldfinch”:The -5V line is also used for the TL074C quad opamp (very similar to the KA3403) and outputs only a low power “Line Out” Signal. I measured the typical current slightly below 10 mA.

Finally, you probably wonder what’s the point with other ISA cards that are NOT sound cards. Short answer: same! None of them requires more than a few milliamps on -5V. For example, here is one of the rare ISA Network cards that needs -5V to work properly, the 3COM Etherlink 16 TV (3C507-TP) :

Connected to an Ethernet 10Base-T switch, it only requires 2.5 mA on -5V.

Conclusion: the maximum current required on -5V is very low and we need a voltage with low noise because it is usually used to fed preamplifiers. A linear regulator is definitely the way to go. On the ATX2AT Smart Converter, you will find a linear regulator protected with a 100 mA resettable PPTC fuse.