Visit Your Local PBS Station PBS Home PBS Home Programs A-Z TV Schedules Watch Video Support PBS Shop PBS Search PBS
I, Cringely - The Survival of the Nerdiest with Robert X. Cringely
Search I,Cringely:

The Pulpit
The Pulpit
Weekly Column

Beat the clock: Why the next generation of microprocessors may go faster by having no clock at all

Status: [CLOSED]
By Robert X. Cringely
bob@cringely.com

At least for a few weeks, Apple seems to be in the lead of the microprocessor speed race, flaunting its so-called third generation (G3) Macintosh computers with their ultrafast PowerPC 750 chips. Apparently, these chips do some real world computing jobs up to twice as fast as Intel's top Pentium II. Steve Jobs can enjoy himself for a moment, but within days, Intel will announce its latest masterpiece, a 450 MHz wonder complete with custom motherboard running at a bus speed of 100 MHz. And don't forget Digital, whose Alpha chips have long claimed the speed championship at 600 MHz with an L-3 cache. Nobody can outrun an Alpha -- yet. In the personal computer business, speed sells, but the definition of speed may be about to change.

Power users like fast computers. There is a psychological effect that makes us first thrill to the speed of a new computer, then over time get used to the speed until it seems slow again. That's when it is time for a new PC. Hardware makers love to cater to our need for speed, quite simply, because we seem to be willing to pay 30 to 50 percent more for a computer that's 10 to 20 percent faster. If they make it, we will come.

Of course, most people don't really need the extra performance. It's interesting to calculate the implied hourly wage built into a 333-MHz Pentium II or a G3 Mac. For most of us, the cumulative daily time savings from having a faster processor can be measured in a few seconds, perhaps 30 seconds per day. We work five days per week, 50 weeks per year, and hand the PC down the corporate food chain after 18 months. That's 375 work days times 30 seconds, or just over three hours of time saved for that extra $500 it cost to buy the fast PC instead of the sensible PC. I don't make $160 per hour, do you? Yet that's what it costs to save those three hours and eight minutes.

Sure, if you are rendering Titanic, doing video editing, or cracking a critical enemy code, you need the extra horsepower, but most of us do not. But then we don't need cars that can go 130 mph, either, yet we buy millions of them every year.

Here are the four major techniques for building faster personal computers. Yes, Mr. Wizard, I know there are more than four, but these are the four I think are most important:

1) Set your clock ahead. Faster chips running at faster internal clock speeds can do more instructions per second. Faster bus speeds help, too.

2) Accept only cache. To a microprocessor, the time it takes to copy data from the hard disk is forever and even the time it takes to copy data from main memory is one thousandth of forever, which is still forever. So it's better to hold needed data as close as possible to the chip in ultra high-speed cache memory. Level 1 caches are right in the chip, level 2 caches are in the chip package, but not on the same piece of silicon. (Pentium Pro, Pentium II, and PowerPC 750 chips do this, communicating with cache memory over an ultra-high-speed bus that runs only to the cache and back.) Level 3 cache is on the motherboard, but still runs much faster than main memory. DEC Alpha chips use all three types of cache, where the rule is bigger is better.

3) Actually, smaller is better. The smaller you can make a microprocessor, the shorter distance electrical signals have to travel within the chip. It sounds kind of crazy, but even at the speed of light, size counts. So smaller chips are faster. They also use less power, dissipate less heat, and -- despite being more powerful -- cost less to build because you can fit more of them on a silicon wafer. The 300 MHz Pentium II is built from a CMOS manufacturing process that can etch lines on the chip as small as 0.35 micron (millionths of an inch). The 333 MHz Pentium II is built using a 0.25 micron process which makes the chip smaller, faster, lower-power and cheaper. Yet you'll notice the 333 MHz machines still cost more, most of which is extra profit that goes into Andy Grove's retirement package.

4) Do more than one thing at a time. Today, using techniques like superscalar architecture, high-end processors can perform more than one operation at a time. Tasks are split apart, performed in parallel, then their results are put in the correct order. Right now, the state-of-the-art in this bit juggling seems to be around four operations per clock tick.

These are the performance techniques developed over the last 50 years to make Von Neumann computers, which is to say nearly every computer around today, run faster. But there ARE other speed techniques, one of which owes at least some of its existence to John Von Neumann's first computer -- the one that wasn't like nearly every other computer around today.

Von Neumann's first computer, a design he quickly abandoned for very good reasons, was asynchronous: it had no clock. Well, what goes around comes around, and it is looking like asynchronous logic may be the next great advance in computing power.

In most ways, clocks are good for processors. The clock acts exactly like that guy who beat the drum to keep the galley slaves rowing together properly in Ben Hur. Clocks keep everything synchronized. But there are disadvantages to having a clock, too. Keeping the chip synchronized means that every operation runs at the speed of the slowest operation. Getting rid of the clock would let faster operations run like the wind, though there is then the problem of resynchronization, which is why Von Neumann dropped the idea of going without a clock.

Clocks take energy to run, and they are running and using energy even when no actual computing is being done at all. And clock circuitry takes space on the chip, which of course makes the chip more expensive than it might be otherwise.

Getting rid of the clock has many advantages, including some that aren't at all obvious. Not only can an asynchronous chip run as fast as it naturally wants to, but when it has nothing to do it consumes no power at all. None. This makes asynchronous logic good for low-powered or battery-powered devices. But wait, it gets better! Without the clock circuitry, asynchronous chips can be a little smaller than synchronous chips, saving both power and money. This space saving actually doesn't amount to much because asynchronous chips require a bit more cache memory to help reorder those jumbled instructions. Where the power can be really saved with asynchronous circuits, though, is by lowering the voltage. Drop the voltage on a traditional CPU and it generally stops running. Drop the voltage on an asynchronous chip and it just slows down. And the rate at which power consumption drops is greater than the rate at which performance drops.

And just as asynchronous chips will run at lower voltages than synchronous chips, they are also less sensitive to manufacturing variations, which means a lot as the world starts building 0.18 micron chips. This added robustness means more working asynchronous chips can be got from each silicon wafer, lowering costs yet again.

So the good news is that asynchronous chips are faster, cheaper to make, and will run, literally, on the power generated by sticking two bare wires in a potato. And when it is not working, the asynchronous chip consumes no power at all. How much faster an asynchronous chip can be than its synchronous counterpart depends very much on who designs it. For a very good designer, the ratio is at least 10-to-1, meaning that a 333 MHz Pentium II, if it was redesigned as an asynchronous device, would operate at the equivalent of more than three gigahertz! But remember, this asynchronous chip would still have to operate in a synchronous device, so not all that performance gain would be convertible into real work. The real performance improvements will come when asynchronous processors are designed from scratch to be part of asynchronous computers. Then watch out.

There is always a catch, of course, and with asynchronous circuit design the catch is that it is very hard to do well. But industrial competition and the perpetual drive among academics to be anointed top smarty-pants steadily is bringing asynchronous logic into the mainstream. A small processor from Sharp already uses some asynchronous logic. Intel has an upcoming Pentium chip that uses an asynchronous instruction decoder. Sun Microsystems also has a group working toward an asynchronous SPARC processor. But the first real asynchronous systems may appear first in your kids' room. Imagine a battery-powered, handheld Sony Playstation with 10 times the performance of today's unit that plugs in the wall. That's the 3-D graphics equivalent of a $30,000 SGI workstation for $100. I am not making this up.

Comments from the Tribe

Status: [CLOSED] read all comments (0)