Visit Your Local PBS Station PBS Home PBS Home Programs A-Z TV Schedules Watch Video Donate Shop PBS Search PBS
I, Cringely - The Survival of the Nerdiest with Robert X. Cringely
Search I,Cringely:

The Pulpit
The Pulpit

<< [ Off With Their Heads! ]   |  Cool Threads  |   [ Ctrl-Alt-Del ] >>

Weekly Column

Cool Threads: The right sort of multi-threaded network programming makes computing both faster and simpler.

Status: [OPEN] comments (63) | add a comment
By Robert X. Cringely
bob@cringely.com

The Pulpit Poll

Are there any applications you use every day that you know to be specifically multi-threaded?

Yes: But I'm not telling you which ones.
No: How can you even tell?

Skip this one and see results

A couple of columns ago we touched on the practical rebirth of parallel computing. In case you missed that column (it's in this week's links), the short version is that Moore's Law is letting us down a bit when it comes to the traditional way of increasing the power of microprocessors, which is by raising clock speeds. We've hiked them to the point where processors are so small and running so hot that they are in danger of literally melting. Forget about higher clock speeds then; instead we'll just pack two or four or 1000 processor cores into the same can, running them in parallel at slower speeds. Instantly we can jump back onto the Moore's Law performance curve, except our software generally doesn't take advantage of this because most programs were written for single cores. So we looked back at the lessons of parallel supercomputers, circa 1985, and how some of today's software applies those lessons, such as avoiding dependencies and race conditions.

But we didn't really talk much in that column about the use of threads, which are individual processes spun off by the main CPU. Each time the microprocessor adds a new task, it creates a thread for that task. If the threads are running on the same processor they are multiplexed using time-slicing and only appear to run in parallel. But if the threads are assigned to different processors or different cores they can run truly in parallel, which can potentially get a lot of work done in a short amount of time.

Most traditional PC applications are single-threaded, meaning the only way to make them go faster without a completely new architecture is to run the CPU at a faster clock rate. Single-threaded apps are simpler in that they are immune to the dependencies and race conditions that can plague true parallel code. But they are also totally dependent on tasks being completed in a quite specific order, so in that sense they can be dramatically slower or dramatically more resource-intensive than multi-threaded apps.

For an example of where single-threaded applications are just plain slower, consider Eudora, which is still my favorite e-mail client (I'm old, you don't have to tell me). Until not very long ago Eudora still couldn't send mail in background, so everything (and everyone, including the user -- me) had to wait until the mail was sent before completing anything else, like writing a new message or replying to an old one. I KNOW THIS IS NO LONGER THE CASE, SMARTY-PANTS -- THIS IS JUST AN EXAMPLE. The program was single-threaded and, since sending mail is a very slow activity, users were generally aware that they were waiting. Today Eudora sends mail in background, which is the same as saying "in another thread."

Multithreading has been great for user interactivity because nothing should ever stop the input of data from typing, mouse movements, etc.

There are many ways to use threads and before we consider some let's think about scale -- literally how many threads are we talking about? To run at true clock speed we'd have only one thread per CPU core, but a fast processor can multiplex hundreds or even thousands of threads and multi-core processors can do even more. So the EFFICIENT shift to multi-threaded programming requires a significant change in thinking on the part of developers.

Here's an example: A hard problem with programming games is when you want something to happen every so often. That's not very efficient to code because it traditionally requires a program loop that spins as fast as the CPU will let it (making the CPU go to 100 percent) and keeps checking the time to see if it is time to do that thing. But threads are different: With threads you can very easily put them to sleep for any period of time, or even put them to sleep indefinitely until some event occurs. It's not only easier for the programmer, it takes nearly no CPU power compared to the looping system.

How do you use threads to write an e-mail server that handles thousands of simultaneous incoming e-mails? Well, you write it as if you were writing a server that can only handle ONE e-mail at a time. Just write very simple code that knows how to accept e-mail, then test it by sending in an e-mail. It works? Cool. Now send 1000 threads into that same piece of code. Each thread has its own state as to what e-mail (FROM, TO, SUBJECT, etc.) it is receiving, but despite the fact that that the content is different, the process is exactly the same. Now you have an e-mail server capable of serving a thousand simultaneous connections.

See, writing multi-threaded apps may require a different approach but the benefits from doing so can be fantastic.

A new area of multi-threaded programming that is REALLY hard (hard even for developers who do normal multi-threaded programming really well) is the use of optimistic concurrency. Two columns ago I alluded to this in my example of Appistry's decision to forego using a database for a credit card processing application. I said I would show a hack that was yet another approach to the same problem. Well here comes the hack.

Let's consider this problem: My wife (the young and lovely Mary Alyce) and I happen to be in different parts of town, each of us standing in front of a bank ATM machine. I am a thread, Mary Alyce is a thread, and the ATM is main memory. Contrary to our usual behavior in which we only take money OUT of the bank, we are paradoxically planning near-simultaneous bank deposits. Our balance starts at $1000. I am depositing $200 while Mary Alyce is depositing $300.

The ATM does the process like this:

  1. Retrieve existing balance
  2. Add new deposit to that balance
  3. Update new total balance

The trick is that we are both doing this same thing at the same time. What if this happens:

Bob starts deposit transaction
Mary Alyce starts deposit transaction
Bob's ATM grabs the balance ($1000)
Mary Alyce's ATM grabs the balance ($1000)
Bob deposits $200 (his ATM updates the balance to $1200)
Mary Alyce deposits $300 (her ATM updates the balance to $1300)
Bob's ATM updates the main database with the new balance ($1200)
Mary Alyce's ATM updates the main database with the new balance ($1300)

This is a race condition. Mary Alyce updated the balance last so my deposit is lost completely. There are many ways for this to work out still, but also many ways for it not to work out OK. And any traditional solution requires a LOT of back-end reconciliation and computation. We need something simpler.

Classic concurrency control would basically LOCK the database. This is called, not surprisingly, "pessimistic concurrency." So I go up to the ATM and my ATM requests the database to be locked for me. If Mary Alyce then went to another ATM it would tell her she has to wait because the database was being updated elsewhere. This ensures that the race condition can't happen, but it also holds up Mary Alyce, who does not like to be kept waiting.

Optimistic concurrency control says, "We KNOW there could be a race condition, but we'll add a very cheap way to detect if it occurred. And if so, we'll pay the very expensive cost of restarting one of the transactions from the beginning."

The only changes from the above sequence are in the last two lines:

Bob's ATM updates the main database with the new balance ($1200)
Mary Alyce's ATM updates the main database with the new balance ($1300)

These become:

Bob's ATM updates the main database with the new balance ($1200) so long as the current balance is still $1000.
Mary Alyce's ATM updates the main database with the new balance ($1300) so long as the current balance is still $1000.

That's easy to write but trickier to implement. The check of the balance and the update of the new balance must occur within the microprocessor as an atomic action. It all must happen basically as one single operation. Some processors have had these instructions for a long time, but they're pretty common now and called "test-and-set" or "compare-and-set" instructions.

If the "so long as" part fails, the whole transaction must be restarted from scratch. In our example, Mary Alyce's $300 would pop back out of the machine and she'd have to start over.

That's very expensive for Mary Alyce, but the actual occurrence of the race condition is very rare. So although the redo is expensive it hardly ever happens, so no one has to wait for another person doing an update operation.

Apply this to 25,000 ATMs and suddenly the database is decoupled from transaction processing and the system is additionally controlled for internal race conditions such that it can run with less code and at full speed, which is saying something. Suddenly the system can be 100 times faster (cascading 10X improvements) or run 10 times faster on one tenth the hardware (take your pick), all thanks to the timely embrace of clever multi-threaded programming.

Comments from the Tribe

Status: [OPEN] read all comments (63) | add a comment

Today, multicore processors are great for end users to have because:

1. Botware/malware/trojans are often CPU intensive

2. Flash is frequently CPU intensive, and appears on a large percentage of web pages

CPU has become a dumping ground for malicious, runaway, and inefficient processes.

Andrew S | Oct 28, 2008 | 12:23AM

Yes, multiple threads can run on a single processor, the OS just needs to switch between them. But - it can switch away from a thread that would be suspending waiting on something to happen like disk or network IO. That is, if the program is propertly written. Impropertly written, it won't do any good. Also breaking a program into too many threads will slow the system down - it spends too much time doing context switches. Also, Bob, your email server example doesn't really scale. That's not how webservers are written. There are other technolgies like IO completion ports for building scalable code. Your concurrancy example is good though. Other interesting things are deadlocks.
Brian

Brian K | Nov 18, 2008 | 8:05PM

text me at 757-737-5690

DONALDMCCRAY | Jun 01, 2009 | 1:04AM
[OPEN] read all comments (63)

ADD A COMMENT

Ground rules for posting comments...

  1. No profanity or personal attacks, please.
  2. Please restrict your comments to the subject of the column and directly relevant topics.
  3. Be more funny.
  4. Those who violate these ground rules will have their comments deposted (which is a bit like being deported, only you don't have to leave the country).

name:

e-mail:

NOTE: Your email address is for internal purposes only and will not be published, shared or sold to other entities.

url (optional):

Comment (br and p tags are not necessary for line breaks)