Visit Your Local PBS Station PBS Home PBS Home Programs A-Z TV Schedules Watch Video Donate Shop PBS Search PBS
I, Cringely - The Survival of the Nerdiest with Robert X. Cringely
Search I,Cringely:

The Pulpit
The Pulpit

<< [ Running Interference ]   |  Flying Blind  |   [ Is a Little Broadband Enough? ] >>

Weekly Column

Flying Blind: How Obfuscation Is Emerging as a Technology That Is Critical to .NET, Yet Somehow Not Owned by Microsoft

Status: [CLOSED]
By Robert X. Cringely
bob@cringely.com

Years ago, I was flying a small airplane that I had built myself and the engine stopped in flight. Despite the picture portrayed in most news accounts of small aircraft engines failing and planes dropping from the sky like plump raindrops, engine failure is actually quite rare. But it happened to me. One moment, I was flying across Arizona, and the next moment, I was gliding across Arizona. I looked down in that first moment of silence and noticed directly beneath me was that giant meteor crater near Winslow — the crater where Jeff Bridges went back to his intergalactic mothership in the movie "Starman." Spectacular as crash landing in that giant crater might have been, I still preferred the idea of a more normal landing in Winslow, which was barely visible on the horizon. As a glider, my plane wasn't really that bad. It was capable of gliding 18 feet forward for every foot of altitude lost, which with luck put Winslow almost within reach. And luck was with me. I caught a couple thermals — rising columns of air — and made Winslow with altitude to spare, landing silently and using my last bit of momentum to roll up in front of the big Native American Air Ambulance hangar just as though I did it every day. I spent most of the next week in that hangar, fixing my plane and learning a little bit about Native American culture.

Winslow sits where it does in northeast Arizona because of the railroad, but it also happens to lie between two giant indian reservations, that of the Hopi and the Navajo. When I was buying a new fuel pump at the local Ford dealer (this was a homebuilt plane, remember), I asked the salesman whether there was any difference selling a car to a Hopi or a Navajo. He said Hopis and Navajos alike bought pickup trucks, not cars, but if there was a difference between the two tribes, it came down to financing: A Hopi always makes his truck payment, while Navajos are somewhat less reliable. "Hopis are pueblo Indians," he explained. "They have lived next door to the same neighbors for a thousand years, so their reputation in the village is very important. Navajos are nomadic plains Indians who don't even think of themselves as having neighbors, so what others think is less important than what they think of themselves. They'll pay when they get around to it."

This discussion came to mind when I learned that opening this week in the U.S. is a new movie (John Woo's Windtalkers) about the Navajo Code Talkers — U.S. Marines from the Navajo tribe who used their native language and a special glossary of military terms to communicate by radio during World War II battles in the Pacific Theater. Code talking worked spectacularly because Navajo was an unwritten language — it had no alphabet and it is hard to crack a code that uses no symbols — and because there were almost no non-native Navajo speakers and certainly no Japanese Navajo speakers. A controversial part of the movie is the idea that each code talker was assigned to a non-Navajo partner and that partner's duties included killing the code talker, if necessary, rather than allow him to be taken prisoner. The U.S. Marine Corps says that part wasn't accurate.

Maybe, maybe not. Certainly, it was true a couple wars later in Vietnam. I have a friend who never allows me to use his name in print because doing so would cause silent black helicopters to shortly hover over my house, scaring the dog. This friend served in Vietnam repairing secret listening stations hidden in North Vietnam. Each repair mission was very dangerous. He wasn't allowed to carry any documentation, so all repair manuals and schematics had to be memorized. He was accompanied each time by a squad of soldiers from the Republic of Korea whose job was to protect him, or failing that, kill him to avoid capture. So it does happen. In this case, the bag men were foreigners not subject to the U.S. Code of Military Conduct. There's always a loophole.

Keeping secrets is a big part of high technology, too, primarily because nearly everything finally comes down to software, and software has no real substance except for the knowledge of how it works, so keeping that a secret is vital for business. Traditionally, there have been two major ways of keeping software secrets. One way is by distributing only object code — code that has already been compiled for a target processor and operating system. The uncompiled source code remains like the secret recipe for Coca-Cola, locked in a safe back at software corporate HQ. Nobody gets source code from Microsoft, for example, because doing so lies open the innards of how programs work. Part of the current Microsoft antitrust dispute with nine states comes down to sharing source code, which the states think is a good idea, but Bill Gates sees as corporate suicide.

The second common technique for keeping secrets is encryption, which is scrambling the program so that it can't be read without first decoding, and decoding can't happen without the secret password. Encryption is good not just for software when we don't know how it works, but also for software when we do know how it works. An example of the latter is music or video. We know how "I Left My Heart in San Francisco" goes, but encryption keeps us from hearing Tony Bennett sing it unless we pay for the password.

These two techniques are not mutually exclusive and are often used together.

A third method of keeping computer code secret is obfuscation, and that's the topic of this column. Obfuscation is used primarily with languages that are interpreted, rather than compiled. BASIC is usually an interpreted language, as are PostScript and many other computer languages. But the most popular interpreted language by far is Java. As an interpreted language, Java has to have available and in full view all the code prior to runtime so it can then be interpreted by the Java Virtual Machine. Nothing is kept back in the lab, nothing is hidden. It is all right there to be seen, and sometimes, to be stolen. This is a problem for a language promoted for corporate and commercial software development, and obfuscation is the solution that emerged.

Obfuscation used to work primarily by padding the code with extra lines that didn't actually do anything. You could try to read the program listing and all the extra junk was supposed to make it impossible to understand the program flow. Yeah, right. Good Java programmers could get past all the garbage, but it took a lot of work, so obfuscation was generally successful.

Then there appeared the first bytecode optimizer for Java, called Dash-O, which comes from a company in Cleveland, Ohiocalled PreEmptive Solutions. Dash-O's job was to make Java code smaller and faster by removing all the parts that weren't actually used. If you think about it, that means Dash-O would also remove all the padding code added for purposes of obfuscation. And that's exactly what happened. So Dash-O, the first Java bytecode optimizer, also became the first Java DE-obfuscator. Uh-oh.

So the boys at PreEmptive Solutions (they have been friends of mine for Years, but I don't own any of their stock, so stop worrying) had to come up with a new type of obfuscator just to undo the damage done by their optimizer. Now Dash-O not only optimizes, it obfuscates, too, and the method it uses is unique. When Dash-O is finished, every named identifier (a variable, for example) has the same name and that name is "a." Trying to figure out code that is nearly all "a"s is almost impossible, which is the whole point. Through this patented technique called "overload induction," obfuscation has become more powerful — perhaps even more powerful than the other techniques mentioned above.

Here is why. We could compare encryption to locking a six-item meal into a box. Only the intended diner (the Virtual Machine) has the key, and we don't want anyone else to know what they are going to eat. Unfortunately, if someone can pick the lock (or find the key hidden on the bottom of the box), the food is in plain view. Obfuscation works more like putting the six-item meal into a blender, and sending it to the diner in a baggie. Sure everyone can see the food in transit, but besides a lucky pea or some beef-colored goop, they don't know what the original meal is. The diner still gets the intended delivery, and it still provides the same nutritional value as it did before (Virtual Machines aren't picky). The trick of an obfuscator is to confuse observers, while still giving VMs the same delivery.



Comments from the Tribe

Status: [CLOSED] read all comments (0)