Visit Your Local PBS Station PBS Home PBS Home Programs A-Z TV Schedules Watch Video Donate Shop PBS Search PBS
I, Cringely - The Survival of the Nerdiest with Robert X. Cringely
Search I,Cringely:

The Pulpit
Pulpit Comments
October 03, 2008 -- Data Debasement
Status: [OPEN] add a comment

Here's what I'm struggling with. Google can take a great number of shortcuts because it's read-only and doesn't have to be 100% accurate. But I don't want those kinds of shortcuts with my bank account.

Sure, I can believe that there's newer approaches which can deal with these issues in a much more parallel architecture. But there's just a little bit of programming to do first, eh? Not sure I'd write off Oracle anytime soon.

Carl D | Oct 03, 2008 | 3:55PM

Is this just another implementation of the thought "the network is the computer"? By thinking of the network as delivering a daisy chain of if-then results, do we oviate the need to pool our knowledge within a database? As long as the end nodes remain consistent, the entire network and implied data store remains consistent? Or has my mind reeled a bit too much, here.

Ted Murphy | Oct 03, 2008 | 4:04PM

Yes, that's exactly it, Bob. A disappointingly large number of modern developers have never built production software without having an SQL database involved. They quite literally can't conceive of other options. It's the modern equivalent of 80-column mind.

That's not to say that databases aren't useful for some problems, but they're terrible for many others. Not knowing the other approaches means they can't recognize, like the credit card processing team you mention, that they're going about things the wrong way.

Developers interested in getting their feet wet with other approaches can take a look at frameworks like Prevayler, Mnesia, and CouchDB, as well as the MapReduce framework already mentioned. Service offerings include Google's AppEngine and Amazon's SimpleDB. A number of Stonebraker's recent papers are also very interesting; as one of the fathers of the relational database, it's amazing to hear him say that it should be scrapped and replaced with a variety of different solutions.

William Pietri | Oct 03, 2008 | 4:08PM

Interesting article, which reminded me of a few products already existing (whether they can be used in this case is another story).

First of all, the idea of the application performing all the calculation in memory is not new. I know of a few projects (like at BT or Telstra) where they used an object database to have the whole content of the database in memory (the database is still here to commit changes to disk).

There are also companies such as Dataupia which sell a software+hardware database solution which they claim can scale way beyond what an Oracle or a SQL Server can do.

Laurent | Oct 03, 2008 | 4:11PM

I think you are missing a few critical points.

1) Databases do not have to be relational. In fact, I would suggest that for many of the problems that your articles explores a hierarchical database is a much better fit. That is part of what MapReduce/Google File System does, but they dont refer to it as a Database.

2) For critical data, like financial transactions, it is possible to leverage parallelism to provide fault tolerant storage as well. When a transaction is received, send it to 2 different threads, one to store it and another to process it. Then after process step, store that result as well.

While it is nice to hear about these issues now, they are not new and there have been approaches to solve them from the mainframe days that still hold true and work fine today on none mainframe platforms. They are just not in the main stream today and the knowledge about how they worked is being lost as the old timers retire and the new blood tries to reinvent the wheel rather than learn from the past.

Mark

Mark | Oct 03, 2008 | 4:19PM

How is security handled in this methodology?

Steve Dean | Oct 03, 2008 | 4:19PM

I think you are missing a few critical points.

1) Databases do not have to be relational. In fact, I would suggest that for many of the problems that your articles explores a hierarchical database is a much better fit. That is part of what MapReduce/Google File System does, but they dont refer to it as a Database.

2) For critical data, like financial transactions, it is possible to leverage parallelism to provide fault tolerant storage as well. When a transaction is received, send it to 2 different threads, one to store it and another to process it. Then after process step, store that result as well.

While it is nice to hear about these issues now, they are not new and there have been approaches to solve them from the mainframe days that still hold true and work fine today on none mainframe platforms. They are just not in the main stream today and the knowledge about how they worked is being lost as the old timers retire and the new blood tries to reinvent the wheel rather than learn from the past.

Mark

Mark | Oct 03, 2008 | 4:21PM

How is security handled in this methodology?

Steve Dean | Oct 03, 2008 | 4:21PM

Since you mention Larry, what about Oracle 10g, where the "g" stands for "grid"? It's supposed to distribute the database across multiple servers and it's been out for a couple years.

Peter | Oct 03, 2008 | 4:45PM

The big issue with this is, of course, being able to audit a transaction process at every step along the way. Writing things to disk provides safety and accountability. Is a bank going to be comfortable letting all these open transactions just float around in the cloud without an acknowledgment of the atomicity of a transaction? Is a bank going to be comfortable letting someone take money out of an ATM without knowing that transaction has been safely written to a database somewhere? I don't think so.

Jason | Oct 03, 2008 | 5:12PM

It may turn out that relational databases have fundamental problems in a parallel distributed environment. The ACID properties that RDBMSs guarantee are predicated on transactions executing in a defined order, and presenting a consistent view of the data to all processes using the database. As a previous commented pointer out this is not a new problem; relational databases have been distributed across servers for a long time. I don't agree that the solutions are lost in the mist of ancient mainframe lore, nor do I agree that returning to hierarchical databases is a solution, but that isn't what I take issue with.

What Google does with GFS and MapReduce is data management, but it's not comparable to an RDBMS. In Google's world of indexing and searching web content there is usually no need for ACID or guarantees of data integrity or mechanisms for processing data in a specific and roll-backable order. It's acceptable for Google to miss a chunk of web content during a page crawl, or index the pages of a site in an unpredictable order, or process the same page multiple times redundantly, or return different search results for the same query. It would not be acceptable for a bank's database of financial transactions or an air traffic control system to work like that. Different problems, different solutions.

The database experts Michael Stonebraker and David DeWitt went off into the same weeds last January. Like you they confused Google's MapReduce algorithm and GFS with database technology. See their article here:

http://www.databasecolumn.com/2008/01/mapreduce-a-major-step-back.html

And my comments on their article:

http://typicalprogrammer.com/?p=16

MapReduce and GFS are impressive and have many applications, maybe even replacing some of the relational databases that are managing data that doesn't need the integrity enforcement. But MapReduce doesn't mean RDBMSs are now obsolete any more than pickup trucks make semis and trains obsolete.


Greg Jorgensen | Oct 03, 2008 | 5:14PM

Microsoft's SQL Server Data Services (SSDS) is a hybrid Entity-Attribute-Value database running on SQL Server 2005 in MSFT data centers. It takes an approach that's different from MapReduce (e.g., BigTable) implementations in that it defines regions of consistency (Containers), whereas other implementations (Google API, SimpleDB) only provide eventual consistency.

For a recent comparison of relational databases deployed to the cloud versus the SSDS approach, see http://oakleafblog.blogspot.com/2008/10/sql-server-data-services-team-architect.html.

--rj

Roger Jennings | Oct 03, 2008 | 5:16PM

Microsoft's SQL Server Data Services (SSDS) is a hybrid Entity-Attribute-Value database running on SQL Server 2005 in MSFT data centers. It takes an approach that's different from MapReduce (e.g., BigTable) implementations in that it defines regions of consistency (Containers), whereas other implementations (Google API, SimpleDB) only provide eventual consistency.

For a recent comparison of relational databases deployed to the cloud versus the SSDS approach, see http://oakleafblog.blogspot.com/2008/10/sql-server-data-services-team-architect.html.

--rj

Roger Jennings | Oct 03, 2008 | 5:17PM

Small correction to the closing of your article: Google didn't really coin the term "MapReduce". They borrowed it from functional programming, where map represents a transformation that is applied to each of the elements of a collection and then a final reduce step is performed on them to generate some singular result.

Danno | Oct 03, 2008 | 5:55PM

OMG the entire INTERW3B at once!!!11ONE

C'mon. They have the text of even web page that they feel is worth keeping track of. Gigantic to be sure but hardly the entire internet.

Sean Benton | Oct 03, 2008 | 6:25PM

I'd like to what Chris Date and Fabian Pascal have to say about some of these concepts. And IIRC they haven't been too impressed with Stonebreaker's logic in the past.

The point of a relational database is to represent some aspects of reality consistently and correctly - and hopefully fast enough to be useful. A lot of so-called "databases" - such as MySQL - which for a long time were barely "databases" and more like "file managers" - have shown that trying to obsolete relational theory in the pursuit of performance carries considerable risks of making the reality they represent considerably less correct, consistent - and even in performance.

A fast database that produces wrong results is nothing to the purpose.

Richard Steven Hack | Oct 03, 2008 | 7:05PM

Databases are slow because they are dependent on I/O speeds. I remember hearing several years ago that if CPUs were clocked at 1 second, disk reads would take 2 weeks to complete. I doubt that relationship has improved by even 1 order of magnitude.


So all applications that require scalable, high performance should avoid I/O as much as possible. In those markets, databases will be a hard, if not impossible sell. But while those markets are growing, it hardly means that they are causing markets where databases make sense to shrink. Also, the level of expertise required to build HPC applications is pretty expensive, and likely to remain so. There are probably some at the low end that would accept the performance hit in return for the cost savings.


Meanwhile, I expect that Larry Ellison, et al, are considering how they might be able to enter those markets. But I'm pretty sure they aren't losing any sleep over it. If anything, they are more concerned about the low end DBs, such as MySQL.


Later . . . Jim

JJS | Oct 03, 2008 | 7:22PM

William Petri wrote: "A disappointingly large number of modern developers have never built production software without having an SQL database involved. They quite literally can't conceive of other options. It's the modern equivalent of 80-column mind."

Actually a dissapointingly large number of modern developers have built lots of production software without understanding the underlying relational database. They conceive of all kinds of other options: XML databases, thousands of hand-carved text and binary formats, serialized objects, object-oriented databases, rediscovered fossils like hierarchical and network databases, etc. And every one of those options is inferior to a relational database in a serious application (one where data integrity matters). I'm amazed at the lengths programmers will go to in their quest to avoid learning how to design and use a relational database.

"That's not to say that databases aren't useful for some problems, but they're terrible for many others. Not knowing the other approaches means they can't recognize, like the credit card processing team you mention, that they're going about things the wrong way."

The implication is that relational databases are only appropriate for some small subset of data management problems, and they are only widespread because of programmer sloth. That opinion demonstrates significant ignorance of the history of relational databases and why the many earlier data management solutions were replaced. If any of the alternatives mentioned were superior in even a couple of dimensions they would not be niche or toy products; does anyone think hundreds of thousands of customers pay Oracle and Microsoft licenses because they are too stupid to adopt better and cheaper technologies that are right in front of their noses? If that was the case Appistry wouldn't have any customers and IBM would be delivering mainframes by rail.

The credit card processing mainframe anecdote in Cringely's article doesn't ring true. As described, the problem that prevented Appistry from delivering as promised sounds more like a lot of issues and assumptions in the entire application architecture, not a fundamental flaw in relational databases per se. And the example of so-called "false dependencies" is not realistic, either; it's not credible to claim that simultaneous queries lock up entire tables or single-thread all database operations in any modern enterprise-worthy RDBMS. Either the story is a canard or some serious application design issues are being blamed on the database.

Since credit card transactions don't have interdependencies during processing (validation and posting) that can't be reconciled after they are all posted, the whole story comes off as manufactured to illustrate yet another "end of the relational database regime" story. These kinds of stories continuously popping up on technical forums are analogous to the persistent and equally silly claims of car engines that run on water, and how great things would be if only the big companies weren't hold the miracle technologies back.

Greg Jorgensen | Oct 03, 2008 | 7:31PM

name says it all. please stop trying to re-invent things that work perfectly well

i agree with greg | Oct 03, 2008 | 8:21PM


lol.

That approach is only good for read only data and relatively simple transactions. Once you move into complicated transactions you have to maintain data integrity. To do that you need read consistency and you have to store data on disk because servers crash. If you have multiple copies in memory you have to keep them in sync because you need read consistency.

Google doesn't have to worry about any of that. If something is not in memory, then can read it in. If something gets wiped out completely, they can just go to the internet and get the data again. There are no transactions.

The "cloud" isn't going to replace databases. This is ludicrous.

dimitri | Oct 03, 2008 | 8:25PM

Want a highly parallel and faster than fast DB? Try MonetDB. Those guys are a decade ahead.

Just have a look at their papers and their alredy available code.

Alecco | Oct 03, 2008 | 9:09PM

Those living in the internet programming world can draw a lot of false lessons from it. Google is a very big, very impressive application for a lot of reasons, and a lot of internet-oriented problems map to it well. But those problems aren't actually as relevant to enterprise applications as a lot of programmers think they are.

Google is, from the perspective of an application architect, almost entirely a read-only application. The problems that databases are meant to solve don't affect it much, because the overwhelming majority of things that google does don't involve ACID transactions. When you can serve all your front-end activities out of a cache, you don't really benefit much from a proper database.

Applications like that get a lot of press and a lot of attention these days, but most applications aren't like that.

Yes, ACID databases have a lot of room to improve in scalability. Yes, utility computing is likely the scalability model they'll need to adapt to. But that scaling doesn't require dumping the traditional requirements of a database...it just means changing the implementation, so that only the things which must be centralized (locking, basically) are centralized. There's no intrinsic reason that primary data and locking metadata need to be stored and delivered the same way, the way traditional databases do it. But seperating them isn't a rejection of the database model...just a re-engineering of it.

So yes. Databases have a lot to learn from the cloud model, and those database vendors who don't learn it are going to be crowded out of the market soon by those who do. But that's a very different thing than simply rejecting the database model entirely in favor of something supposedly "better" which in reality simply dumps over the side _necessary_ features in the name of scalability.

Matt | Oct 03, 2008 | 9:15PM


There does seem to be an interesting convergence happening - memcache, Amazon SimpleDB, attribute-value, stream databases, popularity of functional programming concepts, map-reduce, multithread / manycore.

Think of the overhead and the layers of cruft needed to make a web app with a data store back-end...

We never question this assumption... but, I think the ugly duckling is clearly SQL - we need to throw that away, rather than extending it [for stream data clauses etc]

Replace it with a legit modern programming language [be it python/arc/lisp/ocaml/ruby/F#...] which has succinct, readable ways to express operations on data. Have an event mechanism inbuilt into the language - you wont need to have 12 flavours of stored procedure dialects just to run some code when data changes, or new data arrives.

There is a better way.

gord | Oct 03, 2008 | 10:17PM

Relational databases are holding data in a relational model.
That model is modelling some part of a business process or application process.
Relation Modelling is based on Relational Calculus.

If you need ACID compliance you need it. If you believe that in-memory everything with occasionally writes gets you enough D (Durability) so be it, the whole stack dies and ... whoops what was the state of the data?

At the root of this, the question is: Do you care about your data or not?
Does Google care about whether a page is slightly out-of-date, or changed since the last spider... no.
Does the bank care if you withdraw money from an ATM and your account balance remains the same... yes.

Paul | Oct 04, 2008 | 1:37AM

This happens to all recording systems as the price of recording drops and reporting capability grows.

Early papyrus/clay tablet records held things that were really important like religious texts, crop records and histories.
Early paper writing in the west held the same and financial records.
Hardly anyone could read these things.

As recording writing became cheaper (the printing press) we started to produce pictures (yes, porn) and fiction.
As the present time, the cost of writing and recording text and pictures on this scale is very small. We deliberately write complete trash for fun and enjoy reading it.
There is also mass literacy.

The early records held on a computer were about valuable things (crop records and financial details) and we worked out ways of ensuring that anyone using that data in any way would view coherent information and if changing anything,leave it in a good state. This is called SQL.
If you structure your database correctly, you can make it behave in a fairly parallel manner - or not. Look at large ERP systems.
Hardly anyone could read these things.

The cost of recording and reporting on data is now also very small, we are in the early stages of datafiction and dataporn - where we can make money out of reporting (and recording) on something that is probably not completely correct - and never will be as it always changes.
We are heading towards mass (google type) literacy

We have travelled in a strange loop back to the place of the caveman who would observe nature, and draw conclusions, sometimes innacurately as the weather changes.
We now observe google, and draw conclusions, sometimes innacurately as the index changes.

andrew | Oct 04, 2008 | 3:29AM

This happens to all recording systems as the price of recording drops and reporting capability grows.

Early papyrus/clay tablet records held things that were really important like religious texts, crop records and histories.
Early paper writing in the west held the same and financial records.
Hardly anyone could read these things.

As recording writing became cheaper (the printing press) we started to produce pictures (yes, porn) and fiction.
As the present time, the cost of writing and recording text and pictures on this scale is very small. We deliberately write complete trash for fun and enjoy reading it.
There is also mass literacy.

The early records held on a computer were about valuable things (crop records and financial details) and we worked out ways of ensuring that anyone using that data in any way would view coherent information and if changing anything,leave it in a good state. This is called SQL.
If you structure your database correctly, you can make it behave in a fairly parallel manner - or not. Look at large ERP systems.
Hardly anyone could read these things.

The cost of recording and reporting on data is now also very small, we are in the early stages of datafiction and dataporn - where we can make money out of reporting (and recording) on something that is probably not completely correct - and never will be as it always changes.
We are heading towards mass (google type) literacy

We have travelled in a strange loop back to the place of the caveman who would observe nature, and draw conclusions, sometimes innacurately as the weather changes.
We now observe google, and draw conclusions, sometimes innacurately as the index changes.

andrew | Oct 04, 2008 | 3:37AM

A whole host of techniques exist such as master-slave, table partitioning across servers, solid state or in memory storage, application level caching, database caching etc. All these allow databases and applications to scale depending on the usage, read/write ratio and other factors.

They all have one thing in common - Smart people using all the above tools correctly to meet the needs of the application or service.

So the big change is a database is not an install and forget solution to your data-storage needs.

Mark Hewis | Oct 04, 2008 | 5:47AM

Can I please make one gripe: The spelling is "metre". The French invented this system of measurement, and the name they use is "metre".

As my high school physics teacher used to say, there is a big difference between a micrometer (a device that measures little things), and a micrometre (10^-6 metres).

Damien | Oct 04, 2008 | 9:21AM

Somehow, Google's read very very intensively, if not exclusively, application can't quite be the generic model which the other 99% of applications in our computerized world should be measured. It really isn't representative---not that we can't learn a lot from the architecture, but it is a very different animal. Furthermore, many of those false dependencies that Bob writes about are called "data integrity" actions. Actions that make sure debits match credits and your flight reservation matches that overall airplane manifest of available seats for examples. These are the types of "boring" applications that we are dependent on daily.

It is further worth noting or, at least emphasizing, that of the zillions of applications that use databases, only a small fraction are candidates for moving to a cloud model. The vast majority are just not that big or just not growing all that fast that conventional computing in either client server or mainframe architectures are insufficient. Don't count Larry out so fast!

steve | Oct 04, 2008 | 11:06AM

There are several aspects to data storage. One is how the application access it, and the other is how it's stored. (Others are how transactions are done corretly, etc.) I think Cringely is really talking about the application interface, specifically how the application-database system performs when the application (or applications) is (or are) making lots of independent queries at the same time. Whether transactions are always correct or whether the data is safely stored and not lost due to failures, is mostly a separate issue. But is making lots of SQL queries and waiting for them to return the best way for your application to get stored data? There are other ways. It could still all be backed by a "database", but the interface can be considered and designed for the best match with the application.

fool | Oct 04, 2008 | 12:14PM

Damien, your name isn't spelled that way either. The Druids borrowed it from the Huns, who borrowed it from the Mongolians, who borrowed it from the Chinese, who spell it Damian in today's pinyin system. Or if you really want to use the proper spelling, use the Chinese characters. Sorry, I can't type them here.

Whatever | Oct 04, 2008 | 1:18PM

I think that Larry is as worried about then end of databases as Bill Gates is worried about the end of the desktop.

IT doesn't stop on a dime, it's a huge oil tanker moving in the ocean. This would explain the amount of production COBOL Code and why you'll never see a linux desktop.

I believe in this new technology, but it's hard to convince people to give up their SQL knowledge, etc.

Alex Birch | Oct 04, 2008 | 2:00PM

Interesting column - thanks! With 49 state attorneys general calling for minors' age verification (which has scary implications for children's privacy), it'd be interesting to hear how this sea change would affect their privacy. Not at all? Would all the data still just be archived in databases?

Anne Collier | Oct 04, 2008 | 5:50PM

I have to disagree with the basis of this article. I do agree with many of the RDBMS posters.

A) Many programmers do not understand the block box they call a database. They delegate it as a read/write mechanism like the OSs I/O Layer.

B) Relational Databases were created to keep a transaction consistent until it is complete. Meaning multiple related componentes (inserts, updates deletes) all in one consistent image that is not exposed to the world unless every part of it is done.

This logic can be broken out. I think programmers in the parrallel world will find the need for read consistency across multiple threads as a must. So they may be trying to build a better RDBMS, but what they are doing is componentizing the RDBMS idea into thredable pices of work.

There is a massive difference between read only (data warehousing) and OTLP RDBMSs. Each has a reason for existance however it is much easier to take a google simulated data warehouse rdbms and re-create it using a customer app than it is to replace the logic provided by an OLTP system.

Sorry, but you are a bit off here.

Rob | Oct 04, 2008 | 6:42PM

Rob, you arrive at the key through a different route -- you assume that read consistency is important for web applications, but it's not! Not all the time, anyway.

The standard database theory idea of "Atomic, Consistent, Isolated, Durable" (ACID) gives way to a new idea.

This paper (from 1998!) expressed the idea as "BASE" (geddit?): "Basically available, soft-state, eventual consistency" (http://www.ccs.neu.edu/groups/IEEE/ind-acad/brewer/sld001.htm). You don't need the bid price of your eBay auction to be exactly correct at the time you see it -- network latency means that by the time it's on your screen, it might be wrong anyway. But as long as you can guarantee that *at the close of the auction* the price is right, that's good enough.

That insight has led to a huge amount of innovation over the past ten years, including Google's MapReduce.

Brendan.
(PS the backup slides in that presentation are quaintly funny, comparing HotBot to AltaVista and Yahoo! Ah, times were simpler ten years ago...)

Brendan | Oct 04, 2008 | 9:14PM

Isn't the much like the whole prevalayer/object prevalence idea which has been around for the better part of a decade now? Keep the transactions in memory and just use a plain file system as a log + full state snapshots so that in the event of failure the current state can be recovered by recommitting all logged transactions since the most recent snapshot.

test test | Oct 05, 2008 | 12:04AM

BTW, there's a common lisp version of object prevalence called common lisp prevalence that's quite nice.

test test | Oct 05, 2008 | 12:07AM

Brendan,
Your statmenets "Not all the time, anyway" and "You don't need the bid price of your eBay auction to be exactly correct at the time you see it -- network latency means that by the time it's on your screen, it might be wrong anyway. But as long as you can guarantee that *at the close of the auction* the price is right, that's good enough.
" prove my point.

Databases have different types or reads in thier system. You can turn on "dirty reads/writes" to meet this same workflow. However the point remains, you can not bid on an item until it exists.

Databases have this logic in them. The fact that programmers do not know when to utilize/leverage the dirty read vs the consistent read and are re-creating the RDBMS at the application layer, is what I was agreeing to.

The transistion is to integrate the RDBMS logic in the appication layer, not to avoid it. That means for Larry the huge move to applications and providing more RDBMS application logical components that they can license is the right transistion.

That is why I disagree with the article. If you think Oracle does not understand the extremely large data market or enterprise application requirements, then perhaps you should interview Larry and/or a VP in charge of application technology for Oracle or Microsoft.

This is the direction they will go. Objects you include in your application to allow your threads to utilize the RDBMS logic and provide the management or prioritizing the threads.
Maybe you and I are are agreeing, however it does not sound like it. My point is simply "the RDBMS will exist". How it is customized, which part of the application, the major players in the market, is my disagreement. Also a standard RDBMS can be leveraged today, if you know the RDBS you are using. Some logic may be included in the application side, which is why oracle makes so much money develping applications for people usign thier RDBMS today.

I think Mr Cringley stepped out of his safe zone on this one.

Rob

P.S.
Custom long transition logic has been around for years. FORD developed one form that is similary to another in use by ESRI and other companies. They allow for the multiple writes and multiple reads and read consistency issues to scale using a standard RDBMs. But you have to understand the RDBMs and how to leverage it at a level not many outside of the vendor or oak table have.


Rob | Oct 05, 2008 | 12:15PM

Great post. Parallel computing is the only way to keep up with data volumes as the world continues to digitize. Of note, MapReduce and other innovations in distributed computing are making their way into commercial databases, as well. In-Database MapReduce (http://www.asterdata.com/product/mapreduce.html) is a method to unite the power and analytic expressiveness of MapReduce
with the structure and features of SQL that those in the relational database world love (and require). Doing this, many of the bottle-
necks in traditional RDBMS are overcome, and businesses can take advantage of parallel computing without having to throw out the
reports/analytics/applications which rely on an RDBMS.

Steve Wooledge | Oct 06, 2008 | 11:45AM

How about this type of solution, run the DB on a memory disk. It would require a special version and we would need to do some additional work to occasionally write information to the disk.

So we have a "dirty" flag that signals the data needs to be updated and on some idle time it gets written. All data though would reside on the RAM disk making it all very fast.

Everything would have to have powerful and reliable UPS systems but it could really be useful and thus reduce the I/O times for the underlying system.

TonyK | Oct 06, 2008 | 3:16PM

Parallel computing will never be a main stream development methodology. Most all programs that we run are inherently serial. There just isn't enough parallelism to exploit from them. Parallel computing has been around for years and it is still at the same place as it was. It is used for computation heavy simulation type work.

parallelMan | Oct 07, 2008 | 10:21AM

Small business is the underlying fabric of our economy. I don't see small businesses using a cloud computing application unless they're contracting with a service provider who does. I'm not worried about my sqlserver and mysql knowledge becoming obsolete anytime soon.

Lareman | Oct 07, 2008 | 12:43PM

You should check out CouchDB:

http://incubator.apache.org/couchdb/

"Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API. Among other features, it provides robust, incremental replication with bi-directional conflict detection and resolution, and is queryable and indexable using a table-oriented view engine with JavaScript acting as the default view definition language.

CouchDB is written in Erlang, but can be easily accessed from any environment that provides means to make HTTP requests. There are a multitude of third-party client libraries that make this even easier for a variety of programming languages and environments."

Noah Slater | Oct 08, 2008 | 7:27AM

Exadata.

J Peters | Oct 09, 2008 | 10:50AM

Exadata.

J Peters | Oct 09, 2008 | 10:52AM

Exadata.

J Peters | Oct 09, 2008 | 10:54AM

Encyclopedias used to be a shelf full of books. Along came the CD-ROM (and DVD later) and that shelf vanished. But the rest of the books stayed.

For the Googles, Amazons and EBays, "data" was too cumbersome for the old database paradigm. Vast quantities of data, lots of concurrent users, minimal consistency issues. They used and developed new technologies.

But the vast majority of existing systems work the way they are, and won't need or benefit from any revolutionary changes in that super-scale field.

Gary | Oct 10, 2008 | 5:29PM

add a comment

name:

e-mail:

url (optional):

Comment (br and p tags are not necessary for line breaks)