Truth, Fads and Principles: 
What's Wrong with the Database Industry? 

An Interview with Fabian Pascal

This issue of Data Discussions is with Fabian Pascal, an independent industry analyst, consultant, author and lecturer specializing in database management.  What makes Mr. Pascal especially “unique” is his reputation for taking a highly critical and contradictory position in almost all his public opinions.  For example, he’s a prolific writer who calls his regular columns “Setting Matters Straight” (quarterly, on www.TDAN.com) and “Against the Grain” (on www.DBAzine.com).  His own web site is called DATABASE DEBUNKINGS (www.dbdebunk.com).  And after a recent presentation, I told him the audience obviously liked his talk, to which he replied “I must have done something wrong.”  Clearly he doesn’t suffer fools, so lest I demonstrate my own ignorance and incur his wrath, I enlisted some help for this interview from my colleagues Bob Seiner of TDAN.com and Craig Mullins of BMC Software and DBAzine.com.  Both gentlemen have an awareness of Fabian’s hot buttons.

Tony Shaw, Wilshire Conferences (Wilshire): Fabian, you’ve successfully positioned yourself as the curmudgeon of the database world.  Your opinions are highly likely to disagree with, and in many cases anger, vendors and the trade press.  This is rather an unusual marketing strategy for a consultant, yet you seem to relish the reputation.  Have you always had this contrarian personality or was there a pivotal moment of disillusionment that put you on this path?  

Fabian Pascal (FP): I’ve been called worse, which does not bother me. If telling the truth and relying on science--rather than uninformed personal opinions, poor reasoning, or regurgitating vendor press releases--invites anger from vested interests—which is expected--so be it. This is what marketing should be based on, it should be the rule, not the exception. The issue is not disagreement per se, but the basis for it: it’s OK to disagree, but you better reason and ground your position in knowledge. It is that which is lacking in the industry: people disagree for no good reason, they just don’t know what they’re talking about and, what is worse, they don’t want to know.  

The question is how you define success. There is little doubt that I would have been much better off financially had I not bucked the industry. But I must be able to look myself in the mirror, and I wouldn’t be able to do that had I been saying, like everybody else, that which is popular, but incorrect. So my measure of success is the degree to which I do not compromise on principles.

Wilshire: You are particularly critical of the corruption of the relational model by vendors.  And I’ve read where you’ve said there are no relational database management systems, but rather they are all SQL DBMSs.  Can you explain please?

FP: The facts is that a vast majority of professionals do not know/understand the relational model, and what is worse, they do not bother to learn it; but it does not stop them from criticizing it. If they really knew it, they would be able to compare the products to it and see what the discrepancies are, as well as their practical implications. Without such background it is very difficult to understand and appreciate what’s wrong.

I keep hearing “If the relational model is so good, why hasn’t it been implemented right yet?” But this is a copout. The answer is implicit in the question: If you don’t bother to educate yourself on the subject, why should you rely on somebody else to do the right thing and bring you the right solution, and how could you tell that’s indeed what it is? In that state I could sell you anything, and vendors do. Why should vendors bother, if their customers buy all the marketing nonsense--like “it’s theory, and therefore not practical”-- and whatever fad they come up with? If vendors or the press tell you that “post-relational” DBMSs are better than RDBMSs, when in reality they are old, nonrelational technologies, how can you figure it out if you don’t know what a RDBMS is, and what it’s supposed to do for you? It’s so much easier for vendors to sell to uninformed customers.

The relational model is simply the application of logic to database management. Whether practitioners are aware of it or not, whether they like it or not, databases are collections of predicates and DBMS are essentially logic inference engines. And they better like it, because logic is what guarantees correctness of the information recorded in databases, and the answers obtained from them (with correctness defined as consistency—internal, and with the business rules in effect). Can anybody say with a straight face that corrupting the foundation of database management is acceptable? Yet, this is precisely what they say when they ignore or violate relational principles. Do any of the proponents of other approaches or technologies really believe they can replace logic as a basis for database management, and what exactly do they propose to replace it with that is better? I have yet to get an answer to this question.

SQL itself, as well as its commercial implementations, failed to adhere to the model and violated it in a plethora of ways, all of which cause numerous practical problems. What is more, it is a poorly designed language, difficult to implement and, therefore, products suffer from serious implementation flaws on top of relational and language weaknesses. Ted Codd, Chris Date, David McGoveran, myself and others have amply documented this so anybody who is interested can find the information in our writings. It is generally thought that the problems are due to SQL products being relational, while in reality they are due to their not being relational enough. If logic is ignored, what do you think the consequences will be? It’s like building bridges ignoring the laws of physics.

Wilshire: Well I’m guessing you’re not on Larry Ellison’s Christmas card list. Yet, for the time being we’re stuck with the products that the vendors have built for us today.  So what advice do you have for practitioners who have the bulk of their data in SQL database systems?  What can they do to mitigate the potential problems and pitfalls you’ve identified?

FP: There is no magic solution; if products are bad, they’re bad. Learn data fundamentals and the relational model and assess technologies, products and practices accordingly (you can take my, or Chris Date’s seminars for that purpose :)). Know and understand what the deficiencies are and how to minimize their impact. Don’t rely exclusively on vendors, and do not believe anything you read in the trade press. Have your own, solid base of knowledge to evaluate things, not industry claims.

Without such knowledge there is no reason to assume that what practitioners and the industry are doing makes sense. I mean, consider: SQL DBMSs, object DBMSs, “universal DBMSs”, multivalue DBMSs, XML DBMSs--do we really need a new technology every few years? Why, if each is claimed to be the right solution? Do our informational needs fundamentally change so often? That in itself indicates something is wrong. It’s profitable for the industry, but a costly proposition for users.

Wilshire: Craig points out that you’ve written about a “True” Relational DBMS (TRDBMS). What would it take to build one?  And what would happen if some vendor ever did create one? How would that vendor help to make the TRDBMS successful; what would happen to that entire legacy SQL out there?

FP: It is difficult to develop and sell a TRDBMS, given the way in which the industry operates. The big vendors, like Microsoft, Oracle and IBM, are vested in their existing technologies, with large installed user bases; they are not likely to make fundamental changes. And a small company does not have the resources to compete with them.

Be that as it may, there are two startups that came up with some goods. One is Alphora, that implemented Dataphor, a product based on the proposals of Chris Date and Hugh Darwen in THE THIRD MANIFESTO. The other is Required Technologies, but unfortunately I cannot say much about this one for all sorts of reasons. Stay tuned to DATABASE DEBUNKINGS! There is little doubt that these are superior to SQL products (although they are forced, unfortunately, to support SQL) and much closer to the model, and those who tried Dataphor, recognized the benefits. But they require informed, educated users, who take risks, and I’m afraid that’s a scarce commodity.

It’s precisely because languages never die and make migrations very costly and difficult, that things should be done right in the first place. But the industry operates the wrong way: they keep forcing users to migrate and map constantly, instead of doing productive work. It’s really mindless.

Wilshire: Both Craig and Bob suggested I would get a strong reaction if I ask you about XML – both as a language, and the use of XML for data management. 

FP: XML was invented by text publishers, who had no knowledge of data management, purportedly for data exchange. But exchange requires a physical format, not a data model. First, there are tons of formats in the industry and any one could have been used, why invent yet another? And second, XML is actually a bad physical format for exchange; it is highly and unnecessarily inefficient, to the point where it is increasingly violated to get performance out if it.

Now they are adding a data model to it, to be able to do any data management (see Tags Do Not a Language Make) and, as Chris Date points out, the first thing they had to do to define their “model” was to discard the notion of an XML document as the fundamental data object! What can you conclude from this fact? The model they did come up with is the same hierarchic model which we discarded 30 years ago and replaced with SQL, because it was too complex, inflexible and lacked rigor. I call the whole insanity “The Exchange Tail and the Management Dog”, the title of my new seminar. Would such regressions be accepted if practitioners understood data fundamentals? No way.

Wilshire: There are obviously lots of people and companies that are pushing XML as a standard for data integration.  If XML is not the answer, what is?

FP: Whenever so many push something so hard, it is suspect: it smacks of yet another fad.

Integration is to IT what motherhood and apple pie is to American culture: everybody wants it and promises it. But very few really understand what it means. For example, they are talking about the “semantic web” and all that jazz. But XML came out with very little semantics-–no integrity and no manipulation—what kind of integration can you have based on that? And how comfortable should you be with semantics added post-hoc to a physical data format, and without a theoretical foundation?

For data management you need truly relational RDBMSs (TRDBMS). For data exchange you can have any efficient physical format, as long as it is agreed upon. But the industry has not ever been able to agree even on standard physical formats, how likely are they to agree on semantics, particularly something so complicated as the semantics of hierarchies? They can’t even come up with specifications because of this. When XML pushers talk about it, it is almost always structure, not the purpose of the structure, integrity and manipulation. And it’s there that complexities crop up. That’s what the relational model was devised to eliminate.

Wilshire: At the conference coming up in April (the DAMA International Symposium and Wilshire Meta-Data Conference in Orlando, April 27-May 1) you’re conducting a workshop called “The Dangerous Illusion: Normalization, Performance, Integrity and the Logical-Physical Confusion.”  What will you be talking about in the workshop?

FP: This is an excellent example of how ignorance of fundamentals can lead people astray. Most practitioners believe that relational design—normalization—is bad for performance. But this is, of course, logically impossible: how can logic, which governs the truth or falsehood of propositions about the real world, have anything to do with implementation details and performance? Practitioners denormalize for performance, without realizing that if they get any performance gains (and that is by no means certain), it does not come from denormalization, but actually from ignoring the integrity implications of that redesign. In fact, they trade unlikely performance gains for almost certain corruption, without being aware of it.

When I bring this up with practitioners, most do not know what I mean. Some, however, say that they are aware of it and that they took the necessary steps to prevent corruption. But I know this is not true. First, when I ask them what are the integrity constraints that were added for this purpose, they have no idea how to formulate them (for that you need to know and understand the relational model). And second, had they added those constraints, they would have realized the futility of denormalization as a performance enhancer, because they would not have gotten any gains! Nobody who understands the fundamentals would want to denormalize for performance. And they sure would not blame the model, rather than the products.

Wilshire: And then later on you’re doing a Night School session on “To Laugh or Cry: More Fundamental Fallacies in Database Management.”  What new observations do you have in store for us?

FP: The first part of the title, “To Laugh or to Cry?” says it all. I will discuss blatant examples from the trade media and industry practice, of the amount of prevalent ignorance, and the high cost that you can end up paying if you don’t know your fundamentals to see through it all. These are the “best” specimens from the tons I collect during the year, and it’s not easy to select them, believe me. I post others as weekly quotes at DATABASE DEBUNKINGS, but there are too many even for that.

Wilshire: Let me get your short takes on some of the “hot topics” in the data management community today.  What do you see is appropriate to “De-Bunk” about these?.

FP: I normally frown upon “sound bites”, which tend to contribute to the problems I’ve been referring to. These topics require time to discuss intelligently and meaningfully, particularly with an audience that is not strong on the fundamentals. So what I can do is provide links to writings on the subjects (either by Chris Date, or by me, or exchanges I had with readers).

a)       UML:

b)       Dimensional modeling:

c)       OO (or OODBMS – your choice)

d)       Business rules

e)       ORM (Object Role Modeling):

  • See c)

f)         Agile Methods

g) MySQL (the open source database)

Wilshire: I like this one--Bob suggested I ask you this question.  You’ve written widely about your admiration for the relational model.  But is there anything in the 30 years since then that you would consider to have been a positive “revelation”, or a contributing factor in the advancement of the database management industry?  Are there in fact ANY positive revelations or are they just, as you call them, fads?

FP: In science revelations are rare. Science advances slowly, gradually and carefully, via the efforts of many involved in thinking, writing, reviewing, discussing, testing, duplicating for validation, correcting errors, etc. And in this sense there has been considerable advancement. We know much more and understand better the relational model today, then we did in the 70’s; and we find new insights and benefits all the time, indication that it was indeed a revelation. Progress is clear from works such as Date and Darwen’s THE THIRD MANIFESTO and, with Lorentzos, TEMPORAL DATA AND THE RELATIONAL MODEL. But it’s hard to come up with revelations such as Codd’s, they don’t come that often anyway, let alone in a field that operates the way the database field does.

Only in industry and business can somebody wake up one morning and think he invented something new, without ever bothering to check and find out that it was tried and discarded in the past (like the hierarchic model). This is how “revelations” such as XML or “universal DBMS” come about, and they turn out to be nothing of the sort. In fact, in my editorials and lectures I demonstrate how even academia has renounced its educational and scientific functions and is rapidly becoming a certifying and training vehicle for vendors, so not much can be expected from that quarter.

And yet once in a while, against all odds, some revelation does occur, and the implementation technology invented by Required Technologies promises to be one. How it’s going to play in the industry, however, is a tossup. The past is littered with superior technologies, products and practices that failed because of the inefficient ways of the market.

Wilshire: Finally Fabian, any predictions on what’s next for database technology, or for the industry?

FP: You wouldn’t know it from the media, vendors and pundits, but to be honest, rather than do the usual optimistic closing, I must say that things are continually deteriorating. Practitioners today know much less fundamentals than previous generations, and if this trend continues, future generations will know still less. As I already stated, with all these fads coming and going, most of the time and effort goes into migrating from one fad to another, in mapping from one model to another, and in trying to make all the disparate technologies and acronyms work together. It’s a colossal waste of resources, but in the absence of knowledge and appreciation thereof—which not only is not rewarded, but punished—what’s there to stop it?

Let me close by quoting somebody much smarter than myself, who has recently passed away (thanks to Paul Vernon for bringing this quote to my attention). Note when he said it! Sound familiar?

"I hope very much that computing science at large will become more mature, as I am annoyed by two phenomena that both strike me as symptoms of immaturity.

The one is the widespread sensitivity to fads and fashions, and the wholesale adoption of buzzwords and even buzznotes. Write a paper promising salvation, make it a "structured" something or a "virtual" something, or "abstract", "distributed" or "higher-order" or "applicative" and you can almost be certain of having started a new cult.

The other one is the sensitivity to the market place, the unchallenged assumption that industrial products, just because they are there, become by their mere existence a topic worthy of scientific attention, no matter how grave the mistakes they embody. In the sixties the battle that was needed to prevent computing science from degenerating to "how to live with the 360" has been won, and "courses" -- usually "in depth"!-- about MVS or what have you are now confined to the not so respectable subculture of the commercial training circuit. But now we hear that the advent of the microprocessors is going to revolutionize computing science! I don't believe that, unless the chasing of dayflies is confused with doing research. A similar battle may be needed"

--E.W. Dijkstra, My hopes of computing science, 1979

Wilshire: Fabian, thank you. 



Join us for the
Wilshire Meta-Data Conference
and DAMA International Symposium

May 2-6, 2004 • Century Plaza Hotel • Los Angeles, California USA

The World's Largest Vendor-Neutral Data Management Conference

The 16th annual DAMA International Symposium and 8th annual Wilshire Meta-Data Conference will be held May 2-6, 2004 at the Century Plaza Hotel in Los Angeles, a beautiful venue adjacent to Beverly Hills. Hear 40 case studies outlining strategies of companies that have implemented successful data management projects. There will be more than 120 speakers in all, covering meta data, enterprise architecture, data and process modeling, unstructured data, business rules, data integration, XML, business intelligence, data warehousing, information stewardship, and more. Keynote Speaker Chris Date. Click here for details.


This "Data Discussions" is a series of interviews with leading data management experts and practitioners, presented by Wilshire Conferences. Click here for links to more Data Discussions interviews.

Click here to sign up to receive future editions.
For sponsorship information, contact Rick Froton at 603-305-0660.


©2003 Wilshire Conferences, Inc. May be forwarded to colleagues and quoted with full attribution.