Sponsored by:

Data Discussions is a series of interviews with leading data management experts and practitioners,
presented by Wilshire Conferences. Click here to sign up to receive future editions.
FORWARDING THIS NEWSLETTER TO YOUR COLLEAGUES IS ENCOURAGED.

October 16, 2003

Universal Data Models:
Simsion Interviews Silverston

Len Silverston
President
Universal Data Models, LLC
Graeme Simsion
Senior Fellow
University of Melbourne

This edition of Data Discussions brings together two of data modeling’s leading writers and educators. Len Silverston is one of the foremost proponents of, and authorities on, universal data models. His two books (The Data Model Resource Book, Volumes 1 & 2) are among the most widely read and recommended on the subject. Len’s questioner is Graeme Simsion – his Data Modeling Essentials (co-written with Graham Witt) is considered a seminal book the field. And as readers of Data Discussions will know, Graeme tends also to be a provocateur, recognized for his willingness to question conventional wisdom.

Both Graeme and Len will be speaking at the upcoming Enterprise Data Forum in Philadelphia, November 3-6. Len is teaching a tutorial on Implementing Universal Data Models to Integrate Data, and Graeme will be chairing “the Modeling Forum” – an entire conference track devoted to current issues in Data Modeling.

Putting these two gentlemen together seemed like a good way to delve into the topic of Universal Data Models, so let’s see if we were right…

Graeme Simsion (Simsion): Len, basics first…What exactly is a Universal Data Model?

Len Silverston (Silverston): Webster’s dictionary defines “universal” to mean “applying to a great variety of uses: comprehending, affecting or extending to the whole:” Therefore a Universal Data Model is a template or re-usable data model that is generally applicable and that can be used by a great number of organizations to save time and effort while offering holistic perspectives.

The idea of re-using common data models is more obvious than the other aspect of Universal Data Models which provides a holistic perspective. Data modelers (including myself) are sometimes focused on particular data requirements and may not always completely see the entire picture. This is where a “universal” view can help.

For instance, when building a product pricing data model, the modeler may model a PRODUCT PRICE COMPONENT entity not realizing that these price components apply not only to the base price of a product but also to other things, for example, discounts or surcharges that are based on geography, or by agreement, or based on the type of customer. Therefore setting up a more generic PRICE COMPONENT entity offers a more re-usable and holistic approach instead of having multiple entities to maintain pricing structures. Likewise, when developing a CRM application instead of adding fields to the CUSTOMER entity, Universal Data Models can offer alternatives illustrating that maybe the name or contact information should be associated with a PERSON, ORGANIZATION or PARTY. This way the party’s information is consistent when this same party is involved in another role, for example as a PROSPECT or WEB SITE VISITOR.

Universal Data Models include common data constructs applying to most organizations as well as industry specific data constructs. For example, common data constructs that apply to most organizations would include data models for information about people, organizations, roles, relationships between people and organizations, contact information, products, services, inventory, pricing, requirements, quotes, orders, agreements, shipments, projects, invoicing, payments, budgeting and accounting.

There are Universal Data Models for many industries that build upon these common constructs and offer additional extensions that may only be applicable to a certain industry. For example, a manufacturing Universal Data Model includes many of previously mentioned common data constructs but also includes additional data constructs such as design engineering models. Likewise, the insurance Universal Data Models includes additional common constructs for claims processing, which are actually an extension of the invoicing models since they both represent a request for reimbursement.

Additionally, there are also data warehouse Universal Data Models offering common ways of modeling data warehouse and star schema constructs for example regarding sales analysis, human resource analysis or financial analysis.


Simsion: How do clients respond to you having “ready made” answers before you’ve ascertained their requirements?

Silverston: I make it extremely clear to my client’s that I do not know their particular requirements before I arrive and that it is extremely important to keep a very open mind when collecting information requirements. However, by having a toolkit of many “best practice” data constructs for common data modeling problems, along with alternatives and pros and cons behind these alternatives, then data modelers (such as myself) can be much more prepared and data modeling efforts can be tremendously streamlined.

If a data modeler has available to him/her a set of Universal Data Models to re-use, then when a common requirement is stated, the template models can be applied. For example, if the data modeler discovers that they need to maintain customer demographics as well as various types of phone numbers, email addresses, and postal addresses for a customer, then they can apply Universal Data Models that have been through many iterations over the years and provide ideas about ways to best model these structures.

I have made many mistakes in implementing database and data warehouse designs over the years. As a result, I have continued to find better and better ways of modeling data and implementing databases. The Universal Data Models provide not only my experience but lessons from many very experienced professionals about effective methods for modeling common constructs. Mistakes in data models and in databases are one of the most costly aspects of systems development since they represent a foundational component of the system. It is critical to be aware of pitfalls in modeling common structures so we don’t keep on repeating the same mistakes!

Another benefit of having “ready made” modeling solutions is that questions can be asked proactively regarding possible information requirements. For example, if the subject data area is about customer relationship management, the template data structures can highlight possible areas to discuss such how non-solicitation requests are handled, whether preferred calling times are needed, or if there a need to maintain numerous last names, first names, or middle names for a person.


Simsion: Do you see your model for a given subject area as constituting the one right answer or only one possible answer?

Silverston: Yes, I have the only one right answer and that makes everyone else’s model wrong. I’m, of course, being facetious but we’ve all seen this type of attitude in data modeling efforts and it poses a disservice to our community. I believe the data management community is largely about integration, which involves working together – not proving each other “right” or “wrong”.

Pardon the tongue in cheek response, but my real answer is that the models I am providing are NOT the only one right answer for a subject data area, or even for a very specific data construct! In my opinion, there is no one right answer, especially when offering “universal” constructs that can be generally applied to different situations. In my books, I sometimes show alternatives to modeling various structures and point out the pros and cons of each. When consulting, I will often provide models that are not what I have in my Universal Data Model repository but they are variations of the Universal Data Models based upon the specific needs of the client.

I believe the debate over being “right” when it comes to modeling a specific data modeling construct can be very costly. I have been involved in many efforts where data modelers debate over the “right” way to model a particular construct. I was involved in one effort where the company spent 150 million dollars on a data model that was ultimately shelved and a great deal of this was spent arguing over which data modeler’s construct was “right” (by very experienced modelers).

I also believe that knowing and understanding various perspectives and possibilities is very powerful. When data modelers have differences of opinions, I will usually ask them to model the data requirements as they see it and very often several excellent models emerge. From these various alternatives, along with their pros and cons, an informed decision can be made.

I have attended Karen Lopez’s (moderator of the Data Modeling List) outstanding conference session entitled “Data Modeling Contention Issues”. She brings up various issues in data modeling (such as abstract versus specific modeling, use of surrogate keys, use of a conceptual data model, and even the idea of using template models) to a group of experienced modelers and has participants publicly rate their responses on a scale from 1 (strongly agree) versus 5 (strongly disagree). What I loved about attending this session is that even though there are near-religious debates about what participates believe is the “right” way, she constantly brings awareness that “the most successful discussions are ones where both sides learn something new about the others viewpoint”.

The question is not whether a Universal Data Model is the one “right” answer: the question is can I re-use a Universal Data Model construct that has successfully worked in other situations at other organizations, in this current situation? Does the Universal Data Model construct offer a useful perspective that I did not consider before I looked at it? How can this Universal Data Model save me time and effort in meeting a data requirement so I don’t have to re-invent effective models for modeling common constructs?

Simsion: How do your universal models compare with David Hay’s data model patterns?

Silverston: Again, the idea of having multiple perspectives is powerful. David Hay has provided valuable contributions to our field and has diligently and courageously offered his perspectives regarding ways to model common constructs in his books, articles, speaking and consulting. I believe that the basic idea behind “universal data models” or “data model patterns” is the same – offering possibilities about ways of modeling common constructs.

While we have both offered our data modeling perspectives on some of the same subject data areas such as parties, orders, accounting and manufacturing, we have each also provided more extensive focus on certain subject data areas. For instance (and not to be exhaustive), David has contributed valuable, extensive template data models regarding modeling metadata, business rules, and laboratory data models, while I have offered comprehensive models for data warehousing and many industries such as telecommunications, health care, insurance, professional services, financial services, travel, and e-commerce.

The first sentence of my book (“The Data Model Resource Book, Volume 1”) is “If you can see more of the whole, you are moving closer to the truth”. Therefore, the more perspectives that you can understand, the more possibilities exist for you and the better position you are in to make an effective, informed decision.


Simsion: What about industry models?

Silverston: Good point. There are many industry models that exist that are also good sources for ideas, perspectives, and effective ways of modeling industry constructs.

For example, I reviewed the health care HL7 model (which is publicly available on the web) and gained some insight and understanding about the type of data needed and the ways in which data could be modeled in the health care models. There are many industry models available such as the ARTS data model for retail or the CDISC data model for the pharmaceutical industry.

I would highly encourage using numerous sources for re-usable, proven data modeling constructs that are applicable to your task at hand. Why not take advantage of other people’s knowledge and offerings as opposed to re-inventing data constructs that have already been researched and modeled?

Simsion: Why do you publish only data models – shouldn’t we also have corresponding process models?

Silverston: Yes, there is a huge need for Universal Process Models! The functions of marketing, sales, order processing, logistics, accounting, budgeting, and many others are often common and Universal Process Models could help greatly save time and again offer additional perspectives.

I actually had a draft chapter dedicated to Universal Process Models in my last book and it was cut because of time constraints – I wanted to do my best on developing and updating the industry data models, which was a big enough challenge! However, publishing Universal Process Models is definitely on my “to do” list.

I also believe that we shouldn’t just stop there, but as a mature industry, we should have universal models for all cells in the Zachman framework. Why not have various template models for all aspects of systems development?

Simsion: Do universal models mean that we have less need for professional data modelers?

Silverston: No. In my opinion, for the foreseeable future, there is a huge need to have professional data modelers who can understand and model information requirements. Universal Data Models are not a substitute for this. They are designed to provide an effective toolkit for the professional data modeler.

Perhaps, if we as data modeling professionals have better tools and methods, then the demand for data modelers could increase as a result of increased effectiveness, proficiency and maturity in the data modeling field.

Simsion: How relevant are the universal models to an object oriented development project?

Silverston: Very relevant. Most of my clients implementing Universal Data Models are in an object oriented environment. Very often, the Universal Data Models serve as the foundation for a relational database and then an object oriented class structure is superimposed on the Universal Data Models to allow object oriented programmatic access. Sometimes a relational database is not even involved and the Universal Data Model (customized to the organization) is used as a basis for the object class structures in object oriented programs. Other clients use the Universal Data Models as a “universal” method for passing data, for example via XML.

Simsion: What are the main objections you’ve faced to the use of universal models?

Silverston: Some of the main objections are:

“Our organization is very unique - There is no such thing as a standard or universal model”

I usually respond that, yes, your organization is very unique, however the types of data that you capture are usually very common. There are many standard data requirements such as personal demographics, contact information, order, invoicing, web activity, accounting, etc. that are needed for most organizations.

“Generic modeling is a great theoretical idea that has no basis in reality”

The reality is that a great deal of small and large companies have successfully implemented these models. Additionally, many of these universal data constructs are now found in the latest versions of very popular software packages.

“They are too high level to add real value”

I will point out that while some universal data models have abstractions and generalization for integration purposes, there are many models that offer a significant level of valuable detail and research based upon real world experience.

“I can figure it out myself just as fast”

However, are there possible ideas or pitfalls that you may not have considered? And what about the time that it takes to document the models and train less experienced modelers on the rationale and explanation behind these standard constructs?

“I already have a model, so what good do these do me?”

Good that you have a model. Then using Universal Data Models can provide a checkpoint against your data model to ensure completeness and possibilities for alternative ways of modeling the information.

“We aren’t doing custom development - we are just implementing packages”

Most of my clients implement packages. The Universal Data Model is often used to show the integrated information requirements across these packages in order to manage consistent, complete, and accurate information across the enterprise. The packages offer a physical database design while the universal data models are a jump-start towards providing the enterprise’s information requirements.

“We are just primarily maintaining systems”

Universal Data Models are often used to evaluate new database changes as new needs emerge, thus providing possible paths toward more flexible, integrated systems.

“We are focusing on data warehousing”

I believe that data modeling is a fundamental step in developing data warehouses. Additionally, there are Universal Data Model constructs for common star schema designs.
Simsion: As more and more organizations choose to buy “off the shelf” software, are we living in the past in trying to improve data modeling practices?

Silverston: I don’t think that buying “off the shelf” packages eliminates the need to properly define an organization’s information requirements. As data models represent a statement of information requirements, I have used data models to help organizations evaluate and select suitable application packages. Furthermore, application packages usually offer a great amount of flexibility and in order for organizations to know how to best implement these packages, they really need to fully understand their information requirements. Finally and maybe most importantly, there exists a huge need to integrate and synchronize data from various packages since most organizations cannot fulfill their needs with a single application package. An enterprise data model is a very effective means for showing how information relates across these application packages as well as for building integrated enterprise-wide data stores.

Regarding improving data modeling practices, this is something that is critical. Our track record for data modeling has not been great. Many data modeling efforts have struggled because they have cost more and taken longer than the associated perceived business value. I believe the answer is not to stop doing data modeling (or to stop gathering information requirements) but that we need to do it better, in less time, with better tools and methods, such as using re-usable components as we do in other aspects of systems development.

Simsion: You’ve made the models available in electronic format: what do you think is the current position and future of CASE technology?

Silverston: Yes, I offer a Universal Data Model repository in the CASE, or data modeling tool, known as ERwin. CASE tools are vital in providing a practical facility to properly capture our data models.

There is a huge gap in the current CASE tools and the sophisticated needs regarding capturing information requirements. We need CASE tools with greater abilities to synchronize and map between various models so that we can properly maintain conceptual data models, business data models, logical data models, physical data models, object models, process models and any other models that make sense for capturing these requirements.

I believe the future of CASE technology is that it supports a need that won’t go away: the need to capture information requirements. While the form of the information capturing may take on many forms such as entity relationship modeling, object oriented modeling, UML or ORM, one of the most foundational elements of systems development is capturing information requirements. Thus, in my opinion, data modeling (in some form) and tools for data modeling (e.g., CASE) seem to be an essential aspect of quality information system delivery as we move into the future.

Simsion: Anything else I should have asked?

Silverston: You have covered the highlights with great, insightful questions Graeme. I appreciate your time and your amazing contributions in our field. Many thanks also to Tony Shaw from Wilshire Conferences for allowing me to share my perspectives here in Data Discussions.



Join us for the
Wilshire Meta-Data Conference
and DAMA International Symposium

May 2-6, 2004 • Century Plaza Hotel • Los Angeles, California USA

The World's Largest Vendor-Neutral Data Management Conference

The 16th annual DAMA International Symposium and 8th annual Wilshire Meta-Data Conference will be held May 2-6, 2004 at the Century Plaza Hotel in Los Angeles, a beautiful venue adjacent to Beverly Hills. Hear 40 case studies outlining strategies of companies that have implemented successful data management projects. There will be more than 120 speakers in all, covering meta data, enterprise architecture, data and process modeling, unstructured data, business rules, data integration, XML, business intelligence, data warehousing, information stewardship, and more. Keynote Speaker Chris Date. Click here for details.


This "Data Discussions" is a series of interviews with leading data management experts and practitioners, presented by Wilshire Conferences. Click here for links to more Data Discussions interviews.

Click here to sign up to receive future editions.
For sponsorship information, contact Rick Froton at 603-305-0660.


©2003 Wilshire Conferences, Inc. May be quoted with full attribution.