CONFERENCE TRIP REPORT
 
5th Annual Wilshire Meta-Data Conference  
& 13th Annual DAMA Symposium  
Hilton Anaheim · Anaheim, California · March 4-8, 2001
Hosted by Wilshire Conferences, Inc. & DAMA International

   

This report Compiled and Edited by 
Tony Shaw, Chairman, Wilshire Conferences, Inc. 

Contributing Trip Report Authors:  
A debt of gratitude, and congratulations on a huge task exceptionally well done,  
is owed by all conference attendees, to the work of the following individuals:
Davida Berger, Carey Clark, Arnie Hook, Ron Klein, Dale Kohlmoos,
Linda Kresl, Margaret O’Hara, David Plotkin, Anne Marie Smith

   

The Meta-Data Conference and DAMA Symposium were again co-located in 2001. The combined event drew an audience of over 1000 attendees and speakers. The exhibit floor included 35 companies showing the latest data management and development products. To receive more information about this conference, and related future events, go to http://www.wilshireconferences.com

This report contains summaries of the key discussions and conclusions from virtually all of the 60+ conference sessions, tutorials and workshops.

Reproduction policy:
This Conference Summary is intended for the use of the attendees at the 2001 Wilshire Meta-Data Conference and DAMA International Symposium.  As such, attendees may excerpt or reproduce any portion of the report for the purpose of sharing information with colleagues and within their own organizations.  Any other reproduction, publication or editing of the report is not permitted without the specific written authorization of Wilshire Conferences, Inc., including the placement of the report on other web sites.  However, links to the report on the Wilshire Conferences web site may be made without express permission.
 

©2001 Wilshire Conferences, Inc.  
"Meta-Data Conference" and the Meta-Data Conference logo are service marks of Wilshire Conferences, Inc. 

Join us next year…

   
The 6th Annual Wilshire Meta-Data Conference  
and the 14th Annual DAMA International Symposium
 
April 28 – May 2, 2002 · San Antonio, TX  
www.wilshireconferences.com


META-DATA CONFERENCE & DAMA INTERNATIONAL SYMPOSIUM 
March 4-8, 2001 – Anaheim, California

  Links Pages to Summaries Below

 

Sunday March 4 - WORKSHOPS

Half Day

W1:

Data Modeling Essentials: Things Have Changed

Graeme Simsion & Graham Witt, Simsion & Bowles

W2:

The Operational Data Store: An Evolution of the Data Warehouse

Jonathan Geiger

Braun Consulting, Inc.

W3:

Knowledge For Action: The Discipline of Spreading Knowledge

Robert S. Seiner

TDAN & CIBER, Inc.

W4:

Meta Data 101

Peter Aiken

Institute for Data Research

Virginia Commonwealth University

   

Monday, March 5 - TUTORIALS

Full Day

T1

Zen and the Art of Data Modeling

Alec Sharp

Damex Consulting

T2

Applying Quality Principles to Data Definition and Data ModelingLarry P. English

INFORMATION IMPACT International, Inc.

T3

Developing a High-Quality Data Resource to Support Information Needs

Michael Brackett

Data Resource Design & Remodeling

T4

Enterprise Architecture

John Zachman

Zachman International

T5

Building and Managing the Meta Data Repository

David Marco

Enterprise Warehouse Solutions

T6

XML for Data Management

Debbi Walsh & Hal Davis

XML Solutions

T7

Application and Data Integration Sridhar Iyengar

Unisys Corporation

T8

Data Architectures for Scalable E-Commerce

Michael Stonebraker

Cohera Corporation

 


 
Tuesday, March 6 – CONFERENCE SESSIONS

8:45-9:45  

KEYNOTE PRESENTATION: The Agile Organization, Tom DeMarco, Atlantic Systems Guild  

10:15 -11:15

C1

Pouring the Foundation for the Information Age: Data Architecture at USAA

Andres Perez

USAA

C2

Data Quality as a Profit Center

Wendy Wood

SBC Services

C3

Introduction to the Unified Modeling Language

Eric Naiburg

Rational Software Corporation

C4

Implementation/Use of Operational Meta Data to Improve Data Quality in the Data Warehouse

Michael Jennings

Hewitt Associates

C5

The Data Resource Repository

David Hay

Essential Strategies

C6

XML Without Fear

Alan Perkins

Visible Systems Corporation

11:25 - 12:25

C7

Data Management Support for Enterprise Architecture

Brett Champlin

Allstate Insurance Company

C8

Business Rule Specification, Validation and Transformation: Advanced Aspects  

Terry Halpin

Microsoft Corporation

C9

Business Processes and Logical Process Modeling

Anne Marie Smith

LaSalle University

C10

Redefining Meta Data Strategy in the 21st Century

Ron Klein

Carswell Thomson Professional Publishing

C11

Build Your Own Web-Based Meta Data Repository

Joseph Newcum

Bank One

C12

The Role of Data Administration in Managing an Enterprise Portal

Arvind Shah

Performance Development Corporation

1:45 - 2:45 

C13

Corporate Data Architecture in a Federated World

Deborah Henderson

Hydro One Networks Inc.

& Vladimir Pantic, IBM Global Services

C14

Facilitation and the Successful Architect

Shelley Lieberman

Mathtech

C15

The Practical Use of a Universal Data Model in the Data Warehouse

David Lepley

Tyco Electronics

 

C16

Understanding and Managing Reference Data

Malcolm Chisholm

Deloitte & Touche

C17

Architecting and Implementing a Web-Based Corporate Meta Data Repository (CMR) at the Census Bureau

Gail Wright

Oracle Corporation

C18

Building the XML Meta Data Repository

David Plotkin

Longs Drugs

3:15 - 4:15 

C19

The 7 Deadly Sins of CRM

Jill Dyche

Baseline Consulting Group

C20

Elevating the Role of Information Resource Management for Business Effectiveness

Larry P. English

INFORMATION IMPACT International, Inc.

C21

PANEL: Comparison of Modeling Techniques

Graham Witt

Alec Sharp

Terry Halpin

Eric Naiburg

 

C22

Meta Data - Myth and Realities

John Ladley

Knowledge InterSpace, Inc.

C23

The UPS Meta Data Repository - A Success Story: Taking the Next Steps

Patti Munier

United Parcel Service

C24

Universal Data Models for Web Information Management

Len Silverston

Universal Data Models

   

Wednesday, March 7 – CONFERENCE SESSIONS

8:00 - 9:00  

KEYNOTE & DAMA Achievement Award: The ER Model, XML and the Web, Peter Chen

9:30 - 10:30  

C25

Business Information Management at J&J

Larry Dziedzic

Johnson & Johnson

C26

Measuring The Quality of Models

Peter A. McDougall

Insurance Corporation of British Columbia

C27

Building an Enterprise Data Warehouse: The Delta Airlines Story

Brent Lautenschlegar

Reflection Technology Corporation

C28

Data - The Good, The Bad, and the Ugly

Is Meta Data the Way to Knowledge Management?

Gil Laware

Purdue University

Frank Kowalkowski

Knowledge Consultants, Inc.

C29

Meta Data Standards at Object Management Group

Andrew Watson, OMG

C30

Embracing XML - Strategic Implications for Data Administrators/Architects

Peter Aiken

Institute for Data Research

Virginia Commonwealth University

10:40 – 11:40  

C31

Enterprise Data Management without an Enterprise Data Model: Working in the Real World

Sheri Dumire-Hamilton

Kodak

C32

How do You Convince Management to Fund Your Proposal?

David Davis

Bank One

C33

Data Warehouse Project Planning

Sid Adelman

Sid Adelman & Associates

C34

Meta Directories vs. Meta Data Repositories

James Jonas

Oracle Corporation

C35

Ramping up for Meta Data and Knowledge Management

Don Soulsby

Computer Associates

 

C36

Building the Scalable Data "E-frastructure"

Tim McBreen

Knightsbridge Solutions

1:00 – 2:00 

C37

Data Architecture on a Shoestring

Becky Kirkpatrick

Union Pacific Technologies

C38

Mapping the UML to the Zachman Framework

Neal Fishman

Equifax

C39

Managing Customer Information for CRM

Danette McGilvray

Agilent Technologies

C40

Just In Time Meta Data

Bob Carasik

Wells Fargo Bank

C41

Architectures for Marrying Online Applications with Information Repositories

Faisal Shah

Knightsbridge Solutions

C42

Getting the Rest of Your Organization Ready for XML

Korki Whitaker

Progressive Insurance

2:30 – 3:30 

C43

Data Modeling Contentious Issues

Karen Lopez

InfoAdvisors, Inc.

C44

Data Stewardship - Fact or Fiction?

Diana C. Young

Applied Information Strategies

C45

How to Make Your Business Processes Smarter

Ronald G. Ross

Business Rule Solutions

C46

Meta-Architecture and Enterprise Meta Data Management

E. Manning Butterworth

Reynolds & Reynolds

C47

E-Business Chaos: Protecting Yourself Against Problem Imported Data

Michael Scofield

Experian

C48

Same Old Work, New Dilemma: A New Approach to Data Design for Interactive Web Portal Applications

Ho-Chun Ho

PointandQuote.com

   

Thursday, March 8 – CONFERENCE SESSIONS

8:30 – 9:30 

C49

Enterprise Information Architecture: "Starter Kit" Models

Jane Carbone

DATANOMICS, Inc.

C50

Data Standardization

Michael Gorman

Whitemarsh Information Systems

C51

Conceptual Data Modeling in an Object-Oriented Process

Scot Becker

InConcept, Inc.

C52

A Success Story: Enterprise Customer Meta Data Definition/Implementation

Barbara Peterson

Agilent Technologies

C53

eRepository for eBusiness

Warren Selkow

Consultant

C54

Web Usage Mining and Analysis

Patricia Klauer,

&

Robert Cooley,

Apex Solutions, Inc

9:50 - 10:50

C55

Action Business Rules – Getting to Yes

Judi Reeder

Consultant

 

C56

Enterprise Model In Action

Natalie Arsenault

First Union National Bank

C57

Meta Data in the Trenches

Dave Buch,

Capital One

C58

Synchronizing Your Operational Systems with Your Enterprise Information Portal (EIP) using Meta Data Management

Joe Danielewicz

Motorola, SPS

C59

PANEL: New Trends in Meta Data

Robert Seiner, TDAN and CIBER (moderator)

Don Soulsby

Computer Associates

James Jonas, Oracle

 

C60

OMG CWM - An Architecture for Enterprisewide E-Business Intelligence Integration

Sridhar Iyengar

Unisys

11:10-12:30

CLOSING KEYNOTE PANEL: Data Management – Where to From Here?  
Graeme Simsion (Moderator), John Zachman, Michael Brackett, Ronald Ross, Andres Perez

 

 

 

SUNDAY WORKSHOPS  
March 4, 2001

 

Workshop 1

Speakers:

Data Modeling Essentials:

Things Have Changed

 

Graeme Simsion and Graham Witt

Simsion Bowles & Associates

 

Summary by Carey Clark

In an industry known for its hype and self importance, the two Grahams sparkle for their self deprecating honesty. They speak their mind and welcome rebuttal. Controversy is a good thing and not to be avoided.

 Graeme’s first book was based on the still held contention that data modeling is design not analysis, and not simply an act of discovery. Hence there can be different workable data models based on the same requirements documents. Data modeling is more than a skill, it is a discipline, and deserves to be valued as such.

What’s different now than when his first book was published?

Object Orientation has not made conventional data modeling obsolete. It’s great for some software projects but can create more headaches than help for “persistent data” applications. Typical pitfalls:

UML is not their preferred modeling nomenclature for a number of reasons:

Data Types are now more complex and include more user-defined data. Spatial, video, audio and image data deserve their own type. One must resist the habit of converting data into characters or numbers when a richer data type makes sense. For example address can be its own data type and treated as a single thing.

Derived data needs to be modeled and defined even when they won’t be stored in a database.

Business rules need to be captured. How they are stored depends on their volatility. All such rules are subject to challenge. It is the modeler’s responsibility to suggest changing existing rules when they don’t make data modeling or business sense.

In some circumstances one must allow a rule to be broken. The problem arises when there is a need to collect data that doesn’t comply with a rule.

Naming data is extremely important. When names don’t denote what they mean the ambiguity becomes widespread. This is even truer with XML and the increased interaction with other businesses. When there is an industry naming standard (i.e. XML) it is best to go with it. Even if not optimum, it beats having to translate more than necessary.

Data modelers need to be involved with the use of their models after they are completed. It is not uncommon to see developers ignore them or misuse them. The effort of many hours of confirmation can be jettisoned when a developer assumes a mistake and simply overrules the model.

Meta data needs to be available to everyone who needs to see it: Users, process modelers, and developers. If its not used it doesn’t add value. “You can’t have data quality without meta data quality.”

In one survey they found that a good percentage of decisions the data modeler felt was theirs to make, the data administrators thought they should make. It behooves the two functions to reach agreement on responsibilities.

Although there wasn’t time to go into detail they touched on how to present large data models (i.e. corporate data models) to executives. The consensus there was to break the model up into small chunks. The whole model tends to bewilder the uninitiated.

They presented a process diagram of the data modeling process. It was realistic and useful.

 

Workshop 2

Speaker

The Operational Data Store: An Evolution of the Data Warehouse

 

Jonathan Geiger, Vice President, Braun Consulting, Inc.

Summary by Dale Kohlmoos

Jon Geiger gave a three-hour presentation on an operational data store (ODS) designed for the tactical analysis of subject-oriented data. Jon mapped out the essential steps that should be taken in the development of an ODS in an integrated enterprise system.

He began with a high level evaluation of enterprise systems and described where the ODS could be positioned for optimal use as a tactical tool for analysis. The ODS was presented as a tool used to complement a warehouse and its associated data marts. The intent of the ODS is the tactical execution of the strategies identified in the warehouse. In order to accomplish this, he described the ODS as demanding a high degree of query performance and availability.

Characteristics of the ODS are that it is subject-oriented, integrated, current and volatile. It is intended to be a central point of data integration for business management. This view was further broken down into four classes. Each class was described by the update frequency, degree of integration, transformation and summarization.

ETL tools play a significant role in the management of the ODS and were described to be an architectural consideration. The high level of integration, transformation and summarization preclude most other forms of loading.

Jon introduced the concept of Oper-Marts or ODS Data Marts. Much like the familiar OLAP reporting cubes, summary tables and small star schemas. The Oper-Marts being frequently rebuilt because they only reflect data at a specific point in time and lag behind the ODS data update.

ODS data model examples and aspects of tuning and scheduling were presented to help give the audience a good background for consideration of an ODS implementation. From there Jon went into overall architectural considerations with respect to e-Business, CRM, Finance and Insurance.

The methodology for implementation included examples from project management, design phases, project phases, project definition, process definition, process modeling, deployment and all the associated deliverables. Good examples were given to demonstrate what needs to be considered to drive a successful implementation of an ODS.

Last but-not-least, Jon reviewed data quality issues and expectations for an ODS. Much the same as what is seen throughout the enterprise, but with suggestions for when and where those issues may be caught and cleaned up. That was with a look at the impact on the tactical analysis performed on ODS data.

 

Workshop 3

Speaker

Knowledge for Action:

Creating Competitive Advantage through Knowledge Management

 

Robert Seiner, Publisher, TDAN

and BI/DW Director & Principal, CIBER, Inc.

 

Summary by Margaret O’Hara

Using the example of a grocery chain opening a new store, Bob Seiner stepped the audience through the process of creating competitive advantage through knowledge management (KM). After noting that this was the first workshop on Knowledge Management presented at a Meta-Data/DAMA-I conference, Seiner state that there was a logical progression from managing data to managing info to managing knowledge.

Seiner first defined KM as the discipline of spreading the knowledge of individuals and groups across the organization in ways that directly affect performance. He emphasized that the spread of knowledge was critical, as knowledge cannot be helpful unless it is shared. The vision on KM is that the right information – in the correct format – gets to the right person, at the right time, for the right business purpose.

The amount of information being produced annually is approximately 250 MB per every person on the planet – and that this amount is expected to increase. Thus, managing the knowledge is a daunting task for all organizations.

To set the stage, Seiner offered the first of his many store-opening examples. As part of a KM project, he interviewed employees from two recently opened stores in the grocery chain. Employees in Store #1 reported significant problems with one aspect of the opening – receiving deliveries. Store managers solved the problem and learned to manage its deliveries. When interviewing employees in Store #2, he discovered that they had experienced the same problem, but because the first store had not shared its knowledge, Store #2 went through the same painful process of solving the problem.

The first business impact of Knowledge for Action is that information is provided 24x7 in a customizable and detailed view to everyone who needs it. Thus, knowledge is recorded (i.e., becomes an artifact). Moreover, the knowledge is well managed and employees learn from past decisions. There exists a sharing of best practices and innovation. Most importantly, a KM program reduces the risk from attrition. To sell KM initiatives to senior management, Seiner recommends you start small and focus on investment rather than costs (i.e., on the payback of the project).

KM project planning should start with executive business sponsorship and should involve a knowledge audit. Audits employ both qualitative and quantitative assessments as well as a readiness assessment. Scoping sessions are important: Seiner recommends starting with “a slice of a slice of the pie”, and identifying the “most ready” of all documented knowledge. Questions to ask within the organization during the audit include:

Seiner also stressed that changing behavior was important to the process and suggested that accountability for knowledge become part of peoples’ jobs – in most cases being written into the job descriptions.

Building a Knowledge Factory was next on the agenda. This involves first identifying the artifacts (recorded pieces of knowledge) and determining where to store them. Samples of artifacts include SOPs, checklists, event planning schedules, etc. Key to the process is an effective stewardship program. Knowledge for the factory can be collected in a variety of ways, including threaded discussions, virtual meetings, emails and on-line chats. Seiner recommends that the organization step slowly through the process, first carefully planning the factory, then building a prototype, and finally piloting the project. Keeping the scope narrow is also important.

To develop the Enterprise Knowledge Platform, Seiner suggests careful assessments, performing the knowledge audit, planning for the short, intermediate and long-term, creating the employee portal, and making the standard build vs. buy decision.

Although time ran short in the presentation due to the number of questions and comments from the very involved audience, Seiner did have time to stress that the knowledge portal was not the only consideration in KM. While the portal’s functional design, graphic capabilities and degree of personalization were important, the portal is only a starting point web site where employees can enter, find and access knowledge.

 

Workshop 4

Speaker

Meta Data 101

 

Peter Aiken, Institute for Data Research,

Virginia Commonwealth University

Summary by Linda Kresl

Peter is a proponent of meta data management. He began the presentation by pointing out that meta data isn’t a very accurate term. Many IT and business managers don’t understand the importance of meta data. Many managers ask why do meta data? Meta data is one of the most important activities within Data Resource Management. Another definition for meta data is data resource data. Meta data is data describing business processes both technical and business related.

Deriving a legacy architecture is a major reason to create meta data. Every system has architecture however poor or rich. Meta data is the language of the architecture, it is how we understand and articulate the architecture. Meta data describing system data can be considered as a multidimensional data. A lack of meta data is the primary reason for re-engineering failures.

A data model is an excellent place to begin the process of meta data creation and definition. A model depicts the data implementation, data design or system data requirements. Meta data engineering and data re-engineering are inextricably linked. What is a meta data data model? A data model that describes or characterizes system components, not business data. Tools that reverse engineer meta data: SAS, Evoke (best used if the organization doesn’t have a data model). These tools have a built in QA function.

Meta data Engineering:

A meta data model is the key to quickly implement data conversions, understanding business processes, and gaining knowledge about packages (PeopleSoft, SAP, etc.)


TUTORIALS  
Monday, March 5, 2001

 

Tutorial 1

Speaker

Zen and the Art of Data Modeling

 

Alec Sharp, Founder,

Damex Consulting

 

Summary by Arnie Hook

Alec teaches the outline and guidelines for good practices to arrive at the data model that satisfies business needs. He imbeds humor to establish a point and keep the audience involved with his inspirational messages. The analyst must be able and willing to do a variety of things in order to arrive at the appropriate data model.

Alec’s Messages: Design the content to fit your needs. Extend and communicate the use of data management. Communicate across the business and the objectives of the design practices. Reverse engineering to the blank page. What is the direction?

Level set –agree on the basics. Consistency is key to success. There are many ways to describe a business. What the business needs information about: the data model. The data model is a non-technical description of the business not a database. The model must be maintained at all levels.

Level set to the 3 types of data model:

Do not violate the four ‘Ds’ of modeling:

Alec describes the ‘facilitated session’ process to analyze the business requirement. The technique ensures consistency and scope to the objectives. Make an agenda and schedule for each subject session. Participants need to understand their roles and responsibilities (‘establish the behavior contract’) for each session. Alec coaches a ‘bus tour’ recipe to facilitate for a correct model.

The last step is to review with ‘rhetorical context’. Know the audience, occasion, and purpose. Then answer the data questions with a storyboard format.

Alec takes the attendees through the course to practice modeling principles and techniques for each level of analysis. The tutorial presented a great workshop for the novice or expert. Even if you know it all, this tutorial should be on your list.

 

Tutorial 2

Speaker

Applying Quality Principles to Data Definition and Data Modeling

 

Larry English, President,

INFORMATION IMPACT International

 

Summary by Margaret O’Hara

The premise of English’s presentation was that since information is the product of a process, Demmings’ quality principles can be applied to develop Information Quality. English defines information quality as “consistently meeting knowledge worker and end-customer expectations through information and information services”. This involves quality of data definitions, data content and data presentation. English offered the following as an example of poor data quality:

            Data Element:           Payment Date

            Definition:                   Date of Payment

As in this example, very often the stated definitions for data elements are too vague to be of much use to the organization. Does this date refer to the date the check received, the date it was written, the date the monies were credited, or the date the transaction was entered?

The benefits of information are that work processes are transformed and that clerical workers are “informated” (i.e., they become knowledge workers). All too often, knowledge workers either use data for something other than it was defined or have no idea that anyone else in the organization is using the same data. An IQ initiative can help avoid these problems.

English proposes that we eliminate the word “user” from our vocabulary and instead describe those employees who use information in their jobs as

English set forth several quality principles. These involve a customer focus, process improvement, scientific methods and management accountability. Most organizations do not hold managers responsible for the information their departments generate. English spent considerable time explaining Kaizen (the art of continuous improvement) and its application to the Information Resource Management area.

Principle #1: Create a constancy of purpose for improvement of the information product and service. Since the obligation to the customer never ceases, information quality ramifications are that the IRM mission and objectives are defined to include total quality for both its services and products, develop plans with both long and short term deliverables that support strategic business objectives.

Principle #2: Adopt the new philosophy of Quality Information Management that will transform both the business and IS management. The quality information philosophy means reliable information management and shared information to reduce costs.

English next focused on how to assure data definition quality. He believes that instead of data documentation we should engage in data definitions that would state precisely the meaning of words. He stressed that the definition should not be more difficult to understand that the word it defines. English also feels that we should avoid the term “meta data” except in technical forums. The Knowledge Worker (not the user!) will better understand the phrase Information Product Specification (IPC). An IPC is a detailed, exact statement of particulars. Among the goals of data definition are (1) to enhance communication assuring that the transmitted information, thoughts and feelings so that it is satisfactorily received and understood and (2) to increase productivity.

English then presented the concept of Total Quality data Management (TQdM), which will establish the Information Quality Environment. He proposes that TQdM is not a program but instead a value system and habit of continuous improvement of both application and data development processes and business processes. English illustrated the TQdM process using a data flow diagram. The steps in establishing the IQ environment are:

Process

Output

Assess the data definition & IQ architecture quality

Data definition quality assessment

Assess Information Quality

Information Quality Assessment

Measure Non-Quality Information Costs

Information Value / Cost Analysis

Reengineer and Cleanse Data

Corrected Data

Improve Information Process Quality

Information Process Improvements

English discussed data definition quality characteristics such as conformance to meaningful enterprise standards, consistency of data names, and complete domain values with definition. He also stressed the importance of data standards quality, including such issues as enterprise wide guidelines, meaningful abbreviations and complete, precise, non-overlapping class words. English illustrated the importance of determining all definitions of a word with the business term “volume”. He presented three diverse definitions of the word, each used by a different business segment.

After giving several examples of data definitions and business rules that illustrated high and low quality, English had the audience assess a specific attribute definition using a Data Definition Quality / Usefulness Assessment Form. Working In small groups, the attendees assessed one attribute definition. This brief exercise generated much discussion, which demonstrates the complexity in achieving even one small part of information quality.

English then presented the basics of Information Architecture (IA) quality and suggested guidelines for achieving a high quality architecture. Such architectures are characterized by completeness, stability, and flexibility. Moreover, these architectures can be reused with a minimal degree of modification. “A well-defined architecture supports tomorrow’s business needs as well as today’s”.

English then described the TQdM process #5: Improving Information Process Quality by presenting the Quality, Time, Money triangle. Essentially, maximizing any one of the three points means the other two will suffer. Typically, an organization can achieve two, but not three of the objectives.

Toward the end of the day, English provided metrics to measure information quality, stating that choosing the lowest price alternative may result in the costliest action. He believes that organizations – instead of asking for a cost/benefit analysis of “shared” DBs and enterprise data modeling -- should ask what the cost is of redundant applications as well as the cost of change requests to the original product specifications. He reminded the audience once again that Total Quality data Management is not a program; it is a value system, mind set and habit of continuous improvement.

Tutorial 3

Speaker

Developing a High-Quality Data Resource to Support Information Needs

 

Michael Brackett, Consulting Data Architect, Data Resource Design & Remodeling

Summary by Dale Kohlmoos

Michael Brackett gave a full day presentation that addressed a lot of the commonly experienced limitations of our current data resources. He discussed how we can turn those limitations around for more refined data resources that could better meet information demands.

He reviewed and discussed current data situations, data resource concepts, resolving data disparity and cultural considerations. The current data situation is that disparate data is a truism. The result of this disparate data is the inability to integrate data to meet the information demand. He described four basic data problems that are commonly seen throughout most organizations:    

The demand for integrated data to support business needs is high, yet disparate data continues to be produced at a rapidly increasing pace. Mr. Brackett described the current status quo as potentially leading the organization to failure due to information deprivation. An emphasis was placed on the notion that it’s not our tools that understand technology, nor do they automate understanding, but that people are the key and tools support people.

Mr. Bracket discussed the structure of the Business Intelligence Value Chain and noted that the data resource is the foundation of all the other structures. This is a sobering reminder that we all need to revisit every so often. Mr. Brackett also brought to mind the debate on whether the data resource is considered an asset or a resource.

Further discussion reviewed data architecture and the corresponding position of the data resource within that architecture. From within that architectural perspective, Mr. Brackett identified ways and means to both halt and resolve existing data disparity. From there, the session delved into detailed examples, principles, and practices for refining data definitions, data structures, data integrity, data documentation, data orientation, data availability, data responsibility, data vision, and data recognition.

The next step was to discuss the data resource transition and how to implement better practices. Not to mention, the cultural considerations that would have to be addressed to make it happen.

Mr. Brackett concluded his presentation by demonstrating that there is no “silver-bullet.” The techniques are available and that it is time to develop a high-quality data resource that can meet the information demands of each organization.

Tutorial 4

Speaker

Enterprise Architecture

 

John Zachman, President,

Zachman International

 

No summary is available for this tutorial

 

Tutorial 5

Speaker

Building and Managing the Meta Data Repository

 

David Marco, President,

Enterprise Warehouse Solutions

 

Summary by Carey Clark

 

David Marco’s presentation was aimed at those new to the meta data imperative and included sections on basic meta data terms, definitions, concepts and justifications. He also makes the case for treating repository creation as a project and to use formal project management methods. The presentation is drawn from David’s book by the same title.

It is important to relate and document the business benefit of the repository. This benefit is usually to increase revenue or reduce costs. Repositories need to be built iteratively with value added at intermediate stages.

David likes to put data quality in the repository rather than in the data warehouse because more people can get to it and can be related to more systems.

He estimates that 35% of the IT budgets are spent on integration. His experience is that a company’s data will double every 4 years. Hence the need to manage this data is critical to effective growth.

He separated meta data into business related and technical related areas. Most of what one audience needs to see, the other audience doesn’t.

A lot of his projects are aimed at the data warehouse construction. They deal mostly with extraction, translation, load (ETL) activities rather than business names, definitions and their maintenance.

His list of MUSTS includes:

 David discussed various repository products and their pros and cons. He emphasized that there isn’t a “best one” because it depends on your requirements, your in-house skill set and your budget. Again it is important that you document the purpose and benefit of the repository before look at them.

Be sure that whatever repository you buy (or build) can import from your existing sources. e.g. ER/WIN, Meta Designer, Oracle etc.

David uses a classic decision matrix for determining the best tool. Each requirement has an importance, a complexity (=cost). Each tool is then matched against this matrix.

 

Tutorial 6

Speaker

XML for Data Management

 

Debbi Walsh, Technical Director, &

Hal Davis, Consultant,

XMLSolutions

Summary by David Plotkin

Introduction and Business Case

The tutorial began with a brief introduction of what XML is, including an intuitive diagramming technique for showing how XML labels data – giving it more meaning and making it more understandable than a simple flat file. The design goals of XML were reviewed, giving us a good idea of the reasons why we might want to introduce XML into our organizations. As part of this justification, a series of business scenarios were presented, and in each case the advantages that XML provides were made clear.

Documents and Structure

The tutorial continued with the definition of the rules for creating a "well-formed" XML document, including the single root element, proper element nesting, quoting of attribute values, and the naming conventions.

Validation of an XML document can take place – either via the well-accepted DTDs, or the newer, and more powerful XML Schemas. The syntax for defining DTDs was discussed, including the details of processing instructions, the XML declaration, elements, attributes, and comments. The different types of elements were covered, such as text, empty, mixed, and element (a content model that consists of sub elements). The different types of attributes were also covered, as well how to declare optionality and cardinality. Namespaces (for reusing element names) were covered with examples. XML Schemas were discussed in significant detail, including simple and complex data types, and declaring your own data types. In addition, the reuse aspects of XML Schemas (one of the primary advantages of XML Schemas) were shown.

After discussing how to build validation documents, the details of connecting a DTD to an XML document for use by a validating parser was covered. In addition, general and parameter entities (both internal and external) were covered with an excellent and concise chart.

RDF

The presenters covered RDF, although it was somewhat difficult to see the application of RDF in the context of XML. There are some similarities, but not strong ones.

Transformations

One of the most useful parts of the whole presentation was the section on transformations. Using XSLT (.xsl), Hal put on a demonstration of displaying an XML document using a style sheet in XML, and showed how the entire "look" of the document could be changed by changing the associated style sheet. He also demonstrated how the XML document could be converted into another form – be it another XML document, a plain text file, or whatever. The presentation covered the exact flow of how the XML content was converted, including using a parser, and even included a brief rundown of some of the more common XSLT commands. He also covered XSL (.fo) for applying formatting to convert the output of XSLT to PDF, HTML, or printer output.

The parser uses either DOM or SAX, and Hal covered the advantages and disadvantages of both types of parser. DOM needs more memory and is not as quick as SAX, but since it maintains the "tree" in memory, it is possible for the program using the parser output to navigate the nodes of the tree more freely.

Resources

XML has a considerable number of resources available – standards, products, and information on the internet. Hal and Debbi briefly covered these topics, and provided a CD that contained all of the XML standards being considered. They were less thorough with the editors, databases, transformation tools, and servers that are available today, merely stating that there were plenty of choices.

Data Management/Schema Design

The last two sections briefly covered two topics of considerable importance to Data Administrators getting involved with XML. The first are the challenges that we face in managing these new flavors of schema, and this whole new environment. They provided some recommendations on managing names, accuracy and descriptiveness, and modularity and reuse. There ARE industry standards emerging, and where possible, it is a good idea to try and use the common schemas for an industry. Finally, they covered what you should be concerned with when trying to manage your schemas centrally, including the ability to browse, do impact analysis, impose good design practices, dynamically access and generate schemas, and import and export schemas from various sources.



Tutorial 7

Speaker

Application and Data Integration:

Implementing Model Driven Architectures (using OMG CWM, XMI and MOF)

Sridhar Iyengar, Chair, OMG Object Analysis & Design Task Force &

Unisys Fellow, Unisys Corporation

 

Summary by Ron Klein

Sridhar’s insights and knowledge contributed greatly to our awareness of what is coming in the standards area. His tutorial presentation included discussion of various evolving OMG (Object Management Group) standards, models and protocols, such as:

CWM – Common Warehouse Metamodel

UML – Unified Modeling Lamguage

XMI – XML Meta data Interchange

MOF – Meta Object Facility

Much of the discussion was driven by questions from the audience, hence this summary draws substantially on those questions.

Quick history of OMG: founded 1989, now more than 800 vendors.

1991 - CORBA 1.0  
1995 - CORBA 2.0  
1997 - MOF and UML  
1999 - XMI and CORBA Components  
2000 - CWM, XML.Value, EDOC (Enterprise Distributed Computing), XMI for XML Schema  
2001 - UML for EDOC, UML 2.0, Better XML and E-Business integration

OMG is broadening the scope of technologies moving to Model Driven Architecture. It is targeting middleware technologies in the data management and application development realms.

The Meta Data Coalition (MDC) merged into OMG during 2000. CWM became the common standard last June (2000) and had a revision published last week (Feb 26, 2001) based on vendor experiences.

The Data Integration Problem

-         Emerging XML issues include new XML data types, integrating XML with middleware technologies and into core database technologies.

-         The Internet is driving us from small to large databases.

-         The transformation of information from one technology to another leads us to CWM as a solution.

What is needed to solving the Integration Problem?

-         Meta data becomes more and more important.

-         Moving to XML APIs.

-         New APIs such as JMI, JOLAP, JDATAMINING

-         SOAP Developmenter: marries HTTP/XML

E-‘Muddleware’ Architect’s Dilemma

-         What is the data exchange protocol?

-         Ignore the middleware when you are doing Design and Analysis, use Mapping techniques.

-         Integration at higher level is as important as in lower levels.

-         XML won’t solve all the problems! Others will not go away.

AUDIENCE QUESTION (Q): What is your definition of components?  
INSTRUCTOR ANSWER (A): Pieces of a program with interfaces that have been captured.

SPE – Software Process Engineering: Best practices forming Objects for life cycle – an extension of UML. (IBM, Rational Rose and others)  

Q: What about Open Process?  
A: Not involved with SPE but it is with UML 2.0.


Q: Data Structure?  
A: There is no model that fits it all! UML can define what the data structures are. It addresses the static part of it. If you can represent your legacy in UML then you can use XML.

Q: What about Workflow Management?  
A: Activity diagram is included in UML, State Machines.

Q: What about Batch File Model?  
A: Look at CWM model, it is more focused on extraction and transformation. UML is weak here.

Model the Data, Model the Application, and Model the Interface

Every three years comes a new protocol. The guts of business rules change very slowly because they are abstract concepts of the business. It is fundamental to focus in your business.

Enterprise Portals are in a rudimentary stage now. The elements are already in CWM. Integration technology brings process, content, application. It is not Data or Process or Presentation integration but all of them.

Work together with common shared metamodels

- There is more & more meta data lurking everywhere!

- There are specific meta data to manage the DW in CWM. More clear, more easy to use and represent meta data.

IDA – Enterprise Modeling from OMG

OMG Modeling and Meta data Framework

Modeling Concepts:

- Platform Independent Model (PIM)

- Platform Specific Model (PSM) Meta data technology

- Mappings from Independent Model to Platform Specific Model

1)     Create concrete mapping from neutral to specific through data model and rules

2)     UML profiles: AD going from neutral to specific (UML-> C++, Smalltalk, JAVA)

Q: Which one is the META META model?  
A: MOF Meta Meta Model, it is a subset of UML.

Q: What about legacy ER with UML -> Use Case?  
A: When you deal with data a bridge is needed. Work is on going to map UML and ER. CA, Rational, Sybase are supporting. You need to make decisions to map models. There is a mismatch. CWM includes UML, ER and Transformation Model. The heart is MOF.

Q: What tool are you using to generate XMI and IDL?
A: Rational Rose

Roles of UML in CWM

CWM 1.0 Overview {02/2001} Common Warehouse metamodel

Q: Where do I see security?  
A: It is part of the systems management.

Q: Is this the persistent metamodel?  
A: Yes

Q: Notation, classes becoming stereotype in UML?
A: Yes.

 

Tutorial 8

Speaker

Data Architectures for Scalable E-Commerce

 

Michael Stonebraker, Chief Technology Officer,

Cohera Corporation

 

Summary by Linda Kresl

In this full-day tutorial Dr. Stonebraker predicts that the US will lead B2B eCommerce. Major B2B players are Ariba, CommerceOne, Oracle, SAP, IBM. He covered data architecture designed for eCommerce, B2C/B; its inception, types of products (Portals, DBMS, protocols, components, N-tier architectures) and the standards associated with eCommerce.

A B2C application example is a query catalog of items for sale. B2C players are Broadvision, and Openmarket. The interface is usually to a fulfillment system. Gizmos like Palm Pilots and cell phones will be major players in the future.

Any web architecture should be designed using components. The component protocols should be built using Java beans – a safe bet for general-purpose applications. Don’t build your components in Active X, it is not supported by any non-MS OS.

Another choice is XML as a component protocol. XML is also a messaging system – XML will soon be ubiquitous even on gizmos. XML goes through firewalls and it’s easy to parse. XML is a safe bet for low performance applications, use it only for small and slow applications. XML isn’t a good idea for large amounts of data because the meta data is coupled with the data. He favors Java for a web language. C++ will be used for complex applications. He favors the following scripting languages, Javascript, XSL. These products are ODBC compliant and talk to the DBMS. Michael suggests that you stay with ODBC to move from database to database.

Components can run in 3 areas:

1.      Thick client – on a browser – screen intensive logic should run as close to the screen as possible

2.      Thick middle – applications that are in between should run in middleware

3.      Thick database an OR DBMS - data mining should be run as close to the database as possible. Logic in the DB is always faster!!! Move the code to the engine!

Michael states that the obvious goal is Universal components. Write the component once and reuse it at any level. The industry is nowhere near universal components. Java Beans are the closest component at this point in the game.

How should we interface to legacy systems? We can use two approaches, an EAI system or a messaging system. Please use your favorite EAI system. An EAI helps you package up a message and transform it over the network and have the user unpack it and understand it. The top EAI packages are: MQ Series (IBM), Webmethods, Vitria, CommercQuest, CrossWorlds, Mercator.

Content Management is locally authored information in rich content (text and images) and little if any structure to this data. This data is fairly static. This data may also be purchased. There are two solutions to manage content management.

1.      Store content in HTML/XML via a file system (don’t grow your own)

Packages are Plumtree, Viador, Interwoven, Vignette

2.      Object-Relational DBMS – use this if you have an enormous amount of content, these are scalable.

The Web changes data warehousing with a new set of data – clickstream analysis (CSA) – every time a user clicks to a new page – this is stored. This data source is outside the enterprise. Now, this data is outside the firewall. CSA looks exactly like traditional data warehousing. Web site scraping is used to get data from web sites. This is a way to get the data if the enterprise doesn’t own the data. One of the weaknesses of DW is that data is stale by ½ the refresh interval, the scalability issue is great. Trends in this space include automatic data mining, federators should get traction, and visualization systems will get traction to complement data cubes.

Michael suggests the following to improve web design.

1.      Plan for short design cycles - web cycle time appears about 3 months and the rapid prototyping mentality is really required.

2.      Scalability is key. Test a design for scale before it goes live. Make sure that you hire serious system software expertise. Availability is a must. Replicate your data and make sure to turn RAID on.

3.      Do only what you are good at. Figure out your core competency and out-source everything else.

4.      Do everything only once. This means run one ETL system, one EAI, one Federator, etc.

5.      Less islands of information. Use less system administrators, less training, less manuals, etc. Converge federator and EAI and converge app server into OR DBMS

6.      Use XML appropriately – use as a transport protocol not a storage format

   

CONFERENCE SESSIONS  
Tuesday, March 6th, 2001

 

KEYNOTE

Speaker

The Agile Organization

 

Tom DeMarco, Principal,

Atlantic Systems Guild

 

Summary by Linda Kresl

Tom started kicked off the Meta-Data DAMA conference with a flair. He said the systems we build today are characterized by: more stakeholders, conflict, shorter schedules, tighter budgets, more visibility, and risk. And modern day systems are harder because we built all the easy ones years ago.

The major point that Tom is making in this presentation is that we need to introduce “slack” in our work environment. His definition of slack is the degree of freedom (in time and budget and manpower and space, etc.) necessary to make change possible.

What is a quality focus today? Most of our quality programs focus on defects. How do we live with the fact that many of our products are chock full of defects. For example, Microsoft’s IE. Does the software transform your world? The fact that it has defects is of no consequence, it transforms the way a person does one particular thing.

We must consider human capital as the most important asset of an agile organization. The agility principle is based on prioritization. Tom’s view on priority is a great departure from the norm. He suggests rank order priorities and putting projects on hold when their priority doesn’t justify doing them yet.

Tom’s Prescription for a new era

·        Become less “efficient”

·        Lighten process (strive for light process and heavy skills)

·        Learn to Prioritize

·        Choose your projects very wisely; what you decide not to build is more important than how you build

·        Invest in human capital

People must spend time thinking today. Tom spent one whole summer just thinking. Don’t spend all your efforts strategizing. Everyone should put some slack back into your life. Put some slack back into your organization.


 

Conference Session

Presenter

Data Architecture at USAA

 

Andres Perez, Enterprise Data Architect, USAA

 

Summary by Linda Kresl

Andres is chartered with bringing more rigor to USAA’s data architecture. USAA is an automobile insurance agency that prides itself in serving its member with superior information. Andres hopes that what he shares today will be something that you can take home with you and use in your own organizations.

He discussed the fact that they have a large IMS legacy system. It is extremely difficult to do data mining with data in this format. Much of the data isn’t defined correctly and it is conflicting. Semantic problems are those in which data attributes don’t match up from different reports. Also the data is constrained to a given channel. The web may help alleviate this problem.

Most of the applications at USAA have more interfaces than users. One application alone has 4,500 interfaces. There are several translations that must take place for any single application to run. This has created a fur ball of data! Andres states that 50% of the total IT budget is spent maintaining the interfaces.

The single reason that data is inconsistent is what the individuals believe their business processes are. Every individual truly believes they are doing the right thing. When in reality they are not doing what is best for the business. Because of the focus on projects and not the enterprise – USAA has redundant data. 

USAA’s data architecture is based on the Zachman Framework. USAA still has many obstacles to understand its customer’s needs. By implementing the Zachman framework they hope to understand and relate relevant data. Andres is proposing a common data model and definition. He is proposes a reference guide to manage and control meta data.

The desired data architecture for USAA is creating data structures that are subject oriented and in canonical form. Once the data is moved to subject areas Andres proposes creating data marts based on these subject areas.

 

Conference Session

Speaker

Data Quality as a Profit Center

 

Wendy Wood, Data Quality Analyst,

SBC Services

 

Summary by Margaret O’Hara

Wood began her presentation with a comment about “dirty data” being a renewable resource, and thus offering her job security. She then explained the mission of her department at her firm: data quality. She discussed how high data quality can help the company achieve its goals of faster and better market response, improved business flowthrough and customer delight.

Wood believes that the main questions to ask in your company are: (1) Are you getting the data you’re expecting, and (2) what is it worth to you and your company? Wood believes that customer addresses are a good place to start a data quality initiative because most firms have address data, many areas of the company have problems with the address data. At PacBell, customer addresses are a major issue. This is because a single customer may have up to three addresses: the service address, the billing address, and the listing address. While the company can handle “less-than-perfect” addresses for service (e.g., the second house behind the gas station on the corner), the Post Office cannot. More importantly, discounts available for complete addresses were threatened.

To correct the problem, Wood found users who cared about the data. She stressed that data quality was not something that IT could achieve by itself, a user-sponsor was critical. She urged the audience not to take such projects on themselves – to be sure there is buy-in from the business. She then briefly stepped the audience through the cleansing process. First, take the highest level “one” table – country is a good example as it has few values. Examine and correct the data in that table, then move down to state, then to city, etc. She cautioned that the lowest level tables are the ones with data quality issues that are the hardest to identify. She also advised looking at small samples (perhaps 10% of the data before undertaking the project). At the very least such an examination will allow the firm to learn more about its data.

 

Conference Session

Speaker

Introduction to the Unified Modeling Language

 

Eric Naiburg, Rational Software

Summary by Anne Marie Smith

Eric Naiburg, (presenting for Terry Quatrani who was unable to attend due to weather), introduced the concepts of the Unified Modeling Language, how it can be used, and some examples of UML in modeling.

History of UML: created by Booch, Rumbaugh and Jacobsen – all were working on methodologies / languages for visualizing, specifying, constructing and documenting the artifacts of a software system. These methodologies were synthesized with the assistance of Rational Software Corp., and has evolved into a unified format, notation and language designed for modeling applications and data.

Eric explained the various diagrams in the UML:

Activity Diagrams: show flow of control in a system, from start to finish. It represents processes (activities) and the order in which each occurs. This activity diagram can be used to illustrate the data entities needed, as the basis for database design.

Use Case Diagrams: Use cases and actors are the 2 components of a use case diagram. An actor is someone or something that must interact with the system to perform an action. A Use Case is a pattern of behavior that the system can exhibit. Each use case is a sequence of related transactions performed for an activity, involving one or more than one actor. Use cases are a high level requirements gathering and documentation method, and are essential to an object-oriented system development.

Sequence Diagrams: Displays object interactions in the order in which it will be performed.

Collaboration Diagrams: Displays object interactions organized around objects and their links to one another.  

Class Diagrams: Shows the existence of a class and its relationships in the logical view of a system. Classes are collections of objects with a common structure, common behavior and common relationships. Eric explained the concepts of association, aggregation, dependency and inheritance relationships in classes. Eric mentioned the similarity between entities and classes, to demonstrate the commonality between ER modeling and UML modeling. He showed the essential nature of “classes” in object-orientation, and the modeling of classes and relationships within the UML.

State Diagrams: Shows the life history of an application, and are similar to an activity diagram at a point in time. This diagram type is not used as frequently as activity diagrams or sequence diagram for application development. They are more frequently used for networking implementation.

Component Diagrams: Shows the physical implementation of a class and its actions (DLL, programs, interfaces). Deployment diagrams represent the processor and devices used in implementing a system.

Eric concluded by explaining some of the extensions to the UML that are frequently used, and discussed how to bring UML and its concepts into the “data world”. He cited the universality of the UML in business modeling, requirements modeling and application development. He encouraged attendees to learn more about UML and to apply its concepts and techniques in their data activities.

Questions for Eric were mostly technical and documentation-oriented, and showed the high level of interest in the UML and its place in data management.

 

Conference Session

Speaker

Implementation/Use of Operational Meta Data to Improve Data Quality in the Data Warehouse

 

Mike Jennings, Architect and Manager, Hewitt Associates LLC

 

Summary by Ron Klein

Mike Jennings discussed the Meta Data Repository (MDR) and the Data Warehouse. He assumes that the MDR should be independent of both ETL tool selection, and of the Dimensional Modeling technique used.

The purposes of the Repository in the BIE (Business Intelligence Environment) are:

-         The repository product and its data model allow the various function areas in the data warehouse environment to communicate

-         To provide context to the data content, processes and reports

-         Central hub of the data warehouse environment

-         Allow project teams to focus on the operational source system and data warehouse data models, not the repository

Provide a single location for integration between the operational source systems, data warehouse, ETL processes business views, reports, and operational statistics

Mike presented a Generic Meta Data Repository Model (see slide #8 in the speaker’s materials on the conference CD-Rom). He reviewed the various types of business meta data (e.g. Business terms and definitions for tables and columns, subject area names, query and report definitions, report mappings) and technical meta data (Physical table and column names, Data mapping and transformation logic, Source systems, Foreign keys and indexes, Security, ETL process names).

Operational meta data is an extension of the design and architecture of the data warehouse that provides processing optimizations in data acquisition design, maintenance activities, end user reconciliation and auditing of information. It Provides an extra bridge between the meta data repository and the data warehouse through addition of physical columns in the design for ease of use, both technical and business. Operational meta data use will require additional ETL processing steps and time. If a meta data repository can not be extended for operational meta data or is not available, lookup tables can be used as an alternative in the warehouse model. Operational meta data provide a detailed, micro level, explanation of the information content in the data warehouse. The direct association of meta data to each row of the information in the data warehouse allowing for detailed (row level) explanation of information content versus a repository (table/column level) is the key distinction of this method

Transforming the Logical Data Model into the Data Warehouse Data Model

There are eight (8) basic Inmon transformation rules to be applied to the Logical Data Model in order to convert it into a Data Warehouse Data Model. These transformation rules should typically be applied in sequence. Mike’s own “modified” version of these rules is:

1. Removal of purely operational data

2. Addition of an element of time to the key structure and operational meta data

3. Addition of derived data

4. Transformation of data relationships into artifacts

5. Accommodations of different levels of granularity

6. Merging like data from different tables

7. Creation of arrays of data

8. Separation of data attributes based on their stability

Operational Meta Data Examples

There can be various technical meta data columns (tags) utilized in the data warehouse data model and ETL processes for enhanced automated support.

- Load Cycle Identifier

- Current Flag Indicator

- Load Date

- Update Date

- Operational System(s) Identifier

- Active in Operational System Flag

- Confidence Level Indicator

- Cyclic Redundancy Check CRC)

These columns are added during transformation of the Business Logical model into the Dimensional or Data Warehouse data model. Use of certain operational meta data depends on the type of table in question (e.g., Update date on a fact table would result in little value since these tables are not typically updated in a standard warehouse). Mike discussed an example of a strategy for operational meta data use for slowly changing dimensions (SCD). This can be reviewed in his paper on the conference CD.

 

Conference Session

Speaker

The Data Resource Repository

 

David Hay, President, Essential Strategies

 

Summary by Carey Clark

David Hay creates the most readable data models in the world (in this author’s humble opinion). In this presentation he presents over 30 logical models and meta models covering all aspects of the information systems development process itself. Models presented describe the entities and relationships of the artifacts created during analysis, design, and programming. He also showed models for data transformations, business rules, screen design, and object oriented programming. Doing this not only provides a basis for storing the relevant meta data that would reside in a repository, but also goes a long way in helping us to understand what we ourselves do.

David avoids the term “meta data” in reference to repositories. He thinks it’s too restricted. Instead he defers to Michael Brackett’s designation, the “The Data Resource Repository”.

He reviewed historical efforts to create a repository and provided his assessment of their success. The OIM and OMG versions he felt were too abstract. They hold lots of stuff but not the stuff a typical data modeler would recognize. Oracle Designer is promising. TDAN and Aera Energy were potentially workable. But he decided to have a go at it himself.

He plugged the TDAN newsletter at www.TDAN.com as required reading. His own three articles on his Repository Models are there as well. He started simple and progressed with more, and more elaborate, repository meta models. All are worth studying and I recommend viewing them.

He contends that UML is only a data modeling notation and that there is nothing fundamentally different from other notations. It does some things okay but is not easy to read. He therefore defers to the ER (crows feet) notation instead. UML also tends to focus the modeler on the application (physical) rather than on the business (logical).

Dave explained using the models how certain issues were handled. For example there is the need to have a way to describe elements that initially may be populated but eventually must be populated. Most tools make you decide one way or the other up front. He includes derived data in his model. Whether that data is derived when viewed or stored is an implementation decision. The logical model is the same.

The problem with most meta models is that they are too abstract for anyone but data modelers. In order to make models readable to the user community he added the concept of “virtual entities” that derive from the abstract one. Thus one can display the entity Customer in a model view, even though Customer is really the Role of a Party (where Party is a Person or Organization).

He believes that use cases are awkward because they assume you understand the process you’re modeling. They are essentially context level data flow diagrams but lack some of the formality and rigor.

Dave is currently working on business rules meta model with the Business Rules Group. This group is sort of a replacement for Guide. Check it out at businessrulesgroup.org.

Not everything about a business belongs in a Repository. He doesn’t claim his models cover every possible modeling subject. For example, work flow models, events, policies might be better stored in their own data store. In none of his models does one see foreign keys. It’s a mechanism for implementing relationships. At the logical level they are implied by the relationship link. Putting them in the model is redundant.

His models are particularly readable and elegant. He uses Oracle Designer, it allows subtypes to be nested and entities to be stretched so that that relationship lines rarely overlap and never bend. The bad news is that it’s expensive.

 

Conference Session

Speaker

XML Without Fear

 

Alan Perkins, Vice President,

Visible Systems

 

Summary by David Plotkin

This presentation introduced the basics of XML, including the fact that it is content-based, not presentation-based. It also identified what tags are used for, and briefly discussed Elements, Attributes, and Entities, with examples.

The main point of the talk is that XML Without Fear is based on documenting Enterprise Meta data in the form of business rules. The types of business rules were listed, including definitions, data integrity constraints, derivations, inferences, processing sequences, and relationships among facts. The presentation discussed the advantages of managing business rules, and the characteristics of a "good" business rule.

The bulk of the presentation discussed modeling of business rules. In general, constraint-type business rules and derivations cannot be modeled in a "standard" data modeling tool. However, using Visible System's tool, Alan demonstrated how data modeling could be extended to model these types of "impossible to model" business rules.

 

Conference Session

Speaker

Data Management Support for Enterprise Architecture

 

Brett Champlin Architecture Consultant, Allstate Insurance Company

 

Summary by Linda Kresl

This presentation offered valuable insights on how your company can manage the data for your enterprise architecture. Brett’s examples from Allstate Insurance give practical suggestions to handle this difficult task. The key is to manage the models that support the architecture, but an Enterprise Architecture is much more than just models. Enterprise architecture is models, principles, and standards. It includes data and process modeling and application and technologies architecture.

In this presentation Brett explained architecture definitions. His first definition was an engineering definition of architecture – the art and science of building. And the purpose of architecture is to convey a design. Information systems architecture is the blueprints, drawings and models, which define and describe what is needed.

Brett presented many schematics and diagrams to show different architectural frameworks, e.g. Zachman, Gorman’s Knowledge Worker, Framework for 3-tier C/S development. Brett compared Enterprise architecture to city planning, comparing the buildings in a city to systems in an enterprise. The most important element is the infrastructure – what is underneath supporting the buildings and systems.

Data management support includes defining the processes, choosing a framework, and integrating the EA with key business processes. Brett mentioned the several tools to help manage the EA. These tools include: Corporate Modeler by CASEwise, Metis by NCR, and Architect by ZTI.

 

 

Conference Session

Speaker

Business Rule Specification, Validation & Transformation: Advanced Aspects

 

Terry Halpin, Technical Lead in Database Design, Microsoft

 

Summary by Margaret O’Hara

Halpin began his presentation by asking the audience how many used data use cases and object-role modeling (ORM) in their work. About 1/3 of the audience had used them. Halpin’s basic premise in the presentation was that data use cases and ORM were:

- more understandable because it stated facts and rules in English and/or intuitive graphics  
- more reliable because it validates rules using English and sample populations  
- more expressive because it captures more business rules graphically  
- more stable because it minimizes the impact of change in models.

Halpin used the example of birth date. Instead of stating that a person has a birth date, with ORM this becomes, “I was born on ____” -- a much more natural way for the user to state the date. For the remainder of the presentation, Halpin presented ORM examples.

In his concluding remarks, Halpin stated that ER was useful for basic data modeling, but that commercial versions were restricted with regard to business rules. UML is useful for OO code design but not for information analysis as its use cases are too process-oriented. For the ER and UML users, Halpin suggested they use ORM for analysis and then map to ER or UML, supplement ER and UML with data use cases, or enhance ER and UML to make them more ORM-like.

 

 

Conference Session

Speaker

Business Process Analysis and Logical Process Modeling

 

Anne Marie Smith, Assistant Professor, LaSalle University

 

Summary by Anne Marie Smith

Anne Marie Smith, assistant professor of MIS at LaSalle University and a data architect consultant, gave an overview of the concepts of business process analysis and its relationship to data analysis, with a brief overview of the methods used to model logical processes and that model’s relationship to a logical data model.

Anne Marie noted that process analysis should be used in all systems development, whether transaction processing, decision support/data warehousing; for both traditional applications as well as electronic commerce applications. She cited the failure rate of application development projects of all types and the lack of understanding of the processes that occur, causing frustration in the user and IT communities.

Business Processes do not operate in a vacuum: they need data to validate the reason for the processes’ existence. As such, Anne Marie described the interaction between data analysis and process analysis, and the need to have BOTH analyses for full application development and user effectiveness.

Anne Marie’s presentation was enhanced by the use of actual experiences of her consulting and information management career, and demonstrated the interaction between data and process in a successful implementation in different types of development.

With a very brief overview of logical process modeling, Anne Marie introduced this method to the data analysts in attendance. She concluded by reiterating the ideas from the introduction and by relating the needs for understanding processes to data analysts’ understanding of the need for data analysis.

Some reactions/questions to this presentation showed that DAMA needs more exposure to processes and processes’ intimate relationship to data – more process-oriented presentations were requested for future conferences.

 

Conference Session

Speaker

Build Your Own Web-Based Meta Data Repository

 

Joseph Newcum, Senior Data Architect, Bank One

 

Summary by Carey Clark

There are several reasons for building your repository rather than buying one. Vendor versions tend to be costly and can be difficult to modify. On the build-your-own side of the issue, you must have the skill and patience in house to attempt the project.

Joseph separates meta data into operational and developmental. The first deals with the flow of information in the enterprise such as for loading a data warehouse. These activities happen day in and day out. Development meta data concerns the creation of applications, the analysis, models, and constructs used on a project. Your repository will be different depending on your emphasis.

Bank One spent two years evaluating third party repositories. Their focus was using the repository to build a data warehouse. Building their own repository wasn’t straightforward. It took 4 tries. The first failed because it was too difficult to load data from their case tools. The second for lack of skilled object oriented programmers. The third was a purchased repository that didn’t fill the bill. The four try succeeded.

The successful approach to building their repository was to create a prototype in Microsoft Access, prove the design, and then rebuild it in HTML and JavaScript for dissemination over the Web. They used Microsoft tools (Active Server Pages, Active Data Objects, Java) etc. Their modeling tool is ER/WIN. They don’t have XML incorporated yet.

Joseph walked through and discussed the various display screens in the Access prototype. The initial application ended up smaller in many ways because certain meta data simply wasn’t available. The resulting application primarily supports a data warehouse environment.

They made the interface look like Business Objects. Users were already familiar with it so the learning curve was reduced. The user interface is clean and robust. What goes on under the covers is something of a jumble but is constantly being improved. He believes this is the right approach. Make the interface elegant and robust and don’t worry so much about internals. You can change those without the end user being affected. Right now they are modularizing it into VB classes and moving data into business objects. Subject matter experts input definitions directly.

He showed the meta models underlying the repository. They started out as a very abstract thing-thing model used by Knowledgeware’s Application Development Warehouse. Later it was redone to be less abstract.

An audience member asked if data models themselves are viewable on-line. The answer was yes but he found that few developers every used those views: Just not enough space or resolution. Instead most of them plotted the models out on large plotter paper and pinned them in their cubical.

He recommended the books: Visual Basic 6 Business Objects and Visual Basic 6 Distributed Objects. These, he said, would be valuable for their architectural insights even if you didn’t use Visual Basic.

 

 

Conference Session

Speaker

The Role of Data Administration in Managing the Enterprise Portal

 

Arvind Shah, President,

Performance Development Corporation

Summary by David Plotkin

This presentation defined the many kinds of personalized portals (such as consumer, vertical, B2B, and Corporate) and their purposes. It discussed the typical problems with B2B portals, and the roles of data administration in solving these problems.

The roles included some roles that are typically considered part of data administration, and some (like performance tuning, security, and supply chain standardization) that are not. The roles typically considered part of data administration included Planning-Architecture development, Content Management, and Information Quality Management.

Architecture Development consists of managing Enterprise architecture, establishing a process model, building the data model, setting up the business rules, and creating strategies for information, technology, and BPR initiatives. Content management consists of managing data architecture, enforcing data standards, assuring data timeliness & quality, and assuring security levels. It also means managing meta data.

 

 

Conference Session

Speaker

Developing a Corporate Data Architecture in a Federated World

 

Deborah Henderson, IT Architect,

Hydro One Networks, Inc. &

Vladimir Pantic, IBM

 

Summary by Linda Kresl

Deborah presented first and described the business of Hydro One Networks. Hydro is a wholesale retail electric utility. She gave several examples of the work that Hydro One is creating in defining their data architecture. They have a high re-use of data and processes across the enterprise. She stated that they are leveraging their data warehouse – this is the driver for the data architecture.

The data architecture is composed of local data, OLAP and details, external and historical data and the ODS source. Meta data ties everything together.

The physical database architecture includes an Oracle 8I, RI, multi-dimensional cubes, and a meta data repository through hooks.

Hydro One is using IBM’s LOVEM methodology to develop and document processes and implement procedures. This methodology tracks the life cycle of these deliverables.

At Hydro One business rules are implemented via the ETL. The ETL then feeds the data marts where additional information is stored to support the data architecture.

 

Conference Session

Speaker

Facilitation and the Successful Architect

 

Shelly Lieberman, Director, Strategic Directions, Mathtech

Summary by Margaret O’Hara

In this well-organized and entertaining presentation, Lieberman shared her experiences at the Division of Alcoholic Beverage Control (ABC) in NJ and the part that facilitation played in achieving a successful business process reengineering effort. She began by defining facilitation as the process of harnessing user knowledge and expertise in a group to accomplish objectives and develop deliverables.

Her presentation included discussion of when and why one should use facilitation, an overview of the ABC project, the facilitation approach she used, the results of the facilitation sessions with the ABC and the critical success factors for the sessions. The facilitation process consists of careful planning, execution and follow-up, very often with the follow-up activities feeding directly into the next planning session. A knowledge of the organizational culture is critical, as not all techniques work in all cultures. Not all sessions are facilitated; only those involving major issues among the involved parties.

Once the sessions have been scheduled, it is important to follow a strict agenda. Each session is split into three parts: an opening module where the stage is set, the work module , and the closure module where the wrap-u[p and summary takes place. “boarding” issues – writing them in a public space in the room for everyone to see often diffuses conflict – people are assured they are being heard.

Lieberman presented the rules for sessions, including everyone is equal, critique ideas, not people, etc. and shared the evaluation forms she uses for the sessions. She also presented the critical success factors for the sessions. Among these were: commitment from management for change, knowledgeable participants, open communication, and extensive follow-up. Lieberman also spent some time dealing with the challenges, such as groups not wanting to follow structured agendas (stay focused on the issues, but let the group do their thing), the director having most of the say (talked to director in background), and “nay Sayers” who didn’t want change (persuaded to join group by the director).

The session concluded with Lieberman sharing some resources for further information (iaf-world.org).

 

Conference Session

Speaker

The Practical Use of a Universal Data Model in the Data Warehouse,

 

David Lepley, Data Analyst,

Tyco Electronics

 

Summary by Anne Marie Smith

To demonstrate the need for “context” with data, David gave an overview of the electronics environment and his company’s history before launching into a presentation on the Tyco global data warehouse development and its reliance on universal data models.

David’s presentation gave us:

Business Rules Approach: explained the rationale for business rules in a Data Warehouse, showed the drivers of the business as fundamental for understanding the data contained in a data warehouse, and described why these factors pointed Tyco to using a universal model for its data warehouse

The Universal Database Concept and the Universal Database Tables: this is a database design where business rules about data are stored and used to facilitate development of new and enhanced applications. David briefly described how Tyco has implemented this universal database in Oracle, using partitioning and other DBMS facilities.

David’s presentation answered the question “Where do these concepts fit into the Data Warehouse Architecture?” He explained the roles of data quality in data warehousing, showed how Tyco is changing culture to verify and ensure data quality. David referenced Barbara von Halle and David Hay throughout the presentation, providing reinforcement from experts to his organization’s approach.

He stressed how this approach was unique to his organization, and the risk the team took in using a universal data model for the Tyco Data Warehouse. Thankfully, this approach has been successful to date, and has been helped by their use of flexible structures, business rules and committed IS and business team members.

 

 

Conference Session

Speaker

Understanding and Managing Reference Data

 

Malcolm Chisholm, Manager,

Deloitte & Touche

 

Summary by Ron Klein

What is Reference Data?

Reference data is any kind of data that is used solely to categorize other data found in a database, or solely for relating data in a database to information beyond the boundaries of the enterprise.  

Reference Data…at Best, like Cinderella is forgotten

Reference Data…at Worst` the “Rodney Dangerfield” of the world of data – “No respect at all”

Characteristics of Reference Data

1 – Rate of Change - Table structures change rarely, though there can be exceptions, such as in the world of foreign exchange rates

2 – Volume – Reference data tables typically have few rows and columns, but there may be many reference tables in a data model  


Q: How do you distinguish reference data from domain?  
A: Yes, it can be hidden in the domain causing problems for reporting

3 – Scope - One Reference Data table can have relationships to many other tables in a single database, or across an enterprise

4 – Meta data and Meaning - Individual values of Reference Data can have meaning, very unlike other data where attribute definitions suffice

Reference Data Management Issues

-         Implementation is typically in Program Code, not Database Tables. Using values taken from Reference Data tables is fine; defining values in program logic that can be used in updates is not

-         Usage of External Standards. External standards can be useful, however they may suffer from “information float” and may not always match the requirements of the enterprise

-         Divergence - Different applications have independent functionality for updating their own Reference Data tables. This leads to divergence in data. The result is MAPPING whenever data has to be shared between the different databases. Mapping typically involves semantic analysis, data quality checking, and resolving granularity problems

  Suggestions for Managing Reference Data

-         Accept that Reference Data is a distinct class of data that is different to other classes of data

-         Assign an “owner” for reference data. It needs to be centrally managed. Perhaps the data administration function.

-         Develop a strategy for assigning codes and acronyms as primary keys

-         Controlled redundancy can be a good strategy

-         Publish the content and meaning of reference data for use by developers and users

Q: Are you sure you can’t find this reference data. What are the obstacles?  
A: No one wants to touch it. Ownership usually goes to the Data Administration group. On the other hand, business users can sometimes own classification schemas.

Q: Multiple owners that do not co-share?  
A: 3rd category -> a central repository, non trivial

 

 

Conference Session

Speaker

Architecting and Implementing a Web-Based Corporate Meta Data Repository at the Census Bureau

Gail Wright, Technical Director,

Oracle Corporation

Summary by Carey Clark

The Census Bureau does a lot more than count people every 10 years. It is chartered to conduct community, demographic, and economic surveys of organizations and business throughout the country. For example, every business in the country will receive a questionnaire in 2002.

The questionnaires ask different sets of the same questions depending on the industry and audience. Creating these questionnaires on paper took months. Analyzing the results were equally labor intensive. So the goal was to make a corporate meta data repository that would use meta data to generate surveys, collect and collate the data, and disseminate the results.

Gail covered their reasons for the repository, what was included in the repository, how it was architected, designed and implemented. Lastly she showed how the repository is now poised to be used for nine other major governmental departments. Because of this effort, work that took months can now take days. Data is more reliable, and different kinds of studies are possible. The whole survey process is now meta data driven.

This repository is remarkable in many respects. It’s large, comprehensive, based on open industry standards, contains tabular and not tabular data with reference materials and full text search. While most of us aspire to making a car, they have a space ship.

Their repository includes data content, quality, its condition, context and meaning. It includes data models, business models, screen layouts, mappings and transformations, hierarchies, aggregations rules, formulas, schedules, access controls and actual code. The repository is composed of the following components:

Nothing is application specific. Industry standards are followed where they exist. XML is used extensively. No software is created or modified directly. All of it goes into a modeling tool and is generated from there. The custom stuff is passed through but is forced to follow the required standards and process.

Gail described the repository as having a “tightly-to-loosely coupled architecture”. She described it and the tools used in detail. It’s scalable, provides for open API’s, is self documenting and easy to maintain.

She walked us through the interface screens and showed how the navigation worked and how versatile it was. Security is underneath a set of “portlets” that determines who gets to see what. The public can see quite a bit at the web site, American Fact Finder (factfinder.census.gov).

The effort has gone from being a good idea to being mission critical. The census bureau wouldn’t think of running their business now without it.

Questions and Answers

Their repository doesn’t overlap much with the Common Warehouse Meta Model. CWM is more focused on tool development at the technical level. Their’s is more focused on the business level.  

It took 5 people a year to create the data element registry. She has 13 people in her group working on various projects.  

They decided not to do it in Java. They didn’t have the skill set. They mostly use Oracle Designer and generate PL SQL.  

Michael Gorman, who introduced Gail, emphasized the importance of pointing out to executives and others how much savings and benefits a successful project achieved. Memories are short. “Selling after the sale” enables you to get funding for further projects

 

 

Conference Session

Speaker

Building the XML Repository

 

David Plotkin, Senior Data Administrator, Longs Drug Stores

 

Summary by David Plotkin

 This presentation presented XML from the data administrator's viewpoint, including what XML is and isn't, the various items (entities, element, attributes), the rules for well-formed XML documents, and how to define the valid structure of an XML document using a DTD.  

Then, the complete metamodel for a repository designed to store DTDs and XML instance documents was presented. The major sections included DTDs and entities, DTDs and element, elements and attributes, and physical implementation of elements and attributes.

The presenter also covered the functionality that is needed from a Repository, including scanning in DTDs, making changes, creating revised DTD output, building sample XML documents from DTDs, and doing impact analysis for changes. In addition, he pointed out that although this application is called a "repository", it is a limited-function implementation, and is not that difficult to design and build. However, you still need to use "industrial strength" tools -- no desktop databases need apply!

 

Conference Session

Speaker

The Seven Deadly Sins of CRM

 

Jill Dyche, Partner,

Baseline Consulting Group

 

Summary by Anne Marie Smith

Jill Dyche, a partner at Baseline Consulting Group, presented the major mistakes of CRM from a data focus. Many sins are data-related, and, can be resolved by better attention to data management. According to Jill, those sins that are not data-related can be solved in part by a focus on data (and meta data, in the author’s opinion). However, data analysis cannot be done “in a vacuum” or bad actions can result. 

She used references from her recent book, “e-Data: Turning Data into Information” from Addison-Wesley Publishing, offering “real-life examples” of each sin and its possible solution. Since “there is no such thing as plug-and-play in CRM” each example and possible solution must be evaluated in light of an organization’s goals and objectives.

The many different definitions of CRM are at the root of many of the problems and sins in CRM implementation. Data’s reliance on definitions can assist CRM in developing a solid and reusable definition to use in all CRM projects.

Sins:

  1. No Unified CRM Strategy (multiple CRM projects occurring simultaneously)

  2. Failing to Manage Staff Expectations of the benefits and costs of CRM

  3. Failure to Define Success in Customer Management

  4. Outsourcing Hastily (or Not at All)

  5. Failure to Change Business Processes (Failure to differentiate customers and change processes based on that customer’s value to the organization)

  6. Not Understanding Product Features and Differences in CRM Approaches (operational CRM versus analytical CRM)

  7. Lack of Integration, Understanding and Executive Attention (No “Single Version of the Truth”)  

Closing with Critical Success Factors, Jill reinforced the ideas she opened the presentation with, concluding with some examples of successful CRM implementation. Questions to Jill demonstrated the need for education in CRM, its concepts, implementation and approaches to solving these “7 Deadly Sins”.

 

 

Conference Session

Speaker

Elevating the Role of IRM for Business Effectiveness

 

Larry English, Principal,

INFORMATION IMPACT International

 

Summary by Margaret O’Hara

English began his presentation be explaining why traditional approaches to data administration have failed to create positive impact and acceptance in the enterprise. The cause, he believes, is that we are operating still under an industrial age paradigm. We fail to view information as a strategic enterprise resource because we have overlaid IT on obsolete structures. The industrial age is vertical; the information age is horizontal. To illustrate this, one example English used was that all managers (not just HR) can read organizational charts, all managers (not just financial) can read balance sheets, but only IT managers can read data models.  

To move from data administration to information stewardship (which English recommends), the organization must view information as a strategic resource with a resource management life cycle. This means that information must be planned for, acquired, applied, maintained and disposed of in the same manner as other resources.  

English presented some trends in data / information quality to illustrate that it is getting worse:

- in one firm, 66% of 6 million records were useless  
            - DA influence seems to be decreasing  
            - DRM is moving away from the business  
            - 65% of data warehouse initiatives fail outright

English believes that the term meta-data should not be used because it has no meaning to non-IT people.

To elevate IRM effectiveness:

English believes we must move from Data administration to Information leadership, and from being data bigot to business bigots. He also told us: Don’t sell – listen!

 

Conference Session

Speaker

Comparison of Data Modeling Techniques

 

 

Panel: Davida Berger (moderator)

Graham Witt, Alec Sharp, Terry Halpin, Eric Naiburg

Summary by Davida Berger

This was a very lively advanced session with renowned modeling experts discussing the benefits and drawbacks of ERM (Entity Relationship Modeling), ORM (Object Relationship Modeling), and UML (Unified Modeling Language).

ERM

ORM

UML

No matter what methodology is used the model must be designed and readable for the business community. Special attention should be given to the presentation and arrangement of the diagram. Names of entities, attributes, and relationship should not be cryptic and should represent business terms and not computer or system concepts or functions.

 

 

Conference Session

Speaker

Meta Data – Myth and Realities

 

John Ladley, President

Knowledge InterSpace, Inc.

 

Summary by Ron Klein

John outlined his experience – he did “James Martin stuff”. He worked for Meta Group. He worked at integrating everything and doing Data Administration.

John makes the point that business is “gray” – not black & white. Collaborative Intelligence comes about when tacit and unstructured information is factored into a business decision.

The reality of meta data is that there are No comprehensive tools, Repositories are not capable enough, there are 2-3 standards, and too much in house development. However, CWM is a tremendous step in standards. Remember that CWM scope is limited to data warehouse (DW) - and analytic application-relevant metadata, while the OIM schema is supposedly capable of handling knowledge management and business-process constructs. Therefore, enterprises considering panoramic metadata/repository initiatives may find CWM limiting, though more broadly supported.

Don’t be afraid to build your meta data bottom up.

Despite his apparent despair at the state of meta data products and management, John actually believes the importance of meta data will increase in the future. His summary slide said:

 

 

Conference Session

Speaker

The UPS Meta Data Repository – A Success Story

 

Patti Munier, Senior Data Analyst and Manager, United Parcel Service

 

Summary by Carey Clark

UPS is a large company. Every year it handles 3.28 billion parcels using 1700 facilities, 575 aircraft 149,000 vehicles, and 344,000 employees. It is 93 years old.

UPS uses Computer Associates’ Platinum Repository and rather than being a gate keeper for new development they are more of a watch dog. They use Platinum’s scanners to scan all production databases and programs throughout the enterprise. They then compare what they find to the meta data in the repository. Entries that aren’t recognized or don’t meet standards are flagged for review and brought into compliance. What passes is parsed and loaded.

Developers use the repository and are required to involve data administration from the outset of a project. But because Patti’s group is constantly scanning the end result, they know what is real.

They track over 5000 key words, 30,000 data elements; database structures, and copybooks. The repository is updated twice a month. This data is then distributed through an intranet. The site gets 24,000 hits a day by every level of user.

One of the key processes is what they call rationalization. All representations of data are documented and linked back to the master name and definition. The data description is stored only once. This enables UPS to do impact analyses quickly. Anyone can find out what data is being used, where it is being used, and whether or not it’s official. The benefit of this cannot be over estimated.

Meta data types include, abbreviation name, full English name, physical name, standing (approved, non approved, skeleton), source (e.g. vendor name), descriptions, warehouse description and history. Every data element ends in a “class word” (e.g., number, text, code, etc.) as part of its formal name.

The success of this effort has reduced data disparity and allowed them to decommission the other dictionaries at hubs and distribution centers. The repository is used for training new employees who are able to learn the corporate vocabulary quickly.

In the future Patti’s group plans to compete the data element quality application, provide support for XML, DTD’s, and Schemas, automate scanning and loading of SQL Server data, and add business rules.

Patti presented some of the repository’s screens: Straightforward, understandable and powerful.

 

Conference Session

Speaker

Universal Data Models for Web Constructs

 

Len Silverston, Founder,

Universal Data Models

 

Summary by David Plotkin

The motto of the presentation was: "The more you see the whole, the closer you move towards the truth".

Len presented a series of generalized (or "universal") models for the following subjects: Web Parties, Web Party Contact Mechanism, Web Login, Web Site Content, Web Object Usage, Web Visits and Hits, and Web Star Schema (data warehouse). The common characteristic of these model is that they did not contain any aspects of the business at the entity level. Instead, they used very generic terms such as "Party" (person, organization, or automated agent who participates in a process or transaction), Party Type (a generalized way of classifying parties) and party role (customer, referrer, supplier, etc.). Although Len did not model the relationships themselves in the limited time available, he did state that the roles could not exist without a relationship. For example, the role "customer" could not exist without a relationship between parties.


Tuesday, March 6th, 2001

 

KEYNOTE

Speaker

The ER Model, XML and the Web

(and DAMA Individual Achievement Award)

Peter Chen, Professor,

Louisiana State University

 

Summary by Anne Marie Smith

 

Rose Romero, DAMA International VP of Communication, presented the 2001 DAMA International Individual Achievement Award to Dr. Peter Aiken, and Dr. E.F. Codd. This is the first time that 2 individuals were the recipients of the Individual Achievement Award. Drs. Aiken and Codd received this award for their significant contributions in the field of Information Resource Management. As educators, consultants and authors, they have assisted numerous companies in developing and maintaining data resource management environments, therefore expanding and enhancing the roles of information management professionals. It should be noted that Dr. Aiken is a member of the DAMA International Board of Advisors.

 

Other nominees for the 2001 Individual Achievement Award were:

Larry P. English, David Marco, Dr. James Martin, Dr. Richard Nolan

 

After the award ceremony, Dr. Peter Chen, the originator of the ER model, delivered a keynote address on the relationships among the ER model, XML and the World Wide Web. Dr. Chen was the 2000 DAMA International Individual Achievement Award. He gave the attendees an understanding of XML and ER modeling, as well as several good, new buzzwords.  

His entertaining and very informative presentation focused on:

Dr. Chen provided numerous references and links for further study in XML and ER modeling. These references are found on the CD of the conference’s presentations.

Dr. Chen concluded with his insights on other interesting research directions in XML and web modeling. He stressed the need for methodology for modeling in all arenas, and urged the attendees to actively participate in the expansion and development of understanding of XML and ER modeling.

 

 

Conference Session

Speaker

Business Information Management at Johnson and Johnson: Beginning the Process

 

Larry Dziedzic, Information Management Architect,

Johnson & Johnson

 

Summary by Margaret O’Hara

Larry Dziedzic began his presentation by offering a brief history of Johnson and Johnson and his personal background in the Information Management discipline. With 198 diverse companies scattered throughout 52 countries, coming to agreement on an any enterprise wide standards is a daunting task. The companies are grouped together into three primary divisions: Consumer products (shampoo, band-aids, Tylenol), medical devised and diagnostics (hips, shoulders, glucose monitors) and pharmaceuticals.

He then presented the initial plan for establishing the business Information Management (BIM) program at Johnson and Johnson. Using some basic and easy-to-understand examples, he explained the particular problems J&J experiences. For example, when a new fragrance is added to a shampoo, does it become a new product or a variation on the existing product? Because of the nature of the J&J culture (with all companies retaining some degree of autonomy), questions such as this have myriad answers.

Other surprising issues he encountered included: only 70% of information being correct, and management being satisfied with that statistic. Moreover, the Information Management Architecture group did not typically talk to the customers, relying instead on pre-existing information – which was sometimes inaccurate. Thus, the lack of attention paid by IM to the business side, and therefore a lack of appropriate information were fundamental problems.

Dziedzic went on to illustrate some classic examples of “dab” information making the news to the detriment of the organization to which the information applied. Among the specific challenges that J&J faces are: the level of autonomy of the 198 diverse companies, the varying level of resources for these firms, and the lack of standard ERP package among the three primary groups (One has selected JD Edwards and two have selected SAP).

To alleviate the situation, global competency centers (GCCs) are being formed to liaison to the business community. Thus far, GCCs have been established for two of the groups, with the third one coming later this year. These GCCs will work with the global partners to establish unified applications and implement global strategies. Consultants (internal and external) and helping to develop the BIM strategies and best practices and tools will be utilized.

One major problem J&J faces is that the SAP and JD Edwards packages will eventually have to interface. More importantly, the task of implementing the GCCs is very much a people problem – with listening, educating and communicating being top priorities.

 

Conference Session

Speaker

Measuring the Quality of Models

 

Peter A. McDougall, Senior Data Administrator, Insurance Corporation of British Columbia

 

Summary by Linda Kresl

 

This presentation focused on an approach for measuring model quality that Peter developed over five years ago. The criteria for evaluating a model are based upon the aspects of communication. Furthermore, since a data model is a composite object, the presentation described how a model’s quality is actually derived from the collective quality of its components. Thus any quality measures shouldn’t be applied to the model as a whole, but instead to its smaller, atomic-level pieces. As such, five communications based yardsticks – Accuracy, Clarity, Consistency, Conciseness and Completeness were discussed.

 

Peter also focused on the model review process. Two techniques called Direct Feedback and Business-Based questioning, plus how the quality measures are used with these methods, will be described. These techniques focus on understanding the business unit’s relationship to the message from the model. They take a nonjudgmental perspective and are designed to develop a collaborative framework used for working towards a quality product. Lastly, the presentation described how communications-based criteria ultimately produce better models.

The following topics were discussed by Peter:

·        Why communications-based measures are useful to evaluate the quality of a model

·        The five criteria used to measure quality

·        A set of techniques for applying the measures

·        Why the approach creates models that have quality built-in, instead of “inspected in”

 

Conference Session

Speaker

Organizational and Development Strategies for Creating a High-ROI Enterprise Data Warehouse

 

Brent Lautenschlegar, Principal,

Reflection Technology Corporation

 

 

 

Summary by Anne Marie Smith

Brent has much experience in enterprise applications and data warehousing. He used these experiences to describe the implementation of an enterprise data warehouse at Delta Air Lines.

Brent gave an overview of the history of the data warehouse at Delta, which had a focus of incremental growth. Business users at Delta were not well served by Information Technology at Delta, and this lack formed the rationale for developing and implementing an enterprise data warehouse. As a result, Brent’s presentation was more business-oriented than technical, although he did discuss some very technical topics in answering questions. The teams of users and IT specialists included subject areas of HR, Operations, Finance and Marketing/Sales. Eventually, this data warehouse was able to “establish a single version of the truth”. Having a conceptual data model for the enterprise was essential to the success of planning this massive project, despite the fact that many subject areas did not have transactional level data models to use as a basis for the data warehouse. Capturing requirements and feedback from the user community was a hallmark of the quality effort within Delta and the data warehouse project.

Brent outlined the technologies used in this project: Teradata for the DW database; Brio for querying and reporting, SAS for statistical analysis; Informatica for extraction, transformation and loading (ETL) and Essbase for multi-dimensional database management.

Each module of the data warehouse was developed within a 60-day period, to counter the perception of a data warehouse as a monolithic project. Incremental development has many benefits to both IS and users, and gives ownership and control to the development and implementation teams, as well as demonstrating the progress of data warehousing to the organization’s management. One disadvantage to this rapid, incremental development effort was the need to alter the habits and expectations of database administrators and data administrators / modelers. These team members were not accustomed to working in this rapid environment, and some culture change was necessary. Brent explained the steps the teams used to meet this development deadline, and described some of the challenges the teams encountered in some subject areas.

Questions to Brent were both business-oriented (cost-benefits, information use approach, skill development) and technical (reasons for choosing certain technology, interfaces and their construction). Questions lasted into the break period.

Conference Session

Speaker

The Good, Bad & Ugly: Is Meta data the Way to Knowledge Management?

 

Gil Laware, Assistant Professor,

Purdue University &

Frank Kowalkowski, President

Knowledge Consultants, Inc.

Summary by Ron Klein

The Library of Alexandria purpose was to gather material from the countries they conquered to subjugate them. A heck of a business value!

Start with Robert Anthony’s Framework for looking at enterprises (see page 11 of speaker’s paper on CD-Rom). Consider that knowledge can viewed in a similar manner (see pg 12). Now propose an architected view knowledge – a Library model is not a good model for the business.

Gil and Frank stressed the following key presentation points:

- The meta views and knowledge content are important to an enterprise

- Meta views are needed to successfully implement critical applications in a business such as:

– Enterprise application integration

– Business performance measurement

– Customer relationship management

– Enterprise resource planning

- Knowledge fills or is connected to many meta structures

- A meta-data strategy is needed to get best business value

- Businesses without meta views will gradually fall behind with failed implementations or only partial realization of benefits

Integration will get money because it saves money!

 

 

Conference Session

Speaker

OMG Technical Update

 

Andrew Watson, Technical Director, Object Management Group

 

Summary by Carey Clark

Andrew described the Object Management Group. It’s a not-for-profit. body with over 800 members where decisions are proposed and accepted by their members. OMG is not an official standards body like ISO and no one is obligated to conform, however most do. Anyone can access and download their specifications. There are no fees or passwords.

 Some of the main specifications to come out of OMG include:

OMG has numerous task forces and special interest groups covering all manner of subjects and industries.

UML

OO modeling like ER modeling has a wide variety of notations. By 1994 it was a real mess. Similar concepts, incompatible notations, few support tools. Methodologist are often very stubborn, and getting agreement is extremely difficult. In ’95 Jacobson and Soley began to push for modeling standards. By 1997 UML was accepted by all parties. The current version is 1.4.

UML is designed for visualizing and documenting software. It is was not designed for database modeling. UML is not a method but a convention for representing software constructs. Because of this standard, lots of tools have been built and over 60 books written. It is now used in over 70% of IT shops. Until it was adopted no one was willing to invest the capital to develop tools.

Version 2.0 of the specification is under development and if you want to influence it, now is the time to speak up. Thirty seven companies are already on board.

MOF

The meta object facility is a meta data architecture (i.e. for repositories). It works in cooperation with UML. It leans heavily on XMI, a meta data exchange specification. XMI enables meta data to be passed between modeling tools. This in turn enables DTD’s and later XML Schemas go in and out of modeling tools seamlessly.

CWM

The volume of data in an organization doubles every 5 years. Much of it is redundant and inconsistent. CWM provides a standard way of handling data warehouse problems. It supports ETL, OLAP, XMI, and UML. In addition specifications are being developed by, and for, specific industries.

CORBA

CORBA is a middleware specification. It’s a list of API’s that allow data to be moved from legacy systems to new ones and back. There is still a lot of COBOL code that needs to integrate with VB, Java, DBMS’s, the Web, etc. It facilitates this integration while staying vendor independent.

CORBA has been extended to include XML and DOM (Document Object Model). It enables XML structures to be compacted into a binary format for easy transport.

Domain Specific Standards

PIDS, or Personal Identification Services, provides a way for health care providers to identify individuals. There is no reliable unique identifier for people and misidentification can mean wrong treatment. Hence algorithms determine the probability of a match.

Resource Access Decision (RAD) specifies how to secure access to healthcare data. It helps to implement and enforce access policies and procedures.

Andrew showed diagrams of how all the specifications relate to each other. The future is to make application development as model driven as possible. The goal is to have all code generated by a modeling tool and not modifiable directly.

Conference Session

Speaker

Embracing XML Strategic Implications for Data Administrators/Architects

 

Peter Aiken, Institute for Data Research

Virginia Commonwealth University

 

Summary by Arnie Hook

Dr. Aiken looks at the organization/legacy assets to locate opportunities to integrate data with the management of meta data. The focus is on the evolution of systems. He advises to not try and develop components all at once. Time and expense equation?

The presentation identifies XML Benefit and XML application Integration ratings for various business and technology classes. XML is ‘meta data wrapped around data’ and associated with business problems and planning.

XML equips the organizations with the tools to and technology develop programmatic solution to manage data interchange environments using economies of scale. Peter explains the metrics and time problems for engineering the legacy. The 7-hour per attribute definition metric does not exist (a myth) in creating project plans.

Aiken uses real life examples for the audience to understand the implications of XML, data architecture/engineering, and data management practices to approach and define data solutions. The scenario of systems operations using XML manages business rules and data interchanges.

Using XML expands the definition, roles, and preparation required of data management for e-business development. Attendees of this session benefit from early XML adopters and the role XML will play in future data management.

 

Conference Session

Speaker

Enterprise Data Management Without the Enterprise Data Model: Working in the Real World

 

Sheri Dumire-Hamilton, Senior Systems/Business Analyst, Kodak

 

Summary by Margaret O’Hara

The goals of Sheri’s presentation were to demonstrate how ED Management would benefit the firm, to present some different approaches to resolving issues and to identify some sources and issues of technology change. The goals of ED Management are to increase data sharing across the organization, to increase reuse of data and maintain control, to enable evolution of new technology, and to integrate new needs and stability of DBs over time – in effect as data evolves, the DM must keep up.

An ED Model does several things. It supports the use of data as a corporate asset; it provides a vehicle for communication and agreeing about data meaning and usage, and it supports the sharing and reuse of data across functional areas. Still ED Models are often not constructed. These are many reasons for this. Among the reasons are: construction requires support and direction from senior management, it absorbs resources and may not provide immediate measurable value, and it is often perceived as a corporate mandate with little value to specific functional areas.

So, where can you start to develop ED Management? First, select a problem that data management will address with high probability of success. Symptoms of DM problems include: lots of interfaces being written, customer complaints about supplying information repeatedly and errors due to bad data, data unavailable for decision making, problems in enterprise data management, and difficulty in meeting changing business needs.

To handle the problems, first define the problem domain then plan the approach to solve it. It is important to publish the approach and review it with affected areas. Some things that can “bite you” are: there is a common ground, but everyone is fighting for a piece of it. To alleviate this, find a champion and form a steering committee. Power struggles occur because data is not seen as a corporate asset. By educating the concerned parties about the nature of data management data is viewed more as a corporate asset. Finally, it is important to network, communicate and educate the involved parties. Build relationships with individuals to increase their comfort level, their trust and your own credibility.

 

Conference Session

Speaker

How do you Convince Management to fund your Proposal?

 

David Davis, Vice-President, Enterprise Data Management Group, Bank One

 

Summary by Linda Kresl

This presentation focused on the political maneuvering required to persuade and convince management to fund projects. David explained that people with technical backgrounds often stress the technical aspects of a proposal to their detriment. The context of the proposal, it’s timing and how it is presented often affect the acceptance or disapproval of a good proposal. Various anecdotes, analogies, marketing and forming alliances can lead to successful, approved proposals and projects. The best implementation, technique, new technology and method do not guarantee acceptance and funding.

This presentation further explained the following steps to ensure success:

·        Learn that the work involved in “selling” a proposal may be as difficult and necessary as the project

·        A technique of creating analogies

·        Share successes and failures

·        Learn the importance of ‘sound bites’, charts and diagrams to sell proposals

 

Conference Session

Speaker

Data Warehouse Project Planning

 

Sid Adelman, Founder, Sid Adelman & Associates

 

Summary by Anne Marie Smith

Sid Adelman, consultant and co-author of the book “Data Warehouse Project Management” presented a roadmap for developing a successful data warehouse project plan.

Sid outlined the history of data warehouse project planning, why project planning is critical to the success of any development effort, what constitutes a proper data warehouse project plan and how to relate the project plan to the technical infrastructure.

To date, many organizations have taken the approach of not planning a data warehouse project for many reasons. Almost without exception, these non-planned projects have failed, and according to Sid’s research, this failure can be traced to the lack of a concrete project plan. This presentation showed the similarities between traditional systems development and data warehouse development and the few differences.

Major points in Sid’s presentation included:

·        Project Selection: choose sponsors and users who really want the project to succeed, a project with importance to the organization, a project that WILL succeed (not necessarily a high profile or controversial project), and a project with measurable success factors, a project with reasonable size (database and interfaces) and reasonable time expectations, project control

·        Function: source data (from where are you getting the data, is it reliable and clean?); determine needed summaries, aggregation and integration methods; develop appropriate canned queries, issues in the meta data repository for a DW (user-oriented). User and technical functionality are different, and the differences must be understood and evaluated.

·        User Expectations: performance (sub-second response time is unrealistic from a DW), simplicity (ease of use of the user tools, easy to understand navigation), accuracy (clean data, correct data – these are different), availability (do you really need 24x7, 365? This is very expensive and usually not a true requirement), timeliness (data refresh expectations must be established), difference between summary and detail data access needs. Traditionally, success is not well-defined, and can be achieved through communication of expectations

·        Scheduling: taking a phased approach (by subject area and user role delineation) is the foundation of a successful data warehouse, task estimation (a difficult task and experience contributes to amount of time needed to complete a task), actual hours worked versus elapsed time (which measurement will you use? – use both), essential to build contingency factors into a plan since interruptions will always occur, schedule responsibly since too-tight schedules force people to do re-work. Delivering low-quality results quickly is NOT a method for success! Sid felt that a 60-day phase was a bit too short, and recommended a 3-month phase.

·        User Responsibilities: co-project management (IT and user managers), users must define requirements (NOT the IT staff), security requirements (essential in web access to data), determining roles in query and reporting tool selection (not necessary to involve users in infrastructure tool selections, user involvement in training material development and implementation

·        Tools and Service Agreements: performance and response time requirements are not appropriate for a DW, but availability and problem response time requirements are appropriate for a DW, DW implications on the work of the Help Desk or other support mechanisms

·        DW Project Planning: the usual steps of application development project planning apply, each task should not exceed a 40-hour period, each task should have a primary responsible party (even if there are more than one person on the task), each task should have a defined deliverable, each deliverable should be evaluated for completeness and contribute to a defined milestone, progress monitoring and change control management are also important and frequently forgotten

·        Resources: people versus roles (some people can fill multiple roles, but should they?), development and maintenance of a capabilities and skills assessment for all team members, direct reporting relationship (100% focus on the DW project), importance of management commitment and active support across and through the organization

Sid concluded with offers of some reference material (web links, task examples, suggested vendors) to interested attendees.

There were numerous questions, and they included the issue of data cleansing at the source (do you go back and clean up data that is clean in the DW and not clean in the source?), the best format of a project plan for a DW (iterative or spiral), cost-benefit analysis of a DW (see an accountant!), choices in various tool categories, and specific roles to be included in any data warehouse project. These questions showed the level of interest in data warehousing and its “resurgence”. It also demonstrated the need for more presentations on data warehousing and project management.

 

Conference Session

Speaker

Meta data Directory vs. Meta Repository

 

James Jones, Product Manager,

Oracle Corporation

 

Summary by Ron Klein

James started by citing ORACLE own experience in streamlining its business using its own solutions. i.e. “Eating our own dog food”

Lightweight Directory Access Protocol (LDAP) is the Exploding Standard. It is a light, browser friendly client implementation.

What are Directory Services?

-         “A flexible, special-purpose distributed database designed to the storage and retrieval of entry-oriented information for a wide range of applications.”  

-         DS are a type of universe of meta data

The Meta Directory Paradigm:

-         Touches everything and is everywhere

-         A single directory that connects everything

Stretching the idea of meta data persistency and sharing:

Nodes + Hubes = Nubes <- ETL

 

Meta Directory

Meta Data Repository

Metadata (Hierarchical)

Security

Party

Network

Device

Security Integration

Device Integration

Giant Installed Base

Metadata (Any)

Managing files and folders

Dependency management

Versioning

Configuration management

Tool Integration

Small Installed Base

Q: Is the Meta Directory usually a source for the repository? 
A: A place where it can store this information, but it is not strong enough to hold the complexity.

Q: Should we be hanging off these directories underneath a repository?
A: Underneath a portal, yes.

 

Conference Session

Speaker

Ramping up for Meta Data and Knowledge Management

 

Don Soulsby, Director of Architecture Strategies, Computer Associates

 

Summary by Carey Clark

In the beginning was Electronic Data Processing (EDP). The focus was handling files and getting data in. In the 80’s the focus was on getting the data out (DSS, EIS, Queries). This age will be known as the knowledge management era. Tabular data needs to merge with documents, graphics, and video. The buzzwords are integration and access, and like before, tools follow the need.

Knowledge Management

Knowledge is information (data) at work. Eighty to ninety percent of corporate information is not tabular in nature. The issue is how to store and retrieve it efficiently and combine it with pertinent tabular data. The difficulty is compounded by the tribal nature of various disciplines: Data Processing, Library Science, multimedia, desktop applications, Web technologies etc.

Legacy systems tend to resemble spaghetti. When using third party packages one must not only use other peoples’ products, but other peoples’ models. How, then do you find what you are looking for? In 15 years the baby boomers begin to retire and their knowledge goes with them. It behooves organizations to capture as much as possible before they go.

It’s a massive problem, not unlike building the Empire State Building or the Queen Mary. As in those cases, a key factor was having the right tools (e.g. the rivet gun).

The Solution: An Enterprise Information Portal

Create a single place where all information can be accessed and displayed. Integrate the various forms of information. Where possible, provide dynamic personalization. Make it easy to find, easy to understand, easy to navigate, and believable. Provide information in context as a way to recognize what you have.

Most knowledge architectures are hierarchical. This is efficient for getting somewhere fast but not for finding stuff in the first place. A better model is the Knowledge Mall. You can find stuff alphabetically, by category, by context, and by wandering around. Still there’s a need for a map.

Don’s technique was to classify information using the Zachman Framework. Going vertically you have rows for Business, Operational, and Technical. Going horizontal are columns for Who, What, and Where. This could be expanded to match Zachman’s 6x6 matrix.

Personalization involves knowing specifics about the user. Might be buying patterns, sales patterns, demographics. Based on these the user sees different screens, menus, options etc. Software behind the scenes is able to learn, predict, adapt, and optimize. Patterns are recognized and used extensively to present information or suggest new resources.

Observations/Predictions

Metadata repositories are likely to adopt parallels to retail’s UPI codes. Data will have truly unique identifiers.

Meta data must be collected in response to a business event. If people have to enter it manually, it most likely will not be maintained. The imperative is to decrease the number of duplicate instances. Store once, distribute many.

He expects knowledge navigation and supporting software to resemble the neural net: It recognizes patterns, learns from experience, adapts dynamically, and predicts outcomes.

 

Conference Session

Speaker

Building the Scalable Data E-Frastructure

 

Tim McBreen, Senior Principal and E-business Practice Leader, Knightsbridge Solutions

 

Summary by Arnie Hook

 

The theme of the talks is to make sure we are ‘building the enterprise infrastructure’. Tim describes the high-performance data solution, which is robust, and scalable and cost effective. Why performance matters related to data volumes and quality of use, and influx of data.

Mr. McBreen says that scalability rules the day; build it once; build it right, scale often. The e-frastructure data engine includes:

            Data acquisition processes

            Data repository

            Data mart creation processes

Tim describes a typical solution encompassing data extraction, transformation, aggregation, and balancing/controls & loading. The tool of the month club will not work to manage the e-frastructure environment. Changing tools created chaos for impact analysis and applying new requirements.

 The methodology is a three-path approach:

            Business path – end user focus

            Data path – design, development focus

            Infrastructure path – design, configuration, implementation focus

  Each path in the methodology describes the required components and activities for a data warehouse/mart project. The architecture philosophy is a cohesive and integrated environment that enables the applications to resource the best capabilities and performance of the underlying technologies. Tim walks through each of the components in describing the e-frastructure environment.

Mr. McBreen stresses the importance of data management solutions that allow companies to enable the ‘power enterprise’. A compelling message to the data practitioner needing an approach to deliver a data warehouse application.

 

 

Conference Session

Speaker

Data Administration on A Shoestring   

 

Becky Kirkpatrick, Data Architect

Union Pacific Technologies

 

Summary by Margaret O’Hara

Becky began describing how Union Pacific IM has adjusted to downsizing of staff, mergers and lack of funding to provide an online metadata repository that was quickly put together, is very functional and continues to grow in use and in capability.

The results of her and 1.5 full time employees is an enabled website using the Zachman Framework as the architecture for its development.

The problem, as Kirkpatrick detailed, is that end users and IT project managers want to know immediately where they can get state and country data, customer number information and values from a railroad equipment master.

None of these important questions could easily be answered by any means that were currently available.  Kirkpatrick’s management that her group of 2.5 people put something together within a 3 to 4 month period.

The team ‘piggy backed’ on existing files (manual and automated) and utilized existing technologies (i.e. Access, Excel) coupled those with web development and produced a product that was successfully implemented and accepted.

Kirkpatrick then walked the audience through a demonstration of the online site that they developed.

 

 

Conference Session

Speaker

Mapping UML to the Zachman Framework

 

Neal Fishman, Enterprise Architect , Equifax

 

Summary by Linda Kresl

This presentation focused on why it is important to map the UML to the Zachman framework. The number one reason is to model systems, from concept to executable artifact, using object-oriented techniques.

• To address the issue of scale inherent in complex, mission- critical systems.

• To create a modeling language usable by both humans and machines.

• Use the UML for...

– Visualizing

– Specifying

– Constructing

– Documenting

Neal explained that the UML consists of nine models and the Object Constraint Language (OCL). The Zachman Framework for Enterprise Architecture identifies at least thirty models. This presentation reviewed each UML model type (use case, class, object, component, deployment, activity, statechart, collaboration, sequence, an OCL), and review which of the Zachman cells they map to. The presentation then explored the use of stereotypes to augment the native UML models in creating more model types to demonstrate how to complete the mapping to the framework.

·        Identifying the UML models

·        The Zachman Cells

·        Using stereotypes

·        Mapping the models  

 

Conference Session

Speaker

Managing Customer Information for CRM

 

Danette McGilvray, Customer Information Quality Program Manager, Agilent Technologies

 

Summary by Anne Marie Smith

Danette asked and answered the questions “Can you claim to know your customer if the information in your systems about that customer is wrong?” and “How can you manage the relationship with your customer if the basic process for acquiring, maintaining and using that information are not working?”

Danette’s presentation focused on these points:  

Danette presented a case study in CRM, using Agilent’s customers as the basis of a CRM initiative. The case examined a customer information system (one of many at this client) to determine the level of effectiveness for CRM. The system was developed with the framework mentioned above, and was used as a method to re-engineer the customer approach at this client. She concluded with some examples of uses of information in a CRM pilot system and a list of challenges to CRM.

Questions, taken throughout the presentation, were around the framework’s development, uses of information in CRM, explorations of reasons for CRM failure. Danette’s presentation showed the relationship data has in a CRM effort, and the need for quality data in CRM.

 

Conference Session

Speaker

Just in Time Meta data

 

Bob Carasik, Systems Architect,

Wells Fargo Bank

 

Summary by Ron Klein

Bob has worked with Data Dictionary for two decades and is still doing the same. He worked with Case, Repository, XML and messaging standards. He currently co-ordinates the meta data initiative for the enterprise portal. Wells Fargo is a leader in Internet banking, eBay and account aggregation to customer. You will hear more and more that all your financial services can be bundled in one site. Clients do one log-in and have access to many financial services.

Doing the Portal = Reality hits people in the face. We need to know about meta data. It is quick to explain why it is important, but to get into the project plan is another story.

Mapping is a hot spot to help find inconsistencies.

The goal is to make the transitions easier. Has to be bottom up and has to be distributed.

You find meta data automatically through the Web.

Messages are way under cover in systems. Now it surfaces as meta data and turns out to be as important as database schema meta data.

End Users don’t need to understand the meta data, but do quick searches.

Meta data Paradigms: The Ideal

The old idea to centralize everything did not work. The enterprise wide model is also a challenge. I can see the advantage of that, e.g. DHL expanded one character on the packaged ID field, and has been dealing with this issue for many years.

Bob strongly suggests the federated approach to meta data. This accepts that semantics differ across the enterprise but provides a common format for meta data.

He also proposes a lightweight meta data strategy for building step by step. Recognizes that high quality meta data frequently costs too much to provide, relative to its benefits to users. You don’t need a full repository to begin with.

Resources can come when you show how much conversation will be needed.

Lower your standards! You’ll feel good when you deliver.

Q: When you gather meta data are you building processes to maintain it fresh? 
A: Share tags and retrieve them on a project-by-project basis – it is a negotiation process. It is not necessarily repeatable. That is the just in time concept here.

Q: What tools?  
A: ORACLE, sometimes a Web resource. Repository technology meets some needs but not all. Have a standard for XML development.

Q: How you handle change management?  
A: Specific for each project.

Bob did a very good piece-by-piece presentation on a current issue that most of us are dealing with, namely developing portals and how to piggyback to develop and gather meta data. Relevant notes on the project: we have 2 levels taxonomy. Allies: our technical library, our internal web team, PMO. Lots of goodwill for meta data creation. XML-Schema as a documentation tool: Document Language to a Data Language. Modified Dublin Core for defining Meta Tags.

 

Conference Session

Speaker

Architectures for Marrying Online Applications with Information Repositories

 

Faisal Shah, Chief Technology Officer, Knightbridge Solutions

 

Summary by Carey Clark

Knightsbridge Solutions works with Fortune 500 companies to marry data from transaction processing systems with data from data warehouses. Conceptually this is trivial. One might suppose you just create a front end to display data from both environments.

In practice, however, doing this is very difficult. The difficulty arises from the fundamentally different “quality of service” requirements of each environment. Explaining and resolving this difficulty is the subject of the presentation.

So why marry these two data sources in the first place?

A bank wants to provide customers with analytical information about their investments, i.e., reporting tools to compare their portfolio with industry indices (e.g. Dow Jones, Standard and Poors). Customers want compare their performance with newsletter or broker recommendations. This is a serious competitive advantage if the bank’s competitors don’t offer it.

An Internet hosting service wants to provide their advertisers with real time statistics: The number of visits to a site, the kind of visitors they were, what web pages were visited etc. This must be done on an hourly basis; two days late is unacceptable. In both cases historical computed data is displayed along with real time transaction data.

Quality Service Levels

A typical online transaction processing system (OLTP) requires 24x7 uptime, sub second response times, and 100% accuracy. It must fault tolerant even against disasters. It has very narrow maintenance windows, usually minutes.

A data warehouse systems are up typically 12 hours a day, 6 day a week (12x6). The off hours are needed for batch processing. They don’t need transaction monitors, the data can be a day or two old, response time can be several minutes, and if something goes wrong, you’re not out of business.

And herein lies the problem: Transaction systems can’t tolerate warehouse service levels and warehouse systems can’t realistically achieve transaction service levels. Users who see both data types at the same time assume the same service level.

What to do

It is real important to perform careful ROI analysis. Ambitious requirements can be outrageously expensive, even for large companies. The best solution is a set of trade offs.

The first reality is that one cannot put both transaction and warehouse data on the same box. Just not feasible.

The second reality is you can’t divvy up warehouse data into mini warehouses. Doing so forces you to decide in advance what queries will be asked. If you choose time, then geography is a performance problem, if by type then time is a performance problem.

In almost all cases, the online transaction system must remain fast and reliable so every effort is made to impact it as little as possible. One successful technique is to precalculate and preaggregate a small standard set of queries and load that data on the transaction system. This precludes complex and ad hoc queries, but still provides immense value.

Another technique is to limit analytical data to the time dimension. Data can sometimes be distributed across multiple database instances. A thousands trade-offs are made, for example, weighing refresh times, whether to put analytical data in with the transaction data or in a separate instance. Backup and restore can be handled, but the data currency is different for the two environments. How different is part of the trade off analysis.

In a few situations the amount of data was so large that putting data in a relational database was cost prohibitive. In these cases the client resorted to creating massive flat files.

A favorite techniques is to toggle between multiple database servers, or multiple database instances, or multiple database tables. This technique doubles or triples refresh times and hardware costs, but it works.

 

Conference Session

Speaker

Getting the Rest of Your Organization Ready for XML

 

Korki Whitaker, Progressive Insurance

 

Summary by Arnie Hook

Ms. Whitaker presents advice concerning the introduction of XML discovery activities, the employee indoctrination, and the needs for training. The talk is based on experience gained at the Progressive Insurance Co. where she is responsible for data-related teaching and development.

Korki’s advantage to XML usage is that Progressive’s management understands the benefits and has a place for new technologies in addition to allocating resources for its promotion.

The Data Engineering group led the activity with surveys of management and explored software acquisition areas about XML tool requests. They also examined projects for interfaces to internal and external systems. They proactively got involved with detail requirements of projects.

Up-front analysis included documentation and highlight of accomplishments with current projects. This was critical to show management of progress and successes with XML. XML projects require a high-level management sponsor in order to form a project and a development group (internal XML forum) in alignment with business requirements. The XML forum is an established common interest group with regular sessions and subcommittees.

Ms. Whitaker’s group developed a database of XML best practices and a training program to extend knowledge throughout the organization. The group objectives are equivalent to learning a new programming language. A core group promoting and mentoring XML as a new technology benefit automated project progress.

Korki’s experience and presentation sets-up any new XML advocate with material for introducing XML, a new technology.

 

 

Conference Session

Speaker

Data Modeling Contentious Issues

 

Karen Lopez, Principal Consultant

InfoAdvisors, Inc.

 

Summary by Margaret O’Hara

This presentation was a highly interactive look at the issues that people who subscribe to the e-mail, web, and newsgroup based discussion groups have participated in. The format was simple: Karen presented an issue, discussed it briefly and then asked the audience to vote on it. Then, there was a brief discussion as to why the answers were what they were.

Voting was performed in an interesting manner. People in the audience were given Post-It notes and they could stick them to one of 5 boards, depending on how strongly they felt about an issue. Not everyone had the notes (the group was too large for that) but there were enough people with voting ability to make the results interesting.

Among the issues discussed were: whether conceptual data models were used (the results were evenly distributed on a 1-5 scale) whether a good data model needed classwords (results were definitely skewed toward 1 for always) and whether surrogate or natural keys were preferred (results were dead center at 3). The surrogate key issue generated a great deal of discussion; it was obvious that the audience felt very strongly about this issue.

The session pointed out two significant things. First, even a group of data administrators and data managers cannot agree on everything. Secondly, the voting method used was quite effective for taking a quick pulse of a large group and can be used in other similar situations.

 

 

Conference Session

Speaker

Data Stewardship-Fact or Fiction?

 

Diana C. Young, President,

Applied Information Strategies

 

Summary by Linda Kresl

Diana began this presentation by explaining the term data stewardship has been tossed around for the past decade. Stewardship is … the recognition that all individual components of an enterprise serve to ensure the future of the total organization.

The main stewardship objective is to provide high quality information that meets the needs of the business:

·        Getting the Right Information

·        To the Right People

·        At the Right Time

Ultimately, the successful path to stewardship is based upon an understanding of the principles of information stewardship, aligning those principles with the business in a value-added approach, and planning and achieving both short and long term improvements in the business. This presentation addressed:

·        The four factions of stewardship: strategic, tactical, operational, and technical – what are they and how they align with business processes and functions

·        Stewardship roles, responsibilities, and the “A” word – accountability

·        The four pillars of successful implementation-policy, program, practice, and promotion.

Lastly, in our work, we are all information producers just as we are all information consumers. As we work within our companies, we all must strive to see that all functions of the company succeed. Therefore, it is in our best interest for us all to champion the practices of sound information management. Because, in reality, we are all Information Stewards.  

 

Conference Session

Speaker

How to Make Your Business Processes Smarter

 

Ronald G. Ross, Principal,

Business Rule Solutions

 

Summary by Anne Marie Smith

Ron Ross, renowned consulting expert in business rules and the editor of the “Data to Knowledge” newsletter, is one of the information management field’s primary speakers and practitioners, and was the winner of the DAMA International 1995 Individual Achievement Award.

This presentation introduced the concept of a business rule approach to business process “education”, outlined three steps in the process of applying the business rules approach and offered some suggestions for implementing this approach in various organizations. Ron used cases from his consulting to demonstrate the fundamental verities to business rules as the point of control for the business.

Points in Ron’s presentation included:

·        The inevitability of business rules: Business rules can assist organizations in doing things “faster, cheaper, better” and can teach organizations about their company’s activities and culture. Organizational trends from the 1960’s (automation) through the 1990’s (warehousing and networking) concentrated on technology. The trend in the 2000’s is knowledge management, a non-technical trend that needs business rules to succeed. Business rules have “guidance spheres” that include policies, rules, guidelines, instruction points and suggestions. These enable an organization to be effective and to achieve the goals and objectives that the company has expressed. Guidance spheres are fragmented, compartmentalized and not well-understood. Business rules can make those guidance spheres cohesive, cross-functional and understandable.

·        The need to trace the rules to their sources: Rules cannot be valuable unless their sources have been identified, interpreted and captured to retain corporate memory. Sources can be valid or invalid for each rule, and the reason for a rule’s origination at a particular source must be captured as part of the rule’s meta data, and rules should be managed as part of meta data in all instances. “Outsourcing the business rules to the business” means that an organization gives control of the business rules to the business users.

·        Finding a single source for each rule: One rule should only have ONE source for consistency and tracability. Multiple sources for a rule cause confusion in users and can severely affect the value of the data instances from that rule. Viewing business rules as meta data would include the versioning of each rule, controlling the vocabulary in rules management, etc…

·        The crucial role of data and meta data in business rules: Data administrators should be given the responsibility to manage the meta data of the organization’s business rules, and should have a business rules repository as part of the DA toolkit. Data administrators should be trained in business rule management, just as they are trained in data management. Business rule management includes the development of a rule vocabulary, logic for rule construction, and techniques for the CRUD process of rule maintenance.

·        The needs of knowledge workers in the 21st century: Unlike previous centuries, knowledge workers need “knowledge” to perform their tasks properly. “Knowledge” of the data used to create information requires an understanding of the logic behind the creation of a data instance. This logic is a “business rule”. According to Ron, “the idea of not using a rule engine to run your rules management will seem as strange in 5 years as not using a DBMS to manage your data”.

Ron concluded with a discussion of some “first steps” in the business rule development process and what he sees as the future of business rules. Closing the communication gaps in an organization can be accomplished by adopting a business rule approach to organizational knowledge.

 

Conference Session

Speaker

Meta-Architecture and Enterprise Meta Data Management

 

E. Manning Butterworth, Senior Manager of Data Architecture

Reynolds & Reynolds

 

Summary by Ron Klein

Dr Butterworth is experienced in Business Delivery Architecture. He was a principal engineer of DOD, Air Force and holds an Astrophysics Ph.D. Opening his presentation, he stated that we need to give more emphasis to the business issues than to act just on behalf of IT.

The business goals of meta data management are to accelerate growth and drive down costs from doing business. Butterworth described his meta-architecture process explaining from the big picture to examples of fragments of how artifacts were defined in the architecture. He chose the tool ARIS as enterprise modeling tool, popular in Europe, originated from Germany. It is an OO DB, methodology neutral or almost. Butterworth says the model will never be “finished”, but will be useful along the way providing incremental value.

This part of the presentation gave rise to many questions:

Q: What are the 17 model types?  
A: It is the metamodel.

Q: Would there be a constraint for each dimension?  
A: Yes.

Q: Synthesizing the process, how did you come to 17 model types?  
A: It is based on what is required to capture at this point in time.  

Q: How do you deal with an object type that is part of more than one model?  
A: You can assign objects from one model to another.

Q: What about Change Management?
A: This is work in progress.

Q: Where do you put an instance?  
A: Maybe in a document and associate to the object.

Q: Do you get your meta data manually or are there any automated extractions to populate?  
A: Both.

Q: Will this data model support architecture over time?  
A: It will require change management.
 

The audience perception was that Dr. Butterworth’s work has an enormous potential across industries due to its generic construct.

   

Conference Session

Speaker

E-Business Chaos

 

Mike Scofield, Director of Data Quality, Experian

 

Summary by Carey Clark

Experian is one of the three big credit reporting agencies (formerly TRW). They store data on 260 million consumers and a billion credit card accounts. The architecture to do this is very complex and largely proprietary.

Data quality is their major concern and insuring it is a massive undertaking. Massive because they update a billion records every month and because it comes from 5000 separate outside sources! The data from these sources have different file formats, different schedules, different quality levels, and varying data conventions. At the same time accuracy and reliability is of utmost importance.

Determining data quality is difficult because you can’t physically validate it and sampling is often undoable. So you’re left with two kinds of tests: Conformance with absolute rules and reasonability testing.

Absolute rules are like:

  Reasonability tests are more subtle:

Data that pass the tests are allowed to be loaded into the database of record. Data that doesn’t is diverted to a suspense database. Suspense data is then examined by humans and the data’s source organization is often called.

Obviously the incentive is to automate as much as possible and to reduce the amount of human intervention. To this end they make it easy to study the problem, easy to decide what to do about it, and easy to execute the decision. All of this must be done without slowing down the timely loading of good data.

Lessons learned:

Test data as soon as its available and certify it before loading into the master database. If you scrub the data, test it again before loading it.

Have several architectural components: A data rules database so that rules don’t have to be hard coded. A historical context database to remember exactly what was received last time. A metrics database to store quantifiable measures of quality. And a feedback mechanism to the supplier of the data.

It is not uncommon for them to know more about the quality of supplier data than the supplier does. And they sometimes provide quality assessments for a fee. Without violating confidentiality agreements they can sometimes let suppliers know that their data is not as good as their competitors’.

Experian believes that knowledge and maintenance of data quality is what differentiates them from other credit bureaus. They are a learning corporation and their sophistication resembles an expert system.

 

 

Conference Session

Speaker

Same Old Work, New Dilemma: A New Approach to Data Design for Interactive Web Portal Applications

 

Ho-Chun Ho, Director of Information Systems, PointandQuote.com

 

Summary by Arnie Hook

Ho-Chun defines E-business as ‘the transformation of key business processes through the use of Internet technologies’. He presents an E-business maturity model that streamlines applications for integrated enterprise architecture. The stages of maturity are:

Web presence

Interactive

Transaction

Inter-enterprise integration

The new dilemma is that customers are demanding services through technology such as the WWW. Organization developers want to use the best and fastest automation to deliver applications and technical services. Executives want to use information and technology to gain market share and increase competitive channels and to produce new products.

Ho-Chun presents basic e-business terms and describes application considerations. The content prepares an organization for planning and development of the Web enterprise. He provides a vocabulary of Web and development terms.

The attendees are better equipped to consider the stateless design and data performance issues in addressing Web applications.  

Thursday, March 8th, 2001

 

Conference Session

Speaker

Enterprise Information Architecture: "Starter Kit" Models

 

 

Jane Carbone, Director of Information Architecture Services, DATANOMICS, Inc.

 

Summary by Linda Kresl

This presentation reflects the Jane’s experience in building and using enterprise architecture frameworks to create architecture models and related data models. The presentation provides a “drill-down” for the “models” dimension of the “data” component of the “Starter Kit” architecture framework. It introduces a standardized approach to building conceptual information architecture models. It describes the link from architecture model to conceptual data model. It includes examples and guidelines for construction of Current State (AS-IS) and Target State (TO-BE) information architecture models. She focused on the following:

·        How to construct standardized information architecture model

·        How to decompose standardized information architecture models

·        How to create a conceptual data model from a Level n information architecture model.

Jane stressed the importance of defining the difference between information architectures and data architectures. An information architecture is the graphical representation of the business view of data functions, technology, people and processes and the relationships and/or communications between them. A data architecture is the graphical representation of the business view of data functions, processes and the relationships and/or communications between them. She warns to always do the enterprise architecture models before you attempt to create the enterprise data models. A data model does not = an architecture model. But, they have a strong relationship.

 

 

Conference Session

Speaker

Data Standardization

 

 

Michael Gorman, President, Whitemarsh Information Systems

 

 

There is currently no summary for this session.  Please check the Wilshire Conferences web site shortly for the summary.

 

Conference Session

Speaker

Conceptual Data Modeling in an Object-Oriented Process

 

Scot Becker, Principal Consultant, InConcept, Inc.

 

Summary by Anne Marie Smith

Scot Becker, principal consultant at InConcept and the editor of the Object Modeling newsletter, presented the Object Relational Modeling technique. Scot defined the various components of a object model, how components are defined, briefly defined iterations (“mini waterfalls”), and gave an overview of the analysis phase of an object modeling session (“what is the problem”, NOT “how to solve the problem”). The verification and deployment phases are iterative, as is appropriate to such development, since components are iterative in themselves.

 Typical uses of data modeling in an OO process are just for mapping objects to tables/columns, so Scot proposed that there is no need for modeling in OO. Data issues are usually ignored in OO analysis, and in the design phase data issues are mostly concerned with denormalization. Scot is a fan of use cases for process requirements, but not for data. The main difference between class diagrams and ER diagrams is the abstraction inherent in class diagrams that do not translate well to data analysis efforts.

 Scot offered some benefits to the OO approach to requirements and analysis:

Scot offered some weaknesses of class diagrams, including inflexibility, lack of clear cardinality, and the need to actively focus on entities and attributes separately. Components are not perfect since too many important things rely on getting all the components right the first time (lack of iterative capabilities). Use case formats are too variable, and can be “overloaded” with information that may not be essential. According to Scot, the lack of formal OO modeling techniques can contribute to errors and misunderstandings in analysis and design.

Scot believes that the way to overcome the weaknesses of OO modeling is to develop a new method, called ORM (Object Relational Modeling). This is not really a new idea, and is centered around the concept that entities and attributes are simply objects playing one or more roles. ORM uses a natural language (English or other languages) and uses “data use cases” as well as “process use cases”. It has a very rich set of constraints and it is set-based (mathematically valid).

Scot gave an overview of the Conceptual Schema Design Procedure as a way to explain the ORM approach, using the various OO components as a comparison to ORM. He suggested that OO and ORM can be merged and work together to achieve a full method for analysis, design, construction and implementation. He concluded with a review of a case study that enabled him to employ the OO/ORM combination approach.

 

 

Conference Session

Speaker

A Success Story: Enterprise Customer Data Standard Definition/Implementation

 

Barbara Peterson, Enterprise Data Standards Program Manager

Agilent Technologies

 

Summary by Ron Klein

Barbara is working with Customer Data Standards, Web Applications and she has 21 years experience at HP. She is not an IT professional. Her father asks: What do you do Barbie? After, answering with an explanation, she invited him to attend the presentation. He says – “Wonderful, how about I come in for lunch?”

Information Quality can be no better than the supporting data standards. How do you get standards that really make sense to the business, not for IT?

Applications must use the same naming convention. Agilent had a Quality person but not a Program Manager for Standards and the need came when the mappings among customers begin to surface nightmares. This was the reason I was hired to this job.

Q: How do you get the business to understand the logical model?  
A: By including them on the creation through the Data Standards Council involvement.

Q: How do you get buy-in?  
A: There are 30 representatives in the Council. We maintain a 2 hours weekly conference call.

Q: Are there any metrics to qualify the benefits?  
A: Manpower hours saved and quality improvement could be the ones, but we don’t currently have any metrics on this.  


Q: After you define your standards and buy ORACLE ERP system, do you change the business or do you adjust the software?  
A: We discuss that we can leave without customizations and are able to implement standard ORACLE ERP.

Q: What attitude do you have on implementing a package? In my case we changed the business? Maybe you have other answers?  
A: There are exceptions because there is flexibility within the product and this is where the standards come very handy to resolve things that the vendor does not answer in your behalf.

Q: Are the Council defining and taking information and disseminating throughout the organization?  
A: Both.

Q: Is this council a single level or multiple levels?  
A: It is a high level decision makers and also experts. Get the right people in the room. The decision-makers have a vote.

Q: Where is IT in the council?  
A: Just below the CIO level.

Data Standards Process Roles

Don’t go to the council without a draft pre-defined and available on-line on the Web and feedback come those are the things you go and discuss in the Council. It is also common knowledge.

Q: Your position is in IT?  
A: I am in Relationship Marketing.

Q: Does your Council deal with physical data?  
A: Yes with ORACLE ERP vanilla implementation, but it is more on high level conceptual.

Q: How many people are there in your organization and what skills do you need?  
A: The skill set required is facilitation skills and able to work with people that disagree. Will create groups by subject.

Q: Is it focused in Customer only?  
A: IT is huge around customers.

Q: How do you attract people?  
A: More people want to have this role because they have decision power. Information is shared. People on Council come from all over the business. Have a one-day kick-off meeting face to face and then move to conference meetings, not more than two hours. Always use the same dial-in number. Get for one year. They need to know your expertise. No feedback is back feedback. Teleconference recommendation: No cell phone. 80% is Consensus = done deal.

Q: Who pays for your function?  
A: At HP was funded by the business, now is internal.

Q: Before talking physical?  
A: If you have it going it has to address both.

Q: How do you manage multiple instances in multiple countries, and follow through time. What are your guidelines?  
A: In HP there was Steward and nobody was policing. Now it is the council. They are the ones who enforce and bring changes back to the council.

 

Conference Session

Speaker

eRepository for eBusiness

 

Warren Selkow, Consultant

 

Summary by Arnie Hook

Why repository? It is where the business knowledge is. It is all about information technology. Decision-makers need to know about the business/technology environment. If you are a big enough company, your mistakes become standards. The repository contains the organization’s motivation.

Mr. Selkow describes the environment and what it means to business leaders. He outlines the knowledge paradox for future business trends and technology applications. Today there is unprecedented business pressure and rapidly changing technology changing the nature of business. Eighty percent of what we know is obsolete. Technology costs are decreasing and information costs are increasing – a dramatic impact to business practices.

Experts say whatever will sell their ideas and products. Warren opines that the IT professional learns to communicate at the executive’s motive of ‘time to respond’ and ‘time to market’.

Mr. Selkow speaks of the environment applied to standards and frameworks to provide how technology is configured and its historical views all captured in an enterprise repository. The repository is about deployment and the creation of measurable results. We must educate management emphasizing ‘the need to know and share’, and how it relates to customer benefits.

 

Conference Session

Speaker

Web Usage Mining and Analysis

 

Patricia Klauer, Senior Consultant, &

Robert Cooley Senior Consultant

Apex Solutions, Inc.

 

Summary by Margaret O’Hara

The topics covered in this presentation were: clickstream data, web content and structure charts, E-commerce data and the analysis of such data. Clickstream data is very difficult to analyze – everyone wants it but thus far no one has provided an effective solution. We want to analyze data in terms of user actions and behavior (typically referred to as clickstream).

For even the most basic sites, however, a single click may generate 3-4 pages of data that require lots of Extract, Translate, Load (ETL) operations to get anything meaningful from it. And, after all that analysis, we sometimes get nothing meaningful in the log. Some specific problems are: IP addresses may not be unique. We can use cookies, but they raise privacy concerns and are not useful when more than one person uses the machine – they are browser specific, not user specific.

The real problem arises when one wants to track E-commerce events – those site visits associated with a single user. There are two main areas: product-oriented and visit-oriented. For product-oriented events, we need to see things such as click-throughs (when a user clicks on an item to obtain more information), shopping cart changes, and purchases. For visit-oriented events, we need to see how the session begins and ends and what degree of personalization results from the visit.

After collecting all the above data, the real analysis can begin – we can actually determine what the users are doing. Some caveats from the Speakers: you can’t just buy tools and attach them to a web site and expect to collect data (although vendors often claim you can!). A web usage Analysis Methodology would include: careful planning, an iterative build process with prototyping, and determining measures to perform analysis and assess results.

 

 

8 C 20

Conference Session

Speaker

Action Business Rules – Getting to Yes

 

Judi, Reeder, Consultant

 

Summary by Linda Kresl

This presentation focused on action business rules to test conditions and, upon finding them true, start a transaction or event. When capturing action business rules, one of the key tasks discovers and documents those condition and their values that impact the decision. This presentation discusses exempt of decision areas where action business rules were developed using facilitated sessions.

Judi suggests the following steps be completed.