CONFERENCE TRIP REPORT

5th Annual
Wilshire Meta-Data Conference
& 13th
Annual DAMA Symposium
Hosted by Wilshire Conferences, Inc. & DAMA International
This report Compiled and Edited by
Tony Shaw, Chairman, Wilshire
Conferences
Contributing
Trip Report Authors:
A
debt of gratitude, and congratulations on a huge task exceptionally well done,
is
owed by all conference attendees, to the work of the following individuals:
Linda Kresl, Margaret O’Hara, David Plotkin, Anne Marie Smith
The Meta-Data Conference and DAMA Symposium were again
co-located in 2001. The combined event drew an audience of over 1000 attendees
and speakers. The exhibit floor included 35 companies showing the latest data
management and development products. To receive more information about this
conference, and related future events, go to http://www.wilshireconferences.com
This report contains summaries of the key discussions and conclusions from virtually all of the 60+ conference sessions, tutorials and workshops.
Reproduction
policy:
This Conference Summary is intended for the use of the attendees at the 2001
Wilshire Meta-Data Conference and DAMA International Symposium.
As such, attendees may excerpt or reproduce any portion of the report for
the purpose of sharing information with colleagues and within their own
organizations. Any other
reproduction, publication or editing of the report is not permitted without the
specific written authorization of Wilshire Conferences, Inc., including the
placement of the report on other web sites.
However, links to the report on the Wilshire Conferences web site may be
made without express permission.
©2001
Wilshire Conferences, Inc.
"Meta-Data Conference" and the Meta-Data Conference logo are service
marks of Wilshire Conferences, Inc.
Join
us next year…

The 6th
Annual Wilshire Meta-Data Conference
and
the 14th Annual DAMA International Symposium
April 28
– May 2, 2002
www.wilshireconferences.com
META-DATA
CONFERENCE & DAMA INTERNATIONAL SYMPOSIUM
March 4-8, 2001 – Anaheim, California
|
Sunday
March 4 - WORKSHOPS |
||||
|
Half
Day |
W1: Data
Modeling Essentials: Things Have Changed Graeme Simsion & Graham Witt,
Simsion & Bowles |
W2: The Operational Data
Store: An Evolution of the Data Warehouse Jonathan Geiger Braun Consulting, Inc. |
W3: Knowledge For Action: The
Discipline of Spreading Knowledge Robert S. Seiner TDAN & CIBER, Inc. |
W4: Peter Aiken Institute for Data Research Virginia Commonwealth University |
|
Monday, March 5 - TUTORIALS |
||||||||
|
Full
Day |
T1 Zen and the Art of Data Modeling Alec Sharp Damex Consulting |
T2 Applying
Quality Principles to Data Definition and Data ModelingLarry P.
English INFORMATION IMPACT International, Inc. |
T3 Developing
a High-Quality Data Resource to Support Information Needs Michael
Brackett Data Resource Design & Remodeling |
T4 John Zachman Zachman International |
T5 Building
and Managing the Meta Data Repository David Marco Enterprise Warehouse Solutions |
T6 Debbi Walsh
& Hal Davis XML Solutions |
T7 Application and Data Integration
Sridhar Iyengar Unisys Corporation |
T8 Data
Architectures for Scalable E-Commerce Michael
Stonebraker Cohera Corporation |
| Tuesday,
March 6 – CONFERENCE SESSIONS |
||||||
|
8:45-9:45
|
KEYNOTE PRESENTATION: The
Agile Organization, Tom DeMarco, Atlantic Systems Guild |
|||||
|
10:15
-11:15 |
C1 Pouring
the Foundation for the Information Age: Data Architecture at USAA Andres Perez USAA |
C2 Data
Quality as a Profit Center Wendy Wood SBC Services |
C3 Introduction
to the Unified Modeling Language Eric Naiburg Rational
Software Corporation |
C4 Implementation/Use
of Operational Meta Data to Improve Data Quality in the Data Warehouse Michael Jennings Hewitt Associates |
C5 David Hay Essential Strategies
|
C6 Alan Perkins Visible Systems Corporation |
|
11:25
- 12:25 |
Data Management Support for Enterprise Architecture Brett
Champlin Allstate
Insurance Company |
C8 Business
Rule Specification, Validation and Transformation: Advanced Aspects
Terry Halpin Microsoft
Corporation |
C9 Business
Processes and Logical Process Modeling Anne Marie
Smith LaSalle
University |
C10 Redefining Meta Data
Strategy in the 21st Century Ron Klein Carswell
Thomson Professional Publishing |
C11 Build
Your Own Web-Based Meta Data Repository
Joseph Newcum
Bank One |
The Role of Data Administration
in Managing an Enterprise Portal Arvind Shah Performance
Development Corporation |
|
1:45
- 2:45 |
C13 Corporate Data
Architecture in a Federated World Deborah Henderson Hydro One Networks Inc. & Vladimir Pantic, IBM Global Services |
C14 Facilitation
and the Successful Architect
Shelley
Lieberman Mathtech |
C15 The
Practical Use of a Universal Data Model in the Data Warehouse David Lepley Tyco
Electronics |
C16 Understanding
and Managing Reference Data
Malcolm
Chisholm Deloitte
& Touche |
C17 Architecting
and Implementing a Web-Based Corporate Meta Data Repository (CMR) at the
Census Bureau Gail Wright Oracle
Corporation |
C18 Building
the XML Meta Data Repository David Plotkin Longs Drugs
|
|
3:15
- 4:15 |
C19 Jill Dyche Baseline Consulting Group |
C20 Elevating the Role of
Information Resource Management for Business Effectiveness Larry P. English INFORMATION IMPACT International, Inc. |
C21 PANEL:
Comparison of Modeling Techniques Graham
Witt Alec
Sharp Terry
Halpin Eric
Naiburg |
C22 Meta
Data - Myth and Realities John Ladley Knowledge
InterSpace, Inc. |
C23 The UPS Meta Data Repository -
A Success Story: Taking the Next Steps Patti Munier United Parcel Service |
C24 Universal
Data Models for Web Information Management Len
Silverston Universal
Data Models |
|
Thursday,
March 8 – CONFERENCE SESSIONS |
||||||
|
8:30
– 9:30 |
Enterprise Information Architecture: "Starter
Kit" Models Jane Carbone DATANOMICS,
Inc. |
C50 Michael
Gorman Whitemarsh
Information
Systems |
C51 Conceptual
Data Modeling in an Object-Oriented Process Scot Becker InConcept,
Inc. |
C52 A Success Story: Enterprise Customer Meta Data
Definition/Implementation Barbara
Peterson Agilent
Technologies |
C53 Warren Selkow Consultant |
C54 Patricia
Klauer, & Robert Cooley, Apex Solutions, Inc |
|
9:50
- 10:50 |
C55 Action
Business Rules – Getting to Yes Judi Reeder Consultant |
C56 Natalie
Arsenault First Union National Bank |
C57 Dave
Buch, Capital One |
C58 Joe Danielewicz Motorola,
SPS |
C59 PANEL: New Trends in Meta Data Robert
Seiner,
TDAN and CIBER (moderator) Don Soulsby Computer
Associates James Jonas,
Oracle |
C60 OMG CWM - An
Architecture for Enterprisewide E-Business Intelligence Integration Sridhar Iyengar Unisys |
|
11:10-12:30 |
CLOSING
KEYNOTE PANEL: Data Management – Where to From Here? |
|||||
SUNDAY WORKSHOPS
|
Workshop 1 |
Speakers: |
|
Things Have Changed |
Graeme Simsion and Graham Witt Simsion Bowles & Associates |
Summary by Carey Clark
In an industry known for
its hype and self importance, the two Grahams sparkle for their self deprecating
honesty. They speak their mind and welcome rebuttal. Controversy is a good thing
and not to be avoided.
What’s different now than
when his first book was published?
Object Orientation has not made
conventional data modeling obsolete. It’s great for some software projects but
can create more headaches than help for “persistent data” applications.
Typical pitfalls:
UML
is not their preferred modeling nomenclature for a number of reasons:
Data Types are now more
complex and include more user-defined data. Spatial, video, audio and image data
deserve their own type. One must resist the habit of converting data into
characters or numbers when a richer data type makes sense. For example address
can be its own data type and treated as a single thing.
Derived data needs to be
modeled and defined even when they won’t be stored in a database.
Business rules need to be
captured. How they are stored depends on their volatility. All such rules are
subject to challenge. It is the modeler’s responsibility to suggest changing
existing rules when they don’t make data modeling or business sense.
In some circumstances one
must allow a rule to be broken. The problem arises when there is a need to
collect data that doesn’t comply with a rule.
Naming data is extremely
important. When names don’t denote what they mean the ambiguity becomes
widespread. This is even truer with XML and the increased interaction with other
businesses. When there is an industry naming standard (i.e. XML) it is best to
go with it. Even if not optimum, it beats having to translate more than
necessary.
Data modelers need to be
involved with the use of their models after they are completed. It is not
uncommon to see developers ignore them or misuse them. The effort of many hours
of confirmation can be jettisoned when a developer assumes a mistake and simply
overrules the model.
Meta data needs to be
available to everyone who needs to see it: Users, process modelers, and
developers. If its not used it doesn’t add value. “You can’t have data
quality without meta data quality.”
In one survey they found
that a good percentage of decisions the data modeler felt was theirs to make,
the data administrators thought they should make. It behooves the two functions
to reach agreement on responsibilities.
Although there wasn’t
time to go into detail they touched on how to present large data models (i.e.
corporate data models) to executives. The consensus there was to break the model
up into small chunks. The whole model tends to bewilder the uninitiated.
They presented a process
diagram of the data modeling process. It was realistic and useful.
|
Workshop 2 |
Speaker |
|
The Operational Data Store: An Evolution of the Data Warehouse |
Jonathan Geiger, Vice President, Braun Consulting, Inc. |
Summary by Dale Kohlmoos
Jon Geiger gave a
three-hour presentation on an operational data store (ODS) designed for the
tactical analysis of subject-oriented data. Jon mapped out the essential steps
that should be taken in the development of an ODS in an integrated enterprise
system.
He began with a high level
evaluation of enterprise systems and described where the ODS could be positioned
for optimal use as a tactical tool for analysis. The ODS was presented as a tool
used to complement a warehouse and its associated data marts. The intent of the
ODS is the tactical execution of the strategies identified in the warehouse. In
order to accomplish this, he described the ODS as demanding a high degree of
query performance and availability.
Characteristics of the ODS
are that it is subject-oriented, integrated, current and volatile. It is
intended to be a central point of data integration for business management. This
view was further broken down into four classes. Each class was described by the
update frequency, degree of integration, transformation and summarization.
ETL tools play a
significant role in the management of the ODS and were described to be an
architectural consideration. The high level of integration, transformation and
summarization preclude most other forms of loading.
Jon introduced the concept
of Oper-Marts or ODS Data Marts. Much like the familiar OLAP reporting cubes,
summary tables and small star schemas. The Oper-Marts being frequently rebuilt
because they only reflect data at a specific point in time and lag behind the
ODS data update.
ODS data model examples and
aspects of tuning and scheduling were presented to help give the audience a good
background for consideration of an ODS implementation. From there Jon went into
overall architectural considerations with respect to e-Business, CRM, Finance
and Insurance.
The methodology for
implementation included examples from project management, design phases, project
phases, project definition, process definition, process modeling, deployment and
all the associated deliverables. Good examples were given to demonstrate what
needs to be considered to drive a successful implementation of an ODS.
Last but-not-least, Jon
reviewed data quality issues and expectations for an ODS. Much the same as what
is seen throughout the enterprise, but with suggestions for when and where those
issues may be caught and cleaned up. That was with a look at the impact on the
tactical analysis performed on ODS data.
|
Workshop 3 |
Speaker |
|
Creating Competitive Advantage through Knowledge Management |
Robert Seiner, Publisher, TDAN and BI/DW Director & Principal, CIBER, Inc. |
Summary by Margaret O’Hara
Using the example of a
grocery chain opening a new store, Bob Seiner stepped the audience through the
process of creating competitive advantage through knowledge management (KM).
After noting that this was the first workshop on Knowledge Management presented
at a Meta-Data/DAMA-I conference, Seiner state that there was a logical
progression from managing data to managing info to managing knowledge.
Seiner first defined KM as
the discipline of spreading the knowledge of individuals and groups across the
organization in ways that directly affect performance. He emphasized that the spread
of knowledge was critical, as knowledge cannot be helpful unless it is shared.
The vision on KM is that the right information – in the correct format –
gets to the right person, at the right time, for the right business purpose.
The amount of information
being produced annually is approximately 250 MB per every person on the planet
– and that this amount is expected to increase. Thus, managing the knowledge
is a daunting task for all organizations.
To set the stage, Seiner
offered the first of his many store-opening examples. As part of a KM project,
he interviewed employees from two recently opened stores in the grocery chain.
Employees in Store #1 reported significant problems with one aspect of the
opening – receiving deliveries. Store managers solved the problem and learned
to manage its deliveries. When interviewing employees in Store #2, he discovered
that they had experienced the same problem, but because the first store had not
shared its knowledge, Store #2 went through the same painful process of solving
the problem.
The first business impact
of Knowledge for Action is that information is provided 24x7 in a customizable
and detailed view to everyone who needs it. Thus, knowledge is recorded (i.e.,
becomes an artifact). Moreover, the knowledge is well managed and employees
learn from past decisions. There exists a sharing of best practices and
innovation. Most importantly, a KM program reduces the risk from attrition. To
sell KM initiatives to senior management, Seiner recommends you start small and
focus on investment rather than costs (i.e., on the payback of the project).
KM project planning should
start with executive business sponsorship and should involve a knowledge audit.
Audits employ both qualitative and quantitative assessments as well as a
readiness assessment. Scoping sessions are important: Seiner recommends starting
with “a slice of a slice of the pie”, and identifying the “most ready”
of all documented knowledge. Questions to ask within the organization during the
audit include:
Seiner also stressed that
changing behavior was important to the process and suggested that accountability
for knowledge become part of peoples’ jobs – in most cases being written
into the job descriptions.
To develop the Enterprise
Knowledge Platform, Seiner suggests careful assessments, performing the
knowledge audit, planning for the short, intermediate and long-term, creating
the employee portal, and making the standard build vs. buy decision.
Although time ran short in
the presentation due to the number of questions and comments from the very
involved audience, Seiner did have time to stress that the knowledge portal was
not the only consideration in KM. While the portal’s functional design,
graphic capabilities and degree of personalization were important, the portal is
only a starting point web site where employees can enter, find and access
knowledge.
|
Workshop 4 |
Speaker |
|
|
Peter Aiken, Institute for Data Research, Virginia Commonwealth University |
Summary
by Linda Kresl
Peter is a proponent of
meta data management. He began the presentation by pointing out that meta data
isn’t a very accurate term. Many IT and business managers don’t understand
the importance of meta data. Many managers ask why do meta data? Meta data is
one of the most important activities within Data Resource Management. Another
definition for meta data is data resource
data. Meta data is data describing business processes both technical and
business related.
Deriving a legacy
architecture is a major reason to create meta data. Every system has
architecture however poor or rich. Meta data is the language of the
architecture, it is how we understand and articulate the architecture. Meta data
describing system data can be considered as a multidimensional data. A lack of
meta data is the primary reason for re-engineering failures.
A data model is an
excellent place to begin the process of meta data creation and definition. A
model depicts the data implementation, data design or system data requirements.
Meta data engineering and data re-engineering are inextricably linked. What is a
meta data data model? A data model that describes or characterizes system
components, not business data. Tools that reverse engineer meta data: SAS, Evoke
(best used if the organization doesn’t have a data model). These tools have a
built in QA function.
Meta data Engineering:
As-Is
Data implementation assets
- Reconstitute data design
- Recreate data implementation
As-is data design assets
- Recreate data design
As-Is information requirements
assets
- Reconstitute requirements
- Recreate requirements
-
Redevelop requirements
To-be requirements assets
To-be data implementation assets
To-be design assets
Redesign data
A meta data model is the
key to quickly implement data conversions, understanding business processes, and
gaining knowledge about packages (PeopleSoft, SAP, etc.)
TUTORIALS
|
Tutorial 1 |
Speaker |
|
Zen and the Art of Data Modeling |
Alec Sharp, Founder, Damex Consulting |
Summary
by Arnie Hook
Alec teaches the outline
and guidelines for good practices to arrive at the data model that satisfies
business needs. He imbeds humor to establish a point and keep the audience
involved with his inspirational messages. The analyst must be able and willing
to do a variety of things in order to arrive at the appropriate data model.
Alec’s Messages: Design
the content to fit your needs. Extend and communicate the use of data
management. Communicate across the business and the objectives of the design
practices. Reverse engineering to the blank page. What is the direction?
Level set –agree on the
basics. Consistency is key to success. There are many ways to describe a
business. What the business needs information about: the data model. The data
model is a non-technical description of the business not a database. The model
must be maintained at all levels.
Level set to the 3 types of
data model:
Do not violate the four
‘Ds’ of modeling:
Alec describes the
‘facilitated session’ process to analyze the business requirement. The
technique ensures consistency and scope to the objectives. Make an agenda and
schedule for each subject session. Participants need to understand their roles
and responsibilities (‘establish the behavior contract’) for each session.
Alec coaches a ‘bus tour’ recipe to facilitate for a correct model.
The last step is to review
with ‘rhetorical context’. Know the audience, occasion, and purpose. Then
answer the data questions with a storyboard format.
Alec takes the attendees
through the course to practice modeling principles and techniques for each level
of analysis. The tutorial presented a great workshop for the novice or expert.
Even if you know it all, this tutorial should be on your list.
|
Tutorial 2 |
Speaker |
|
Applying Quality Principles to Data Definition and Data Modeling |
Larry English, President, INFORMATION IMPACT International |
Summary
by Margaret O’Hara
The premise of English’s
presentation was that since information is the product of a process, Demmings’
quality principles can be applied to develop Information Quality. English
defines information quality as “consistently meeting knowledge worker and
end-customer expectations through information and information services”. This
involves quality of data definitions, data content and data presentation.
English offered the following as an example of poor data quality:
Data
Element:
Payment Date
Definition:
Date of Payment
As in this example, very
often the stated definitions for data elements are too vague to be of much use
to the organization. Does this date refer to the date the check received, the
date it was written, the date the monies were credited, or the date the
transaction was entered?
The benefits of information
are that work processes are transformed and that clerical workers are
“informated” (i.e., they become knowledge workers). All too often, knowledge
workers either use data for something other than it was defined or have no idea
that anyone else in the organization is using the same data. An IQ initiative
can help avoid these problems.
English proposes that we
eliminate the word “user” from our vocabulary and instead describe those
employees who use information in their jobs as
English set forth several
quality principles. These involve a customer focus, process improvement,
scientific methods and management accountability. Most organizations do not hold
managers responsible for the information their departments generate. English
spent considerable time explaining Kaizen (the art of continuous improvement)
and its application to the Information Resource Management area.
Principle #1: Create a
constancy of purpose for improvement of the information product and service.
Since the obligation to the customer never ceases, information quality
ramifications are that the IRM mission and objectives are defined to include
total quality for both its services and products, develop plans with both long
and short term deliverables that support strategic business objectives.
Principle #2: Adopt the new
philosophy of Quality Information Management that will transform both the
business and IS management. The quality information philosophy means reliable
information management and shared information to reduce costs.
English next focused on how
to assure data definition quality. He believes that instead of data
documentation we should engage in data definitions that would state precisely
the meaning of words. He stressed that the definition should not be more
difficult to understand that the word it defines. English also feels that we
should avoid the term “meta data” except in technical forums. The Knowledge
Worker (not the user!) will better understand the phrase Information Product
Specification (IPC). An IPC is a detailed, exact statement of particulars. Among
the goals of data definition are (1) to enhance communication assuring that the
transmitted information, thoughts and feelings so that it is satisfactorily
received and understood and (2) to increase productivity.
English then presented the
concept of Total Quality data Management (TQdM), which will establish the
Information Quality Environment. He proposes that TQdM is not a program but
instead a value system and habit of continuous improvement of both application
and data development processes and business processes. English illustrated the
TQdM process using a data flow diagram. The steps in establishing the IQ
environment are:
|
Process |
Output |
|
Assess the data definition & IQ architecture quality |
Data definition quality assessment |
|
Assess Information Quality |
Information Quality Assessment |
|
Measure Non-Quality Information Costs |
Information Value / Cost Analysis |
|
Reengineer and Cleanse Data |
Corrected Data |
|
Improve Information Process Quality |
Information Process Improvements |
English discussed data
definition quality characteristics such as conformance to meaningful enterprise
standards, consistency of data names, and complete domain values with
definition. He also stressed the importance of data standards quality, including
such issues as enterprise wide guidelines, meaningful abbreviations and
complete, precise, non-overlapping class words. English illustrated the
importance of determining all definitions of a word with the business term
“volume”. He presented three diverse definitions of the word, each used by a
different business segment.
After giving several
examples of data definitions and business rules that illustrated high and low
quality, English had the audience assess a specific attribute definition using a
Data Definition Quality / Usefulness Assessment Form. Working In small groups,
the attendees assessed one attribute definition. This brief exercise generated
much discussion, which demonstrates the complexity in achieving even one small
part of information quality.
English then presented the
basics of Information Architecture (IA) quality and suggested guidelines for
achieving a high quality architecture. Such architectures are characterized by
completeness, stability, and flexibility. Moreover, these architectures can be
reused with a minimal degree of modification. “A well-defined architecture
supports tomorrow’s business needs as well as today’s”.
English then described the
TQdM process #5: Improving Information Process Quality by presenting the
Quality, Time, Money triangle. Essentially, maximizing any one of the three
points means the other two will suffer. Typically, an organization can achieve
two, but not three of the objectives.
Toward the end of the day,
English provided metrics to measure information quality, stating that choosing
the lowest price alternative may result in the costliest action. He believes
that organizations – instead of asking for a cost/benefit analysis of
“shared” DBs and enterprise data modeling -- should ask what the cost is of
redundant applications as well as the cost of change requests to the original
product specifications. He reminded the audience once again that Total Quality
data Management is not a program; it is a value system, mind set and habit of
continuous improvement.
|
Tutorial 3 |
Speaker |
|
Developing a High-Quality Data Resource to Support Information Needs |
Michael Brackett, Consulting
Data Architect, Data Resource Design & Remodeling |
Summary
by Dale Kohlmoos
Michael Brackett gave a
full day presentation that addressed a lot of the commonly experienced
limitations of our current data resources. He discussed how we can turn those
limitations around for more refined data resources that could better meet
information demands.
He reviewed and discussed
current data situations, data resource concepts, resolving data disparity and
cultural considerations. The current data situation is that disparate data is a
truism. The result of this disparate data is the inability to integrate data to
meet the information demand. He described four basic data problems that are
commonly seen throughout most organizations:
The demand for integrated
data to support business needs is high, yet disparate data continues to be
produced at a rapidly increasing pace. Mr. Brackett described the current status
quo as potentially leading the organization to failure due to information
deprivation. An emphasis was placed on the notion that it’s not our tools that
understand technology, nor do they automate understanding, but that people are
the key and tools support people.
Mr. Bracket discussed the
structure of the Business Intelligence Value Chain and noted that the data
resource is the foundation of all the other structures. This is a sobering
reminder that we all need to revisit every so often. Mr. Brackett also brought
to mind the debate on whether the data resource is considered an asset or a
resource.
Further discussion reviewed
data architecture and the corresponding position of the data resource within
that architecture. From within that architectural perspective, Mr. Brackett
identified ways and means to both halt and resolve existing data disparity. From
there, the session delved into detailed examples, principles, and practices for
refining data definitions, data structures, data integrity, data documentation,
data orientation, data availability, data responsibility, data vision, and data
recognition.
The next step was to
discuss the data resource transition and how to implement better practices. Not
to mention, the cultural considerations that would have to be addressed to make
it happen.
Mr. Brackett concluded his
presentation by demonstrating that there is no “silver-bullet.” The
techniques are available and that it is time to develop a high-quality data
resource that can meet the information demands of each organization.
|
Tutorial 4 |
Speaker |
|
|
John Zachman, President, Zachman International |
No
summary is available for this tutorial
|
Tutorial 5 |
Speaker |
|
Building and Managing the Meta Data Repository |
David Marco, President, Enterprise Warehouse Solutions |
Summary by Carey Clark
David
Marco’s presentation was aimed at those new to the meta data imperative and
included sections on basic meta data terms, definitions, concepts and
justifications. He also makes the case for treating repository creation as a
project and to use formal project management methods. The presentation is drawn
from David’s book by the same title.
It
is important to relate and document the business benefit of the repository. This
benefit is usually to increase revenue or reduce costs. Repositories need to be
built iteratively with value added at intermediate stages.
David
likes to put data quality in the repository rather than in the data warehouse
because more people can get to it and can be related to more systems.
He
estimates that 35% of the IT budgets are spent on integration. His experience is
that a company’s data will double every 4 years. Hence the need to manage this
data is critical to effective growth.
He
separated meta data into business related and technical related areas. Most of
what one audience needs to see, the other audience doesn’t.
A
lot of his projects are aimed at the data warehouse construction. They deal
mostly with extraction, translation, load (ETL) activities rather than business
names, definitions and their maintenance.
His
list of MUSTS includes:
David
uses a classic decision matrix for determining the best tool. Each requirement
has an importance, a complexity (=cost). Each tool is then matched against this
matrix.
|
Tutorial 6 |
Speaker |
|
|
Debbi Walsh, Technical Director, & Hal Davis, Consultant, XMLSolutions |
Summary
by David Plotkin
Introduction and
Business Case
The tutorial began with a
brief introduction of what XML is, including an intuitive diagramming technique
for showing how XML labels data – giving it more meaning and making it more
understandable than a simple flat file. The design goals of XML were reviewed,
giving us a good idea of the reasons why we might want to introduce XML into our
organizations. As part of this justification, a series of business scenarios
were presented, and in each case the advantages that XML provides were made
clear.
Documents and Structure
The tutorial continued with
the definition of the rules for creating a "well-formed" XML document,
including the single root element, proper element nesting, quoting of attribute
values, and the naming conventions.
Validation of an XML
document can take place – either via the well-accepted DTDs, or the newer, and
more powerful XML Schemas. The syntax for defining DTDs was discussed, including
the details of processing instructions, the XML declaration, elements,
attributes, and comments. The different types of elements were covered, such as
text, empty, mixed, and element (a content model that consists of sub elements).
The different types of attributes were also covered, as well how to declare
optionality and cardinality. Namespaces (for reusing element names) were covered
with examples. XML Schemas were discussed in significant detail, including
simple and complex data types, and declaring your own data types. In addition,
the reuse aspects of XML Schemas (one of the primary advantages of XML Schemas)
were shown.
After discussing how to
build validation documents, the details of connecting a DTD to an XML document
for use by a validating parser was covered. In addition, general and parameter
entities (both internal and external) were covered with an excellent and concise
chart.
RDF
The presenters covered RDF,
although it was somewhat difficult to see the application of RDF in the context
of XML. There are some similarities, but not strong ones.
Transformations
One of the most useful
parts of the whole presentation was the section on transformations. Using XSLT
(.xsl), Hal put on a demonstration of displaying an XML document using a style
sheet in XML, and showed how the entire "look" of the document could
be changed by changing the associated style sheet. He also demonstrated how the
XML document could be converted into another form – be it another XML
document, a plain text file, or whatever. The presentation covered the exact
flow of how the XML content was converted, including using a parser, and even
included a brief rundown of some of the more common XSLT commands. He also
covered XSL (.fo) for applying formatting to convert the output of XSLT to PDF,
HTML, or printer output.
The parser uses either DOM
or SAX, and Hal covered the advantages and disadvantages of both types of
parser. DOM needs more memory and is not as quick as SAX, but since it maintains
the "tree" in memory, it is possible for the program using the parser
output to navigate the nodes of the tree more freely.
Resources
XML has a considerable
number of resources available – standards, products, and information on the
internet. Hal and Debbi briefly covered these topics, and provided a CD that
contained all of the XML standards being considered. They were less thorough
with the editors, databases, transformation tools, and servers that are
available today, merely stating that there were plenty of choices.
Data Management/Schema
Design
The last two sections
briefly covered two topics of considerable importance to Data Administrators
getting involved with XML. The first are the challenges that we face in managing
these new flavors of schema, and this whole new environment. They provided some
recommendations on managing names, accuracy and descriptiveness, and modularity
and reuse. There ARE industry standards emerging, and where possible, it is a
good idea to try and use the common schemas for an industry. Finally, they
covered what you should be concerned with when trying to manage your schemas
centrally, including the ability to browse, do impact analysis, impose good
design practices, dynamically access and generate schemas, and import and export
schemas from various sources.
Summary
by Ron Klein
Sridhar’s
insights and knowledge contributed greatly to our awareness of what is coming in
the standards area. His tutorial presentation included discussion of various
evolving OMG (Object Management Group) standards, models and protocols, such as:
CWM – Common
Warehouse Metamodel
UML – Unified
Modeling Lamguage
XMI – XML
Meta data Interchange
MOF – Meta
Object Facility
Much of the
discussion was driven by questions from the audience, hence this summary draws
substantially on those questions.
Quick
history of OMG: founded
1989, now more than 800 vendors.
1991 - CORBA
1.0
1995 - CORBA
2.0
1997 - MOF and
UML
1999 - XMI and
CORBA Components
2000 - CWM,
XML.Value, EDOC (Enterprise Distributed Computing), XMI for XML Schema
2001 - UML for
EDOC, UML 2.0, Better XML and E-Business integration
OMG is
broadening the scope of technologies moving to Model Driven Architecture. It is
targeting middleware technologies in the data management and application
development realms.
The Meta Data
Coalition (MDC) merged into OMG during 2000. CWM became the common standard last
June (2000) and had a revision published last week (Feb 26, 2001) based on
vendor experiences.
The Data Integration Problem
-
Emerging XML issues
include new XML data types, integrating XML with middleware technologies and
into core database technologies.
-
The Internet is
driving us from small to large databases.
-
The transformation of
information from one technology to another leads us to CWM as a solution.
What is needed
to solving the Integration Problem?
-
Meta data becomes more
and more important.
-
Moving to XML APIs.
-
New APIs such as JMI,
JOLAP, JDATAMINING
-
SOAP Developmenter:
marries HTTP/XML
E-‘Muddleware’
Architect’s Dilemma
-
What is the data
exchange protocol?
-
Ignore the middleware
when you are doing Design and Analysis, use Mapping techniques.
-
Integration at higher
level is as important as in lower levels.
-
XML won’t solve all
the problems! Others will not go away.
AUDIENCE
QUESTION (Q): What is your definition of components?
INSTRUCTOR
ANSWER (A): Pieces of a program with interfaces that have been captured.
SPE –
Software Process Engineering: Best practices forming Objects for life cycle –
an extension of UML. (IBM, Rational Rose and others)
Q: What about
Open Process?
A: Not involved
with SPE but it is with UML 2.0.
Q: Data
Structure?
A: There is no
model that fits it all! UML can define what the data structures are. It
addresses the static part of it. If you can represent your legacy in UML then
you can use XML.
Q: What about
Workflow Management?
A: Activity
diagram is included in UML, State Machines.
Q: What about
Batch File Model?
A: Look at CWM
model, it is more focused on extraction and transformation. UML is weak here.
Model the Data,
Model the Application, and Model the Interface
Every three
years comes a new protocol. The guts of business rules change very slowly
because they are abstract concepts of the business. It is fundamental to focus
in your business.
Enterprise
Portals are in a rudimentary stage now. The elements are already in CWM.
Integration technology brings process, content, application. It is not Data or
Process or Presentation integration but all of them.
Work together with common shared metamodels
-
There is more & more meta data lurking everywhere!
- There are
specific meta data to manage the DW in CWM. More clear, more easy to use and
represent meta data.
IDA – Enterprise Modeling from OMG
OMG Modeling
and Meta data Framework
Modeling
Concepts:
- Platform
Independent Model (PIM)
- Platform
Specific Model (PSM) Meta data technology
- Mappings from
Independent Model to Platform Specific Model
1)
Create concrete
mapping from neutral to specific through data model and rules
2)
UML profiles: AD going
from neutral to specific (UML-> C++, Smalltalk, JAVA)
Q: Which one is
the META META model?
A: MOF Meta
Meta Model, it is a subset of UML.
Q: What about
legacy ER with UML -> Use Case?
A: When you
deal with data a bridge is needed. Work is on going to map UML and ER. CA,
Rational, Sybase are supporting. You need to make decisions to map models. There
is a mismatch. CWM includes UML, ER and Transformation Model. The heart is MOF.
Q: What
tool are you using to generate XMI and IDL?
A:
Rational Rose
Roles of UML in CWM
CWM 1.0
Overview {02/2001} Common Warehouse metamodel
Q: Where do I
see security?
A: It is part
of the systems management.
Q: Is this the
persistent metamodel?
A: Yes
Q: Notation,
classes becoming stereotype in UML?
A: Yes.
|
Tutorial 8 |
Speaker |
|
Data Architectures for Scalable E-Commerce |
Michael Stonebraker, Chief Technology Officer, Cohera Corporation |
Summary by Linda Kresl
In this full-day tutorial
Dr. Stonebraker predicts that the US will lead B2B eCommerce. Major B2B players
are Ariba, CommerceOne, Oracle, SAP, IBM. He covered data architecture designed
for eCommerce, B2C/B; its inception, types of products (Portals, DBMS,
protocols, components, N-tier architectures) and the standards associated with
eCommerce.
A B2C application example
is a query catalog of items for sale. B2C players are Broadvision, and
Openmarket. The interface is usually to a fulfillment system. Gizmos like Palm
Pilots and cell phones will be major players in the future.
Any web architecture should
be designed using components. The component protocols should be built using Java
beans – a safe bet for general-purpose applications. Don’t build your
components in Active X, it is not supported by any non-MS OS.
Another choice is XML as a
component protocol. XML is also a messaging system – XML will soon be
ubiquitous even on gizmos. XML goes through firewalls and it’s easy to parse.
XML is a safe bet for low performance applications, use it only for small and slow
applications. XML isn’t a good idea for large amounts of data because the
meta data is coupled with the data. He favors Java for a web language. C++ will
be used for complex applications. He favors the following scripting languages,
Javascript, XSL. These products are ODBC compliant and talk to the DBMS. Michael
suggests that you stay with ODBC to move from database to database.
Components can run in 3
areas:
1.
Thick client – on a browser – screen intensive logic should run as
close to the screen as possible
2.
Thick middle – applications that are in between should run in
middleware
3.
Thick database an OR DBMS - data mining should be run as close to the
database as possible. Logic in the DB is always faster!!! Move the code to the
engine!
Michael states that the
obvious goal is Universal components. Write the component once and reuse it at
any level. The industry is nowhere near universal components. Java Beans are the
closest component at this point in the game.
How should we interface to
legacy systems? We can use two approaches, an EAI system or a messaging system.
Please use your favorite EAI system. An EAI helps you package up a message and
transform it over the network and have the user unpack it and understand it. The
top EAI packages are: MQ Series (IBM), Webmethods, Vitria, CommercQuest,
CrossWorlds, Mercator.
Content Management is
locally authored information in rich content (text and images) and little if any
structure to this data. This data is fairly static. This data may also be
purchased. There are two solutions to manage content management.
1.
Store content in HTML/XML via a file system (don’t grow your own)
Packages
are Plumtree, Viador, Interwoven, Vignette
2.
Object-Relational DBMS – use this if you have an enormous amount of
content, these are scalable.
The Web changes data
warehousing with a new set of data – clickstream analysis (CSA) – every time
a user clicks to a new page – this is stored. This data source is outside the
enterprise. Now, this data is outside the firewall. CSA looks exactly like
traditional data warehousing. Web site scraping is used to get data from web
sites. This is a way to get the data if the enterprise doesn’t own the data.
One of the weaknesses of DW is that data is stale by ½ the refresh interval,
the scalability issue is great. Trends in this space include automatic data
mining, federators should get traction, and visualization systems will get
traction to complement data cubes.
Michael suggests the
following to improve web design.
1.
Plan for short design cycles - web cycle time appears about 3 months and
the rapid prototyping mentality is really required.
2.
Scalability is key. Test a design for scale before it goes live. Make
sure that you hire serious system software expertise. Availability is a must.
Replicate your data and make sure to turn RAID on.
3.
Do only what you are good at. Figure out your core competency and
out-source everything else.
4.
Do everything only once. This means run one ETL system, one EAI, one
Federator, etc.
5.
Less islands of information. Use less system administrators, less
training, less manuals, etc. Converge federator and EAI and converge app server
into OR DBMS
6.
Use XML appropriately – use as a transport protocol not a storage
format
CONFERENCE
SESSIONS
|
KEYNOTE |
Speaker |
|
|
Tom DeMarco, Principal, Atlantic Systems Guild |
Summary by Linda Kresl
Tom started kicked off the
Meta-Data DAMA conference with a flair. He said the systems we build today are characterized by: more stakeholders,
conflict, shorter schedules, tighter budgets, more visibility, and risk. And
modern day systems are harder because we built all the easy ones years ago.
The
major point that Tom is making in this presentation is that we need to introduce
“slack” in our work environment. His definition of slack is the degree of
freedom (in time and budget and manpower and space, etc.) necessary to make
change possible.
What is a quality focus
today? Most of our quality programs focus on defects. How do we live with the
fact that many of our products are chock full of defects. For example,
Microsoft’s IE. Does the software transform your world? The fact that it has
defects is of no consequence, it transforms the way a person does one particular
thing.
We must consider human
capital as the most important asset of an agile organization. The agility
principle is based on prioritization. Tom’s view on priority is a great
departure from the norm. He suggests rank order priorities and putting projects
on hold when their priority doesn’t justify doing them yet.
Tom’s Prescription for a
new era
·
Become less “efficient”
·
Lighten process (strive for light
process and heavy skills)
·
Learn to Prioritize
·
Choose your projects very wisely;
what you decide not to build is more important than how you build
·
Invest in human capital
People must spend time
thinking today. Tom spent one whole summer just thinking. Don’t spend all your
efforts strategizing. Everyone should put some slack back into your life. Put
some slack back into your organization.
|
Conference Session |
Presenter |
|
|
Andres Perez, Enterprise Data Architect, USAA |
Summary
by Linda Kresl
Andres is chartered with
bringing more rigor to USAA’s data architecture. USAA is an automobile
insurance agency that prides itself in serving its member with superior
information. Andres hopes that what he shares today will be something that you
can take home with you and use in your own organizations.
He discussed the fact that
they have a large IMS legacy system. It is extremely difficult to do data mining
with data in this format. Much of the data isn’t defined correctly and it is
conflicting. Semantic problems are those in which data attributes don’t match
up from different reports. Also the data is constrained to a given channel. The
web may help alleviate this problem.
Most of the applications at
USAA have more interfaces than users. One application alone has 4,500
interfaces. There are several translations that must take place for any single
application to run. This has created a fur ball of data! Andres states that 50%
of the total IT budget is spent maintaining the interfaces.
The single reason that data
is inconsistent is what the individuals believe their business processes are.
Every individual truly believes they are doing the right thing. When in reality
they are not doing what is best for the business. Because of the focus on
projects and not the enterprise – USAA has redundant data.
USAA’s data architecture
is based on the Zachman Framework. USAA still has many obstacles to understand
its customer’s needs. By implementing the Zachman framework they hope to
understand and relate relevant data. Andres is proposing a common data model and
definition. He is proposes a reference guide to manage and control meta data.
The desired data
architecture for USAA is creating data structures that are subject oriented and
in canonical form. Once the data is moved to subject areas Andres proposes
creating data marts based on these subject areas.
|
Conference Session |
Speaker |
|
Data Quality as a Profit Center |
Wendy Wood, Data Quality Analyst, SBC Services |
Summary
by Margaret O’Hara
Wood began her presentation
with a comment about “dirty data” being a renewable resource, and thus
offering her job security. She then explained the mission of her department at
her firm: data quality. She discussed how high data quality can help the company
achieve its goals of faster and better market response, improved business
flowthrough and customer delight.
Wood believes that the main
questions to ask in your company are: (1) Are you getting the data you’re
expecting, and (2) what is it worth to you and your company? Wood believes that
customer addresses are a good place to start a data quality initiative because
most firms have address data, many areas of the company have problems with the
address data. At PacBell, customer addresses are a major issue. This is because
a single customer may have up to three addresses: the service address, the
billing address, and the listing address. While the company can handle
“less-than-perfect” addresses for service (e.g., the second house behind the
gas station on the corner), the Post Office cannot. More importantly, discounts
available for complete addresses were threatened.
To correct the problem,
Wood found users who cared about the data. She stressed that data quality was
not something that IT could achieve by itself, a user-sponsor was critical. She
urged the audience not to take such projects on themselves – to be sure there
is buy-in from the business. She then briefly stepped the audience through the
cleansing process. First, take the highest level “one” table – country is
a good example as it has few values. Examine and correct the data in that table,
then move down to state, then to city, etc. She cautioned that the lowest level
tables are the ones with data quality issues that are the hardest to identify.
She also advised looking at small samples (perhaps 10% of the data before
undertaking the project). At the very least such an examination will allow the
firm to learn more about its data.
|
Conference Session |
Speaker |
|
Introduction to the Unified Modeling Language |
Eric Naiburg, Rational Software |
Summary
by Anne Marie Smith
Eric Naiburg, (presenting
for Terry Quatrani who was unable to attend due to weather), introduced the
concepts of the Unified Modeling Language, how it can be used, and some examples
of UML in modeling.
History of UML: created by
Booch, Rumbaugh and Jacobsen – all were working on methodologies / languages
for visualizing, specifying, constructing and documenting the artifacts of a
software system. These methodologies were synthesized with the assistance of
Rational Software Corp., and has evolved into a unified format, notation and
language designed for modeling applications and data.
Eric explained the various
diagrams in the UML:
Activity
Diagrams: show flow of control in a system, from start to finish. It represents
processes (activities) and the order in which each occurs. This activity diagram
can be used to illustrate the data entities needed, as the basis for database
design.
Use
Case Diagrams: Use cases and actors are the 2 components of a use case diagram.
An actor is someone or something that must interact with the system to perform
an action. A Use Case is a pattern of behavior that the system can exhibit. Each
use case is a sequence of related transactions performed for an activity,
involving one or more than one actor. Use cases are a high level requirements
gathering and documentation method, and are essential to an object-oriented
system development.
Sequence
Diagrams: Displays object interactions in the order in which it will be
performed.
Collaboration
Diagrams: Displays object interactions organized around objects and their links
to one another.
Class
Diagrams: Shows the existence of a class and its relationships in the logical
view of a system. Classes are collections of objects with a common structure,
common behavior and common relationships. Eric explained the concepts of
association, aggregation, dependency and inheritance relationships in classes.
Eric mentioned the similarity between entities and classes, to demonstrate the
commonality between ER modeling and UML modeling. He showed the essential nature
of “classes” in object-orientation, and the modeling of classes and
relationships within the UML.
State Diagrams:
Shows the life history of an application, and are similar to an activity diagram
at a point in time. This diagram type is not used as frequently as activity
diagrams or sequence diagram for application development. They are more
frequently used for networking implementation.
Component
Diagrams: Shows the physical implementation of a class and its actions (DLL,
programs, interfaces). Deployment diagrams represent the processor and devices
used in implementing a system.
Eric concluded by
explaining some of the extensions to the UML that are frequently used, and
discussed how to bring UML and its concepts into the “data world”. He cited
the universality of the UML in business modeling, requirements modeling and
application development. He encouraged attendees to learn more about UML and to
apply its concepts and techniques in their data activities.
Questions for Eric were
mostly technical and documentation-oriented, and showed the high level of
interest in the UML and its place in data management.
|
Conference Session |
Speaker |
|
Implementation/Use of Operational Meta Data to Improve Data Quality in the Data Warehouse |
Mike Jennings, Architect
and Manager, Hewitt Associates LLC |
Summary
by Ron Klein
Mike Jennings discussed the
Meta Data Repository (MDR) and the Data Warehouse. He assumes that the MDR
should be independent of both ETL tool selection, and of the Dimensional
Modeling technique used.
The purposes of the
Repository in the BIE (Business Intelligence Environment) are:
-
The repository product and its data model allow the various
function areas in the data warehouse environment to communicate
-
To provide context to the data content, processes and reports
-
Central hub of the data warehouse environment
-
Allow project teams to focus on the operational source system
and data warehouse data models, not the repository
Provide a single location for integration between the operational source systems, data warehouse, ETL processes business views, reports, and operational statistics
Mike presented a Generic Meta Data Repository Model (see slide #8 in the speaker’s materials on the conference CD-Rom). He reviewed the various types of business meta data (e.g. Business terms and definitions for tables and columns, subject area names, query and report definitions, report mappings) and technical meta data (Physical table and column names, Data mapping and transformation logic, Source systems, Foreign keys and indexes, Security, ETL process names
Operational meta data is an extension of the design and architecture of the data warehouse
that provides processing optimizations in data acquisition design, maintenance
activities, end user reconciliation and auditing of information. It
Provides an extra bridge
between the meta data repository and the data warehouse through addition of
physical columns in the design for ease of use, both technical and business. Operational
meta data use will require additional ETL processing steps and time. If a meta
data repository can not be extended for operational meta data or is not
available, lookup tables can be used as an alternative in the warehouse model. Operational
meta data provide a detailed, micro level, explanation of the information
content in the data warehouse. The direct association of meta data to each row
of the information in the data warehouse allowing for detailed (row level)
explanation of information content versus a repository (table/column level) is
the key distinction of this method
Transforming the Logical Data Model into the Data Warehouse Data
Model
There are
eight (8) basic Inmon transformation rules to be applied to the Logical Data
Model in order to convert it into a Data Warehouse Data Model. These
transformation rules should typically be applied in sequence. Mike’s own
“modified” version of these rules is:
1. Removal of
purely operational data
2. Addition of an element
of time to the key structure and operational meta data
3. Addition of
derived data
4.
Transformation of data relationships into artifacts
5.
Accommodations of different levels of granularity
6. Merging
like data from different tables
7. Creation of
arrays of data
8. Separation
of data attributes based on their stability
Operational
Meta Data Examples
There can be
various technical meta data columns (tags) utilized in the data warehouse data
model and ETL processes for enhanced automated support.
- Load Cycle
Identifier
- Current Flag
Indicator
- Load Date
- Update Date
- Operational
System(s) Identifier
- Active in
Operational System Flag
- Confidence
Level Indicator
- Cyclic
Redundancy Check CRC)
These columns are added during transformation of the Business Logical model into the Dimensional or Data Warehouse data model. Use of certain operational meta data depends on the type of table in question (e.g., Update date on a fact table would result in little value since these tables are not typically updated in a standard warehouse). Mike discussed an example of a strategy for operational meta data use for slowly changing dimensions (SCD). This can be reviewed in his paper on the conference CD.
|
Conference Session |
Speaker |
|
|
David Hay, President, Essential Strategies |
Summary
by Carey Clark
David Hay creates the most
readable data models in the world (in this author’s humble opinion). In this
presentation he presents over 30 logical models and meta models covering all
aspects of the information systems development process itself. Models presented
describe the entities and relationships of the artifacts created during
analysis, design, and programming. He also showed models for data
transformations, business rules, screen design, and object oriented programming.
Doing this not only provides a basis for storing the relevant meta data that
would reside in a repository, but also goes a long way in helping us to
understand what we ourselves do.
David avoids the term
“meta data” in reference to repositories. He thinks it’s too restricted.
Instead he defers to Michael Brackett’s designation, the “The Data Resource
Repository”.
He reviewed historical
efforts to create a repository and provided his assessment of their success. The
OIM and OMG versions he felt were too abstract. They hold lots of stuff but not
the stuff a typical data modeler would recognize. Oracle Designer is promising.
TDAN and Aera Energy were potentially workable. But he decided to have a go at
it himself.
He plugged the TDAN
newsletter at www.TDAN.com as required reading. His own three articles on his
Repository Models are there as well. He started simple and progressed with more,
and more elaborate, repository meta models. All are worth studying and I
recommend viewing them.
He contends that UML is
only a data modeling notation and that there is nothing fundamentally different
from other notations. It does some things okay but is not easy to read. He
therefore defers to the ER (crows feet) notation instead. UML also tends to
focus the modeler on the application (physical) rather than on the business
(logical).
Dave explained using the
models how certain issues were handled. For example there is the need to have a
way to describe elements that initially may be populated but eventually must
be populated. Most tools make you decide one way or the other up front. He
includes derived data in his model. Whether that data is derived when viewed or
stored is an implementation decision. The logical model is the same.
The problem with most meta
models is that they are too abstract for anyone but data modelers. In order to
make models readable to the user community he added the concept of “virtual
entities” that derive from the abstract one. Thus one can display the entity Customer
in a model view, even though Customer is really the Role of a Party
(where Party is a Person or Organization).
He believes that use
cases are awkward because they assume you understand the process you’re
modeling. They are essentially context level data flow diagrams but lack some of
the formality and rigor.
Dave is currently working
on business rules meta model with the Business Rules Group. This group is sort
of a replacement for Guide. Check it out at businessrulesgroup.org.
Not everything about a business belongs in a Repository. He doesn’t claim his models cover every possible modeling subject. For example, work flow models, events, policies might be better stored in their own data store. In none of his models does one see foreign keys. It’s a mechanism for implementing relationships. At the logical level they are implied by the relationship link. Putting them in the model is redundant.
His models are particularly
readable and elegant. He uses Oracle Designer, it allows subtypes to be nested
and entities to be stretched so that that relationship lines rarely overlap and
never bend. The bad news is that it’s expensive.
|
Conference Session |
Speaker |
|
|
Alan Perkins, Vice President, Visible Systems |
Summary
by David Plotkin
This presentation
introduced the basics of XML, including the fact that it is content-based, not
presentation-based. It also identified what tags are used for, and briefly
discussed Elements, Attributes, and Entities, with examples.
The main point of the talk
is that XML Without Fear is based on documenting Enterprise Meta data in the
form of business rules. The types of business rules were listed, including
definitions, data integrity constraints, derivations, inferences, processing
sequences, and relationships among facts. The presentation discussed the
advantages of managing business rules, and the characteristics of a
"good" business rule.
The bulk of the
presentation discussed modeling of business rules. In general, constraint-type
business rules and derivations cannot be modeled in a "standard" data
modeling tool. However, using Visible System's tool, Alan demonstrated how data
modeling could be extended to model these types of "impossible to
model" business rules.
|
Conference Session |
Speaker |
|
Data
Management Support for Enterprise Architecture |
Brett Champlin Architecture Consultant, Allstate Insurance Company |
Summary by Linda Kresl
This presentation offered
valuable insights on how your company can manage the data for your enterprise
architecture. Brett’s examples from Allstate Insurance give practical
suggestions to handle this difficult task. The key is to manage the models that
support the architecture, but an Enterprise Architecture is much more than just
models. Enterprise architecture is models, principles, and standards. It
includes data and process modeling and application and technologies
architecture.
In this presentation Brett
explained architecture definitions. His first definition was an engineering
definition of architecture – the art and science of building. And the purpose
of architecture is to convey a design. Information systems architecture is the
blueprints, drawings and models, which define and describe what is needed.
Brett presented many
schematics and diagrams to show different architectural frameworks, e.g.
Zachman, Gorman’s Knowledge Worker, Framework for 3-tier C/S development.
Brett compared Enterprise architecture to city planning, comparing the buildings
in a city to systems in an enterprise. The most important element is the
infrastructure – what is underneath supporting the buildings and systems.
Data management support
includes defining the processes, choosing a framework, and integrating the EA
with key business processes. Brett mentioned the several tools to help manage
the EA. These tools include: Corporate Modeler by CASEwise, Metis by NCR, and
Architect by ZTI.
|
Conference Session |
Speaker |
|
Business Rule Specification, Validation & Transformation: Advanced Aspects |
Terry Halpin, Technical Lead in Database Design, Microsoft |
Summary
by Margaret O’Hara
Halpin began his
presentation by asking the audience how many used data use cases and object-role
modeling (ORM) in their work. About 1/3 of the audience had used them.
Halpin’s basic premise in the presentation was that data use cases and ORM
were:
- more understandable
because it stated facts and rules in English and/or intuitive graphics
- more reliable because it
validates rules using English and sample populations
- more expressive because
it captures more business rules graphically
- more stable because it
minimizes the impact of change in models.
Halpin used the example of
birth date. Instead of stating that a person has a birth date, with ORM this
becomes, “I was born on ____” -- a much more natural way for the user to
state the date. For the remainder of the presentation, Halpin presented ORM
examples.
In his concluding remarks,
Halpin stated that ER was useful for basic data modeling, but that commercial
versions were restricted with regard to business rules. UML is useful for OO
code design but not for information analysis as its use cases are too
process-oriented. For the ER and UML users, Halpin suggested they use ORM for
analysis and then map to ER or UML, supplement ER and UML with data use cases,
or enhance ER and UML to make them more ORM-like.
|
Conference Session |
Speaker |
|
Business Process Analysis and Logical Process Modeling |
Anne Marie Smith, Assistant Professor, LaSalle University |
Summary by Anne Marie Smith
Anne Marie Smith,
assistant professor of MIS at LaSalle University and a data architect
consultant, gave an overview of the concepts of business process analysis and
its relationship to data analysis, with a brief overview of the methods used to
model logical processes and that model’s relationship to a logical data model.
Anne Marie noted that
process analysis should be used in all systems development, whether transaction
processing, decision support/data warehousing; for both traditional applications
as well as electronic commerce applications. She cited the failure rate of
application development projects of all types and the lack of understanding of
the processes that occur, causing frustration in the user and IT communities.
Business Processes do not
operate in a vacuum: they need data to validate the reason for the processes’
existence. As such, Anne Marie described the interaction between data analysis
and process analysis, and the need to have BOTH analyses for full application
development and user effectiveness.
Anne Marie’s presentation
was enhanced by the use of actual experiences of her consulting and information
management career, and demonstrated the interaction between data and process in
a successful implementation in different types of development.
With a very brief overview
of logical process modeling, Anne Marie introduced this method to the data
analysts in attendance. She concluded by reiterating the ideas from the
introduction and by relating the needs for understanding processes to data
analysts’ understanding of the need for data analysis.
Some reactions/questions to
this presentation showed that DAMA needs more exposure to processes and
processes’ intimate relationship to data – more process-oriented
presentations were requested for future conferences.
|
Conference Session |
Speaker |
|
Build Your Own Web-Based Meta Data Repository |
Joseph Newcum, Senior
Data Architect, Bank One |
Summary
by Carey Clark
There are several reasons
for building your repository rather than buying one. Vendor versions tend to be
costly and can be difficult to modify. On the build-your-own side of the issue,
you must have the skill and patience in house to attempt the project.
Joseph separates meta data
into operational and developmental. The first deals with the flow of information
in the enterprise such as for loading a data warehouse. These activities happen
day in and day out. Development meta data concerns the creation of applications,
the analysis, models, and constructs used on a project. Your repository will be
different depending on your emphasis.
Bank One spent two years
evaluating third party repositories. Their focus was using the repository to
build a data warehouse. Building their own repository wasn’t straightforward.
It took 4 tries. The first failed because it was too difficult to load data from
their case tools. The second for lack of skilled object oriented programmers.
The third was a purchased repository that didn’t fill the bill. The four try
succeeded.
The successful approach to
building their repository was to create a prototype in Microsoft Access, prove
the design, and then rebuild it in HTML and JavaScript for dissemination over
the Web. They used Microsoft tools (Active Server Pages, Active Data Objects,
Java) etc. Their modeling tool is ER/WIN. They don’t have XML incorporated
yet.
Joseph walked through and
discussed the various display screens in the Access prototype. The initial
application ended up smaller in many ways because certain meta data simply
wasn’t available. The resulting application primarily supports a data
warehouse environment.
They made the interface
look like Business Objects. Users were already familiar with it so the learning
curve was reduced. The user interface is clean and robust. What goes on under
the covers is something of a jumble but is constantly being improved. He
believes this is the right approach. Make the interface elegant and robust and
don’t worry so much about internals. You can change those without the end user
being affected. Right now they are modularizing it into VB classes and moving
data into business objects. Subject matter experts input definitions directly.
He showed the meta models
underlying the repository. They started out as a very abstract thing-thing model
used by Knowledgeware’s Application Development Warehouse. Later it was redone
to be less abstract.
An audience member asked if
data models themselves are viewable on-line. The answer was yes but he found
that few developers every used those views: Just not enough space or resolution.
Instead most of them plotted the models out on large plotter paper and pinned
them in their cubical.
He recommended the books: Visual
Basic 6 Business Objects and Visual Basic 6 Distributed Objects.
These, he said, would be valuable for their architectural insights even if you
didn’t use Visual Basic.
|
Conference Session |
Speaker |
|
The Role of Data Administration in Managing the Enterprise Portal |
Arvind Shah, President, Performance Development Corporation |
Summary
by David Plotkin
This presentation defined
the many kinds of personalized portals (such as consumer, vertical, B2B, and
Corporate) and their purposes. It discussed the typical problems with B2B
portals, and the roles of data administration in solving these problems.
The roles included some
roles that are typically considered part of data administration, and some (like
performance tuning, security, and supply chain standardization) that are not.
The roles typically considered part of data administration included
Planning-Architecture development, Content Management, and Information Quality
Management.
Architecture Development consists of managing Enterprise architecture, establishing a process model, building the data model, setting up the business rules, and creating strategies for information, technology, and BPR initiatives. Content management consists of managing data architecture, enforcing data standards, assuring data timeliness & quality, and assuring security levels. It also means managing meta data.
|
Conference Session |
Speaker |
|
Developing
a Corporate Data Architecture in a Federated World |
Deborah Henderson, IT
Architect, Hydro One Networks, Inc.
& Vladimir Pantic, IBM |
Summary by Linda Kresl
Deborah presented first and
described the business of Hydro One Networks. Hydro is a wholesale retail
electric utility. She gave several examples of the work that Hydro One is
creating in defining their data architecture. They have a high re-use of data
and processes across the enterprise. She stated that they are leveraging their
data warehouse – this is the driver for the data architecture.
The data architecture is
composed of local data, OLAP and details, external and historical data and the
ODS source. Meta data ties everything together.
The physical database
architecture includes an Oracle 8I, RI, multi-dimensional cubes, and a meta data
repository through hooks.
Hydro One is using IBM’s
LOVEM methodology to develop and document processes and implement procedures.
This methodology tracks the life cycle of these deliverables.
At Hydro One business rules
are implemented via the ETL. The ETL then feeds the data marts where additional
information is stored to support the data architecture.
|
Conference Session |
Speaker |
|
Facilitation and the Successful Architect |
Shelly Lieberman, Director,
Strategic Directions, Mathtech |
Summary
by Margaret O’Hara
In this well-organized and
entertaining presentation, Lieberman shared her experiences at the Division of
Alcoholic Beverage Control (ABC) in NJ and the part that facilitation played in
achieving a successful business process reengineering effort. She began by
defining facilitation as the process of harnessing user knowledge and expertise
in a group to accomplish objectives and develop deliverables.
Her presentation included
discussion of when and why one should use facilitation, an overview of the ABC
project, the facilitation approach she used, the results of the facilitation
sessions with the ABC and the critical success factors for the sessions. The
facilitation process consists of careful planning, execution and follow-up, very
often with the follow-up activities feeding directly into the next planning
session. A knowledge of the organizational culture is critical, as not all
techniques work in all cultures. Not all sessions are facilitated; only those
involving major issues among the involved parties.
Once the sessions have been
scheduled, it is important to follow a strict agenda. Each session is split into
three parts: an opening module where the stage is set, the work module , and the
closure module where the wrap-u[p and summary takes place. “boarding” issues
– writing them in a public space in the room for everyone to see often
diffuses conflict – people are assured they are being heard.
Lieberman presented the
rules for sessions, including everyone is equal, critique ideas, not people,
etc. and shared the evaluation forms she uses for the sessions. She also
presented the critical success factors for the sessions. Among these were:
commitment from management for change, knowledgeable participants, open
communication, and extensive follow-up. Lieberman also spent some time dealing
with the challenges, such as groups not wanting to follow structured agendas
(stay focused on the issues, but let the group do their thing), the director
having most of the say (talked to director in background), and “nay Sayers”
who didn’t want change (persuaded to join group by the director).
The session concluded with
Lieberman sharing some resources for further information (iaf-world.org).
|
Conference Session |
Speaker |
|
The Practical Use of a Universal Data Model in the Data Warehouse, |
David Lepley, Data Analyst, Tyco Electronics |
Summary
by Anne Marie Smith
To demonstrate the need for
“context” with data, David gave an overview of the electronics environment
and his company’s history before launching into a presentation on the Tyco
global data warehouse development and its reliance on universal data models.
David’s presentation gave
us:
Business Rules
Approach: explained the rationale for business rules in a Data Warehouse, showed
the drivers of the business as fundamental for understanding the data contained
in a data warehouse, and described why these factors pointed Tyco to using a
universal model for its data warehouse
The
Universal Database Concept and the Universal Database Tables: this is a database
design where business rules about data are stored and used to facilitate
development of new and enhanced applications. David briefly described how Tyco
has implemented this universal database in Oracle, using partitioning and other
DBMS facilities.
David’s presentation
answered the question “Where do these concepts fit into the Data Warehouse
Architecture?” He explained the roles of data quality in data warehousing,
showed how Tyco is changing culture to verify and ensure data quality. David
referenced Barbara von Halle and David Hay throughout the presentation,
providing reinforcement from experts to his organization’s approach.
He stressed how this
approach was unique to his organization, and the risk the team took in using a
universal data model for the Tyco Data Warehouse. Thankfully, this approach has
been successful to date, and has been helped by their use of flexible
structures, business rules and committed IS and business team members.
|
Conference Session |
Speaker |
|
Understanding and Managing Reference Data |
Malcolm Chisholm, Manager, Deloitte & Touche |
Summary
by Ron Klein
What is Reference Data?
Reference data is any
kind of data that is used solely to categorize other data found in a database,
or solely for relating data in a database to information beyond the boundaries
of the enterprise.
Reference Data…at Best,
like Cinderella is forgotten
Reference Data…at Worst`
the “Rodney Dangerfield” of the world of data – “No respect at all”
1 – Rate of Change - Table structures change rarely, though there can be exceptions, such as in the world of foreign exchange rates
2
– Volume – Reference data tables typically have few rows and columns, but
there may be many reference tables in a data model
Q: How do you distinguish
reference data from domain?
A: Yes, it can be hidden in
the domain causing problems for reporting
3 – Scope - One Reference Data table can have relationships to many other tables in a single database, or across an enterprise
4 – Meta data and Meaning - Individual values of Reference Data can have meaning, very unlike other data where attribute definitions suffice
Reference Data
Management Issues
-
Implementation is typically in Program Code, not Database Tables. Using
values taken from Reference Data tables is fine; defining values in program
logic that can be used in updates is not
-
Usage of External Standards. External standards can be useful, however they
may suffer from “information float” and may not always match the
requirements of the enterprise
-
Divergence - Different applications have independent functionality for
updating their own Reference Data tables. This leads to divergence in data. The
result is MAPPING whenever data has to be shared between the different
databases. Mapping typically involves semantic analysis, data quality checking,
and resolving granularity problems
-
Accept that
Reference Data is a distinct class of data that is different to other classes of
data
-
Assign an “owner” for reference data.
It needs to be centrally managed. Perhaps the data administration function.
-
Develop a strategy for assigning codes
and acronyms as primary keys
-
Controlled redundancy can be a good
strategy
-
Publish the content and meaning of
reference data for use by developers and users
Q: Are you sure you
can’t find this reference data. What are the obstacles?
A: No one wants to touch
it. Ownership usually goes to the Data Administration group. On the other hand,
business users can sometimes own classification schemas.
Q: Multiple owners that do
not co-share?
A: 3rd category
-> a central repository, non trivial
|
Conference Session |
Speaker |
|
Architecting and Implementing
a Web-Based Corporate Meta Data Repository at the Census Bureau |
Gail Wright, Technical Director, Oracle Corporation |
Summary by Carey Clark
The Census Bureau does
a lot more than count people every 10 years. It is chartered to conduct
community, demographic, and economic surveys of organizations and business
throughout the country. For example, every business in the country will receive
a questionnaire in 2002.
The questionnaires ask
different sets of the same questions depending on the industry and audience.
Creating these questionnaires on paper took months. Analyzing the results were
equally labor intensive. So the goal was to make a corporate meta data
repository that would use meta data to generate surveys, collect and collate the
data, and disseminate the results.
Gail covered their reasons
for the repository, what was included in the repository, how it was architected,
designed and implemented. Lastly she showed how the repository is now poised to
be used for nine other major governmental departments. Because of this effort,
work that took months can now take days. Data is more reliable, and different
kinds of studies are possible. The whole survey process is now meta data driven.
This repository is
remarkable in many respects. It’s large, comprehensive, based on open industry
standards, contains tabular and not tabular data with reference materials and
full text search. While most of us aspire to making a car, they have a space
ship.
Their repository includes
data content, quality, its condition, context and meaning. It includes data
models, business models, screen layouts, mappings and transformations,
hierarchies, aggregations rules, formulas, schedules, access controls and actual
code. The repository is composed of the following components:
Nothing is application
specific. Industry standards are followed where they exist. XML is used
extensively. No software is created or modified directly. All of it goes into a
modeling tool and is generated from there. The custom stuff is passed through
but is forced to follow the required standards and process.
Gail described the
repository as having a “tightly-to-loosely coupled architecture”. She
described it and the tools used in detail. It’s scalable, provides for open
API’s, is self documenting and easy to maintain.
She walked us through the
interface screens and showed how the navigation worked and how versatile it was.
Security is underneath a set of “portlets” that determines who gets to see
what. The public can see quite a bit at the web site, American Fact Finder
(factfinder.census.gov).
The effort has gone from
being a good idea to being mission critical. The census bureau wouldn’t think
of running their business now without it.
Questions and Answers
Their repository doesn’t
overlap much with the Common Warehouse Meta Model. CWM is more focused on tool
development at the technical level. Their’s is more focused on the business
level.
It took 5 people a year to
create the data element registry. She has 13 people in her group working on
various projects.
They decided not to do it
in Java. They didn’t have the skill set. They mostly use Oracle Designer and
generate PL SQL.
Michael Gorman, who
introduced Gail, emphasized the importance of pointing out to executives and
others how much savings and benefits a successful project achieved. Memories are
short. “Selling after the sale” enables you to get funding for further
projects
|
Conference Session |
Speaker |
|
|
David Plotkin, Senior Data Administrator, Longs Drug Stores |
Summary
by David Plotkin
Then, the complete
metamodel for a repository designed to store DTDs and XML instance documents was
presented. The major sections included DTDs and entities, DTDs and element,
elements and attributes, and physical implementation of elements and attributes.
The presenter also covered the functionality that is needed from a Repository, including scanning in DTDs, making changes, creating revised DTD output, building sample XML documents from DTDs, and doing impact analysis for changes. In addition, he pointed out that although this application is called a "repository", it is a limited-function implementation, and is not that difficult to design and build. However, you still need to use "industrial strength" tools -- no desktop databases need apply!
|
Conference Session |
Speaker |
|
|
Jill Dyche, Partner, Baseline Consulting Group |
Summary
by Anne Marie Smith
Jill Dyche, a
partner at Baseline Consulting Group, presented the major mistakes of CRM from a
data focus. Many sins are data-related, and, can be resolved by better attention
to data management. According to Jill, those sins that are not data-related can
be solved in part by a focus on data (and meta data, in the author’s opinion).
However, data analysis cannot be done “in a vacuum” or bad actions can
result.
She
used references from her recent book, “e-Data: Turning Data into
Information” from Addison-Wesley Publishing, offering “real-life examples”
of each sin and its possible solution. Since “there is no such thing as
plug-and-play in CRM” each example and possible solution must be evaluated in
light of an organization’s goals and objectives.
The
many different definitions of CRM are at the root of many of the problems and
sins in CRM implementation. Data’s reliance on definitions can assist CRM in
developing a solid and reusable definition to use in all CRM projects.
Sins:
No Unified CRM Strategy (multiple CRM projects occurring simultaneously)
Failing to Manage Staff Expectations of the benefits and costs of CRM
Failure to Define Success in Customer Management
Outsourcing Hastily (or Not at All)
Failure to Change Business Processes (Failure to differentiate customers
and change processes based on that customer’s value to the organization)
Not Understanding Product Features and Differences in CRM Approaches
(operational CRM versus analytical CRM)
Lack of Integration, Understanding and Executive Attention (No “Single
Version of the Truth”)
Closing
with Critical Success Factors, Jill reinforced the ideas she opened the
presentation with, concluding with some examples of successful CRM
implementation. Questions to Jill demonstrated the need for education in CRM,
its concepts, implementation and approaches to solving these “7 Deadly
Sins”.
|
Conference Session |
Speaker |
|
Elevating the Role of IRM for Business Effectiveness |
Larry English, Principal, INFORMATION IMPACT International |
Summary
by Margaret O’Hara
English began his
presentation be explaining why traditional approaches to data administration
have failed to create positive impact and acceptance in the enterprise. The
cause, he believes, is that we are operating still under an industrial age
paradigm. We fail to view information as a strategic enterprise resource because
we have overlaid IT on obsolete structures. The industrial age is vertical; the
information age is horizontal. To illustrate this, one example English used was
that all managers (not just HR) can read organizational charts, all managers
(not just financial) can read balance sheets, but only IT managers can read data
models.
To move from data
administration to information stewardship (which English recommends), the
organization must view information as a strategic resource with a resource
management life cycle. This means that information must be planned for,
acquired, applied, maintained and disposed of in the same manner as other
resources.
English presented some
trends in data / information quality to illustrate that it is getting worse:
-
in one firm, 66% of 6 million records were useless
-
DA influence seems to be decreasing
-
DRM is moving away from the business
-
65% of data warehouse initiatives fail outright
English believes that the
term meta-data should not be used because it has no meaning to non-IT people.
To elevate IRM
effectiveness:
English believes we must
move from Data administration to Information leadership, and from being data
bigot to business bigots. He also told us: Don’t sell – listen!
|
Conference Session |
Speaker |
|
Comparison of Data
Modeling Techniques |
Panel: Davida Berger (moderator) Graham Witt |
Summary
by Davida Berger
This was a very lively
advanced session with renowned modeling experts
ERM
Provides for the complete
definition of information requirements in an understandable format such as
entities, attributes, relationships, generalizations/subtypes.
Well-defined integration with
process models. CRUD matrix relates entities to processes in the DFD (data flow
diagram).
Entities and attributes can be
easily visualized as tables and columns and implemented in relational or object
relational database management system
ORM
Best use is for conceptual
informational analysis
Focus on fact types where objects
play roles. Fact instances, types and rules are verbalized in a formal,
graphical and textual language
Mature and well defined
Limited modeling tool support
May be better than ERM for
conveying requirements to designers but not good for dialog with the business
UML
Can capture additional elements
such as triggers and indexes
Data and process not as well
integrated as in ERM
Has limitations for database
modeling. No key constraints but very useful, and may be better than ERM for
object oriented code design
No matter what methodology
is used the model must be designed and readable for the business community.
Special attention should be given to the presentation and arrangement of the
diagram. Names of entities, attributes, and relationship should not be cryptic
and should represent business terms and not computer or system concepts or
functions.
|
Conference Session |
Speaker |
|
Meta Data – Myth and Realities |
John Ladley, President
Knowledge
InterSpace, Inc. |
Summary
by Ron Klein
John outlined his
experience – he did “James Martin stuff”. He worked for Meta Group. He
worked at integrating everything and doing Data Administration.
John makes the point that business is “gray” – not black & white. Collaborative Intelligence comes about when tacit and unstructured information is factored into a business decision.
The reality of meta data is that there are No comprehensive tools, Repositories are not capable enough, there are 2-3 standards, and too much in house development. However, CWM is a tremendous step in standards. Remember that CWM scope is limited to data warehouse (DW) - and analytic application-relevant metadata, while the OIM schema is supposedly capable of handling knowledge management and business-process constructs. Therefore, enterprises considering panoramic metadata/repository initiatives may find CWM limiting, though more broadly supported.
Don’t be
afraid to build your meta data bottom up.
Despite his
apparent despair at the state of meta data products and management, John
actually believes the importance of meta data will increase in the future. His
summary slide said:
|
Conference Session |
Speaker |
|
The UPS Meta Data Repository – A Success Story |
Patti Munier, Senior Data Analyst and Manager, United Parcel Service |
Summary
by Carey Clark
UPS is a large company.
Every year it handles 3.28 billion parcels using 1700 facilities, 575 aircraft
149,000 vehicles, and 344,000 employees. It is 93 years old.
UPS uses Computer
Associates’ Platinum Repository and rather than being a gate keeper for new
development they are more of a watch dog. They use Platinum’s scanners to scan
all production databases and programs throughout the enterprise. They then
compare what they find to the meta data in the repository. Entries that aren’t
recognized or don’t meet standards are flagged for review and brought into
compliance. What passes is parsed and loaded.
Developers use the
repository and are required to involve data administration from the outset of a
project. But because Patti’s group is constantly scanning the end result, they
know what is real.
They track over 5000 key
words, 30,000 data elements; database structures, and copybooks. The repository
is updated twice a month. This data is then distributed through an intranet. The
site gets 24,000 hits a day by every level of user.
One of the key processes is
what they call rationalization. All representations of data are documented and
linked back to the master name and definition. The data description is stored
only once. This enables UPS to do impact analyses quickly. Anyone can find out
what data is being used, where it is being used, and whether or not it’s
official. The benefit of this cannot be over estimated.
Meta data types include,
abbreviation name, full English name, physical name, standing (approved, non
approved, skeleton), source (e.g. vendor name), descriptions, warehouse
description and history. Every data element ends in a “class word” (e.g.,
number, text, code, etc.) as part of its formal name.
The success of this effort
has reduced data disparity and allowed them to decommission the other
dictionaries at hubs and distribution centers. The repository is used for
training new employees who are able to learn the corporate vocabulary quickly.
In the future Patti’s
group plans to compete the data element quality application, provide support for
XML, DTD’s, and Schemas, automate scanning and loading of SQL Server data, and
add business rules.
Patti presented some of the
repository’s screens: Straightforward, understandable and powerful.
|
Conference Session |
Speaker |
|
Universal Data Models for Web Constructs |
Len Silverston, Founder, Universal Data Models |
Summary
by David Plotkin
The motto of the
presentation was: "The more you see the whole, the closer you move towards
the truth".
Len presented a series of
generalized (or "universal") models for the following subjects: Web
Parties, Web Party Contact Mechanism, Web Login, Web Site Content, Web Object
Usage, Web Visits and Hits, and Web Star Schema (data warehouse). The common
characteristic of these model is that they did not contain any aspects of the
business at the entity level. Instead, they used very generic terms such as
"Party" (person, organization, or automated agent who participates in
a process or transaction), Party Type (a generalized way of classifying parties)
and party role (customer, referrer, supplier, etc.). Although Len did not model
the relationships themselves in the limited time available, he did state that
the roles could not exist without a relationship. For example, the role
"customer" could not exist without a relationship between parties.
Tuesday, March 6th, 2001
|
KEYNOTE |
Speaker |
|
(and DAMA
Individual Achievement Award) |
Peter Chen, Professor, Louisiana State University |
Summary by Anne Marie Smith
Rose Romero, DAMA
International VP of Communication, presented the 2001 DAMA International
Individual Achievement Award to Dr. Peter Aiken, and Dr. E.F. Codd. This is the
first time that 2 individuals were the recipients of the Individual Achievement
Award. Drs. Aiken and Codd received this award for their significant
contributions in the field of Information Resource Management. As educators,
consultants and authors, they have assisted numerous companies in developing and
maintaining data resource management environments, therefore expanding and
enhancing the roles of information management professionals. It should be noted
that Dr. Aiken is a member of the DAMA International Board of Advisors.
Other nominees for the 2001
Individual Achievement Award were:
Larry P. English, David
Marco, Dr. James Martin, Dr. Richard Nolan
After the award ceremony,
Dr. Peter Chen, the originator of the ER model, delivered a keynote address on
the relationships among the ER model, XML and the World Wide Web. Dr. Chen was
the 2000 DAMA International Individual Achievement Award. He gave the attendees
an understanding of XML and ER modeling, as well as several good, new buzzwords.
His entertaining and very
informative presentation focused on:
Dr. Chen concluded with his
insights on other interesting research directions in XML and web modeling. He
stressed the need for methodology for modeling in all arenas, and urged the
attendees to actively participate in the expansion and development of
understanding of XML and ER modeling.
|
Conference Session |
Speaker |
|
Business Information Management at Johnson and Johnson: Beginning the Process |
Larry Dziedzic, Information
Management Architect, Johnson & Johnson |
Summary
by Margaret O’Hara
Larry Dziedzic began his
presentation by offering a brief history of Johnson and Johnson and his personal
background in the Information Management discipline. With 198 diverse companies
scattered throughout 52 countries, coming to agreement on an any enterprise wide
standards is a daunting task. The companies are grouped together into three
primary divisions: Consumer products (shampoo, band-aids, Tylenol), medical
devised and diagnostics (hips, shoulders, glucose monitors) and pharmaceuticals.
He then presented the
initial plan for establishing the business Information Management (BIM) program
at Johnson and Johnson. Using some basic and easy-to-understand examples, he
explained the particular problems J&J experiences. For example, when a new
fragrance is added to a shampoo, does it become a new product or a variation on
the existing product? Because of the nature of the J&J culture (with all
companies retaining some degree of autonomy), questions such as this have myriad
answers.
Other surprising issues he
encountered included: only 70% of information being correct, and management
being satisfied with that statistic. Moreover, the Information Management
Architecture group did not typically talk to the customers, relying instead on
pre-existing information – which was sometimes inaccurate. Thus, the lack of
attention paid by IM to the business side, and therefore a lack of appropriate
information were fundamental problems.
Dziedzic went on to
illustrate some classic examples of “dab” information making the news to the
detriment of the organization to which the information applied. Among the
specific challenges that J&J faces are: the level of autonomy of the 198
diverse companies, the varying level of resources for these firms, and the lack
of standard ERP package among the three primary groups (One has selected JD
Edwards and two have selected SAP).
To alleviate the situation,
global competency centers (GCCs) are being formed to liaison to the business
community. Thus far, GCCs have been established for two of the groups, with the
third one coming later this year. These GCCs will work with the global partners
to establish unified applications and implement global strategies. Consultants
(internal and external) and helping to develop the BIM strategies and best
practices and tools will be utilized.
One major problem J&J
faces is that the SAP and JD Edwards packages will eventually have to interface.
More importantly, the task of implementing the GCCs is very much a people
problem – with listening, educating and communicating being top priorities.
|
Conference Session |
Speaker |
|
Measuring the Quality of Models |
Peter A. McDougall, Senior Data Administrator, Insurance Corporation of British Columbia |
Summary by Linda Kresl
This presentation focused
on an approach for measuring model quality that Peter developed over five years
ago. The criteria for evaluating a model are based upon the aspects of
communication. Furthermore, since a data model is a composite object, the
presentation described how a model’s quality is actually derived from the
collective quality of its components. Thus any quality measures shouldn’t be
applied to the model as a whole, but instead to its smaller, atomic-level
pieces. As such, five communications based yardsticks – Accuracy, Clarity,
Consistency, Conciseness and Completeness were discussed.
Peter also focused on the
model review process. Two techniques called Direct Feedback and Business-Based
questioning, plus how the quality measures are used with these methods, will be
described. These techniques focus on understanding the business unit’s
relationship to the message from the model. They take a nonjudgmental
perspective and are designed to develop a collaborative framework used for
working towards a quality product. Lastly, the presentation described how
communications-based criteria ultimately produce better models.
The following topics were
discussed by Peter:
·
Why communications-based measures
are useful to evaluate the quality of a model
·
The five criteria used to measure
quality
·
A set of techniques for applying
the measures
·
Why the approach creates models
that have quality built-in, instead of “inspected in”
|
Conference Session |
Speaker |
|
Organizational and Development Strategies for Creating a High-ROI Enterprise Data Warehouse |
Brent
Lautenschlegar, Principal, Reflection
Technology Corporation |
Summary by Anne Marie Smith
Brent has much experience in
enterprise applications and data warehousing. He used these experiences to
describe the implementation of an enterprise data warehouse at Delta Air Lines.
Brent gave an overview of the
history of the data warehouse at Delta, which had a focus of incremental growth.
Business users at Delta were not well served by Information Technology at Delta,
and this lack formed the rationale for developing and implementing an enterprise
data warehouse. As a result, Brent’s presentation was more business-oriented
than technical, although he did discuss some very technical topics in answering
questions. The teams of users and IT specialists included subject areas of HR,
Operations, Finance and Marketing/Sales. Eventually, this data warehouse was
able to “establish a single version of the truth”. Having a conceptual data
model for the enterprise was essential to the success of planning this massive
project, despite the fact that many subject areas did not have transactional
level data models to use as a basis for the data warehouse. Capturing
requirements and feedback from the user community was a hallmark of the quality
effort within Delta and the data warehouse project.
Brent outlined the
technologies used in this project: Teradata for the DW database; Brio for
querying and reporting, SAS for statistical analysis; Informatica for
extraction, transformation and loading (ETL) and Essbase for multi-dimensional
database management.
Each module of the data
warehouse was developed within a 60-day period, to counter the perception of a
data warehouse as a monolithic project. Incremental development has many
benefits to both IS and users, and gives ownership and control to the
development and implementation teams, as well as demonstrating the progress of
data warehousing to the organization’s management. One disadvantage to this
rapid, incremental development effort was the need to alter the habits and
expectations of database administrators and data administrators / modelers.
These team members were not accustomed to working in this rapid environment, and
some culture change was necessary. Brent explained the steps the teams used to
meet this development deadline, and described some of the challenges the teams
encountered in some subject areas.
Questions to Brent were both
business-oriented (cost-benefits, information use approach, skill development)
and technical (reasons for choosing certain technology, interfaces and their
construction). Questions lasted into the break period.
Summary
by Ron Klein
The Library of Alexandria
purpose was to gather material from the countries they conquered to subjugate
them. A heck of a business value!
Start with Robert
Anthony’s Framework for looking at enterprises (see page 11 of speaker’s
paper on CD-Rom). Consider that knowledge can viewed in a similar manner (see pg
12). Now propose an architected view knowledge – a Library model is not a good
model for the business.
Gil and Frank stressed the
following key presentation points:
-
The meta views and knowledge content are important to an enterprise
-
Meta views are needed to successfully implement critical applications in a
business such as:
–
Enterprise application integration
–
Business performance measurement
–
Customer relationship management
–
Enterprise resource planning
-
Knowledge fills or is connected to many meta structures
-
A meta-data strategy is needed to get best business value
- Businesses without meta views will gradually fall behind with failed implementations or only partial realization of benefits
Integration will get money because it saves money!
|
Conference Session |
Speaker |
|
|
Andrew Watson, Technical
Director, Object Management Group |
Summary
by Carey Clark
Andrew described the Object
Management Group. It’s a not-for-profit. body with over 800 members where
decisions are proposed and accepted by their members. OMG is not an official
standards body like ISO and no one is obligated to conform, however most do.
Anyone can access and download their specifications. There are no fees or
passwords.
OMG has numerous task
forces and special interest groups covering all manner of subjects and
industries.
UML
OO modeling like ER
modeling has a wide variety of notations. By 1994 it was a real mess. Similar
concepts, incompatible notations, few support tools. Methodologist are often
very stubborn, and getting agreement is extremely difficult. In ’95 Jacobson
and Soley began to push for modeling standards. By 1997 UML was accepted by all
parties. The current version is 1.4.
UML is designed for
visualizing and documenting software. It is was not designed for database
modeling. UML is not a method but a convention for representing software
constructs. Because of this standard, lots of tools have been built and over 60
books written. It is now used in over 70% of IT shops. Until it was adopted no
one was willing to invest the capital to develop tools.
Version 2.0 of the
specification is under development and if you want to influence it, now is the
time to speak up. Thirty seven companies are already on board.
MOF
The meta object facility is
a meta data architecture (i.e. for repositories). It works in cooperation with
UML. It leans heavily on XMI, a meta data exchange specification. XMI enables
meta data to be passed between modeling tools. This in turn enables DTD’s and
later XML Schemas go in and out of modeling tools seamlessly.
CWM
The volume of data in an
organization doubles every 5 years. Much of it is redundant and inconsistent.
CWM provides a standard way of handling data warehouse problems. It supports
ETL, OLAP, XMI, and UML. In addition specifications are being developed by, and
for, specific industries.
CORBA
CORBA is a middleware
specification. It’s a list of API’s that allow data to be moved from legacy
systems to new ones and back. There is still a lot of COBOL code that needs to
integrate with VB, Java, DBMS’s, the Web, etc. It facilitates this integration
while staying vendor independent.
CORBA has been extended to
include XML and DOM (Document Object Model). It enables XML structures to be
compacted into a binary format for easy transport.
Domain Specific Standards
PIDS, or Personal
Identification Services, provides a way for health care providers to identify
individuals. There is no reliable unique identifier for people and
misidentification can mean wrong treatment. Hence algorithms determine the
probability of a match.
Resource Access Decision
(RAD) specifies how to secure access to healthcare data. It helps to implement
and enforce access policies and procedures.
|
Conference Session |
Speaker |
|
Embracing XML Strategic Implications for Data Administrators/Architects |
Peter Aiken, Institute for Data Research Virginia Commonwealth University |
Summary
by Arnie Hook
Dr. Aiken looks at the
organization/legacy assets to locate opportunities to integrate data with the
management of meta data. The focus is on the evolution of systems. He advises to
not try and develop components all at once. Time and expense equation?
The presentation identifies
XML Benefit and XML application Integration ratings for various business and
technology classes. XML is ‘meta data wrapped around data’ and associated
with business problems and planning.
XML equips the
organizations with the tools to and technology develop programmatic solution to
manage data interchange environments using economies of scale. Peter explains
the metrics and time problems for engineering the legacy. The 7-hour per
attribute definition metric does not exist (a myth) in creating project plans.
Aiken uses real life
examples for the audience to understand the implications of XML, data
architecture/engineering, and data management practices to approach and define
data solutions. The scenario of systems operations using XML manages business
rules and data interchanges.
Using XML expands the
definition, roles, and preparation required of data management for e-business
development. Attendees of this session benefit from early XML adopters and the
role XML will play in future data management.
|
Conference Session |
Speaker |
|
Enterprise Data Management Without the Enterprise Data Model: Working
in the Real World |
Sheri Dumire-Hamilton, Senior Systems/Business Analyst, Kodak |
Summary
by Margaret O’Hara
The goals of Sheri’s
presentation were to demonstrate how ED Management would benefit the firm, to
present some different approaches to resolving issues and to identify some
sources and issues of technology change. The goals of ED Management are to
increase data sharing across the organization, to increase reuse of data and
maintain control, to enable evolution of new technology, and to integrate new
needs and stability of DBs over time – in effect as data evolves, the DM must
keep up.
An ED Model does several
things. It supports the use of data as a corporate asset; it provides a vehicle
for communication and agreeing about data meaning and usage, and it supports the
sharing and reuse of data across functional areas. Still ED Models are often not
constructed. These are many reasons for this. Among the reasons are:
construction requires support and direction from senior management, it absorbs
resources and may not provide immediate measurable value, and it is often
perceived as a corporate mandate with little value to specific functional areas.
So, where can you start to
develop ED Management? First, select a problem that data management will address
with high probability of success. Symptoms of DM problems include: lots of
interfaces being written, customer complaints about supplying information
repeatedly and errors due to bad data, data unavailable for decision making,
problems in enterprise data management, and difficulty in meeting changing
business needs.
To handle the problems,
first define the problem domain then plan the approach to solve it. It is
important to publish the approach and review it with affected areas. Some things
that can “bite you” are: there is a common ground, but everyone is fighting
for a piece of it. To alleviate this, find a champion and form a steering
committee. Power struggles occur because data is not seen as a corporate asset.
By educating the concerned parties about the nature of data management data is
viewed more as a corporate asset. Finally, it is important to network,
communicate and educate the involved parties. Build relationships with
individuals to increase their comfort level, their trust and your own
credibility.
|
Conference Session |
Speaker |
|
How do you
Convince Management to fund your Proposal? |
David Davis, Vice-President, Enterprise Data Management Group, Bank One |
Summary
by Linda Kresl
This presentation focused
on the political maneuvering required to persuade and convince management to
fund projects. David explained that people with technical backgrounds often
stress the technical aspects of a proposal to their detriment. The context of
the proposal, it’s timing and how it is presented often affect the acceptance
or disapproval of a good proposal. Various anecdotes, analogies, marketing and
forming alliances can lead to successful, approved proposals and projects. The
best implementation, technique, new technology and method do not guarantee
acceptance and funding.
This presentation further
explained the following steps to ensure success:
·
Learn that the work involved in
“selling” a proposal may be as difficult and necessary as the project
·
A technique of creating analogies
·
Share successes and failures
·
Learn the importance of ‘sound
bites’, charts and diagrams to sell proposals
|
Conference Session |
Speaker |
|
Data Warehouse Project Planning |
Sid
Adelman, Founder, Sid Adelman & Associates |
Summary
by Anne Marie Smith
Sid Adelman, consultant and
co-author of the book “Data Warehouse Project Management” presented a
roadmap for developing a successful data warehouse project plan.
Sid outlined the history of
data warehouse project planning, why project planning is critical to the success
of any development effort, what constitutes a proper data warehouse project plan
and how to relate the project plan to the technical infrastructure.
To date, many organizations
have taken the approach of not planning a data warehouse project for many
reasons. Almost without exception, these non-planned projects have failed, and
according to Sid’s research, this failure can be traced to the lack of a
concrete project plan. This presentation showed the similarities between
traditional systems development and data warehouse development and the few
differences.
Major points in Sid’s
presentation included:
·
Project Selection: choose
sponsors and users who really want the project to succeed, a project with
importance to the organization, a project that WILL succeed (not necessarily a
high profile or controversial project), and a project with measurable success
factors, a project with reasonable size (database and interfaces) and reasonable
time expectations, project control
·
Function: source data (from
where are you getting the data, is it reliable and clean?); determine needed
summaries, aggregation and integration methods; develop appropriate canned
queries, issues in the meta data repository for a DW (user-oriented). User and
technical functionality are different, and the differences must be understood
and evaluated.
·
User Expectations: performance
(sub-second response time is unrealistic from a DW), simplicity (ease of use of
the user tools, easy to understand navigation), accuracy (clean data, correct
data – these are different), availability (do you really need 24x7, 365? This
is very expensive and usually not a true requirement), timeliness (data refresh
expectations must be established), difference between summary and detail data
access needs. Traditionally, success is not well-defined, and can be achieved
through communication of expectations
·
Scheduling: taking a phased
approach (by subject area and user role delineation) is the foundation of a
successful data warehouse, task estimation (a difficult task and experience
contributes to amount of time needed to complete a task), actual hours worked
versus elapsed time (which measurement will you use? – use both), essential to
build contingency factors into a plan since interruptions will always occur,
schedule responsibly since too-tight schedules force people to do re-work.
Delivering low-quality results quickly is NOT a method for success! Sid felt
that a 60-day phase was a bit too short, and recommended a 3-month phase.
·
User Responsibilities:
co-project management (IT and user managers), users must define requirements
(NOT the IT staff), security requirements (essential in web access to data),
determining roles in query and reporting tool selection (not necessary to
involve users in infrastructure tool selections, user involvement in training
material development and implementation
·
Tools and Service Agreements:
performance and response time requirements are not appropriate for a DW, but
availability and problem response time requirements are appropriate for a DW, DW
implications on the work of the Help Desk or other support mechanisms
·
DW Project Planning: the usual
steps of application development project planning apply, each task should not
exceed a 40-hour period, each task should have a primary responsible party (even
if there are more than one person on the task), each task should have a defined
deliverable, each deliverable should be evaluated for completeness and
contribute to a defined milestone, progress monitoring and change control
management are also important and frequently forgotten
·
Resources: people versus roles
(some people can fill multiple roles, but should they?), development and
maintenance of a capabilities and skills assessment for all team members, direct
reporting relationship (100% focus on the DW project), importance of management
commitment and active support across and through the organization
Sid concluded with offers of
some reference material (web links, task examples, suggested vendors) to
interested attendees.
There were numerous questions,
and they included the issue of data cleansing at the source (do you go back and
clean up data that is clean in the DW and not clean in the source?), the best
format of a project plan for a DW (iterative or spiral), cost-benefit analysis
of a DW (see an accountant!), choices in various tool categories, and specific
roles to be included in any data warehouse project. These questions showed the
level of interest in data warehousing and its “resurgence”. It also
demonstrated the need for more presentations on data warehousing and project
management.
|
Conference Session |
Speaker |
|
Meta data Directory vs. Meta Repository |
James Jones, Product Manager, Oracle Corporation |
Summary
by Ron Klein
James started by citing
ORACLE own experience in streamlining its business using its own solutions. i.e.
“Eating our own dog food”
Lightweight Directory
Access Protocol (LDAP) is the Exploding Standard. It is a light, browser
friendly client implementation.
What are Directory
Services?
-
“A flexible, special-purpose distributed database designed
to the storage and retrieval of entry-oriented information for a wide range of
applications.”
-
DS are a type of universe of meta data
The Meta Directory
Paradigm:
-
Touches everything and is everywhere
- A single directory that connects everything
Stretching the idea of meta data persistency and sharing:
Nodes
+ Hubes = Nubes <- ETL
|
Meta
Directory |
Meta
Data Repository |
|
Metadata
(Hierarchical) –Security –Party –Network –Device Security
Integration Device
Integration Giant
Installed Base |
Metadata
(Any) Managing
files and folders Dependency
management Versioning Configuration
management Tool
Integration Small
Installed Base |
Q: Is the Meta Directory
usually a source for the repository?
A: A place where it can
store this information, but it is not strong enough to hold the complexity.
Q: Should we be hanging off
these directories underneath a repository?
A: Underneath a portal,
yes.
|
Conference Session |
Speaker |
|
Ramping up for Meta Data and
Knowledge Management |
Don
Soulsby, Director
of Architecture Strategies, Computer Associates |
Summary
by Carey Clark
In
the beginning was Electronic Data Processing (EDP). The focus was handling files
and getting data in. In the 80’s the focus was on getting the data out (DSS,
EIS, Queries). This age will be known as the knowledge management era. Tabular
data needs to merge with documents, graphics, and video. The buzzwords are integration
and access, and like before, tools follow the need.
Knowledge
Management
Knowledge
is information (data) at work. Eighty to ninety percent of corporate information
is not tabular in nature. The issue is how to store and retrieve it efficiently
and combine it with pertinent tabular data. The difficulty is compounded by the
tribal nature of various disciplines: Data Processing, Library Science,
multimedia, desktop applications, Web technologies etc.
Legacy
systems tend to resemble spaghetti. When using third party packages one must not
only use other peoples’ products, but other peoples’ models. How, then do
you find what you are looking for? In 15 years the baby boomers begin to retire
and their knowledge goes with them. It behooves organizations to capture as much
as possible before they go.
It’s
a massive problem, not unlike building the Empire State Building or the Queen
Mary. As in those cases, a key factor was having the right tools (e.g. the rivet
gun).
The
Solution: An Enterprise Information Portal
Create
a single place where all information can be accessed and displayed. Integrate
the various forms of information. Where possible, provide dynamic
personalization. Make it easy to find, easy to understand, easy to navigate, and
believable. Provide information in context as a way to recognize what you have.
Most
knowledge architectures are hierarchical. This is efficient for getting
somewhere fast but not for finding stuff in the first place. A better model is
the Knowledge Mall. You can find stuff alphabetically, by category, by context,
and by wandering around. Still there’s a need for a map.
Don’s
technique was to classify information using the Zachman Framework. Going
vertically you have rows for Business, Operational, and Technical. Going
horizontal are columns for Who, What, and Where. This could be expanded to match
Zachman’s 6x6 matrix.
Personalization
involves knowing specifics about the user. Might be buying patterns, sales
patterns, demographics. Based on these the user sees different screens, menus,
options etc. Software behind the scenes is able to learn, predict, adapt, and
optimize. Patterns are recognized and used extensively to present information or
suggest new resources.
Observations/Predictions
Metadata
repositories are likely to adopt parallels to retail’s UPI codes. Data will
have truly unique identifiers.
Meta
data must be collected in response to a business event. If people have to enter
it manually, it most likely will not be maintained. The imperative is to
decrease the number of duplicate instances. Store once, distribute many.
He
expects knowledge navigation and supporting software to resemble the neural net:
It recognizes patterns, learns from experience, adapts dynamically, and predicts
outcomes.
|
Conference Session |
Speaker |
|
Building the Scalable Data E-Frastructure |
Tim McBreen, Senior Principal and E-business Practice Leader,
Knightsbridge Solutions |
Summary
by Arnie Hook
The theme of the talks is
to make sure we are ‘building the enterprise infrastructure’. Tim describes
the high-performance data solution, which is robust, and scalable and cost
effective. Why performance matters related to data volumes and quality of use,
and influx of data.
Mr. McBreen says that
scalability rules the day; build it once; build it right, scale often. The
e-frastructure data engine includes:
Data
acquisition processes
Data
repository
Data
mart creation processes
Tim describes a typical
solution encompassing data extraction, transformation, aggregation, and
balancing/controls & loading. The tool of the month club will not work to
manage the e-frastructure environment. Changing tools created chaos for impact
analysis and applying new requirements.
Business
path – end user focus
Data
path – design, development focus
Infrastructure
path – design, configuration, implementation focus
Mr. McBreen stresses the
importance of data management solutions that allow companies to enable the
‘power enterprise’. A compelling message to the data practitioner needing an
approach to deliver a data warehouse application.
|
Conference Session |
Speaker |
|
Data Administration on A Shoestring
|
Becky Kirkpatrick, Data
Architect
Union Pacific Technologies |
Summary by Margaret O’Hara
Becky began describing how
Union Pacific IM has adjusted to downsizing of staff, mergers and lack of
funding to provide an online metadata repository that was quickly put together,
is very functional and continues to grow in use and in capability.
The results of her and 1.5
full time employees is an enabled website using the Zachman
The problem, as Kirkpatrick
detailed, is that end users and IT project managers want to know immediately
where they can get state and country data, customer number information and
values from a railroad equipment master.
None of these important
questions could easily be answered by any means that were currently available.
Kirkpatrick’s management that her group of 2.5 people put something
together within a 3 to 4 month period.
The team ‘piggy backed’
on existing files (manual and automated) and utilized existing technologies
(i.e. Access, Excel) coupled those with web development and produced a product
that was successfully implemented and accepted.
Kirkpatrick then walked the
audience through a demonstration of the online site that they developed.
|
Conference Session |
Speaker |
|
Mapping
UML to the Zachman Framework |
Neal Fishman, Enterprise Architect , Equifax |
Summary by Linda Kresl
This presentation focused
on why it is important to map the UML to the Zachman framework. The number one
reason is to model systems, from concept to executable artifact, using
object-oriented techniques.
•
To address the issue of scale inherent in complex, mission- critical systems.
•
To create a modeling language usable by both humans and machines.
•
Use the UML for...
–
Visualizing
–
Specifying
–
Constructing
–
Documenting
Neal explained that the UML
consists of nine models and the Object Constraint Language (OCL). The Zachman
Framework for Enterprise Architecture identifies at least thirty models. This
presentation reviewed each UML model type (use case, class, object, component,
deployment, activity, statechart, collaboration, sequence, an OCL), and review
which of the Zachman cells they map to. The presentation then explored the use
of stereotypes to augment the native UML models in creating more model types to
demonstrate how to complete the mapping to the framework.
·
Identifying the UML models
·
The Zachman Cells
·
Using stereotypes
·
Mapping the models
|
Conference Session |
Speaker |
|
Managing Customer Information for CRM |
Danette McGilvray, Customer
Information Quality Program Manager, Agilent Technologies |
Summary
by Anne Marie Smith
Danette
asked and answered the questions “Can you claim to know your customer if the
information in your systems about that customer is wrong?” and “How can you
manage the relationship with your customer if the basic process for acquiring,
maintaining and using that information are not working?”
Danette’s
presentation focused on these points:
Danette
presented a case study in CRM, using Agilent’s customers as the basis of a CRM
initiative. The case examined a customer information system (one of many at this
client) to determine the level of effectiveness for CRM. The system was
developed with the framework mentioned above, and was used as a method to
re-engineer the customer approach at this client. She concluded with some
examples of uses of information in a CRM pilot system and a list of challenges
to CRM.
Questions,
taken throughout the presentation, were around the framework’s development,
uses of information in CRM, explorations of reasons for CRM failure. Danette’s
presentation showed the relationship data has in a CRM effort, and the need for
quality data in CRM.
|
Conference Session |
Speaker |
|
|
Bob Carasik, Systems Architect, Wells Fargo Bank |
Summary
by Ron Klein
Bob has worked
with Data Dictionary for two decades and is still doing the same. He worked with
Case, Repository, XML and messaging standards. He currently co-ordinates the
meta data initiative for the enterprise portal. Wells Fargo is a leader in
Internet banking, eBay and account aggregation to customer. You will hear more
and more that all your financial services can be bundled in one site. Clients do
one log-in and have access to many financial services.
Doing
the Portal = Reality hits people in the face. We need to know about meta data. It is quick to explain why it is
important, but to get into the project plan is another story.
Mapping is a
hot spot to help find inconsistencies.
The goal is to
make the transitions easier. Has to be bottom up and has to be distributed.
You find meta
data automatically through the Web.
Messages are
way under cover in systems. Now it surfaces as meta data and turns out to be as
important as database schema meta data.
End Users
don’t need to understand the meta data, but do quick searches.
Meta data
Paradigms: The Ideal
The old idea
to centralize everything did not work. The enterprise wide model is also a
challenge. I can see the advantage of that, e.g. DHL expanded one character on
the packaged ID field, and has been dealing with this issue for many years.
Bob strongly
suggests the federated approach to meta data. This accepts that semantics differ
across the enterprise but provides a common format for meta data.
He also proposes a
lightweight meta data strategy for building step by step. Recognizes that high
quality meta data frequently costs too much to provide, relative to its benefits
to users. You
don’t need a full repository to begin with.
Resources can
come when you show how much conversation will be needed.
Lower your
standards! You’ll feel good when you deliver.
Q: When you
gather meta data are you building processes to maintain it fresh?
A: Share tags
and retrieve them on a project-by-project basis – it is a negotiation process.
It is not necessarily repeatable. That is the just in time concept here.
Q: What tools?
A: ORACLE,
sometimes a Web resource. Repository technology meets some needs but not all.
Have a standard for XML development.
Q: How you
handle change management?
A: Specific
for each project.
Bob did a very good piece-by-piece presentation on a current issue that
most of us are dealing with, namely developing portals and how to piggyback to
develop and gather meta data. Relevant notes on the project: we have 2
levels taxonomy. Allies: our technical library, our internal web team, PMO. Lots
of goodwill for meta data creation. XML-Schema as a documentation tool: Document
Language to a Data Language. Modified Dublin Core for defining Meta Tags.
|
Conference Session |
Speaker |
|
Architectures for Marrying Online Applications with
Information Repositories |
Faisal Shah, Chief
Technology Officer, Knightbridge
Solutions |
Summary
by Carey Clark
Knightsbridge
Solutions works with Fortune 500 companies to marry data from transaction
processing systems with data from data warehouses. Conceptually this is trivial.
One might suppose you just create a front end to display data from both
environments.
In
practice, however, doing this is very difficult. The difficulty arises from the
fundamentally different “quality of service” requirements of each
environment. Explaining and resolving this difficulty is the subject of the
presentation.
So why
marry these two data sources in the first place?
A
bank wants to provide customers with analytical information about their
investments, i.e., reporting tools to compare their portfolio with industry
indices (e.g. Dow Jones, Standard and Poors). Customers want compare their
performance with newsletter or broker recommendations. This is a serious
competitive advantage if the bank’s competitors don’t offer it.
An
Internet hosting service wants to provide their advertisers with real time
statistics: The number of visits to a site, the kind of visitors they were, what
web pages were visited etc. This must be done on an hourly basis; two days late
is unacceptable. In both cases historical computed data is displayed along with
real time transaction data.
Quality
Service Levels
A
typical online transaction processing system (OLTP) requires 24x7 uptime, sub
second response times, and 100% accuracy. It must fault tolerant even against
disasters. It has very narrow maintenance windows, usually minutes.
A
data warehouse systems are up typically 12 hours a day, 6 day a week (12x6). The
off hours are needed for batch processing. They don’t need transaction
monitors, the data can be a day or two old, response time can be several
minutes, and if something goes wrong, you’re not out of business.
And
herein lies the problem: Transaction systems can’t tolerate warehouse service
levels and warehouse systems can’t realistically achieve transaction service
levels. Users who see both data types at the same time assume the same service
level.
What to
do
It
is real important to perform careful ROI analysis. Ambitious requirements can be
outrageously expensive, even for large companies. The best solution is a set of
trade offs.
The
first reality is that one cannot put both transaction and warehouse data on the
same box. Just not feasible.
The
second reality is you can’t divvy up warehouse data into mini warehouses.
Doing so forces you to decide in advance what queries will be asked. If you
choose time, then geography is a performance problem, if by type
then time is a performance problem.
In
almost all cases, the online transaction system must remain fast and reliable so
every effort is made to impact it as little as possible. One successful
technique is to precalculate and preaggregate a small standard set of queries
and load that data on the transaction system. This precludes complex and ad hoc
queries, but still provides immense value.
Another
technique is to limit analytical data to the time dimension. Data can sometimes
be distributed across multiple database instances. A thousands trade-offs are
made, for example, weighing refresh times, whether to put analytical data in
with the transaction data or in a separate instance. Backup and restore can be
handled, but the data currency is different for the two environments. How
different is part of the trade off analysis.
In
a few situations the amount of data was so large that putting data in a
relational database was cost prohibitive. In these cases the client resorted to
creating massive flat files.
A
favorite techniques is to toggle between multiple database servers, or multiple
database instances, or multiple database tables. This technique doubles or
triples refresh times and hardware costs, but it works.
|
Conference Session |
Speaker |
|
Getting the Rest of Your Organization Ready for XML |
Korki Whitaker, Progressive
Insurance |
Summary
by Arnie Hook
Ms. Whitaker presents
advice concerning the introduction of XML discovery activities, the employee
indoctrination, and the needs for training. The talk is based on experience
gained at the Progressive Insurance Co. where she is responsible for
data-related teaching and development.
Korki’s advantage to XML
usage is that Progressive’s management understands the benefits and has a
place for new technologies in addition to allocating resources for its
promotion.
The Data Engineering group
led the activity with surveys of management and explored software acquisition
areas about XML tool requests. They also examined projects for interfaces to
internal and external systems. They proactively got involved with detail
requirements of projects.
Up-front analysis included
documentation and highlight of accomplishments with current projects. This was
critical to show management of progress and successes with XML. XML projects
require a high-level management sponsor in order to form a project and a
development group (internal XML forum) in alignment with business requirements.
The XML forum is an established common interest group with regular sessions and
subcommittees.
Ms. Whitaker’s group
developed a database of XML best practices and a training program to extend
knowledge throughout the organization. The group objectives are equivalent to
learning a new programming language. A core group promoting and mentoring XML as
a new technology benefit automated project progress.
Korki’s experience and
presentation sets-up any new XML advocate with material for introducing XML, a
new technology.
|
Conference Session |
Speaker |
|
Data Modeling Contentious Issues |
Karen Lopez, Principal Consultant InfoAdvisors, Inc. |
Summary
by Margaret O’Hara
This presentation was a
highly interactive look at the issues that people who subscribe to the e-mail,
web, and newsgroup based discussion groups have participated in. The format was
simple: Karen presented an issue, discussed it briefly and then asked the
audience to vote on it. Then, there was a brief discussion as to why the answers
were what they were.
Voting was performed in an
interesting manner. People in the audience were given Post-It notes and they
could stick them to one of 5 boards, depending on how strongly they felt about
an issue. Not everyone had the notes (the group was too large for that) but
there were enough people with voting ability to make the results interesting.
Among the issues discussed
were: whether conceptual data models were used (the results were evenly
distributed on a 1-5 scale) whether a good data model needed classwords (results
were definitely skewed toward 1 for always) and whether surrogate or natural
keys were preferred (results were dead center at 3). The surrogate key issue
generated a great deal of discussion; it was obvious that the audience felt very
strongly about this issue.
The session pointed out two
significant things. First, even a group of data administrators and data managers
cannot agree on everything. Secondly, the voting method used was quite effective
for taking a quick pulse of a large group and can be used in other similar
situations.
|
Conference Session |
Speaker |
|
Data
Stewardship-Fact or Fiction? |
Diana C.
Young, President, Applied Information Strategies |
Summary by Linda Kresl
Diana began
this presentation by explaining the term data stewardship has been tossed around
for the past decade. Stewardship is … the recognition that all individual
components of an enterprise serve to ensure the future of the total
organization.
The main stewardship objective is to provide high quality information that
meets the needs of the business:
·
Getting the Right Information
·
To the Right People
·
At the Right Time
Ultimately, the successful path to stewardship is based upon an
understanding of the principles of information stewardship, aligning those
principles with the business in a value-added approach, and planning and
achieving both short and long term improvements in the business. This
presentation addressed:
·
The four factions of stewardship: strategic, tactical, operational, and
technical – what are they and how they align with business processes and
functions
·
Stewardship roles, responsibilities, and the “A” word –
accountability
·
The four pillars of successful implementation-policy, program, practice,
and promotion.
Lastly, in our work, we are all information producers just as we are all
information consumers. As we work within our companies, we all must strive to
see that all functions of the company succeed. Therefore, it is in our best
interest for us all to champion the practices of sound information management.
Because, in reality, we are all Information Stewards.
|
Conference Session |
Speaker |
|
How to Make Your Business Processes Smarter |
Ronald G. Ross, Principal, Business Rule Solutions |
Summary
by Anne Marie Smith
Ron Ross, renowned consulting expert in business rules and the editor of
the “Data to Knowledge” newsletter, is one of the information management
field’s primary speakers and practitioners, and was the winner of the DAMA
International 1995 Individual Achievement Award.
This presentation introduced the concept of a business rule approach to
business process “education”, outlined three steps in the process of
applying the business rules approach and offered some suggestions for
implementing this approach in various organizations. Ron used cases from his
consulting to demonstrate the fundamental verities to business rules as the
point of control for the business.
Points in Ron’s presentation included:
·
The inevitability of business rules: Business rules can assist
organizations in doing things “faster, cheaper, better” and can teach
organizations about their company’s activities and culture. Organizational
trends from the 1960’s (automation) through the 1990’s (warehousing and
networking) concentrated on technology. The trend in the 2000’s is knowledge
management, a non-technical trend that needs business rules to succeed. Business
rules have “guidance spheres” that include policies, rules, guidelines,
instruction points and suggestions. These enable an organization to be effective
and to achieve the goals and objectives that the company has expressed. Guidance
spheres are fragmented, compartmentalized and not well-understood. Business
rules can make those guidance spheres cohesive, cross-functional and
understandable.
·
The need to trace the rules to their sources: Rules cannot be valuable
unless their sources have been identified, interpreted and captured to retain
corporate memory. Sources can be valid or invalid for each rule, and the reason
for a rule’s origination at a particular source must be captured as part of
the rule’s meta data, and rules should be managed as part of meta data in all
instances. “Outsourcing the business rules to the business” means that an
organization gives control of the business rules to the business users.
· Finding a single source for each rule: One rule should only have ONE source for consistency and tracability. Multiple sources for a rule cause confusion in users and can severely affect the value of the data instances from that rule. Viewing business rules as meta data would include the versioning of each rule, controlling the vocabulary in rules management, etc…
·
The crucial role of data and meta data in business rules: Data
administrators should be given the responsibility to manage the meta data of the
organization’s business rules, and should have a business rules repository as
part of the DA toolkit. Data administrators should be trained in business rule
management, just as they are trained in data management. Business rule
management includes the development of a rule vocabulary, logic for rule
construction, and techniques for the CRUD process of rule maintenance.
·
The needs of knowledge workers in the 21st century: Unlike
previous centuries, knowledge workers need “knowledge” to perform their
tasks properly. “Knowledge” of the data used to create information requires
an understanding of the logic behind the creation of a data instance. This logic
is a “business rule”. According to Ron, “the idea of not using a rule
engine to run your rules management will seem as strange in 5 years as not using
a DBMS to manage your data”.
Ron concluded with a discussion of some “first steps” in the business
rule development process and what he sees as the future of business rules.
Closing the communication gaps in an organization can be accomplished by
adopting a business rule approach to organizational knowledge.
|
Conference Session |
Speaker |
|
Meta-Architecture and Enterprise Meta Data Management |
E. Manning Butterworth, Senior Manager of Data Architecture Reynolds & Reynolds |
Summary
by Ron Klein
Dr Butterworth is
experienced in Business Delivery Architecture. He was a principal engineer of
DOD, Air Force and holds an Astrophysics Ph.D. Opening his presentation, he
stated that we need to give more
emphasis to the business issues than to act just on behalf of IT.
The business goals of meta
data management are to accelerate growth and drive down costs from doing
business. Butterworth described his meta-architecture process explaining from
the big picture to examples of fragments of how artifacts were defined in the
architecture. He chose the tool ARIS
as enterprise modeling tool, popular in Europe, originated from Germany. It
is an OO DB, methodology neutral or almost. Butterworth says the model will
never be “finished”, but will be useful along the way providing incremental
value.
This part of the
presentation gave rise to many questions:
Q: What are the 17 model
types?
A: It is the metamodel.
Q: Would there be a
constraint for each dimension?
A: Yes.
Q: Synthesizing the
process, how did you come to 17 model types?
A: It is based on what is
required to capture at this point in time.
Q: How do you deal with an
object type that is part of more than one model?
A: You can assign objects
from one model to another.
Q: What about Change
Management?
A: This is work in
progress.
Q: Where do you put an
instance?
A: Maybe in a document and
associate to the object.
Q: Do you get your meta
data manually or are there any automated extractions to populate?
A: Both.
Q: Will this data model
support architecture over time?
A: It will require change
management.
The audience
perception was that Dr. Butterworth’s work has an enormous potential across
industries due to its generic construct.
|
Conference Session |
Speaker |
|
|
Mike Scofield, Director
of Data Quality, Experian |
Summary
by Carey Clark
Experian
is one of the three big credit reporting agencies (formerly TRW). They store
data on 260 million consumers and a billion credit card accounts. The
architecture to do this is very complex and largely proprietary.
Data
quality is their major concern and insuring it is a massive undertaking. Massive
because they update a billion records every month and because it comes from 5000
separate outside sources! The data from these sources have different file
formats, different schedules, different quality levels, and varying data
conventions. At the same time accuracy and reliability is of utmost importance.
Determining
data quality is difficult because you can’t physically validate it and
sampling is often undoable. So you’re left with two kinds of tests:
Conformance with absolute rules and reasonability testing.
Absolute
rules are like:
Data
that pass the tests are allowed to be loaded into the database of record. Data
that doesn’t is diverted to a suspense database. Suspense data is then
examined by humans and the data’s source organization is often called.
Obviously
the incentive is to automate as much as possible and to reduce the amount of
human intervention. To this end they make it easy to study the problem, easy to
decide what to do about it, and easy to execute the decision. All of this must
be done without slowing down the timely loading of good data.
Lessons
learned:
Test
data as soon as its available and certify it before loading into the master
database. If you scrub the data, test it again before loading it.
Have
several architectural components: A data rules database so that rules don’t
have to be hard coded. A historical context database to remember exactly what
was received last time. A metrics database to store quantifiable measures of
quality. And a feedback mechanism to the supplier of the data.
It
is not uncommon for them to know more about the quality of supplier data than
the supplier does. And they sometimes provide quality assessments for a fee.
Without violating confidentiality agreements they can sometimes let suppliers
know that their data is not as good as their competitors’.
Experian
believes that knowledge and maintenance of data quality is what differentiates
them from other credit bureaus. They are a learning corporation and their
sophistication resembles an expert system.
|
Conference Session |
Speaker |
|
Same Old Work, New Dilemma: A New Approach to Data Design for Interactive Web Portal Applications |
Ho-Chun Ho, Director of
Information Systems, PointandQuote.com |
Summary
by Arnie Hook
Ho-Chun defines E-business
as ‘the transformation of key business processes through the use of Internet
technologies’. He presents an E-business maturity model that streamlines
applications for integrated enterprise architecture. The stages of maturity are:
Web
presence
Interactive
Transaction
Inter-enterprise
integration
The new dilemma is that
customers are demanding services through technology such as the WWW.
Organization developers want to use the best and fastest automation to deliver
applications and technical services. Executives want to use information and
technology to gain market share and increase competitive channels and to produce
new products.
Ho-Chun presents basic
e-business terms and describes application considerations. The content prepares
an organization for planning and development of the Web enterprise. He provides
a vocabulary of Web and development terms.
The attendees are better
equipped to consider the stateless design and data performance issues in
addressing Web applications.
Thursday, March 8th,
2001
|
Conference Session |
Speaker |
|
Enterprise
Information Architecture: "Starter Kit" Models |
Jane Carbone, Director of Information Architecture Services, DATANOMICS, Inc. |
Summary by Linda Kresl
This presentation reflects
the Jane’s experience in building and using enterprise architecture frameworks
to create architecture models and related data models. The presentation provides
a “drill-down” for the “models” dimension of the “data” component of
the “Starter Kit” architecture framework. It introduces a standardized
approach to building conceptual information architecture models. It describes
the link from architecture model to conceptual data model. It includes examples
and guidelines for construction of Current State (AS-IS) and Target State
(TO-BE) information architecture models. She focused on the following:
·
How to construct standardized
information architecture model
·
How to decompose standardized
information architecture models
·
How to create a conceptual data
model from a Level n information architecture model.
Jane stressed the
importance of defining the difference between information architectures and data
architectures. An information architecture is the graphical representation of
the business view of data functions, technology, people and processes and the
relationships and/or communications between them. A data architecture is the
graphical representation of the business view of data functions, processes and
the relationships and/or communications between them. She warns to always do the
enterprise architecture models before you attempt to create the enterprise data
models. A data model does not = an architecture model. But, they have a strong
relationship.
|
Conference Session |
Speaker |
|
|
Michael Gorman, President, Whitemarsh Information Systems |
There is currently no
summary for this session. Please
check the Wilshire Conferences web site shortly for the summary.
|
Conference Session |
Speaker |
|
Conceptual Data Modeling in an Object-Oriented Process |
Scot
Becker, Principal Consultant, InConcept, Inc. |
Summary by Anne Marie Smith
Scot
Becker, principal consultant at InConcept and the editor of the Object Modeling
newsletter, presented the Object Relational Modeling technique. Scot defined the
various components of a object model, how components are defined, briefly
defined iterations (“mini waterfalls”), and gave an overview of the analysis
phase of an object modeling session (“what is the problem”, NOT “how to
solve the problem”). The verification and deployment phases are iterative, as
is appropriate to such development, since components are iterative in
themselves.
Typical
uses of data modeling in an OO process are just for mapping objects to
tables/columns, so Scot proposed that there is no need for modeling in OO. Data
issues are usually ignored in OO analysis, and in the design phase data issues
are mostly concerned with denormalization. Scot is a fan of use cases for
process requirements, but not for data. The main difference between class
diagrams and ER diagrams is the abstraction inherent in class diagrams that do
not translate well to data analysis efforts.
Scot
offered some benefits to the OO approach to requirements and analysis:
Scot
offered some weaknesses of class diagrams, including inflexibility, lack of
clear cardinality, and the need to actively focus on entities and attributes
separately. Components are not perfect since too many important things rely on
getting all the components right the first time (lack of iterative
capabilities). Use case formats are too variable, and can be “overloaded”
with information that may not be essential. According to Scot, the lack of
formal OO modeling techniques can contribute to errors and misunderstandings in
analysis and design.
Scot
believes that the way to overcome the weaknesses of OO modeling is to develop a
new method, called ORM (Object Relational Modeling). This is not really a new
idea, and is centered around the concept that entities and attributes are simply
objects playing one or more roles. ORM uses a natural language (English or other
languages) and uses “data use cases” as well as “process use cases”. It
has a very rich set of constraints and it is set-based (mathematically valid).
Scot
gave an overview of the Conceptual Schema Design Procedure as a way to explain
the ORM approach, using the various OO components as a comparison to ORM. He
suggested that OO and ORM can be merged and work together to achieve a full
method for analysis, design, construction and implementation. He concluded with
a review of a case study that enabled him to employ the OO/ORM combination
approach.
|
Conference Session |
Speaker |
|
A Success Story: Enterprise Customer Data Standard Definition/Implementation |
Barbara Peterson, Enterprise Data Standards Program Manager Agilent Technologies |
Summary
by Ron Klein
Barbara is working with
Customer Data Standards, Web Applications and she has 21 years experience at HP.
She is not an IT professional. Her father asks: What do you do Barbie? After,
answering with an explanation, she invited him to attend the presentation. He
says – “Wonderful, how about I come in for lunch?”
Information Quality can be no better than the supporting data standards. How do you get standards that really make sense to the business, not for IT?
Applications must use the same naming convention. Agilent had a Quality person but not a Program Manager for Standards and the need came when the mappings among customers begin to surface nightmares. This was the reason I was hired to this job.
Q: How do you get the
business to understand the logical model?
A: By including them on the
creation through the Data Standards Council involvement.
Q: How do you get buy-in?
A: There are 30
representatives in the Council. We maintain a 2 hours weekly conference call.
Q: Are there any metrics to
qualify the benefits?
A: Manpower hours saved and
quality improvement could be the ones, but we don’t currently have any metrics
on this.
Q: After you define your
standards and buy ORACLE ERP system, do you change the business or do you adjust
the software?
A: We discuss that we can
leave without customizations and are able to implement standard ORACLE ERP.
Q: What attitude do you
have on implementing a package? In my case we changed the business? Maybe you
have other answers?
A: There are exceptions
because there is flexibility within the product and this is where the standards
come very handy to resolve things that the vendor does not answer in your
behalf.
Q: Are the Council defining
and taking information and disseminating throughout the organization?
A: Both.
Q: Is this council a single
level or multiple levels?
A: It is a high level
decision makers and also experts. Get the right people in the room. The
decision-makers have a vote.
Q: Where is IT in the
council?
A: Just below the CIO
level.
Data Standards Process
Roles
Don’t go to the council
without a draft pre-defined and available on-line on the Web and feedback come
those are the things you go and discuss in the Council. It is also common
knowledge.
Q: Your position is in IT?
A: I am in Relationship
Marketing.
Q: Does your Council deal
with physical data?
A: Yes with ORACLE ERP
vanilla implementation, but it is more on high level conceptual.
Q: How many people are
there in your organization and what skills do you need?
A: The skill set required
is facilitation skills and able to work with people that disagree. Will create
groups by subject.
Q: Is it focused in
Customer only?
A: IT is huge around
customers.
Q: How do you attract
people?
A: More people want to have
this role because they have decision power. Information is shared. People on
Council come from all over the business.
Q: Who pays for your
function?
A: At HP was funded by the
business, now is internal.
Q: Before talking physical?
A: If you have it going it
has to address both.
Q: How do you manage
multiple instances in multiple countries, and follow through time. What are your
guidelines?
A: In HP there was Steward
and nobody was policing. Now it is the council. They are the ones who enforce
and bring changes back to the council.
|
Conference Session |
Speaker |
|
|
Warren Selkow, Consultant |
Summary
by Arnie Hook
Why repository? It is where
the business knowledge is. It is all about information technology.
Decision-makers need to know about the business/technology environment. If you
are a big enough company, your mistakes become standards. The repository
contains the organization’s motivation.
Mr. Selkow describes the
environment and what it means to business leaders. He outlines the knowledge
paradox for future business trends and technology applications. Today there is
unprecedented business pressure and rapidly changing technology changing the
nature of business. Eighty percent of what we know is obsolete. Technology costs
are decreasing and information costs are increasing – a dramatic impact to
business practices.
Experts say whatever will
sell their ideas and products. Warren opines that the IT professional learns to
communicate at the executive’s motive of ‘time to respond’ and ‘time to
market’.
Mr. Selkow speaks of the
environment applied to standards and frameworks to provide how technology is
configured and its historical views all captured in an enterprise repository.
The repository is about deployment and the creation of measurable results. We
must educate management emphasizing ‘the need to know and share’, and how it
relates to customer benefits.
|
Conference Session |
Speaker |
|
|
Patricia Klauer, Senior Consultant, & Robert Cooley Senior Consultant Apex Solutions, Inc. |
Summary
by Margaret O’Hara
The topics covered in this
presentation were: clickstream data, web content and structure charts,
E-commerce data and the analysis of such data. Clickstream data is very
difficult to analyze – everyone wants it but thus far no one has provided an
effective solution. We want to analyze data in terms of user actions and
behavior (typically referred to as clickstream).
For even the most basic
sites, however, a single click may generate 3-4 pages of data that require lots
of Extract, Translate, Load (ETL) operations to get anything meaningful from it.
And, after all that analysis, we sometimes get nothing meaningful in the log.
Some specific problems are: IP addresses may not be unique. We can use cookies,
but they raise privacy concerns and are not useful when more than one person
uses the machine – they are browser specific, not user specific.
The real problem arises
when one wants to track E-commerce events – those site visits associated with
a single user. There are two main areas: product-oriented and visit-oriented.
For product-oriented events, we need to see things such as click-throughs (when
a user clicks on an item to obtain more information), shopping cart changes, and
purchases. For visit-oriented events, we need to see how the session begins and
ends and what degree of personalization results from the visit.
After collecting all the
above data, the real analysis can begin – we can actually determine what the
users are doing. Some caveats from the Speakers: you can’t just buy tools and
attach them to a web site and expect to collect data (although vendors often
claim you can!). A web usage Analysis Methodology would include: careful
planning, an iterative build process with prototyping, and determining measures
to perform analysis and assess results.
8 C 20
|
Conference Session |
Speaker |
|
Action
Business Rules – Getting to Yes |
Judi, Reeder, Consultant |
Summary by Linda Kresl
This presentation focused
on action business rules to test conditions and, upon finding them true, start a
transaction or event. When capturing action business rules, one of the key tasks
discovers and documents those condition and their values that impact the
decision. This presentation discusses exempt of decision areas where action
business rules were developed using facilitated sessions.
Judi suggests the following
steps be completed.