|
Problem: A VP demands a consolidated customer
sales report across product lines from different divisions. As you dig
into the request, it’s clear that sale is defined differently. Some have
it as gross, some net before taxes, some after taxes. Worse yet, there
are codes everywhere. Some have the same code name but there are different
value sets and different meanings.
This tutorial introduces the key topics and scenarios involved in creating
interoperable data environments. Identified as well are the problems that
commonly occur such as complexity and latency.
The tutorial defines the levels of data interoperability to be achieved.
Presented too is the overall content and construction of a metadata repository
environment critical component to a success strategy.
The tutorial details the various scenarios that must occur to achieve
a data interoperability environment including enterprise architectures,
information systems plans, data model engineering, and then both reverse
and forward engineering.
In today's distributed, web-based development environment,
the biggest frustration for developers is having to "hard-couple" their
applications to specific database structures, particularly normalized base
tables. The key to creating flexible, reusable data structures that can
support web services and application objects is to abstract, or decouple,
the functionality of the database from its underlying structure. The end
goal is to create a rule-based (or "policy-based") data abstraction layer
that can be easily changed, and used by multiple application objects and
web services. Some of the techniques that will be presented include:
- Views
- Data abstraction layers
- Data access objects
- Data integration services
- Triggers
- Fundamental stored procedures
- Complex data types
- User-defined functions
In the development of both Business Intelligence and Services Oriented Architecture solutions, there is a requirement for data integration architecture.
This session will take the attendees through the development of a best practices data integration architecture that supports both the delivery of Business Intelligence and the incorporation of a SOA foundation.
Learn about the many facets of data integration architectures including:
- History of data integration architectures
- Key role of metadata
- Iterative development 11 step process
- Including Data Governance & Stewardship
- Distributed vs. Centralized Models
- Architectural considerations
- Team composition and resourcing Real world examples and an open question and answer period will allow attendees to learn about the development of data integration architectures while receiving practical guidance and advise.
Managing health care information looms as one of
the most important issues of the next decades. Scores of organizations have
been gathering data on the state of Americans’ health, and the effort will
accelerate as the baby boomers age and require more and more accurate tracking
of their health and treatment status.
The United States Health Information Knowledgebase (USHIK) is one response
to making sense of the plethora of diverse health care datasets. As a metadata
registry for health care information, USHIK contains and links to the data
elements and information models of Standards Development Organizations (SDO’s)
and other health care organizations to facilitate the ease with which public
and private organizations can harmonize information formats of health care
standards. USHIK implements a metadata registry methodology based on ISO/IEC
11179, Information technology – Metadata Registries, and is sponsored by
the Agency for Health Research Quality (AHRQ) and has been guided by the
American National Standards Institute’s Health Informatics Standards Board.
With over twelve thousand data elements and related items, USHIK supports
data sharing with cross-system and cross-organization descriptions of common
units of health data. Since 2004, USHIK has been used to register selected
Consolidated Health Informatics (CHI) standards under sponsorship of the
Federal Health Architecture’s CHI Council. Most recently, the Biosurveillance
Technical Committee of the Healthcare Information Technology Standards Panel
(HITSP) utilized USHIK to perform comparisons among selected standards to
document and support their decision-making process.
Among the capabilities of the USHIK are:
- Describing data using common characteristics. Promoting development
of good data names and descriptions assists users of shared data to
have a common understanding of a unit of data's meaning, representation,
and identification. This insures the data quality of shared information.
- Providing multiple ways to locate data descriptions. Providing
both standard and custom 'drill-down' methods to data descriptions allows
users to recognize different points of view to narrow and focus on data
definitions to be retrieved. The number of data definitions does not
overwhelm the user.
- Allowing Web access to provide easy access and promote use of
standards. Good data descriptions become standards. When these standards
are re-used, interoperability between systems is easier, more efficient,
and data quality improves.
Within the two main paradigms of integration, data
integration and application integration, there exists a “data divide.” Integration
Competencies Centers (ICC) were established to try to bridge this gap, increasing
integration consistency and productivity by coordinating integration across
the enterprise, and loosely coupling the two paradigms with data dictionaries,
meta-data management and best practices. This has positive effect on integration
consistency and productivity, but they suffer from a lack of an appropriate
end-to-end role related tool support that limits their influence and effectiveness.
What they really need are tools and processes that naturally bridge the
“divide” without the need of a large upfront investment or disrupting the
existing work processes. A reusable, pervasive, executable transformation
specification mechanism is the only way to bridge the gap between an ICC
and the implementations in the field.
In this session, Itemfield CTO Peter Cousins will cover specification-driven
data transformation, an excellent solution for bridging the data divide.
In his experience working with some of the largest companies in financial
services and telecommunications, Peter has been most successful leveraging
the tool most comfortable and familiar to both business analysts and data
modelers – Excel spreadsheets.
From this presentation, audience members will learn:
- Why ICCs cannot bridge the data divide
- Importance of specification-driven transformation for structured,
semi-structured and unstructured data
- Definition of well-defined spreadsheet templates and tool
- Benefits to using Excel as a mapping tool
- Anecdotal customer evidence that supports the use of this tool
Data Integration (DI) is a hot topic, with hundreds
of vendor in the space. There is a great deal written, most of which addresses
the tools and technology to facilitate the physical integration. Tools are
great, but are still just tools. To ensure a successful DI initiative, a
framework composed of a strategy, standards, designs and governance needs
to be included. The presentation covers:
- Understanding of DI and the nature of data (relational, states, types).
- DI Framework
- Structure for integration (standards, guidelines, processes, policies,
DI "rules", integration patterns).
- Integration decisions made by the business through data stewardship
and DI issue resolution (security, compliance, quality standards).
- DI Design/Plan (business design, source data research/analysis,
target integrated design, transformation rules/mapping).
- Strategy and governance for the on-going maintenance (change management,
data quality program). This presentation is meant to raise awareness
of the importance of a DI framework to the success of a data integration
initiative.
Do you have a need to integrate data from multiple
different structured and unstructured sources? Are you concerned about duplicate
data and which records are accurate? These are just some of the challenges
that face IT professionals and project managers who work on systems integration
initiatives on a daily basis. When integrating data from multiple systems,
sometimes containing structured and unstructured data, two critical components
of Data Cleansing emerge; Entity Resolution and Entity Extraction.
Entity Resolution is a form of Data Cleansing and is better known as the
“de-duplication” of data or more accurately the process of identifying and
linking records together that could be the same entity. Entity Resolution
is generally performed on data, formatted in fixed fields, and residing
in a structured format.
Entity Extraction is a form of Data Cleansing used during Data Integration
specifically focusing on unstructured data. Sometimes referred to as “Text
Mining” or “Information Extraction”, Entity Extraction is the process by
which unstructured data in files like word documents, email, and PDF files
can be searched and given meaning from the body of text.
- Entity Resolution & Entity Extraction defined
- Data Integration Pillars
- Entity Resolution
- Standardization
- Matching
- Survivorship
- Entity Extraction
- Business Need
- Benefits of Entity Resolution & Entity Extraction
- Entity Resolution Case Study (FDIC CAS)
A key to any data integration effort is understanding
the personal, cultural, and political environment and consciously employing
proven principles to enable success. The most successful data integration
efforts usually share one thing in common: they developed and implemented
effective strategies that provided fertile cultural and political ground
for success. This seminar will share techniques to help understand key principles
and empower participants in meeting objectives and moving toward effective
integration. It will provide case histories of successful and unsuccessful
efforts, illustrating why some integration programs succeed and others fail.
The instructor will share principles and actions that can either help or
hinder integration efforts. The instructor will also share various insights,
showing pitfalls of where data integration efforts can and have gone off
course. There will be interactive exercises where participants can practice
handling difficult issues that commonly arise by applying principles leading
to effective Integration. Participants of this session will gain:
- An understanding of political and cultural factors for which
successful data integration teams need to be aware and prepared
- Tools and principles to enable data integration such as keys
in developing trust, gaining funding, delivering value, facilitating
common vision, managing conflict, developing effective integration procedures,
building off of other’s work, and gaining buy in.
- Real life stories of how culture and politics either killed
or fostered effective Integration programs.
- Case examples and exercises allowing participants to practice
overcoming challenges that Integration professionals often face.
- Education and experience in preparing for cultural and political
challenges as well as applying powerful techniques for developing
more effective environments, in this non threatening, classroom setting.
|