DAMA + Wilshire Meta-Data Conference - Data Strategy Track
Geographic information systems are a growing category of data assets. GIS is a collection of computer hardware, software, and geographic data for capturing, managing, analyzing, and displaying all forms of geographically referenced information. Data include traditional maps (albeit digitalized) and components of maps (“layers”) as well a variety of image types. All these can be integrated for creating unique displays and presentation as well as some very powerful analysis of issues which are geographically-dependent.

GIS databases pose unique challenges in data management because of their complexity and unique structural requirements. While a number of standard data models have been created for specific topics (such as census, water features, transportation, and utilities) not all the data available conforms to these models. And the very nature of geography on an imperfect and not quite spherical globe introduce new challenges to achieving accuracy in positional data.
  • Why is geospatial data different?
  • Why is geospatial data so difficult to integrate?
  • What are some of the data quality issues?
  • How is the GIS culture different than commercial data exchange?
  • How do projections, datum, and scale influence the suitability of GIS data?


How does a federal executive’s field visits arrive at a revelation, lead to a Presidential Executive Order, and to a decade old federal program that spans the country and influences the World? The journey had ups and downs and encountered curves and rocky roads. Obstructionists and “nay-sayers” were plentiful. How does a small program overcome the inertia to covet, protect, and internalize data. This presentation shares the progression of the program from duckling to fledgling stage and focuses on the program’s activities that result in successes in the federal, state, local, academic, and Tribal sectors.
  •  The why of and what is the Executive Order?
  •  “Corporate” metadata support
  •  “Don’t duck Metadata” - the program
  •  Stumbling blocks
  •  Program successes
  •  Performance strategies

The convergence of media and information technology is developing a need for new concepts in information storage. The merged media future will still need the data we have seen for the last fifty years, but in addition there are requirements to store and retrieve new types of data, both structured and unstructured, dramatic increases in the volume of structured and unstructured data, uncontrolled data entry resulting in vast amounts of data of unknown quality and a new ability to mix data sources in a single information access. The new data storage system must understand the information framework that it manages and the context of the source and destination of information.

This presentation will examine the application of concepts of human understanding to electronic data storage systems. Both the theoretical aspects of computer understanding and practical applications of this new technology will be presented along with case studies of some initial efforts in this area.


    Wednesday, March 7th
    5:30 pm - 6:30 pm

  •  Digital Fingerprints - Bonnie O'Neil,PPC; John Murphy, John Alton Murphy, Inc.
Computers used in the commission of a crime often leave a trail of evidence that is entrapped in the data stores, archives and mail boxes of the corporation. The challenge is not so much in their storage as in their recovery. Litigation has repeatedly pointed to the need for not only “storing” data but for having effective and timely mechanisms for its recovery. The delivery of the “facts” without the context provided by metadata can result in bad decision making and errant judgments. Metadata, when properly used, can significantly reduce the resources, time and costs required for the e-discovery process.

This presentation includes:
  •  Several strategies related to forensic electronic discovery (e-discovery)
  •  Standards in metadata for documents, e-mail, presentations, web content as well as relational data and blogging content
  • .
  •  Specific examples of metamodel content as well as real world cases
  •  The construction of a governance and compliance metamodel to contain specific regulatory and potential litigation defense information
  •  Metadata for data retention: used to track data lifecycle and to enforce specific retention periods and authorities associated with its management and disposal

Social network analysis (SNA) is the process of analyzing the relationships, interactions, and flows between discrete objects such as people, locations, or things). Much flows move through social networks – ideas, fashions, market tips, viruses, drugs, money, and many applications rely on social networks, ranging from the mundane (movie fans), to the business-oriented (creating a word-of-mouth marketing campaign), to the criminal (assessing illegal drug activity), to extremely critical (tracking terrorist activity).

Although social networks are represented in a straightforward manner, (objects are represented as nodes, and a connection between any two nodes is represented as a link between those two nodes), the simplicity of the connection between two objects hides potentially complex associations. The interesting part lies in understanding what those links really mean, and how to measure and assess affinity between different objects. Consequently, the result of the analysis depends on the metadata, taxonomies, and semantics associated with the defined nodes and relationships.

Attendees will learn:
  • Nodes and Edges: The Model for Social Networks
  • Basic analysis: degrees, betweenness, closeness, and other measures
  • SNA metadata: how data about your connections drives the analysis


Do you have a need to integrate data from multiple different structured and unstructured sources? Are you concerned about duplicate data and which records are accurate? These are just some of the challenges that face IT professionals and project managers who work on systems integration initiatives on a daily basis. When integrating data from multiple systems, sometimes containing structured and unstructured data, two critical components of Data Cleansing emerge; Entity Resolution and Entity Extraction.

Entity Resolution is a form of Data Cleansing and is better known as the “de-duplication” of data or more accurately the process of identifying and linking records together that could be the same entity. Entity Resolution is generally performed on data, formatted in fixed fields, and residing in a structured format.

Entity Extraction is a form of Data Cleansing used during Data Integration specifically focusing on unstructured data. Sometimes referred to as “Text Mining” or “Information Extraction”, Entity Extraction is the process by which unstructured data in files like word documents, email, and PDF files can be searched and given meaning from the body of text.
  •  Entity Resolution & Entity Extraction defined
  •  Data Integration Pillars
  •  Entity Resolution
  •  Standardization
  •  Matching
  •  Survivorship
  •  Entity Extraction
  •  Business Need
  •  Benefits of Entity Resolution & Entity Extraction
  •  Entity Resolution Case Study (FDIC CAS)


Close Window
Wishire Conferences DAMA International