Geographic information systems are a growing category of data assets. GIS is a collection of computer hardware, software, and geographic data for capturing, managing, analyzing, and displaying all forms of geographically referenced information. Data include traditional maps (albeit digitalized) and components of maps (“layers”) as well a variety of image types. All these can be integrated for creating unique displays and presentation as well as some very powerful analysis of issues which are geographically-dependent.
GIS databases pose unique challenges in data management because of their complexity and unique structural requirements. While a number of standard data models have been created for specific topics (such as census, water features, transportation, and utilities) not all the data available conforms to these models. And the very nature of geography on an imperfect and not quite spherical globe introduce new challenges to achieving accuracy in positional data.
- Why is geospatial data different?
- Why is geospatial data so difficult to integrate?
- What are some of the data quality issues?
- How is the GIS culture different than commercial data exchange?
- How do projections, datum, and scale influence the suitability of GIS data?
How does a federal executive’s field visits
arrive at a revelation, lead to a Presidential Executive Order, and to a
decade old federal program that spans the country and influences the World?
The journey had ups and downs and encountered curves and rocky roads. Obstructionists
and “nay-sayers” were plentiful. How does a small program overcome the inertia
to covet, protect, and internalize data. This presentation shares the progression
of the program from duckling to fledgling stage and focuses on the program’s
activities that result in successes in the federal, state, local, academic,
and Tribal sectors.
- The why of and what is the Executive Order?
- “Corporate” metadata support
- “Don’t duck Metadata” - the program
- Stumbling blocks
- Program successes
- Performance strategies
The convergence of media and information technology
is developing a need for new concepts in information storage. The merged
media future will still need the data we have seen for the last fifty years,
but in addition there are requirements to store and retrieve new types of
data, both structured and unstructured, dramatic increases in the volume
of structured and unstructured data, uncontrolled data entry resulting in
vast amounts of data of unknown quality and a new ability to mix data sources
in a single information access. The new data storage system must understand
the information framework that it manages and the context of the source
and destination of information.
This presentation will examine the application of concepts of human understanding
to electronic data storage systems. Both the theoretical aspects of computer
understanding and practical applications of this new technology will be
presented along with case studies of some initial efforts in this area.
Wednesday, March 7th 5:30 pm - 6:30 pm
- Digital Fingerprints
- Bonnie O'Neil,PPC; John Murphy, John Alton Murphy,
Inc.
Computers used in the commission of a crime often
leave a trail of evidence that is entrapped in the data stores, archives
and mail boxes of the corporation. The challenge is not so much in their
storage as in their recovery. Litigation has repeatedly pointed to the need
for not only “storing” data but for having effective and timely mechanisms
for its recovery. The delivery of the “facts” without the context provided
by metadata can result in bad decision making and errant judgments. Metadata,
when properly used, can significantly reduce the resources, time and costs
required for the e-discovery process.
This presentation includes:
- Several strategies related to forensic electronic discovery
(e-discovery)
- Standards in metadata for documents, e-mail, presentations,
web content as well as relational data and blogging content
.
- Specific examples of metamodel content as well as real world
cases
- The construction of a governance and compliance metamodel to
contain specific regulatory and potential litigation defense information
- Metadata for data retention: used to track data lifecycle and
to enforce specific retention periods and authorities associated with
its management and disposal
Social network analysis (SNA) is the process of analyzing the relationships, interactions, and flows between discrete objects such as people, locations, or things). Much flows move through social networks – ideas, fashions, market tips, viruses, drugs, money, and many applications rely on social networks, ranging from the mundane (movie fans), to the business-oriented (creating a word-of-mouth marketing campaign), to the criminal (assessing illegal drug activity), to extremely critical (tracking terrorist activity).
Although social networks are represented in a straightforward manner, (objects are represented as nodes, and a connection between any two nodes is represented as a link between those two nodes), the simplicity of the connection between two objects hides potentially complex associations. The interesting part lies in understanding what those links really mean, and how to measure and assess affinity between different objects. Consequently, the result of the analysis depends on the metadata, taxonomies, and semantics associated with the defined nodes and relationships.
Attendees will learn:
- Nodes and Edges: The Model for Social Networks
- Basic analysis: degrees, betweenness, closeness, and other measures
- SNA metadata: how data about your connections drives the analysis
Do you have a need to integrate data from multiple
different structured and unstructured sources? Are you concerned about duplicate
data and which records are accurate? These are just some of the challenges
that face IT professionals and project managers who work on systems integration
initiatives on a daily basis. When integrating data from multiple systems,
sometimes containing structured and unstructured data, two critical components
of Data Cleansing emerge; Entity Resolution and Entity Extraction.
Entity Resolution is a form of Data Cleansing and is better known as the
“de-duplication” of data or more accurately the process of identifying and
linking records together that could be the same entity. Entity Resolution
is generally performed on data, formatted in fixed fields, and residing
in a structured format.
Entity Extraction is a form of Data Cleansing used during Data Integration
specifically focusing on unstructured data. Sometimes referred to as “Text
Mining” or “Information Extraction”, Entity Extraction is the process by
which unstructured data in files like word documents, email, and PDF files
can be searched and given meaning from the body of text.
- Entity Resolution & Entity Extraction defined
- Data Integration Pillars
- Entity Resolution
- Standardization
- Matching
- Survivorship
- Entity Extraction
- Business Need
- Benefits of Entity Resolution & Entity Extraction
- Entity Resolution Case Study (FDIC CAS)
|