Presented by:

Data Discussions is a series of interviews with leading data management experts and practitioners,
presented by Wilshire Conferences. Click here to sign up to receive future editions.
FORWARDING THIS NEWSLETTER TO YOUR COLLEAGUES IS ENCOURAGED.

June 19, 2003 - Contents

Data Management's Next Big Thing
Predictions and Debate on the Future of Data Management

During a conference panel session at the recent Wilshire Meta-Data Conference and DAMA International Symposium in Orlando, we invited four prominent individuals from the industry to discuss their views of the "Next Big Thing" in data management. The result was a wide-ranging discussion, influenced by technological, business, regulatory and management issues. The panelists spoke of managing larger and larger databases, dealing with security requirements, fast-changing storage media, and changing roles of data administration and modeling professionals.

But if there was a single dominant theme during the conversation, it was the emerging regulatory requirements starting to shape the data management requirements of all companies, large and small. Regulations that require new privacy and security safeguards on customer data, or long-term storage of historical records, or stronger auditability – and all of the above. Quite apart from solving the technological and design elements of the problem, the new requirements are often ambiguous, and create a potentially significant liability for senior executives. The pressure on data managers to come up with reliable solutions is only likely to increase in the next few years. The complete transcript of the panel discussion is below.

This edition of Data Discussions is sponsored by Avellino

BUILDING A HIGH QUALITY DATA 'SUPERSTORE'
This customer case study describes how a large retail bank built a data warehouse superstore, achieved benefits and savings amounting to 24 million US dollars, dramatically improved data quality, while increasing the productivity of data profiling and analysis by over 90%. www.avellino.com/download


Data Management’s Next Big Thing

A discussion of trends and futures
with four industry experts

Donald Soulsby
Director, Architecture Strategies

Computer Associates

Lisa Cash
President and CEO
Princeton Softech
Mike Jennings
HRe Technology, Data Warehouse & Architecture
Hewitt Associates
Karen Lopez
Principal Consultant
InfoAdvisors

Throughout history, fortunes have famously been made and lost in pursuit of “The Next Big Thing.”  Few would deny that the IT industry is shaped and driven in large part by our enthusiasm for the latest technology or trend.  We use the Next Big Thing moniker as a generic descriptor -- sometimes seriously, sometimes derisively – because by now we know to approach most technology predictions with a strong degree of caution.  And yet, despite the warning labels, we nonetheless enjoy the notion of the Next Big Thing because if it does exist, then we sure as heck want to be part of it.  Speculation about the impact of new technologies is part of the inherent excitement – and frustration – upon which the IT industry is built, so who are we to fight it?  Let’s just have some fun with it!

At the recent Wilshire Meta-Data Conference and DAMA International Symposium in Orlando, we invited four prominent individuals from the industry to offer up what they think is the Next Big Thing in data management.  Here’s what they said…

Tony Shaw (Moderator): Let’s introduce you to our panel this morning.  Lisa Cash is the CEO of Princeton Softech, a software firm that makes storage and archiving products for large data management problems.  Then we have Karen Lopez…Karen will be familiar to many of you as a consultant with InfoAdvisors, and as the moderator of a variety of discussion boards through her website.  Then Mike Jennings with Hewitt Associates.  He’s a specialist in meta data and data warehousing.  And Don Soulsby.  Don is, as always, the best-dressed guy at the conference.  Don works for Computer Associates.  He’s an avid theater and ballroom dance guy, so he’s going to tiptoe through this very adeptly no doubt.  He’s one of the folks from the vendor community who is a thought leader on issues like meta data and knowledge management. 

Before we start, I want to review the issues identified as “Next Big Things” from last year’s meeting (San Antonio, April 30, 2002): 

  • Karen Lopez is our only repeat panelist.  She made a good call last year in saying that “privacy” was one of the next big things.  There’s no doubt that privacy and security issues are really key today, and will continue to be so.  Even as individuals we understand this. 
  • Len Silverston talked about people.  I wish we could say that people had become a more important priority, but the business environment is working against this.
  • Bob Seiner talked about unstructured data.  I think he hit the nail on the head also.  Clearly, content management and unstructured data management is becoming more a part of the job description of the data community. 
  • And Graeme Simsion – in one of the more controversial declarations of last year’s panel – said that centralized data administration isn’t working.  I asked him to explain that statement later because many people took him the wrong way.  In a previous version of Data Discussions he took some time to clarify his position (that interview is available here.)

Anyway, there was certainly some controversy on the panel at last year’s meeting, and we did have some fun.  So, we’re going to try to carry that spirit into this year’s discussion. 

Here we go.  Don has a few slides.  He’ll speak first.

Don Soulsby: Thank you very much.  Just a quick mention about the ballroom dancing thing.  My wife and I have been doing it for about 2 years.  The one thing I’ve learned is if you move your hands, they won’t look at your feet.  Part of the controversial thing I want to say about meta data is that we’ve got to stop worrying about our feet (which is how to do the meta data), and start to get the business people interested.  They like to see this.  They tend to go for that flashy moving object.  And we keep saying, well, we’ve got to work on our footwork.  So that’s kind of the controversial point I want to make, is that we’ve got to stop worrying about our feet, and start shaking our hands to get the attention of the business people. 

Which leads to my topic that came out of the DAMA meeting in London last year.  It’s the Mad Cow Disease, which is obviously not a light-hearted topic.  I saw that Marks’ and Spencer, a rather large department store in England, was running a series of ads in the newspapers.  They showed a big chunk of very lovely red meat, and a caption reading, “we can trace it so you can trust it.”  And that was the whole point.  We almost lost the European meat market as a result of the disease, and a number of countries wanted people to bring together software that would then give it the traceability.  What we’re talking about here is the mad cow disease of computers.  And that’s frankly where we’re at today.  We have a lot of tainted data in our systems.  If you get a report, you don’t know what it means.  Why?  Because you don’t know if the data is any good.  What they did to save this in the European market was to create systems that could trace the bloodline from the marketplace to the slaughterhouse to the original farm to the original genus of what that animal came from.

To look at the same parallels…if we don’t believe or trust our data, we need a similar type of thing, and that’s what meta data gives us.  Going forward, we can look at emerging government programs regarding the security and privacy of health care, the Patriot Act, and money laundering (by the way it’s not just banks – they’re going after jewelry, and cars next – anywhere you can do a large transaction for cash).  Do you remember what President Reagan said: “Trust, yes, but verify.”  And I think that’s what we’re looking to in terms of government compliance.  The government is saying, we trust you when you say things to us but we’d like you to verify it. 

And, of course, my personal favorite, having worked on a number of projects with meta data consulting for building strategies was the good old ship “S.S. Y2K”.  I worked with a number of clients where I was doing meta data strategy, and next door they were working on Y2K.  I wandered over one day and said, “you guys must have this thing called a legacy application inventory or something—in other words, you look for all the places where Y2K was… do you guys have that?”  They said, “yeah, we did, but we got rid of that.”  So, think to yourselves, when you did Y2K, did you keep that inventory?  Because that was the foundation of where you could go with your legacy map.

Well, the good news is, though that boat has sailed, but there’s a new boat in town called the “S.S. SSN”.  California’s already stated that by 2005 you can no longer use the social security number for medical record reporting.  Because they’ve been accused of providing most of the issues related to identity theft.  So, I think that’s going to be the next Y2K: the Social Security Department -- who owns that code and is being blamed for a lot of the theft of identity -- cracking down and saying, no you can’t use that.  So think of your own companies here in America where you have SSN embedded in your banking systems, insurance systems, state government systems.  And they’re going to say that you can’t use it.  In fact, you have to prove to us that you no longer use it.  So, I think that’s a new boat for us going forward.  And, of course, all of this is based on the fact that they want you to prove that you know where everything is.  So, once again, that legacy map will become critical, not even so much to remedy the problem, but once again to prove and verify to the governmental organizations that you have it safe and secure.  That’s mine…

Tony Shaw:  Thanks Don.  Panelists, anybody want to challenge that one? 

Mike Jennings: In our outsourcing business at Hewitt, I’ve already seen a number of examples from our corporate clients, where they’ve asked us to change our system, originally based on social security number, and protect it with other kinds of system identifiers.  So, we’re seeing a lot of that happening already in our outsourcing business.

Tony Shaw:  A homegrown identifier, then?  Is that going to form the basis of a log-in identity for customers?  Are we going to have to remember a different log-in now for every occasion?

Mike Jennings: Probably.  I know the SSN, and other numbers like that, is probably easier for users to remember, and more prominent.  But we’re trying to get away from that, especially with all the concerns about privacy. 

Tony Shaw:  Anyone have any questions to Don on that one? 

Audience Question: Do you see the government going back to the old scheme of assigning a sequential ID to the military?

Don Soulsby:  When you sleep next to an elephant, you know when they’re going to roll.  In terms of the military structures, it’s who owns the ID who should be allowed to say who gets to use it.  I think it’s going to be more on the Social Security Department to say what is the usage that is going to be allowed for that number.  And again in terms of cooperation between departments, the history has never been that we’re going to ask first before we tell you.  So that would be my concern on that level. 

Tony Shaw: Here’s a question from the audience:

Audience Question:  I’ve been to 5 metadata presentations (here at the conference).  I want to know if there are any meta data managers here who have addressed the issue of traceability and feel that they really have a good handle on it? There are a lot of data dictionaries out there, but we’re just trying to get into integration meta data with messaging and FTPs as well as ETL, and it is a really sticky problem.  We need tooling badly. 

Don Soulsby:  Just from some of the background we’ve done with the consulting, we see a lot of forward traceability which comes from the ETL tools.  So we see a lot of sourced target mapping going that direction.  And again, it’s tool based.  The trick is… well, as they say, it’s easy to take the skin off a cat, the trick is getting it back on.  And that’s really what it was.  There’s lots of ways to put data back into warehouses with ETL tools.  The trick is really reassembling it going back the other way.  It is hard.

Tony Shaw:  OK Lisa.  You’re up next. 

Lisa Cash:  I’m going to speak a little bit more specifically to structured data.  It ties very closely into the comment just made about data warehousing, that getting it in is easier than getting it out.  I think one of the next big things in managing explosive data growth is dealing with the data that resides on relational databases.  There are many statistics detailing the growth rate of data on relational databases.  Anywhere from 40 to 125%+ due to mergers, acquisitions, ERP, CRM, supply chain – just a host of different applications.  The problem that ensues with the growing amount of data is that these are mission critical applications and they are the drivers in the business.  They manage data that generates the revenue for the business.  They manage the back office applications that impact productivity and bottom line.  So the growth rates today really have a management problem, and it’s being exacerbated by some of the points that we mentioned. 

Not only is data growing organically, but there are environmental forces that are forcing companies to retain data on these relational databases for far longer periods of time than was previously anticipated.  You have regulations like HIPAA or CFR Part 11 (an FDA regulation on electronic records and signatures) or SEC 17A-4 (concerning the retention of electronic records, including email).  So you have all of this legislation that is forcing companies to keep data in its relational context, in its business context, in its meaningful context, in case it needs to be reproduced.  So, today the only place people can keep it is on their relational database. 

The problem is that I don’t know one CIO whose operating expense budget is growing at the same rate as data growth.  And so one of the issues we see is – and obviously this is the space that we’re in -- the concept of database archiving is coming up on the radar screen and really starting to take hold.  It’s a very simple concept.  It’s the ability to take data from these relational databases and maintain that meta data, and convert it to a format that can now be stored much more cost effectively in a back-end storage mechanism.  What we’ve seen in the marketplace is that customers who have effectively employed a database archiving strategy have managed to decrease their operating expenses from about a million dollars a month to about $200,000 a month.  And that’s done just by managing that data more effectively, by storing it in a more effective storage media, like tape, or optical, disk etc., than storing it on a very, very expensive production database.  But still being able to maintain that meta data; the data that gives that information its meaning, and still meet those regulatory requirements.  So that’s really what we see, the concept of database archiving.  That’s what see in our space as the next big thing.  I’ll throw that out there for challenge.

Audience Question:  Aren’t there some issues with the lifespan of certain media like tape, and do you think there are any new incentives to find other media for long term storage. 

Lisa Cash:  There are issues with the lifespan of tape, and you absolutely have to consider the lifespan of the medium that you’re moving data to.  But there are really no different issues than the lifespan of the database that you’re saving your data on.  Look at the versions of databases as we’ve gone through the course of history.  We’ve gone through IMS and the relational database du jour.  So you have the same impact, but certainly you absolutely must go to a vendor that has stability, long-term strategy and dedication to that particular side of their business.  So, yes there are issues.  And we see customers doing a tremendous amount of due diligence up front so that they understand that that medium complies with what those regulations require. 

Specifically, we deal with data—and any kind of data, whether it’s large objects, etc.—that reside in relational databases.  And so, if the data is being stored in the relational database, there are capabilities for a migration path for that data.  Our area of focus isn’t taking something from a disc and upgrading it to the “latest-greatest.”  What we really concentrate on is big huge applications that just grind to a halt because of massive amounts of data that continue to accumulate.  What we do is get it out of there but maintain the meta data by putting it in a cheaper storage media.  So, we focus on the major relational databases, and moving the data out, so that we can enhance the performance and availability of those systems. 

Karen Lopez:  But I think the format thing is a big issue, especially because recent trends are going away, like small floppies.  My recent laptop came with an SD card reader, but no floppy.  And that’s fine, but now I’ve been forced to buy a 6 in 1 card reader so that if I can ensure that if I sit in a meeting and someone wants to give me a file, I’ll have a reader to do it.  As the market is now struggling with vendors creating proprietary removable storage media that are large (1 gig, 2 gig CompactFlash cards, 10 gig, 20 gig, 40 gig iPod storage devices), I think this is another thing we’re going to have to deal with mobile data.  Forget what the file format is, we won’t even be able to physically read the thing.  When we did that transition from larger floppies to smaller floppies you always had to worry about that.  Now we have 8 removable media things that we have to deal with.

Mike Jennings:  I think we’re already starting to see that kind of movement now with CDs.  Nobody can imagine CDs going away today, but DVDs are becoming more prominent every day.  My latest computer is a DVD burner and there are a lot of things as far as video and mp3, and we’re already moving to and past that as far as capacity.  At some point you should see all of the video and audio mixture going together, and they’ll replace DVDs down the line. 

Don Soulsby:  I think that we missed a point.  When I was in records management we spent most of our time figuring out when we could get rid of stuff.  This was around the time of the great “Pinto” letter, for everybody who remembers that one, the accident with the gas tank.  They found the letter in a back copy of an engineer’s file cabinet.  Based on that, we spent a lot of time figuring out what is the minimum time we can keep stuff so that we can get rid of it.  I think we kind of missed the point about keeping data. 

I come from a logistics background and we had LIFO (last in first out) and FIFO (first in first out).  Now we have FISH – first in, still here.  I am very much concerned about that.  Because when the lawyers figure out that you have all this data which goes back 25+ years, they’re going to go… fishing!  And that’s my big concern, that we’re making it a technology issue, and not a policy and management issue.  And the other thing again, of course-- which is my new hobby-horse—is how much does this year vs. last year mean anymore?  Why are we actually keeping all this historical data? 

Lisa Cash:  I think it’s actually important where you keep the historical data because the legislative regulations just become more and more onerous for companies out there.  It’s not really a question of can you get rid of the data.  We deal with a lot of insurance companies who feel like they can never get rid of data.  So, it’s really not a matter of “at this point in time, what can we get rid of?”  Although I do believe in complete data life cycle management, and there is a time when you need to get rid of it.  The point that we try to make is that you need to manage the data based on its value to the organization.  So if it’s highly accessed, highly used, then that’s data you put on your most expensive storage mechanism which is your production environment.  The data that is infrequently accessed — which all of the analysts are saying is something like 70-80% of data in an enterprise — but must still be kept by law.  So we have to figure out how to manage it more cost effectively.  That’s really the trend we’re seeing, the theory of store all your data or keep all your business critical data in a production environment is now really being tested.  Better to keep the minimum amount of business critical data in the production environment, move everything else to a back end storage mechanism, and then delete what you can delete. 

Don Soulsby:  OK, but I’m not seeing the balance between policy and why the business wants to keep the data vs. oh yeah, we can do it.  Once again, it’s another Y2K.  Just because we can technically do it doesn’t mean we necessarily should. 

Tony Shaw:  Mike, you’re up next.

Mike Jennings:  The next big thing from my perspective -- a consulting and outsourcing perspective – is data security and data protection.  A lot of the issues we’re running into today with the complexities and interdependencies of our information technologies and environments make securing that data very difficult.  The increase of web-enabled applications that span the enterprise and touch the Internet again makes protecting that data much more difficult to do.  The increased focus on data security by corporations, data continuity after 9-11, privacy and legislation issues again make data security more and more prevalent, and more of a focus coming into this next year.  Publicity around network and data software exploits.  I’m sure everybody can remember at least one news story over this past year where some large software vendor was embarrassed by certain security vulnerabilities that happened on his or her particular piece of application. 

I’m seeing increased focus by vendors on bringing data security up because it gives them a way of differentiating themselves in the marketplace and giving them increased market share as well.  I’m seeing increased focus by companies on data security in the same context that they gave performance and functionality.  Before this, at least with organizations and clients I’ve dealt with, everything was focused on what the functionality of the product was.  Data security is equal with those two, as data is being more and more opened up.  There’s also an increased focus on outsourcing.  Again, data security becomes more of an issue, more of a concern than when everything was in a single server database inside of a corporation.  Increased use of geographically distributed data repositories.  The use of storage area networks, network tests, storage and other things, are making data security more complicated as well as data gets more dispersed geographically.  And then just general privacy concerns of companies, individuals and legislation are going to drive interests demands further into new security products and services. 

Karen Lopez:  As I’m hearing people speak, what’s going through my mind is what my grandmother used to tell me, which is be careful what you wish for because you just might get it.  So we spent all of these last decades trying to build better data that could be used as information that could be used anywhere.  And now it’s come along with the price of securing it and dealing with distributed databases and geographically distributed, and worrying about legislation and those sorts of things.  Our jobs as data professionals have become about much more than figuring out the best way to build a table, don’t you think? 

Tony Shaw:  How many people in room think of security as something that the “security folks” do as opposed to something that I’m responsible for (about 50% of hands raised in the audience).  So that’s a fairly widely held opinion.  Do you see, Mike, that this audience (i.e. the data architecture/administration audience) is likely to be getting deep into these security issues, or is it just a business issue they need to be more cognizant of?

Mike Jennings:  Probably a little bit of both.  I know I spent the last year evaluating different applications, mostly ones from the front-end perspective, and in a lot of cases everything was focused on functionality and performance.  But we’re starting to get more challenges and questions about security.  We’ve brought in 3rd party security auditors to help look at the different applications.  In a lot of applications we’re finding—both internal ones we developed and ones we’ve purchased—big security holes, things where you can get administration rights or root access on a server, things I never would have thought about.  Now when we go in to evaluate a product, we’re asking at the very beginning about security that may not have been asked in the past

Audience Question:  I agree with the security, but I don’t think there’s enough emphasis on the privacy part. There has been so much legislation.  If you’re a worldwide company, and you’re dealing with any of the European worker councils, and the policies regarding where employee data can go.  You’ve probably heard about the new database you’re supposed to be able to call and say, “telemarketers don’t call me…”.  That’s really well and good, but does your company have the data management processes in place to actually comply with that?  All these privacy things are great, but if you don’t have the field in the database that can actually keep up with the processes, then the privacy isn’t there.  I think the privacy piece doesn’t get talked about enough, and tends not to have equal billing with the security issue, though they are two sides of the same coin. 

Audience Comment:  Here’s one of the problems with leaving it up to your security group.  There are three levels of security—low, medium and high.  Everything that’s in high security will be encrypted on disk.  That’s a rather large sledgehammer.  I’m not sure if David Hay or Len Silverston or anyone is coming up with some standard fine grain design patterns because we have to actually have a meta-model of security and roles and privacy and start to get some consensus in the industry and start to actually get down to some cases. 

Don Soulsby:  Security is part of risk management.  If you go to the classical equation on risk management, the elements are that you take the value of the asset, the vulnerability of the asset, and then the statistical likelihood of the attack happening.  There’s a slight flaw in that when we get to data security.  The first element on any insurance program is what is the value of the asset.  Because sometimes you allow the breach to happen if the asset doesn’t have a large value, and then take the insurance, rather than trying to build up the fortifications to protect it from ever happening.  Once again, the big policy issue is—what is the value of your data asset?  We have never been able to assess that because it doesn’t appear on your balance sheet, and if it doesn’t appear on your balance sheet, nobody cares. 

Tony Shaw:  Last year we started to get involved in some of the accounting issues having to do with that… which we won’t do this time!  Time is marching on, so I’ll ask Karen to take the next topic. 

Karen Lopez:  My topic really isn’t something brand new, but it’s something I’m starting to see in multiple organizations and projects.  It’s becoming that “elephant sleeping beside you” that we were talking about earlier.  I’ll talk about some of the things that I see that I’m having to deal with that I think that the rest of you are dealing with more and more as well.  I’m starting to have to work in a data modeling / management environment where there are other tools and other disciplines coming in and trying to help us out.  So, the first thing is the encroachment of non-data management tools and trying to leverage them as data management tools—things like Rational, Visio, Power Point, Microsoft Access to prepare enterprise data models.  It’s not a fun thing to do, but often budgetary constraints or the fact that a larger UML tool (which is very expensive) has something in it that does data-something, so could you use that instead of your current data tool.  So I’m seeing a lot of that. 

The other thing that’s changing is the different roles of people I’m working with who are also helping me build these data models.  The number one thing is outsourcing, both onshore and offshore outsourcing.  This has added another layer of complexity to our data managing efforts.  And I’m not talking about cultural differences or language differences.  I’m talking about time zone differences.  I’m talking about tools that weren’t developed to remotely work 3,000 or 10,000 miles away in a collaborative manner, or to really deal with the security issue.  We’re talking about developers who have now discovered something called modeling through their UML modeling and their object modeling who now all of a sudden need to be part of every modeling process and also impose their modeling standards on our standards, and vice versa.  We can talk about notational differences, but I’m not worried about that (though I have my preferences).  I’m worried about the fact that a class model isn’t a logical model and vice versa.  How do we justify and how do we work together?  It may mean that on certain projects I might have to give up my logical model and work on a class model and try to use it to try to derive benefits from those things. 

The other thing that I think is good about these encroaching disciplines is I’ve seen the re-emergence and re-emphasis on process modeling.  Even though we’re here at a data management conference, those of us who have been in this for a long time realize that data and process go together.  And that now that UML has come onto the scene, now we’re starting to see again a process model that actually can be at the analytical or conceptual level and also can work down at the detail level.  So I’m very happy to see sequence diagrams and use cases coming back onto my projects.  The encroachment of brand new methodologies and techniques has started to have an impact on how we go about building these models.  Things like object whatever -- whatever you want to call it -- whether your project is an OOPS project (which I think is a great acronym for talking about this development effort); whether you’re having to deal with Extreme Programming—which is sort of a new version going into production every 24 hours and 2 people do everything on the project including security, coding and compliance; or whether you are doing something Agile Whatever—which implies re-factoring your work, working to a deadline, also with people having different roles and responsibilities and trying to work in a very reactive manner. 

And the last thing is the encroachment of other professions onto the work that we are trying to do.  So other traditional professions like accounting, engineering, medicine, etc., other professions are now getting into this IT profession.  They initially get into the profession as a result of providing domain expertise onto our projects… and that’s important.  I think if you’re designing a medical operating system that’s going to deliver pharmaceuticals to patients, then probably someone who knows a little bit about medicine ought to be involved with that project.  Do you agree with that?  But what’s happening is—especially with the engineering profession—that we are seeing a great deal of overlap in our foundational education, that more and more professions are learning programming or macro writing or spreadsheets.  We’ve always seen that there are many different paths to come into the IT profession.  But I’ve never had to deal with it from a data management point of view, because most people who are trying to make that transition really liked getting into the technical, geeky part of it and couldn’t understand what these model-things were.  But I think it’s important that we understand that while it’s good to have a variety of viewpoints—especially from people coming from the domain of the business-- it’s also important that people understand the foundations of why we’re doing these models, why we’re doing this process.  They have a great deal of input to provide, but we don’t need to give up all the goals and all of the benefits that we’d like to deliver with these models just because this model happens to look like a table and it may lead to a database.  We have these roles as professionals where we need to be sure that the business value goes into our models, and we also need to be aware of the fact that we need to explain these models to people.  With increasing budget pressure and the fact that we’re having to do more with less, we need to be able to understand the other points of view of the other people working on our project, we need to be able to clearly explain our process, and we need to be able to make the products of our work very visible.  So if your tool allows you to publish your work out onto an intranet, you ought to make sure that that’s happening.  Graeme Simsion is right: you might have to work without the things you really need to do your work.  Even if you don’t have enterprise class tools, you can still have enterprise class models. 

Tony Shaw:  Panel, your first comments…

Don Soulsby:  Two things.  One.  Something you didn’t mention which I found surprising was the education and certification of our data modeling people.

Karen Lopez:  That’s another topic.

Don Soulsby:  I ran into a surgeon one time who was designing my database for me and I suggested that I could help him next time with his operation next time.  He didn’t think I was funny.  But the fact remains that doctors used to be barbers until they accidentally killed a prince, and someone said, “wow, we better get letters after these guys’ names so that we can trust them.”  So clearly because there is no demand to have a certified data modeling / data architect after your name means that you are not a threat.  In other words, all of these other domains that are encroaching on our profession feel that they can do so without accreditation or training because, frankly, what are the results?  How badly are you going to hurt a company by getting cardinality wrong?

Karen Lopez:  6 million dollars.

Don Soulsby:  Exactly!  That’s the point!  I think there are situations where we do not have qualified people doing our job, and we’re just not perceived to be a big enough threat.  And I think that’s one of the biggest issues we have as far as advocacy for our association and our world is that I think the letters after our name have to become important to somebody, such that they would demand that you have them to be certified, to be trusted, and to be able to go out there and do the job.

The second point I wanted to make was on the tools.  We were joking about the capability maturity model for meta data.  And, you can usually judge it by its tools.  Level 1 is Power Point and Excel.  Level II is Visio and Access.  And level III is a repository of some form, whether it’s homegrown or a vendor package.  So, in future reference, if you want to do a CMM for meta data, just look to the tools. 

Audience Question:  Can you define IV and V?

Don Soulsby:  I wish! 

Tony Shaw:  Karen, I’m not sure whether you come down on the threat or the opportunity side here. 

Karen Lopez:  Being a good consultant… both sides.  It’s a threat if we’re not ready for it. 

Audience Question:  I’m going to play the devil’s advocate for a second, because I’ve kind of seen it from both sides—from the data management side as well as from the developers’ side. 

Tony Shaw:  Which do you represent at this point in time?

Audience Question:  Both.  I work for a small company.  To play the devil’s advocate, I think part of the problem is that data management is perceived as being very inflexible in an enterprise setting.  And obviously things are moving very quickly nowadays.  Companies need to be able to adapt not only year-by-year but also month-by-month.  So I think along the lines of encroachment, often you do see these access databases popping up throughout an enterprise.  I think you could blame the people building these access databases, but usually it’s because they don’t feel that they can actually have somebody produce a real solution.  They perceive that they have to go through some set of processes to become a full-fledged part of the standard with security, and everything else, handled.  So they go around it and they throw together something in Access.  This probably won’t be perfect, but at least it’s something.  What I would say is look for these things as red flags.  These people have a need, and it’s probably not being met.  Maybe we can look at how we, as data managers, can be more flexible and adapt readily in these agile type movements.  People want more rapid response.  Of course sometimes this comes back to bite us, but I definitely see a red flag there…

Tony Shaw:  If somebody’s developing an Access database somewhere, should you treat that as someone you should go talk to and offer some “professional help”?

Karen Lopez:  Absolutely.  I mean, I create my own Access databases all the time, and I make all kinds of compromises and pay the price later.  We always think we’re smarter than what we tell people we are.  In terms of the maturity of our profession, we haven’t necessarily balanced out the needs to either offer professional help or to help a business apply their own standards to those already developed.  An analogy is that you can build a shed in your backyard, and you can cook meals for 40 people in your home.  But at some point you cross a line where the protection of the company and of the public needs to be addressed.  So, if you’re building that shed by ripping out walls of your house, you probably need some professional help to do that.  If you’re going to cook a meal for a couple of thousand conference attendees then you probably need to know what you’re doing because there’s a greater potential for harm.  But I agree with you that we are all learning to be reactive.  The point is where do you draw the line? 

Audience Question:  Two of you mentioned outsourcing.  Do you see companies doing real risk assessments around the outsourcing issue? 

Mike Jennings:  At least from my side, yes.  Part of my company’s business is outsourcing.  Many clients make us go through a several-month assessment and risk analysis.  Plus, we go through our own internal analysis including values, pluses, minuses…Not only asking us questions around security, data management, continuity, all the way down to what’s our disaster recovery plan, and things of that nature. 

Karen Lopez:  And I think the hard part is that along with outsourcing, according to projects I’ve worked on, comes fixed price.  This has been around forever, but what I tell people is that when you fix the price and the schedule, the only thing you have to move is quality and risk.  That’s definitely a great trade-off to do, but it’s much easier to negotiate time and materials with your vendor than quality.

Tony Shaw:  Same as the old printer’s adage.  We’ll give you quality, price or delivery time… you can have 2 out of 3. 

Audience Question:  I would like to assert that we have an opportunity in those encroaching roles to actually be looking at how could we learn a little bit more about what the other jobs are.  So if we spread ourselves out a little more, and learn about the tools and jobs that are close to what we do, we could actually be a little more knowledgeable ourselves.  If we learn what they’re trying to do, we may be able to head them off at the pass a little bit more, or be perceived as more valuable. 

Karen Lopez:  And not only that, but feel their pain.  Larry English talks about feeling the business pain. 

Tony Shaw:  What about the other stuff going on in the world.  Buzzwords?  Trends?  Stuff on the CEO/CIO/CTO desk right now that’s going to eventually affect the folks in this room.  Do you see any buzzwords out there that you want to debunk?  Let’s play a little Fabian Pascal game for a minute. 

Don Soulsby:  Actually, I’d like to play the CEO/CFO side of the house.  It concerns me that most of the legislation I am seeing that relates to information only speaks of punitive damage against CEO and CFO.  Very quickly, however, I think we’re going to start seeing internal lawsuits.  We’ve seen it at senate hearings, right?  When “Oh Gee, I didn’t know,” just isn’t going to cut it anymore legislatively.  I suspect as the CEO and CFO are brought up, this reaction to the CIO is going to be real quick.  In both cases where we’ve got to protect our business management at the senior levels, I think we’re going to have to protect our CIOs more than anything else.  They’re the ones that are going to be pointed at when the finger gets pointed to other members of the senior management team. 

Tony Shaw:  Going to have to protect them?  Meaning what exactly?

Don Soulsby:  If I have the risk of going to jail on a decision I made from information I was provided, you can bet that the provider of information is not going to be the one going to jail, but the one who said, here are the rules. 

Tony Shaw:  You think that the people managing the data are going to be held accountable for that? 

Don Soulsby:  You bet! 

Tony Shaw:  That’s a rather long stretch for the arm of the law to go down into. 

Don Soulsby:  But you can bet that there are insurance policies for Board of Directors and CFOs, and there just aren’t for CIOs.  So you take the path of least resistance. If you know there’s protection, you go to the places where there are no protections.  CFOs and CEOs have a lot of really good protection already.

Tony Shaw:  Lisa, you’re a CEO.

Lisa Cash:  And I’m anxious to respond to that one.  It’s not going to be blamed on the CIO.  What’s very clear in the law is who the owner of the responsibility is.  It clearly is the CEO and the CFO.  There’s no path to point to anyone else in the organization.  I think the real area of concern is that there’s just so much vagueness in these laws.  If I’m a CEO sitting in the hot seat, I’m not going to be pointing to someone internally.  The problem is that much of the law is really up for interpretation.  I think it’s how the companies themselves interpret different aspects of it.  And that’s where there is concern in the IT organization.  I actually see that as a tremendous opportunity for consultants out there to help companies implement and be compliant with these legislations.  But I think that’s a bit over the top. 

Tony Shaw:  Sorry, Don, rather than give you the right to respond, we’ll just leave you hanging there and move on (smile).  Another buzzword or trend…?

Karen Lopez:  I have a list of them, actually, because someone was asking me what all of these new terms meant, so I prepared a list. 

Somebody asked me, “What does this agile stuff mean?”  It means, “Developer is King.”  What does “Re-factoring” mean?  It means, “Warning: Massive Re-work ahead.”  What’s this Zachman thing?  It means “Modelers are King.”  “What’s that in the real world?”  It means, “we have way too many kings.”  What’s this “Extreme Programming?”  It means “Everybody’s King,” because in Extreme Programming everybody wears every hat.  What’s “EAI”?  First of all, that depends whether the “A” means application or architecture.  But what it really means is that we can rebuild it faster and stronger.  What’s this “OMG MOF” stuff?  It means, “oh my God it’s the mother of all frameworks.”  Again, someone else asked me, “What does this Zachman stuff mean?”  It means we should all be in cells. 

Tony Shaw:  What do you see as the role of this community in the web services arena?  What about defining the meta data that’s involved in web services?

Karen Lopez:  We’re all on that.

Tony Shaw:  So I’m being passé?  It’s happening?  (audience confirms by show of hands).  OK, next up? 

Audience Comment:  Flexibility.  I think that’s one of the biggest issues right now.  Web Services… a group of people in our community got together to discuss web services and integration issues.  The assumption was that web services would solve all of the integration issues.  I think it’s something for us to be mindful of, to watch out for, that web services are actually just another means to an end.  Web services are not going to solve all of our problems. 

Tony Shaw:  I agree.  Through the course of the last 6-7 years, I’ve seen numerous of these new things coming along.  The perspective that it can be valuable in certain circumstances is good to keep in mind through all these topics and threads.  Anybody else? 

Audience Comment:  For Karen, you might consider adding these to your list.  RTS—The Vendor is King.  BPM—The business analyst is King. 

Audience Comment:  What I have seen in our environment was that the data modeling profession was polarizing toward three ends according to interactive data modeling, analytical data modeling and transaction data modeling which is the back end, the front end and the analytical aspect.  If you’re thinking that people have to change some of their skills, it’s mostly a restructuring of their understanding of what processes come into play.  And also what tools do we have to do all of this.  And in between are the data accesses and all of the transformations that contain the rules.  

Audience Question:  What about the whole issue of semantic modeling and ontologies moving into the world of business as a framework for richer data modeling, and using our technology to leverage the actual data itself?

Karen Lopez:  I think I’ve spent a couple of decades trying to get people to understand what an entity it.  I don’t look forward to the day I have to explain to them what an ontology is.  But it’s definitely something that should be on our radar, you’re right.  We barely have time to do our own jobs, but we have to keep looking ahead to our new jobs. 

Tony Shaw:  If I could add to that, certainly the sessions we’ve had related to content and unstructured management, taxonomy construction, semantics, etc. are not the largest sessions of the conference but they are very well attended.  And there are many of them, so I think there’s definitely an increasing interest. 

Don Soulsby:  As far as the conference goes, what kind of topics do you Tony, and the panelists, see coming and/or going from the topics over the years?  And, going out 3-5 years from now, what do we think the next topics will be? 

Tony Shaw:  From the standpoint of developing the conference, this year we added a database administration track in response to the perception that people needed to become more familiar with what the job of database administrators was.  The unstructured data and content management stuff has really happened in the last 2 years.  XML has been a big thing over a four-year span.  The core meta data, modeling and architecture material is still what attracts a lot of people, so that hasn’t changed since we started doing the conference.  I’m trying to think if there is really anything that has dropped off the horizon at all.  Nothing’s coming to mind.  I think next year, security/privacy related issues are likely to be much larger. 

Karen Lopez:  He got all of my answers.

Don Soulsby:  The two I think I’d predict are legal and finance.  Again, security is just the beginning of what I’m seeing in the legal side.  Within 5 years you will see a legal track at the conference.  And I think you will see a finance or accounting track.  It’s that whole “value of assets” thing.

Tony Shaw:  I would say the value of assets portion, yes, creating some financial measure of what we do.  There haven’t been any particularly good proposals in that area yet but we’ll be looking.

Don Soulsby:  Hmmm, maybe that should be my topic for next year! 

Tony Shaw:  Okay folks, well time has run away from us.  Thank you very much for joining us.  Thank you to all our panelists. 



Join us for the
Wilshire Meta-Data Conference
and DAMA International Symposium

May 2-6, 2004 • Century Plaza Hotel • Los Angeles, California USA

The World's Largest Vendor-Neutral Data Management Conference

The 16th annual DAMA International Symposium and 8th annual Wilshire Meta-Data Conference will be held May 2-6, 2004 at the Century Plaza Hotel in Los Angeles, a beautiful venue adjacent to Beverly Hills. Hear 40 case studies outlining strategies of companies that have implemented successful data management projects. There will be more than 120 speakers in all, covering meta data, enterprise architecture, data and process modeling, unstructured data, business rules, data integration, XML, business intelligence, data warehousing, information stewardship, and more. Keynote Speaker Chris Date. Click here for details.


This "Data Discussions" is a series of interviews with leading data management experts and practitioners, presented by Wilshire Conferences. Click here for links to more Data Discussions interviews.

Click here to sign up to receive future editions.
For sponsorship information, contact Rick Froton at 603-305-0660.

©2003 Wilshire Conferences, Inc. May be quoted with full attribution.