Why should I consider if data needs converting into
different formats?
We have already experienced the difficulty of dealing with
legacy systems. The structure of somebody else's database may not be quite up
to scratch, or there maybe data anomalies. How do you cope?
Similarly, you may be faced with a range of existing systems
that are not conventional relational databases. How do you deal with a process
that uses a spreadsheet to record product data? How can the information be
included within a new system?
There are times when we must convert data from one format to
another. We can do this in 'real-time' by using applications to do the conversions
(such as ASP or Java to generate XML files), or we can do a one-off conversion
when we are scrapping an existing system. If you can't remember how we come
to the conclusion to write a system off, refresh
your memory here.
So, you have an old spreadsheet with some data in it. Download
a sample spreadsheet and take a look at it. In
it we have some customer records, nothing special. But we need to include
this with our new information system.
How can we use this
information in the new system?
We have a few options:
Go back to the spreadsheet you opened in MS Excel.
Is this file of any use to your system?
Repeat the same exercise, but this time save the file as a
Comma Separated Value (CSV) format.
After you have done this, close Excel and Notepad down and
open up Access. Create a new blank database and then:
See if you can successfully import the text file, then the csv version. If you can export/import files in the above
formats, then you can just about access data from any data repository,
whether it is an old database or an obscure spreadsheet.
The last but one exercise is to test your proficiency with
XML. Have a go at re-writing the customer.xls file in XML. Look at last
session's tutorial and think about the three components; Structure, content and
display.
Do you need a DTD or schema? How would you write one?
Once you have done this, take a look at the available export
formats in Excel and Access. Can you find an easier way of creating the XML?
If you want to read a little further about data conversion
issues try the following text:
Chapter 4, A
Methodology for Client/Server and Web Application Development, First Edition,
Roger Fournier, Yourdon Press, p274-282, ISBN 0-13-598426-2.
However
this is only a starting point and you are advised to look further for other
resources, such as alternative texts or on the internet.
Once you
have completed your research and made some notes, try the review questions below.
As a guide,
this piece of learning should take about two hours to complete.
Review Questions
Test your understanding of this topic with the following
questions.
What activities
should we consider if we are designing a data conversion program? |
|
What advantages does
the use of XML offer over traditional 'middleware' products? |
|
|
What is meant by
'manual' and 'automated' data conversion processes? Give some examples of
each. |
|
Using some modelling
notation, describe the data conversion process. |
|
What are the common
characteristics of legacy data? |
|
Describe a clean-up
strategy for legacy files. |
|