|
This database
question should be asked of a Database Administrator (DBA)
with advanced mainframe experience to show the candidate's
understanding of the use of ASCII, EBCDIC and UNICODE code
pages. A DBA with 3-4 years experience who has kept up-to-date
in his or her knowledge should be able to express a solution
to this problem fairly quickly.
We have an existing
application that provides information for our customers throughout
the United States who call into a customer care center. We'd
like to do two things. The first is to expand the application
to our European customers. The second is to make the application
accessible via the Web.
Currently,
the database uses an EBCDIC code page. Are there any issues
moving forward using this code page with the different European
currencies, including the Euro? What other things do we need
to be aware of as we roll out this application via the Web?
A good candidate
will explain that there are may different EBCDIC code pages,
a code page being a method that assigns unique bit patterns
to characters. Within DB2 there is the ability to change the
CCSID from one that doesn't support the Euro symbol to one
that does. The CCSID tells DB2 what code page to use. A good
candidate will also note that the process to change CCSIDs
is not a simple matter. The data will need to be unloaded,
the table space definition altered and the data reloaded using
a LOAD utility.
The good candidate
will also follow up on the second question, noting that non-mainframe
servers generally use ASCII code pages. He or she would also
know that the EBCDIC and ASCII encoding schemes have different
sort orders. This needs to be taken into account if the mainframe
database is going to stay in EBCDIC, but be used by people
who are used to seeing things in an ASCII sort order.
An excellent candidate
would also know that the EBCDIC data is translated into ASCII
data by DB2 Connect when it is retrieved from the mainframe.
If ASCII data is sent to the mainframe, it is translated to
EBCDIC at the mainframe. An excellent candidate might also
mention that this translation is not always perfect, especially
when special characters are involved and that thorough testing
is a good way to determine if any of these problems exist.
A follow up question
might be "How do I avoid the extra overhead
of conversion and a different sort order?" Most candidates
will know that data can be stored on the mainframe in ASCII
format.
A final question
might be, "I've heard about UNICODE.
What is that?"
If the candidate
tells you that UNICODE will translate data from one language
to another (don't laugh -- some people believe this), he or
she is not a very knowledgeable candidate. A good candidate
will tell you that UNICODE is another encoding scheme that
contains information for the majority of international characters,
including double-byte character sets such as those used in
Japan. An excellent candidate will tell you that there are
two main variations on UNICODE -- UTF-8 and UTF-16, the numbers
telling you the number of bits making up the encoding scheme.
About the author
Casey Young is an eclectic writer whose work spans poetry,
children's writing, fiction and technical writing. She has
worked with DB2 for over seventeen years, and is currently
manager of the DB2 performance group at IBM's Silicon Valley
Lab. She has worked on designing and tuning some of the world's
largest databases and is author of "Exploring
IBM e-business Software" available at www.maxpress.com.
|