CHINESE CHARACTERS
The corpus has two
versions.
The Chinese version
The CHAT version
The Chinese version
The
CHINESE version requires the use of MS Chinese Windows 95/98
with HKSCS (Hong Kong Supplementary Character Set)
support because it contains Cantonese characters,
commonly used in Hong Kong but not in mainland China or
Taiwan, which are not found in the standard GB or Big-5
character set. Anyone using the Chinese version of the
corpus will need to download and install the HKSCS
software available at http://www.info.gov.hk/digital21/eng/hkscs/index.html/.
Download the Chinese version of
the corpus from http://www.arts.cuhk.edu.hk/~cancorp/archive/tagdata.zip.
The CHAT version
The CHAT
version now in the Childes archive is a version that
incorporates the Chinese characters on a '%can' tier,
with the romanizations on the main tier. This
amalgamation was done first by Brian MacWhinney, and then
checked by the research team. Ann Law and Brian
MacWhinney provided programming help in the conversion of
the user-defined internal codes of Cantonese characters,
used in earlier versions of the corpus, to the now
standardized codes of the Hong Kong Government's
Supplementary Character Set (HKSCS). This has made the
display of the Cantonese characters in both the Chinese
and CHAT versions relatively easy. The help and advice of
Brian MacWhinney in the final stages of the corpus
preparation, as well as his continual support for the
updating of the corpus, is gratefully acknowledged. This
version has passed the CHECK test for format consistency.
This
version requires the use of MS Chinese Windows 95/98 with
HKSCS (Hong Kong Supplementary Character Set) support
because it contains Cantonese characters, commonly used
in Hong Kong but not in China or Taiwan, which are not
found in the standard GB or Big-5 character set. Anyone
who wishes to view the Cantonese characters in the corpus
will need to download and install the HKSCS software
available at http://www.info.gov.hk/digital21/eng/hkscs/index.html/".
Download the CHAT version of the
corpus from http://www.arts.cuhk.edu.hk/~cancorp/archive/chatfile.zip.
|