Early Vocabulary Development in English, Mandarin, and Cantonese : A Cross-Linguistic Study Based on Childes
Early language development is an exciting topic in the field of child language acquisition.
Only a limited amount of cross-linguistic studies has attempted to investigate the
similarities and differences in child language development across different languages. In
this thesis, I present a study based on English, Mandarin and Cantonese corpora extracted
from the Child Language Data Exchange System (CHILDES, MacWhinney, 2000). I
investigated the lexical compositions of certain lexical categories (nouns, verbs, and
adjectives) in children and their caregivers’ vocabularies across eight different children
age groups ranging from 13 to 60 months. ANOVA, frequency analysis, and cluster
analysis were used to analyze the data. The development trajectories of lexical diversity
and complexity of children’s speech were also analyzed by two novel techniques: Dmeasure
and the Mean Length of Utterances. My research clearly shows that (1) in all
the cultures, children’s early language development exhibits roughly similar patterns: an
increasing diversity in lexicon and increasingly complicated speech patterns emerge as a
function of time, and children’s vocabularies become more similar to those of their
parents over time; and (2) culture variations in children’s linguistic input have strong
influences on their language output, which is reflected in the noun vs. verb ratio and the
varying percentages of nouns, verbs, and adjectives in the total words children are able to
speak in the three cultures.