"Is the English Writing System Phonographic or Lexical/Morphological?
The graphemic distinctiveness of simple word stems in written English (henceforth stems) is usually discussed in terms of the discrimination of homophones: Two or more distinct stems that share a phonological form each have a unique graphemic form (e.g., meat / meet; pair / pear / pare) and in some cases we cannot ascribe the different spellings to etymology: scent ‘should’ be spelled sent given its history (borrowed from French sentir and Latin sentire). The lists in Carney (1994) and Venezky (1999) of heterographic words show that there is a considerable number of homophones that are discriminated in spelling. But there are also many homographic cases (e.g., bank, can), so any stipulated ‘principle of heterography’ is not universal. In this paper, we determine the scope and limitations of this principle empirically. Using the CELEX corpus as well as printed dictionaries, we first determine the number of homophonous simple stems in our data (like bank / bank or pair / pear / pare). Of these, we determine the fraction that has a distinct spelling (like pair / pear / pare). The overall ratio is well below 50%, which means that the principle is not as far-reaching as often assumed. Historically, it appears that in many cases we are not dealing with a graphemic differentiation of stems, but with a conservation of spellings. As a consequence, most distinctive spellings probably corresponded to distinctive sound forms at some point in their history. Sound change then led to homophony, but the graphemic form often remained distinct (as with e.g. loan / lone). Expressing lexical differences in the written form of stems does not seem to be overly important to English writers; there is no widespread lexical or morphological principle at work when it comes to the spelling of English stems.