In Defence of Conservatism in English Orthography,

by Philip T. Smith, Ph.D.*

*Dept. of Psychology, Univ. of Stirling, Scotland.
*Presented at the 1979 Conf. on Reading & Spelling, Nene College.

The argument in this paper is that spelling systems that carry more than purely phonemic information are better suited to the requirements of fluent adult readers, and that different systems of spelling can induce different reading strategies in young readers, these reading strategies perhaps not being best adapted for fluent reading. Accordingly we can draw the conclusions that there are advantages to the use of current English orthography.

Consider first a problem that I want to argue has many parallels with the problem of designing writing systems, namely the problem of designing mathematical systems of notation. It is often the case that the gap between the way a mathematical idea is written down and the way it is spoken is quite large. For example: ½(x+2)2=20, which is spoken (in my dialect) as "half x plus two, all squared equals twenty." Note that although the symbol "2" appears four times, it is spoken in a different way on each occasion, and that some symbols (the brackets) affect not so much the way each element is spoken in isolation, but the way they are grouped together: ½(x+2)2 and (½x+2)2 are spoken with a different rhythm. Now this notation undoubtedly poses problems for the learner, and any mathematic s teacher will be able to tell you of pupils who confuse ½(x+2)2 and (½x+2)2, and, more fundamentally, even of some pupils who confuse "multiplying by 2" with "raising to the power 2". Despite this state of affairs, there is no great pressure to reform algebraic notation comparable with the pressure to reform English orthography. This is because algebraic notation is a particularly successful way of expressing a variety of concepts (multiplication, division, raising to a power, etc.) and this is crucial when algebraic problems of any complexity are attempted: a notation that translated directly into words would leave most of the essential concepts more obscure. When "reforms" have occurred in mathematical notation (the replacement of Roman by Arabic numerals, the preference for Leibniz's rather than Newton's notation for the differential calculus) they have always been in the direction of making thinking easier, not necessarily of making speaking easier.

Consider now the case of alphabetic systems and the problems of reading and writing. Some recently invented orthographies are not strictly phonemic (Faroese orthography, invented by a linguist in the 19th century, contains morphemic information (O'Neil, 1972)) and, as we shall see later, even the best shorthand systems, which are the subject of quite frequent reforms, contain much that is not phonemic or phonetic. However, most alphabetic orthographies have begun by coding only phonemic information, yet in the course of time, largely because the rate of change of spelling is slower than the rate of change of speech, the spelling system has become related to the speech system only in a rather indirect way. Most notoriously this is the case with English spelling, where the writing of vowels is more closely related to the way these vowels were pronounced in pre-Tudor English, and where over the centuries many foreign words have been absorbed into the language, with their pronunciation being adjusted but their original spelling being retained.

Now we argue that just as systems of algebraic notation achieve distinct advantages through distancing themselves from the pronunciation of the propositions they express, so an alphabetic system that expresses linguistic information in an abstract way has advantages over a system that seeks to express only the phonemic form of language. In discussing the optimal design for a writing system, we need feel no more constrained by the observation that alphabets were originally designed to express sounds than a mathematician should worry that geometry was originally developed to measure the areas of fields.

In what ways can alphabets code linguistic information abstractly? First note that even a phonemic system is itself an abstraction. In my dialect, for example, /p/ of pun is pronounced with aspiration [phʌn], but the /p/ of spun is pronounced without aspiration [spʌn]. In writing both these sounds with the same letter p, a writing system is making an abstraction, deciding which distinct phonetic items should be classed together. The reason why such a convention is entirely acceptable is that it is entirely regular and hence predictable: all my word initial /p/'s are aspirated, all /p/'s in the consonant cluster /sp/ are unaspirated. So provided I have a writing system that marks word boundaries, I can always derive the pronunciation of /p/ by rule. This type of observation is given by linguists such as Chomsky and Halle (1968) the status of a fundamental principle: what can be derived by rule need not be marked in an orthography. For example, there exist pairs of words in English that differ only in the location of primary stress (a súrvey (noun), to survéy (verb)) but it is unnecessary to indicate stress in the orthography, according to Chomsky and Halle, because stress location in English is derivable by rule; similarly the vowel alternations in word pairs such as divine-divinity, serene-serenity are rule governed and need not be marked. While we lose information in failing to mark these distinctions in sound patterns, we gain by being able to have similar visual forms representing related ideas (the noun and verb forms of survey, obviously related ideas, are written the same; the related pair divine-divinity have more letters in common than they have sounds in common).

Now I want to emphasise that this paper is far from being a full endorsement of Chomsky and Halle's position. Much of their phonology, it seems to me, is highly implausible at a psychological level, and I am grateful to Valerie Yule (1978) for pointing out the large numbers of exceptions there are to their rules when we first look at the sort of high frequency words that a beginning reader is first exposed to. But I think Chomsky and Halle's essential insight - that an abstract writing system has the power to express important linguistic relations that are missing from a more directly phonemic spelling - should not be ignored by spelling reformers. I now give several examples where the conservatism of English orthography has produced features that could help a fluent reader.

(1) Word stress.

 The location of primary stress in polysyllabic words in English is not easily predicted, since it depends on several phonemic, morphemic and syntactic factors, and stress is unmarked directly in English orthography. However stress placement is rule-governed, and we have shown in a series of studies (Baker and Smith, 1976; Smith and Baker, 1976; Groat, 1979) that English speakers know quite a lot about these rules, to the extent that when subjects read aloud written nonsense words embedded in normal English sentences, the location of stress is affected by such factors as whether the final vowel in the word is tense, whether the word ends in two consonants, and whether the word is a noun or a verb. These skills are present even in seven-year-old children.

One feature of stress assignment is that it can sometimes be predicted more directly from the written form of the word than from the phonemic form. For example, three-syllable nouns with lax vowels take stress on the first syllable if the second vowel is immediately followed by one consonant (cínema, cátapult), but stress is placed on the second syllable if the second vowel is immediately followed by two consonants (veránda, fiásco). Some words are apparent exceptions to this rule: umbrélla, regátta, where only one consonant follows the second vowel in the spoken form of the word. These exceptions are neatly handled with reference to the written form of the word: umbrella has two 'visual' consonants following the second vowel, putting it in the same class as veranda. Similarly, while two-syllable nouns with final lax vowels take stress on the first syllable(témpest, búcket) some exceptions such as giráffe and grotésque can be accounted f or with reference to an underlying three-syllable form, like veránda, from which the third syllable is deleted: although we do not hear this third syllable, its presence is still signaled in the written form by the silent final e. Our experiments have shown that readers do take account of double consonants and silent final e's in pronouncing nonsense words, and we can conclude that such conventions will help a reader in handling unfamiliar words whose pronunciation he might be uncertain about.

(2) Effects of spelling system on reading strategies.

  One issue, which has received little attention in studies of spelling systems, concerns the influence of the type of information contained in a spelling system on the way a child or adult carries out fluent reading. The novice reader has to move from a strategy of laboriously reading aloud all the words he comes across to a strategy of 'reading for meaning' which can be many times faster than natural speech and where any conversion of a word into its full spoken form might actually interfere with efficient reading. It seems to me there is a possibility that if, say, a child is brought up on a highly phonemic alphabet, his attitude to reading and his reading strategies might over-emphasise the phonemic aspects of reading, to the detriment of the lexical and semantic aspects. In this respect, a more abstract system might encourage the child to look beyond simple grapheme-phoneme correspondences.

To be fair, I do not think that such effects, if they exist, will be very large, but given one of our major educational aims is to teach people to read fluently and with comprehension, I think in our research we should be paying more attention to the effects of teaching methods, spelling systems, reading materials, etc. on the reading abilities of children who should be achieving reasonable fluency (15-year-olds, say) rather than concentrating only on the first few years of learning to read.

In our own research, we have one small piece of evidence bearing on this. Groat (1979) looked at the use of stress assignment rules by two groups of seven-year-old children. One group had used traditional orthography throughout the schooling. The second group had been taught to read with the (more phonemic) initial teaching alphabet, but had recently transferred to traditional orthography. Groat found that the two groups performed in similar ways (in particular, both groups had a sophisticated appreciation of the complexities of English stress assignment rules) but in one respect i.t.a. children were different. Recall that, according to some linguists, words like giraffe and grotesque have an underlying three-syllable form (like veranda) which leads to the final form of the word having stress on the second syllable when the third syllable is deleted. Now children taught with i.t.a. operate just in this fashion - a nonsense noun such as gevespe is quite likely to be treated either as a three-syllable word or as a two-syllable word with stress on the second syllable, whereas children taught only with traditional orthography appear to ignore the final e in gevespe, treating it as a normal two-syllable noun with stress on the first syllable (like tempest). So children taught with a more phonemic alphabet have a different strategy for analyzing the stress patterns of long words, though of course we do not know whether this habit persists into adult life or is, as I suspect, merely a temporary strategy in the transition from i.t.a. to traditional orthography.

(3) The three-letter rule.

  Albrow (1972) has pointed out that content words in English must be spelt with at least three letters, thus there are many words with apparently redundant consonant doubling or silent final e's (e.g. inn, bee, bye, sow, two, ore, contrast with in, be, by, so, to, or). I believe this has some significant implications for reading. Recent studies of eye-movements during reading have shown that word-length plays an important part in the way readers scan a text. For example, McConkie and Rayner (1973) have developed an ingenious computer-controlled display of text which allows them to change the text while the subject is in the process of reading it.

Performance is measured by fixation duration (how long the subject needs to spend looking at each part of the text: the longer the fixation, the less efficient the performance). Now if changes are made in the text more than 12 letters ahead of where the subject is currently looking, his performance is unaffected; if changes are made less than 8 letters ahead of where he is looking, his performance is disrupted; but, significantly, if changes are made between 8 and 12 letters ahead, performance is not disrupted if the changes preserve the shape, length and initial and final letters. If a sentence reads:
The cat is near the back.
and the subject is looking at the word is, we could change back to book or bank without disrupting performance, but changing back to sack (initial letter change) or back to brook (length change) would disrupt performance. This means that information about word length and shape is being processed by the skilled reader well ahead of actual word identification (words cannot be accurately identified when they are 8 to 12 letters from fixation). Moreover studies by O'Regan (1979) have shown that readers are able to control their eye-movement patterns in such a way as to avoid what are normally uninformative parts of the text occupied by short function words. Accordingly it seems that the skilled reader can be guided to the most informative parts of the text by peripheral cues to do with word shape and word length, and this process is facilitated by the three-letter rule which distinguishes two-letter function words from three-letter content words.

In this respect, note also that it is an advantage for an orthography to distinguish homophones by words of different shape or length (e.g. threw, through; seen, scene).

(4) Preservation of morphemic information.

 It is a simple observation that syntactically organized text is easier to read than totally disorganized text. It is not even necessary that the text makes sense: syntactic organization by itself helps reading, as Lewis Carroll was well aware

("Twas brillig, and the slithy toves did gyre and gimble in the wabe..."). Note that Carroll creates syntactic organization by the judicious use of function words (the, did) and the use of certain bound morphemes (-y, -s). Now I argue that those features of current orthography that help us to identify morphemes are making a significant contribution to the ease with which we can extract syntactic structure, and thus these features should be preserved. More formal evidence than Lewis Carroll is available, e.g. Epstein (1961) who showed that nonsense syntactically organized in the manner of Jabberwocky was easier to learn than unorganized nonsense. There are two ways that preservation of morphemes can help organization. First it can help indicate whether a word consists of a single unbound morpheme or an unbound morpheme plus a bound morpheme (so we distinguish the homophones band and banned, please and pleas); second, morphemes that sound different in different environments still look the same (e.g. -s in cats and dogs, -ed in walked, climbed, floated).

Evidence that a reader's information-seeking strategies are strongly influenced by certain bound morphemes and function words comes from work I have been doing using letter cancellation (Smith and Groat, 1979; Smith, Pattison and Groat, in preparation). Subjects (university students) are required to read a text while at the same time cancelling all the e's that they notice in the text. Artificial though this technique may sound, it does not seem to disrupt reading greatly, and it has the merit of telling us exactly what parts of a text a subject notices in making a detailed analysis. Results show that thee in the definite article the, and the e in the bound morphemes -ed frequently fail to be cancelled, and this failure rate is strongly dependent on such variables as the difficulty and coherence of the text, and whether the subject had been instructed to attend to the meaning of the text or not. Moreover there are large sequential effects whereby these sorts of e are especially likely to be missed in particular (syntactically defined) parts of the text. For these reasons we call the e's in the and -ed syntactic e's, in contrast with the other e's, which we call lexical e's and which show small sequential effects and little sensitivity to manipulation of text structure. This dissociation of syntactic and lexical e's suggests to us that readers are using words in the text in two different ways: content words (containing lexical e's) are read in much the same way as words in isolation, their meaning and, if necessary, their pronunciation being looked up in some central dictionary in the brain; but certain function words and bound morphemes (containing syntactic e's) are not analysed in such detail, being used rather to guide the reader through the text, and for this purpose their invariant form is crucial.

(5) Semantic information in spelling.

 Semantic information in the spelling of a word, over and above the morphemic information, can appear in English in four ways: (1) Many words are introduced into English from other languages with their non-English spellings retained: spaghetti, Pavlov (the latter being a straight transliteration from the Russian alphabet). (2) Sometimes an English letter is used unconventionally to represent a non-English sound in a loan word: Iraq, Qatar. (3) A substantial number of words have been invented with non-English (usually Latin or Greek) components: psychology, architecture, chromium, cholestrol. (4) Sometimes particular misspellings have become accepted, presumably because they seemed particularly apposite: ghastly, ghost, ghoul.

These processes have some relevant implications for reading: we can guess that spaghetti comes from Italy because of its characteristic Italian spelling; the non-English spelling of Pavlov ('native' English words cannot end in a v) indicates his Slavonic origin; likewise the non-English q in Iraq and Qatar indicates an Arabic origin; the hard ch in psychology, architecture, etc. often indicates a recently invented word (based on Greek) and hence such spellings are likely to indicate words of scientific or technological origin; and ghastly, ghost, ghoul can be seen to be semantically related, thanks to a slip by William Caxton.

To be honest, we do not know how important these semantic cues are for the reader and the speller: certainly educated adults, when asked about the meaning of an unfamiliar word will often use its spelling as a clue to its meaning, and certainly there is plenty of evidence in the psychological literature that the meanings of words can be assessed directly by the reader without recourse to the full phonemic form of the words, but I am inclined to think that the purely semantic information available directly from English spelling is present too sporadically to make a substantial contribution to normal reading. But this is no argument for removing all traces of such information from spelling, rather we should be looking to exploit and systematize such information as is present (it is, for example, unfortunate that Tchaikovsky and Chekhov, are not spelt in British English with the same initial letters, when a systematic transliteration of Russian to English would require this).

Shorthand Systems. Finally I want briefly to discuss shorthand systems. These systems provide further examples of writing systems that demonstrate the advantages of going beyond strictly graphemic-phonemic correspondences. Shorthand systems are interesting because they are reformed quite frequently, there are several systems competing for students, and there is a strong pressure for them to achieve a well-defined criterion, namely to permit rapid and error-free transcription of speech. In short, there are just the sort of pressures, largely missing from traditional orthographies, that should lead to the development of efficient systems.

We have reviewed English shorthand systems recently (Smith and Patterson, to appear). Our conclusion is that their relation to speech is just as abstract as traditional orthography. For example, consider Pitman New Era, one of the fastest and one of the most phonemic systems. There exist in Pitman New Era, abstract phonological conventions like voicing neutralization (ass and as, prices and prizes, Confucian and confusion, would constitute pairs of homographs), rules that operate differently within a word and at the ends of words (sleep and asleep, honest and honesty, would be written in fundamentally different ways, because abbreviations for clusters such as sl- and -st are only available when these occur at the beginning or end of a word) and several abbreviatory devices ignore syllable structure (spring and separate would begin with an abbreviation for spr-, despite the fact that in one case spr- stands for a true consonant cluster and in the other case for two syllables from which the vowels have been deleted). The moral is that rapid writing systems need not stay close to phonemic detail to be efficient. Psychological studies of shadowing (repeating back a message at the same time as listening to it) make much the same point (Marslen-Wilson, 1975): a wide range of linguistic information (morphemic, lexical, semantic) is computed by a listener with remarkably short latency), and there is no evidence that all information must be fully represented in phonemic form before we can start to understand it. Hence there is no reason why an efficient writing system should dwell exclusively on phonemic detail.


 Let me first deal with one objection to the arguments I have been putting forward. It is unnecessary, it is claimed, to distinguish homophones (know, no), to preserve morphemes (walked, climbed, floated) or to have a three-letter rule to aid in discrimination of function and content words (or, ore) because context will almost always allow us to resolve any ambiguities. First, let me remark that the use of "context" is very much a two-edged weapon: we could equally well invoke context to justify all sorts of non-Phonemic reforms, such as dropping nearly all the vowels as semitic orthographies do. Second, writing typically provides less context than speech: when I say, The sun's rays meet or The sons raise meat, it is likely that a gesture I make, or perhaps the rhythm of the sentence will give some hint to the meaning, and these contexts are absent on the printed page. Third, and most important, fluent reading is faster than speech, and needs all the help it can get to be efficient: one reason nobody pushes for vowel deletion as a spelling reform (think of all the space that would save) is that although intelligibility would scarcely be affected, the removal of useful supportive information would probably reduce the reading rate considerably. Let us put as much information into spelling as the reader can usefully handle.

Looking back over my arguments, and having listened to some of the papers at the Northampton conference of the Simplified Spelling Society, what should I recommend about spelling reform? First, I would be against deleting the second l from umbrella or the e from giraffe, since they help with the correct assignment of stress; I would be against dropping the h from spaghetti, and against using k to stand for ch in psychology or for q in Iraq, since useful information is given. However with these examples I acknowledge my position is elitist: these conventions help good readers squeeze a little more information out of difficult or low frequency words. It seems to me an open empirical question whether these slight advantages out-weigh the disadvantages for less able readers.

However there are some reforms that I would much more confidently oppose, because they affect processes that are involved in some of the most central parts of reading. May I re-emphasise that efficient reading depends on much more than accurate phonemics, and that word shape, word length and morphemic structure are important guides for rapid reading. With this perspective, I would be against destroying morphemic invariance (-s, -ed, etc.), against dropping redundant letters in three-letter content words (add, axe, egg, etc.) and against destroying different spellings for homophones (gate, gait). On the other hand, preserving the close visual similarity of divine and divinity is probably less important (the words will begin with the same letters and have roughly the same shape no matter how we spell the second vowel).

I return to my point that spelling should contain as much information as the reader and speller can usefully handle. It seems to me beyond dispute that much of this information should be phonemic, and that in the early stages of reading, the phonemic aspects of spelling need to be stressed. But if we want to develop an orthography that does justice to the richness of the English language and permits fluent and intelligent reading and writing, we should take great care to incorporate into any reformed orthography information that refers to deeper levels of linguistic knowledge.

Acknowledgement. This paper was written while the author was visiting Max-Planck-Gesselischaft Projekt-gruppe für Psycholinguistik, Nijmegen, Holland, whose support is gratefully acknowledged.


