nr
List of Phonetic Symbols and Signs
a CardinaJ Vowel no, 4 (approximately as in French pane); used for first element
of Eng. diphthong [at] ae front vowel between open and open-mid (Eng. vowel in cat) a Cardinal Vowel no. 5 (approximately as in French pas); used for Eng. [a] in car D open rounded Cardinal Vowel no. 5 (Eng. vowel in dog) b voiced bilabial plosive (Eng. b in labour) 6 voiced ingressive bilabial plosive p" voiced bilabial fricative c voiceless palatal plosive £ voiceless palatal fricative C Cardinal Vowel
3 Cardinal Vowel no. 6 (approximately as in German Sonne); used for Eng. [a:] in
saw, and first element of diphthong [31] d voiced alveolar plosive (Eng. d in lady) d voiced ingressive alveolar plosive dj voiced palato-alveolar affricate 5 voiced dental fricative (Eng. th in other)
2 Cardinal Vowel no. 2 (approximately as in French the); used for Eng. [e] in bed, and first element of diphthong [ei]
* unrounded central vowel (Eng. initial and final vowels in another) > retroflexed central vowel (American er in water)
■- Cardinal Vowel no. 3 (approximately as in French pere); used for first element
of diphthong in [es] i unrounded central vowel (Eng. vowel in bird)
* retroflexed central vowel
voiceless labiodental fricative (Eng. / in four) l voiced palatal plosive ; voiced velar plosive (Eng. g in eager) J voiced velar implosive 1 voiceless glottal fricative (Eng. h in house) S voiced glottal fricative (sometimes Eng. h in behind)
1 Cardinal Vowel no. 1 (approximately as in French si); used for Eng. /i:/ in see
* unrounded close central vowel
xiv List of Phonetic Symbols and Signs
List of Phonetic Symbols and Signs xv
i centralized unrounded close-mid vowel (Eng. vowel in sit)
j (unrounded) palatal approximant (Eng. y in you)
t voiced alveolar tap (sometimes r in Eng. very)
k voiceless velar plosive (Eng. c in car)
1 voiced alveolar lateral approximant (Eng. / in lay)
i voiced alveolar lateral approximant with velarization (Eng. // in ill)
\ voiceless alveolar lateral fricative (Welsh 11)
m voiced bilabial nasal (Eng. m in me)
rg voiced labiodental nasal (Eng. m in comfort)
ui Cardinal Vowel no. 16 (like Eng. /u:/ with spread lips)
n voiced alveolar nasal (Eng. n in no)
rj voiced velar nasal (Eng. ng in sing)
ji voiced palatal nasal (French gn in vigne)
o Cardinal Vowel no. 7 (approximately as in French eau)
0 Cardinal Vowel no. 10 (approximately as in French peu) ce Cardinal Vowel no. 11 (approximately as in French peur) 6 voiceless dental fricative (Eng. th in thing)
p voiceless bilabial plosive (Eng. p in pea)
r voiced alveolar trill (an emphatic pronunciation of r in Scottish English)
1 voiced post-alveolar approximant (Eng. r in red) .1 voiced retroflex approximant
r voiced uvular trill
k voiced uvular fricative or approximant
s voiceless alveolar fricative (Eng. s in see)
J voiceless palato-alveolar fricative (Eng. sh in she)
t voiceless alveolar plosive (Eng. f in tea)
tf voiceless palato-alveolar affricate
I dental click
u Cardinal Vowel no, 8 (approximately as in French doux); used for
Eng /u:/ in do
« rounded close central vowel
0 centralized rounded close-mid vowel (Eng. u in put) v voiced labiodental fricative (Eng. v in ever)
a Cardinal Vowel no. 14; used for Eng. /a/ in cup
v voiced labiodental approximant
w labial-velar semi-vowel (Eng. w in we)
a\ voiceless labial-velar fricative (sometimes Eng. wh in why)
x voiceless velar fricative (Scottish ch in loch)
y Cardinal Vowel no. 9 (approximately as in French du)
A voiced palatal lateral approximant (Italian gl in eglt)
Y Cardinal Vowel no. 15
y voiced velar fricative
z voiced alveolar fricative (Eng. z in lazy)
3 voiced palato-alveolar fricative (Eng. s in measure)
$ voiceless bilabial fricative
1 alveolar lateral click ? glottal plosive
indicates full length of preceding vowel
■ indicates half length of preceding vowel . high unaccented pre-nuctear syllable
* high falling nuclear tone (and used to indicate primary accent in citation forms) low falling nuclear tone
* high rising nuclear tone low rising nuclear tone
" falling-rising nuclear tone
* rising-falling nuclear tone
* mid-level nuclear tone
= stylized tone (high level followed by mid-level)
> syllable carrying (high) secondary accent , syllable carrying (low) secondary accent - nasalization, e.g. [6]
" centralization, e.g. [6] , more open quality, e.g. [o]
closer quality, e.g. [o] [ devoiced lenis consonant, e.g. [zj (above in the case of [g,3,g])
* syllabic consonant, e.g. [n] (above in the case of [gl) dental articulation, e.g. [%]
post-alveolar articulation [ ] phonetic transcription // phonemic transcription
> changed to
< developed from is realized as
* common in RP (Figs. 8-26 and in Chapter 10)
PART I
Speech and Language
Communication
1.1 Speech_
One of the chief characteristics of human beings is their ability to communicate to their fellows complicated messages concerning every aspect of their activity. A man possessing the normal human faculties achieves this exchange of information mainly by means of two types of sensory stimulation, auditory and visual. Children learn from a very early age to respond to the sounds and tunes which their elders habitually use in talking to them; and, in due course, from a need to communicate, they begin to imitate the recurrent sound patterns with which they have become familiar. In other words, they begin to make use of speech; and their constant exposure to the spoken form of their own language, together with their need to convey increasingly subtle types of information, leads to a rapid acquisition of the framework of spoken language. Nevertheless, with all the conditions in favour, a number of years pass before they master the sound system used in their community. It is no wonder, therefore, that the learning of another language later in life, acquired artificially in brief and sporadic spells of activity and often without the stimulus arising from an immediate need for communication, will tend to be tedious and rarely more than partially successful. In addition, the more firmly consolidated the basis of a first language becomes and the later in life a second language is begun, the more learners will be subject to resistances and prejudices deriving from the framework of their original language. As we grow older, the acquisition of a new language will normally entail a great deal of conscious, analytical effort, instead of children's ready and facile imitation.
1-2 Writing_
Later in childhood children will be taught the conventional visual representation of speech—they will learn to use writing. Today, in considering those languages which have long possessed a written form, we are apt to forget that the writing was originally an attempt at reflecting the spoken language, and that the latter precedes the former for both the individual and the community. Indeed, in many languages, So parallel are the two forms felt to be that the written form may be responsible for
4 Speech and Language
Communication 5
changes in pronunciation or may at least tend to impose restraints upon its development. In the case of English, this sense of parallelism, rather than of derivation, may be encouraged by the obvious lack of consistent relationship between sound and spelling. A written form of English, based on the Latin alphabet, has existed for more than 1,000 years and, though the pronunciation of English has been constantly changing during this time, few basic changes of spelling have been made since the fifteenth century. The result is that written English is often an inadequate and misleading representation of the spoken language of today. Clearly it would be unwise, to say the least, to base our judgements concerning the spoken language on prejudices derived from the orthography. Moreover, if we are to examine the essence of the English language, we must make our approach through the spoken rather than the written form. The primary concern of this book will be the production, transmission, and reception of the sounds of English— in other words, the phonetics of English.
1.3 Language_
From the moment that we abandon orthography as our starting-point, it is clear that the analysis of the spoken form of English is by no means simple. Each of us uses an infinite number of different speech sounds when we speak English. Indeed, it is true to say that it is difficult to produce two sounds which are precisely identical from the point of view of instrumental measurement: two utterances by the same person of the word cat may well show quite marked differences when measured instrumentally. Yet we are likely to say that the same sound sequence has been repeated. Additionally we may hear clear and considerable differences of quality in the vowel of cat as, for instance, in the London and Manchester pronunciations of the word; yet, though we recognize differences of vowel quality, we are likely to feel that we are dealing with a 'variant' of the 'same' vowel. It seems, then, that we are concerned with two kinds of reality: the concrete, measurable reality of the sounds uttered, and another kind of reality, an abstraction made in our minds, which appears to reduce this infinite number of different sounds to a 'manageable' number of categories. In the first, concrete, approach, we are dealing with sounds in relation to speech; at the second, abstract, level, our concern is the behaviour of sounds in a particular language. A language is a system of conventional signals used for communication by a whole community. This pattern of conventions covers a system of significant sound units (the phonemes), the inflexion and arrangement of 'words', and the association of meaning with words. An utterance, an act of speech, is a single concrete manifestation of the system at work. As we have seen, several utterances which are plainly different on the concrete, phonetic, level may fulfil the same function, i.e. are the 'same', on the systematic language level. It is important in any analysis of spoken language to keep this distinction in mind and we shall later be considering in some detail how this dual approach to the utterance is to be made. It is not, however, always possible or desirable to keep the two levels of analysis entirely separate: thus, as we shall see, we will draw upon our knowledge of the linguistically significant units to help us in determining how the speech continuum shall be divided up on the concrete, phonetic, level; and again, our classification of linguistic units will be helped by our knowledge of their phonetic features.
^4 Redundancy_
finally, it is well to remember that, although the sound system of our spoken languages serves us primarily as a medium of communication, its efficiency as such an instrument of communication does not depend upon the perfect production and reception of every single element of speech. A speaker will, in almost any utterance, provide the listener with far more cues than he needs for easy comprehension. In the first place, the situation, or context, will itself delimit very largely the purport of an utterance. Thus, in any discussion about a zoo, involving a statement such as 'We saw the lions and tigers', we are predisposed by the context to understand lions, even if the n is omitted and the word actually said is liars. Or again, we are conditioned by grammatical probabilities, so that a particular sound may lose much of its significance; e.g. in the phrase 'These men are working', the quality of the vowel in men is not as vitally important for deciding whether it is a question of men or man as it would be if the word were said in isolation, since here the plurality is determined in addition by the demonstrative adjective preceding men and the verb form following. Then again, there are particular probabilities in every language as to the different combinations of sounds which will occur. Thus in English, if we hear an initial th sound [S], we expect a vowel to follow, and of the vowels some are much more likely than others. We distinguish such sequences as -gl and -/ in final positions, e.g. in beagle and beadle, but this distinction is not relevant initially, so that even if dloves is said, we understand gloves. Or again, the total rhythmic shape of a word may provide an important cue to its recognition: thus, in a word such as become, the general rhythmic pattern may be said to contribute as much to the recognition of the word as the precise quality of the vowel in the first, weakly accented, syllable. Indeed, we may come to doubt the relative importance of vowels as a help to intelligibility, since we can replace our twenty English vowels by the single vowel [a] in any utterance and still, if the rhythmic pattern is kept, retain a high degree of intelligibility. An utterance, therefore, will provide a large complex of cues for the listener to interpret, but a great deal of this information will be redundant, as far as the listener's needs are concerned. On the other hand, such an over-proliferation of cues will serve to offset any disturbance such as noise or to counteract the sound-quality divergences which may exist between speakers of two dialects of the same language. But to insist, for instance, upon exaggerated articulation in order to achieve clarity may well be logo beyond the requirements of speech as ft means of communication; indeed, certain obscurations of quality are, and have been for many centuries, characteristic of English. Aesthetic judgements on speech, such as those which deplore the use of the 'intrusive r', take into account social considerations of a somewhat different order from those involved in a study of speech as communication.
lj Phonetics and Linguistics_
This book describes the sound system of English, but it should be remembered that Such a description forms only part of the total description of a language. A complete description of the current state of a language provides information on a lumber of interrelated components.
6 Speech and Language
Communication 7
The phonetics of a language concerns the concrete characteristics (articulatory, acoustic, auditory) of the sounds used in languages, while phonology concerns how sounds function in a systemic way in a particular language. The traditional approach to phonology is through phonemics, which analyses the stream of speech into a sequence of contrastive segments, 'contrastive' here meaning 'contrasting with other segments which might change the meaning' (see further §5.3 below). The phonemic approach to phonology is not the only type of phonological theory but it is the most accessible to those with no training in linguistic theory, besides being more relatable to the writing system. Hence the major part of this book is set within phonemic analysis. Besides being concerned with the sounds of a language, both phonetics and phonology must also describe the combinatory possibilities of the sounds (the phonotactics or syllable structure) and the prosody of the language, that is, how features of pitch, loudness, and length work to produce accent, rhythm, and intonation. Additionally, a study can be made of the relationship between the sounds of a language and the letters used in its writing system (graphology or graphemics).
While this book presents a detailed description of the phonetics and phonemics of English, reference will need to be made from time to time to other components of the language:
(1) The lexicon—the words of the language, the sequence of phonemes of which they are composed, together with their meanings.
(2) The morphology—the structure of words, in particular their inflexion (e.g. start!started—text the past-tense morpheme is added to the stem morpheme). Statements can be made of the phonemic structure of morphemes—the morphophonemics. So the morphophonemics of the English plural morpheme involve the morphophonemic alternations illustrated by the /s/ in cats, the /z/ in dogs, and the /iz/ in losses.
(3) The syntax—the description of categories like noun and verb, and the system of rules governing the structure of phrases, clauses, and sentences in terms of order and constituency.
(4) The semantics—the meaning of words and the relationship between word meanings, and the way such meanings are combined to give the meanings of sentences.
(5) The pragmatics—the influence of situation on the interpretation of utterances.
Moreover various other aspects of linguistics will involve phonetics and phonology. Stylistics concerns the variations involved in different situations and in different styles of speech. Sociolinguistics concerns the interaction between language and society (e.g. the variation involved across classes and between the sexes). Dialectology (often considered a branch of sociolinguistics) concerns the variation in the same language in different regions. Psycho-linguistics concerns the behaviour of human beings in their production and perception of language (e.g. how far do we plan ahead and how much of an utterance do we decode at a time?). Language acquisition concerns children's learning of their first language, whereas applied linguistics principally concerns the acquisition of a second language.
Finally, it is clear that the various components of a language are always
undergoing change in time. The state of a language at any (synchronic) moment must be seen against a background of its historical (diachronic) evolution. It is for this reason that this book includes information on earlier states of the sound system of English, with some speculation on possible developments in the future.
The Production of Speech 9
2
The Production of Speech: The Physiological Aspect
2.1 The Speech Chain_
Any manifestation of language by means of speech is the result of a highly complicated series of events. The communication in sound of such a simple concept as 'It's raining' involves a number of activities on the part of the speaker. In the first place, the formulation of the concept will take place at a linguistic level, i.e. in the brain; the first stage may, therefore, be said to be psychological. The nervous system transmits this message to the so-called 'organs of speech' and these in turn behave in a conventional manner, which, as we have learned by experience, will have the effect of producing a particular pattern of sound; the second important stage for our purposes may thus be said to be articulatory or physiological. The movement of our organs of speech will create disturbances in the air, or whatever the medium may be, through which we are talking; these varying air pressures may be investigated and they constitute the third stage in our chain, the physical, or acoustic. Since communication generally requires a listener as well as a speaker, these stages will be reversed at the listening end: the reception of the sound waves by the hearing apparatus (physiological) and the transmission of the information along the nervous system to the brain, where the linguistic interpretation of the message takes place (psychological). Phonetic analysis has often ignored the role of the listener. But any investigation of speech as communication must ultimately be concerned with both the production and the reception ends.
Our immediate concern, however, is with the speaker's behaviour and more especially, on the concrete speech level, with the activity involved in the production of sounds. For this reason, we must now examine the articulatory stage (the speech mechanism) to discover how the various organs behave in order to produce the sounds of speech.
g_2 The Speech Mechanism___
Man possesses, in common with many other animals, the ability to produce sounds by using certain of his body's mechanisms. The human being differs from other animals in that he has been able to organize the range of sounds which he can emit into a highly efficient system of communication. Non-human animals rarely progress beyond the stage of using the sounds they produce as a reflex of certain |>asic stimuli to signal fear, hunger, sexual excitement, and the like. Nevertheless, like other animals, man when he speaks makes use of organs whose primary physiological function is unconnected with vocal communication; in particular, those situated in the respiratory tract.
U.1 Sources of Energy: The Lungs
The most usual source of energy for our vocal activity is provided by an airstream .expelled from the lungs. There are languages which possess sounds not requiring lung (pulmonic) air for their articulation, and, indeed, in English we have one or (wi5 extralinguistic sounds, such as the one we write as tut-tut and the noise of encouragement made to horses, which are produced without the aid of the lungs; but all the essential sounds of English use lung air for their production. Our utterances are, therefore, largely shaped by the physiological limitations imposed by the capacity of our lungs and by the muscles which control their action. We are obliged to pause in articulation in order to refill our lungs with air, and the number of energetic peaks of exhalation which we make will to some extent condition the division of speech into sense-groups. In those cases where the airstream is not available for the upper organs of speech, as when, after the removal of the larynx, lung air does not reach the mouth but escapes from an artificial aperture in the neck, a new source of energy, such as stomach air, has to be employed; a new source of this kind imposes restrictions of quite a different nature from those exerted by the lungs, so that the organization of the utterance into groups is changed and variation of energy is less efficiently controlled.
A number of techniques are available for the investigation of the activity in speech of the lungs and their controlling muscles. At one time air pressure within the lungs was observed by the reaction of an air-filled balloon in the stomach. On the basis of such evidence from a gastric balloon, it was at one time claimed that syllables were formed by chest pulses.' Such a primitive procedure was replaced by •he technique of electromyography, which demonstrated the electrical activity of those respiratory muscles most concerned in speech, notably the internal inter-costals; this technique disproved the relationship between chest pulses and syllables.2 X-ray photography can reveal the gross movements of the ribs and hence by mference the surrounding muscles, although the technique of Magnetic Resonance Paging (MRI) is now preferred on medical grounds.
' Stetson (1951). 2 Udefoged (1967).
10 Speech and Language
;us
Fig. 1. Organs of speech.
2.2.2 The Larynx and Vocal Folds
The airstream provided by the lungs undergoes important modifications in the upper parts of the respiratory tract before it acquires the quality of a speech sound. First of all, in the trachea or windpipe, it passes through the larynx, containing the so-called vocal folds, often, less correctly, called the vocal cords, or even vocal chords (see Fig. 1).
The larynx is a casing, formed of cartilage and muscle, situated in the upper part of the trachea. Its forward portion is prominent in the neck below the chin and is commonly called the 'Adam's apple'. Housed within this structure from back to front are the vocal folds, two folds of ligament and elastic tissue which may be brought together or parted by the rotation of the arytenoid cartilages (attached at the posterior end of the folds) through muscular action. The inner edge of these folds is typically about 17 to 22 mm long in males and about 11 to 16 mm in females.3 The opening between the folds is known as the glottis. Biologically, the vocal folds act as a valve which is able to prevent the entry into the trachea and lungs of any foreign body, or which may have the effect of enclosing the air within the lungs to assist in muscular effort on the part of the arms or the abdomen. In using the vocal folds for speech, the human being has adapted and elaborated upon this original open-or-shut function in the following ways (see Fig, 2).
(I) The glottis may be held tightly closed, with the lung air pent up below it. This 'glottal stop' [?] frequently occurs in English, e.g. when it precedes the energetic articulation of a vowel as in apple [?aepl] or when it reinforces /p,t,k/ as in clock
3 Clark and Yallop (1990)
The Production of Speech 11
Arytenoid cartilages
[a] tightly together [b] loosely together and [c] open for normal breathing
as for p] vibrating as for voiced and voiceless sounds
sounds
Fig. 2. The vocal cords as seen from above.
[khVk] or even replaces them, as in cotton [ko?n]. It may also be heard in defective speech, such as that arising from cleft palate, when [?] may be substituted for the stop consonants, which, because of the nasal air escape, cannot be articulated with proper compression in the mouth cavity.
(2) The glottis may be held open as for normal breathing and for voiceless sounds like [s] in sip and [p] in peak.
(3) The action of the vocal folds which is most characteristically a function of speech consists in their role as a vibrator set in motion by lung air—the production of voice, or phonation; this vocal-fold vibration is a normal feature of all vowels or of such a consonant as [z] compared with voiceless [s]. In order to achieve the effect of voice, the vocal folds are brought sufficiently close together that they vibrate when subjected to air pressure from the lungs. This vibration, of a somewhat undulatory character, is caused by compressed air forcing the opening of the glottis and the resultant reduced air pressure permitting the elastic folds to come together once more; the vibratory effect may easily be felt by touching the neck in the region of the larynx or by putting a finger over each ear flap when pronouncing a vowel or lz] for instance. In the typical speaking voice of a man, this opening and closing action is likely to be repeated between 100 and 150 times in a second, i.e. there are that number of cycles of vibration (called Hertz, which is abbreviated to Hz); in the case of a woman's voice, this frequency of vibration might well be between 200 and 325 Hz. We are able, within limits, to vary the speed of vibration of our vocal folds w» in other words, are able consciously to change the pitch of the voice produced in the larynx; the more rapid the rate of vibration, the higher is the pitch (an extremely 'ow rate of vibration being partly responsible for what is usually called creaky voice). Normally the vocal folds come together rapidly and part more slowly, the °P«iing phase of each cycle thus being longer than the closing phase. This gives rise to 'modal' (or 'normal') voice which is used for most of English speech. Other •nodes of vibration result in other voice qualities, most notably breathy and creaky voice, which are used contrastively in a number of languages. (See also §5.8.) Moreover, we are able, by means of variations in pressure from the lungs, to modify
size of the puff of air which escapes at each vibration of the vocal folds; in other *ords, we can alter the amplitude of the vibration, with a corresponding change of
12 Speech and Language
The Production of Speech 13
loudness of the sound heard by a listener. The normal human being soon learns to manipulate his glottal mechanism so that most delicate changes of pitch and loudness are achieved. Control of this mechanism is, however, very largely exercised by the ear, so that such variations are exceedingly difficult to teach to those who are born deaf, and a derangement of pitch and loudness control is liable to occur among those who become totally deaf later in life.
(4) One other action of the larynx should be mentioned. A very quiet whisper may result merely from holding the glottis in the voiceless position. But the more normal whisper, by means of which we are able to communicate with some ease, can be felt to involve energetic articulation and considerable stricture in the glottal region. Such a whisper may in fact be uttered with an almost total closure of the glottis and an escape of air in the region of the arytenoids.
The simplest way of observing the behaviour of the vocal cords is by the use of a laryngoscope, which gives a stationary mirrored image of the glottis. Using stroboscope techniques, it is possible to obtain a moving record, and high-speed films have been made of the vocal cords, showing their action in ordinary breathing, producing voice and whisper, and closed as for a glottal stop. The modern technique of observation is to use fiberoptic endoscopy coupled if required with a videocamera.
2.2.3 The Resonating Cavities
The air stream, having passed through the larynx, is now subject to further modification according to the shape assumed by the upper cavities of the pharynx and mouth, and according to whether the nasal cavity is brought into use or not. These cavities function as the principal resonators of the voice produced in the larynx.
2.2.3J The Pharynx The pharyngeal cavity (see Fig. 1 > extends from the top of the trachea and oesophagus, past the epiglottis and the root of the tongue, to the region at the rear of die soft palate. It is convenient to identify these sections of the pharynx by naming them: laryngopharynx, oropharynx, nasopharynx. The shape and volume of this long chamber may be considerably modified by the constrictive action of the muscles enclosing the pharynx, by the movement of the back of the tongue, by the position of the soft palate which may, when raised, exclude the nasopharynx, and by the raising of the larynx itself. The position of the tongue in the mouth, whether it is advanced or retracted, will affect the size of the oropharyngeal cavity; the modifications in shape of this cavity should, therefore, be included in the description of any vowel. It is a characteristic of some kinds of English pronunciation that certain vowels, e.g. the [a&] vowel in sad, are articulated with a strong pharyngeal contraction; in addition, a constriction may be made between the lower rear part of the tongue and the wall of the pharynx so that friction, with or without voice, is produced, such fricative sounds being a feature of a number of languages.
The pharynx may be observed by means of a laryngoscope or fiberoptic nasendoscopy, and its constrictive actions are revealed by lateral x-ray photography or, nowadays, preferably by MRI.
The escape of air from the pharynx may be effected in one of three ways:
(1) The soft palate may be lowered, as in normal breathing, in which case the air may escape through the nose and the mouth. This is the position taken up by the soft palate in articulation of the French nasalized vowels in such a phrase as un hon fin blanc [de bo ve bla], the particular quality of such vowels being achieved through
function of the nasopharyngeal cavity. Indeed, there is no absolute necessity for nasal airflow out of the nose, the most important factor in the production of nasality being the sizes of the posterior oral and nasal openings (some speakers may even make the nasal cavities vibrate through nasopharyngeal mucus or through the soft palate itself).4
(2) The soft palate may be lowered so that a nasal outlet is afforded to the airstream, but a complete obstruction is made at some point in the mouth, with the result that, although air enters all or part of the mouth cavity, no oral escape is possible. A purely nasal escape of this sort occurs in such nasal consonants as [nvnfl] in the English words ram, ran, rang. In a snore and some kinds of defective speech, this nasal escape may be accompanied by friction between the rear side of the soft palate and the pharyngeal wall.
(3) The soft palate may be held in its raised position, eliminating the action of the nasopharynx, so that the air escape is solely through the mouth. At] normal English sounds, with the exception of the nasal consonants mentioned, have this oral escape. Moreover, if for any reason the lowering of the soft palate cannot be effected, or if there is an enlargement of the organs enclosing the nasopharynx or a blockage brought about by mucus, It is often difficult to articulate either nasalized vowels or nasal consonants. In such speech, typical of adenoidal enlargement or the obstruction caused by a cold, the French phrase mentioned above would have its nasalized vowels turned into their oral equivalents and the English word morning would have its nasal consonants replaced by [b,d,g). On the other hand, an inability to make an effective closure by means of the raising of the soft palate—either because the soft palate itself is defective or because an abnormal opening in the roof of the mouth gives access to the nasal cavity—will result in the general nasalization of vowels and the failure to articulate such oral stop consonants as [b,d,gl. This excessive nasalization (or hypernasality) is typical of such a condition as cleft palate.
It is evident that the action of the soft palate is accessible to observation by direct iseans, as well as by lateral x-ray photography and MRI; the pressure of the air Posing through the nasal cavities may be measured at the nostrils or within the cavities themselves.
The Mouth Although all the cavities so far mentioned play an essential P*rt in the production of speech sounds, most attention has traditionally been paid to tile behaviour of the cavity formed by the mouth. Indeed, in many languages the Word tongue is used to refer to our speech and language activity. Such a preoccupa-"on with the oral cavity is doubtless due to the fact that it is the most readily ^ooessible and easily observed section of the vocal tract; but there is in such an attitude a danger of gross oversimplification. Nevertheless, it is true that the shape OF the mouth determines finally the quality of the majority of our speech sounds.
* Uver (1980).
14 Speech and Language
The Production of Speech 15
Far more finely controlled variations of shape are possible in the mouth than in any other part of the speech mechanism.
The only boundaries of this oral chamber which may be regarded as relatively fixed are, in the front, the teeth; in the upper part, the hard palate; and, in the rear, the pharyngeal wall. The remaining organs are movable: the lips, the various parts of the tongue, and the soft palate with its pendant uvula (see Fig. 1). The lower jaw, too, is capable of very considerable movement; its movement will control the gap between the upper and lower teeth and also to a large extent the disposition of the lips. The space between the upper and lower teeth will often enter into our description of the articulation of sounds; in all such cases, it is clear that the movement of the lower jaw is ultimately responsible for the variation described. Movement of the lower jaw is also one way of altering the distance between the tongue and the roof of the mouth.
It is convenient for our descriptive purposes to divide the roof of the mouth into three parts: moving backwards from the upper teeth, first, the teeth ridge (adjective: alveolar), which can be clearly felt behind the teeth; secondly, the bony arch which forms the hard palate (adjective: palatal), which varies in size and arching from one individual to another; and finally, the soft palate (adjective: velar), which, as we have seen, is capable of being raised or lowered, and at the extremity of which is the uvula (adjective: uvular). All these parts can be readily observed by means of a
mirror.
(1) Of the movable parts, the lips (adjective: labial) constitute the final orifice of the mouth cavity whenever the nasal passage is shut off. The shape which they assume will, therefore, affect very considerably the shape of the total cavity. They may be shut or held apart in various ways. When they are held tightly shut, they form a complete obstruction or occlusion to the airstream, which may either be momentarily prevented from escaping at all, as in the initial sounds of pat and bat, or may be directed through the nose by the lowering of the soft palate, as in the initial sound of mat. If the lips are held apart, the positions they assume may be summarized under five headings:
(a) held sufficiently close together over all their length that friction occurs between them. Fricative sounds of this sort, with or without voice, occur in many languages and the voiced variety [p] is sometimes wrongly used by foreign speakers of English for the first sound in the words vet or wet;
(b) held sufficiently far apart for no friction to be heard, yet remaining fairly close together and energetically spread. This shape is taken up for vowels like that in see and is known as the spread lip position;
(c) held in a relaxed position with a lowering of the lower jaw. This is the position taken up for the vowel of get and is known as the neutral position;
(d) tightly pursed, so that the aperture is small and rounded, as in the vowel of do, or more markedly so in the French vowel of doux. This is the close rounded position;
(e) held wide apart, but with slight projection and rounding, as in the vowel of got. This is the open rounded position.
Variations of these five positions may be encountered, e.g. in the vowel of saw, for which a type of lip-rounding between open and close is commonly used. It will be seen from the examples given that lip position is particularly significant in the
formation of vowel quality. English consonants, on the other hand, with the exception of [p,b,m,w], whose primary articulation involves Hp action, will tend to share the lip position of the adjacent vowel. In addition, the lower lip is an active articulator in the pronunciation of [f,v], a light contact being made between the l^wer Up and the upper teeth.
(2) Of all the movable organs within the mouth, the tongue is by far the most flexible, and is capable of assuming a great variety of positions in the articulation of fcoth vowels and consonants. The tongue is a complex muscular structure which joes not show obvious sections; yet, since its position must often be described in considerable detail, certain arbitrary divisions are made. When the tongue is at rest, with its tip lying behind the lower teeth, that part which lies opposite the hard palate is called the front and that which faces the soft palate is called the back, with the region where the front and back meet known as the centre (adjective: central). These areas together with the root are sometimes collectively referred to as the body of the tongue. The tapering section facing the teeth ridge is called the Made (adjective: laminal) and its extremity the tip (adjective: apical). The edges of the tongue are known as the rims.
Generally, in the articulation of vowels, the tongue tip remains low behind the lower teeth. The body of the tongue may, however, be 'bunched up' in different ways, e.g. the front may be the highest part, as when we say the vowel of he; or the back may be most prominent, as in the case of the vowel in who; or the whole surface may be relatively low and flat, as in the case of the vowel in ah. Such changes of shape can be felt if the above words are said in succession. These changes, moreover, together with the variations in lip position, have the effect of modifying very considerably the size of the mouth cavity and of dividing this chamber into two parts: that cavity which is in the forward part of the mouth behind the lips and that which is in the rear, in the region of the pharynx.
The various parts of the tongue may also come into contact with the roof of the mouth. Thus, the tip, blade, and rims may articulate with the teeth, as for the tk sounds in English, or with the upper alveolar ridge, as in the case of /t,d,s,z,n/, or the apical contact may be only partial, as in the case of /l/ (where the tip makes firm contact whilst the rims make none), or intermittent in a trilled /r/ as in some forms of Scottish English. In some languages, notably those of India, Pakistan, and Sri Lanka, the tip contact may be retracted to the very back of the teeth ridge or even slightly behind it; the same kind of retroflexion, without the tip contact, is typical of some kinds of English /r/, e.g. those used in south-west England and in the TJSA.
The front of the tongue may articulate against or near to the hard palate. Such a raising of the front of the tongue towards the palate (palatalization) is an essential PWt of the t/,j] sounds in English words such as she and measure, being additional wan articulation made between the blade and the alveolar ridge; or again, it is the "•uj feature of the [j) sound initially in yield.
_The back of the tongue can form a total obstruction by its contact with the soft j**ate, raised in the case of [k,g] and lowered for [rj], as in sing; or again, there may T^ely be a narrowing between the soft palate and the back of the tongue, so that ™fct»on of the type occurring finally in the Scottish pronunciation of loch is heard, finally, the uvula may vibrate against the back of the tongue, or there may be a
16 Speech and Language
The Production of Speech 17
narrowing in this region which causes uvular friction, as at the beginning of the French word rouge.
It will be seen from these few examples that, whereas for vowels the tongue is generally held in a position which is convex in relation to the roof of the mouth, some consonant articulations, such as the southern British English /r/ in red and the IM in table, will involve the 'hollowing' of the body of the tongue so that it has, at least partially, a concave relationship with the roof of the mouth.
Moreover, the surface of the tongue, viewed from the front, may take on various forms: there may be a narrow groove running from back to front down the mid line as for the /s/ insee, or the grooving may be very much more diffuse as in the case of the / j/ in ship; or again, the whole tongue may be laterally contracted, with or without a depression in the centre (sulcalization), as is the case with various kinds of r sounds.
(3) The oral speech mechanism is readily accessible to direct observation as far as the lip movements are concerned, as are many of the tongue movements which take place in the forward part of the mouth. A lateral view of the shape of the tongue over all its length and its relationship with the palate and the velum may be obtained by means of still and moving x-ray photography and by MRI. It is not, however, to be expected that pictures of the articulation of, say, the vowel in cat will show an identical tongue position for the pronunciation of a number of individuals. Not only is the sound itself likely to be different from one individual to another, but, even if the sound is for all practical purposes the 'same*, the tongue positions may be different, since the boundaries of the mouth cavity are not identical for two speakers; and, in any case, two sounds judged to be the same may be produced by the same individual with different articulations. When, therefore, we describe an articulation in detail, it should be understood that such an articulation is typical for the sound in question, but that variations are to be expected.
Palatography, showing the extent of the area of contact between the tongue and the roof of the mouth, has long been a more practical and informative way of recording tongue movements. At one time the palate was coated with a powdery substance, the articulation was made, and the 'wipe-off subsequently photographed. But the modern method uses electropalatography, whereby electrodes on a false palate respond to any tongue contact, the contact points being simultaneously registered on a visual display. This has the advantage of showing a series of representations of the changing contacts between the tongue and the palate during speech. Electropalatograms of this sort are used to illustrate the articulations of consonants in Chapter 9.
(3) The position of the soft palate, which will decide whether or not the sound j,gg nasal resonances.
(4) The disposition of the various movable organs of the mouth, i.e. the shape of jj^ Kps and tongue, in order to determine the nature of the related oral and oropharyngeal cavities.
In addition, it may be necessary to provide other information concerning, for instance, a particular secondary narrowing, or tenseness which may accompany the primary articulation; or again, when it is a question of a sound with no steady state to describe, an indication of the kind of movement which is taking place. A systematic classification of possible speech sounds is given in Chapter 4,
2.3 Articulator}? Description_
We have now reviewed briefly the complex modifications which are made to the original airstream by a mechanism which extends from the lungs to the mouth and nose. The description of any sound necessitates the provision of certain basic information:
(1) The nature of the airstream; usually, this will be expelled by direct action of the lungs, but we shall later consider cases where this is not so.
(2) The action of the vocal folds; in particular, whether they are closed, wide apart, or vibrating.
The Sounds of Speech 19
The Sounds of Speech: The Acoustic and Auditory Aspects
3.1 Sound Quality
To complete an act of communication, it is not normally sufficient that our speech mechanism should simply function in such a way as to produce sounds; these in turn must be received by a hearing mechanism and interpreted, after having been transmitted through a medium, such as the air, which is capable of conveying sounds. We must now, therefore, examine briefly the nature of the sounds which we hear, the characteristics of the transmission phase of these sounds, and the way in which these sounds are perceived by a listener.
When we listen to a continuous utterance, we perceive an ever-changing pattern of sound. As we have seen, when it is a question of our own language, we are not conscious of all the complexities of pattern which reach our ears: we tend consciously to perceive and interpret only those sound features which are relevant to the intelligibility of our language. Nevertheless, despite this linguistic selection which we ultimately make, we are aware that this changing pattern consists of variations of different kinds: of sound quality—we hear a variety of vowels and consonants; of pitch—we appreciate the melody, or intonation, of the utterance; of loudness—we will agree that some sounds or syllables sound 'louder' than others; and of length— some sounds will be appreciably longer to our ears than others. These are judgements made by a listener in respect of a sound continuum emitted by a speaker and, if the sound stimulus from the speaker and response from the listener are made in terms of the same linguistic system, then the utterance will be meaningful for speaker and listener alike. It is reasonable to assume, therefore, that there is some constant relationship between the speaker's articulation and the listener's reception of sound variations. In other words, it should be possible to link through the transmission phase the listener's impressions of changes of quality, pitch, loudness, and length to some articulator activity on the part of the speaker. It will in fact be seen that an exact parallelism or correlation between the production, transmission, and reception phases of speech is not always easy to establish, the investigation of such relationships being one of the tasks of present-day phonetic studies.
The formation of any sound requires that a vibrating medium should be set in potion by some kind of energy. We have seen that in the case of the human speech mtrhnn'"" the function of vibrator is often fulfilled by the vocal folds, and that tfeeae are activated by air pressure from the lungs. In addition, any such sound produced in the larynx is modified by the resonating chambers of the pharynx, mouth, and. ifl certain cases, the nasal cavities. The listener's impression of sound quality will be determined by the way in which the speaker's vibrator and mMofttors function together.
Speech sounds, like other sounds, are conveyed to our ears by means of waves of compression and rarefaction of the air particles (the commonest medium of communication). These variations in pressure, initiated by the action of the vibrator, are propagated in all directions from the source, the air particles themselves vibrating at the same rate (or frequency) as the original vibrator. In speech, these vibrations may be of a complex but regular pattern, producing 'tone' inch as may be heard in a vowel sound; or they may be of an irregular kind, producing 'noise*, such as we have in the consonant /s/; or there may be both regular and irregular vibrations present, i.e. a combination of tone and noise, as in /z/. In the production of normal vowels, the vibrator is normally provided by the vocal folds; in the case of many consonant articulations, however, a source of air disturbance is provided by constriction at a point above the larynx, with or without accompanying vocal fold vibrations.
Despite the fact that the basis of all normal vowels is the glottal tone, we are all capable of distinguishing a large number of vowel qualities. Yet the glottal vibrations in the case of la:] are not very different from those for [i.], when both vowels are said with the same pitch. The modifications in quality which we perceive are due to the action of the supraglottal resonators which we have previously described. To understand this action, it is necessary to consider a little more closely the nature of the glottal vibrations.
It has already been mentioned that the glottal tone is the result of a complex, but "Minly regular, vibratory motion. In fact, the vocal folds vibrate in such a way as to Produce, in addition to a basic vibration over their whole length (the fundamental •"Wquency), a number of overtones or harmonics having frequencies which are simple multiples of the fundamental or first harmonic. Thus, if there is a fundamental frequency of vibration of 100 Hz, the upper harmonics will be of the order of 200, 300, 400, etc. Hz. Indeed, there may be no energy at the fundamental ™*w»cy, but merely the harmonics of higher frequency such as 200, 300, 400 Hz. j*j**rtheless, we still perceive a pitch which is appropriate to a fundamental ™
ency of 100 Hz; i.e. the fundamental frequency is the highest common factor r?fre
28 Speech and Language
are consonants, while in beat, bit, bet, but, bought, the sounds represented by are vowels. This reference to the functioning of sounds in syllables in a particular language is a phonological definition. But once any attempt is made to define what sorts of sounds generally occur in these different syllable-positions, then we are moving to a phonetic definition. This type of definition might define vowels as median (air must escape over the middle of the tongue, thus excluding the lateral [1]), oral (air must escape through the mouth, thus excluding nasals like [nl), frictionless (thus excluding fricatives like [s]), and continuant (thus excluding plosives like [p]); all sounds excluded from this definition would be consonants. But difficulties arise in English with this definition (and with others of this sort) because English /j,w,r/, which are consonants phonologically (functioning at the edges of syllables) are vowels phonetically. Because of this, these sounds are often called semi-vowels. The reverse type of difficulty is encountered in words like sudden and little, where the final consonants /n/ and /!/ form syllables on their own and hence must be the centre of such syllables even though they are phonetically consonants, and even though /n/ and /l/ more frequently occur at the edges of syllables, as in net and let. When occurring in words like sudden and little, nasals and laterals are called syllabic consonants.
In this chapter we will be describing and classifying speech sounds phonetically (in the next chapter we return to the phonological definitions). We shall find that consonants can be voiced or voiceless, and are most easily described wholly in articulatory terms, since we can generally feel the contacts and movements involved. Vowels, on the other hand, are voiced, and, depending as they do on subtle adjustments of the body of the tongue, are more easily described in terms of auditory relationships.
4.3 Consonants
We have seen, in the preceding chapters, that the production of a speech sound may involve the action of a source of energy, a vibrator, and the movement of certain supraglottal organs. In the case of consonantal articulations, a description must provide answers to the following questions:
(1) Is the airstream set in motion by the lungs or by some other means? (pulmonic or non-pulmonic)
(2) Is the airstream forced outwards or sucked inwards? (egressive or ingressive)
(3) Do the vocal folds vibrate or not? (voiced or voiceless)
(4) Is the soft palate raised, directing the airstream wholly through the mouth, or lowered, allowing the passage of air through the nose? (oral, or nasal or nasalized)
(5) At what point or points and between what organs does the closure or narrowing take place? (place of articulation)
(6) What is the type of closure or narrowing at the point of articulation? (manner of articulation)
In the case of the sound [z], occurring medially in the word easy, the following answers would be given:
(1) pulmonic
(2) egressive
Description and Classification of Speech Sounds 29
(3) voiced
(4) oral
(5) tongue tip-alveolar ridge
(6) fricative
These answers provide a concise phonetic label for the sound; a more detailed description would include additional information concerning, for instance, the shape of the remainder of the tongue, the relative position of the jaws, and the lip position.
4.3.1 Egressive Pulmonic Consonants
Most speech sounds are made with egressive lung air. Virtually all English sounds are so made, the exception being [p,t,k], which in some dialects become ejectives (see §4.3.9 below).
4.3.2. Voicing
At any place of articulation, a consonantal articulation may be voiceless or voiced.
4.3.3 Place of Articulation
The chief points of articulation are the following:
Bilabial. The two lips are the primary articulators, e.g. [p,b,m]. Labiodental. The lower lip articulates with the upper teeth, e.g. [f,v]. Dental. The tongue tip and rims articulate with the upper teeth, e.g. [0,8], as in think and then.
Alveolar. The tip or blade of the tongue articulates with the alveolar ridge, e.g. [t,d,l,n,s,z].
Post-alveolar. The tip (and rims) of the tongue articulate with the rear part of the alveolar ridge, e.g. [i] as at the beginning of English red.
Retroflex. The tip of the tongue is curled back to articulate with the part of the hard palate immediately behind the alveolar ridge, e.g. [j] such as is found in southwest British and American English pronunciation of red.
Palato-alveolar. The blade, or the tip and blade, of the tongue articulates with the alveolar ridge and there is at the same time a raising of the front of the tongue towards the hard palate, e.g. [f,as in English ship, measure, beach, edge.1
Palatal. The front of the tongue articulates with the hard palate, e.g. [j] or [cj as in queue tkju:) or [kcu:] or a very advanced type of [k,g] = [c,j], as in French quitter or guide.
Velar. The back of the tongue articulates with the soft palate, e.g. [k,g,rj], the last as in sing.
Uvular. The back of the tongue articulates with the uvula, e.g. [k] as in French rouge.
1 Note that these are called post-alveolar on the chart of the International Phonetic Alphabet (Table 1).
30 Speech and Language
Glottal. An obstruction, or a narrowing causing friction but not vibration, between the vocal folds, e.g. [hj.
In the case of some consonantal sounds, there may be a secondary place of articulation in addition to the primary. Thus, in the so-called 'dark' [*], as at the end of pull, in addition to the partial alveolar contact, there is an essential raising of the back of the tongue towards the velum (velarization); or, again, some post-alveolar articulations of [j] are accompanied by slight lip-rounding (labialization). The place of primary articulation is that of the greatest stricture, that which gives rise to the greatest obstruction to the airflow. The secondary articulation exhibits a stricture of lesser rank. Where there are two coextensive strictures of equal rank, an example of double articulation results.
4.3.4 Manner of Articulation
The obstruction made by the organs may be total, intermittent, or partial, or may merely constitute a narrowing sufficient to cause friction. The chief types of articulation, in decreasing degrees of closure, are as follows:
(1) Complete Closure
Plosive. A complete closure at some point in the vocal tract, behind which the air pressure builds up and can be released explosively, e.g. [p,b,t,d,k,g,?].
Affricate. A complete closure at some point in the mouth, behind which the air pressure builds up; the separation of the organs is however slow compared with that of a plosive, so that friction is a characteristic second element of the sound e.c
Nasal. A complete closure at some point in the mouth but, the soft palate being lowered, the air escapes through the nose. These sounds are continuants and, in the voiced form, have no noise component; they are, to this extent, vowel-like, e.g. [m,n,n].
(2) Intermittent Closure
Trill (or roll). A series of rapid intermittent closures made by a flexible organ on a firmer surface, e.g. [r], where the tongue tip trills against the alveolar ridge as in Spanish perro, or [r] where the uvula trills against the back of tongue, as in a stage pronunciation of French rouge.
Tap. A single tap made by a flexible organ on a firmer surface, e.g. [r] where the tongue tip taps once against the teeth ridge, as in many Scottish pronunciations of English hi.
(3) Partial Closure
Lateral. A partial (but firm) closure is made at some point in the mouth, the airstream being allowed to escape on one or both sides of the contact. These sounds may be continuant and frictionless and therefore vowel-like (i.e. approximants like the sounds in (5) below), as in [1,*], as pronounced in southern British little [lit*], or they may be accompanied by a little friction [1] as in fling or by considerable friction [t] as in please.
7
Description and Classification of Speech Sounds 31
(4) Narrowing
Fricative. Two organs approximate to such an extent that the airstream passes between them with friction, e.g. [f,v,e,3,s,z,(,3,c,x,h]. In the bilabial region, a distinction is to be made between those purely bilabial such as [$,p], where the friction occurs between spread lips, and a labial-velar sound like [m], where the friction occurs between rounded lips and is accompanied by a characteristic modification of the mouth cavity brought about by the raising of the back of the tongue towards the velum, [cj occurs at the beginning of huge, [x] and [m] in Scottish pronunciations of lock and which, and []}] in Spanish haber.
(5) Narrowing without Friction
Approximant (or Frictionless Continuant). A narrowing is made in the mouth but the narrowing is not quite sufficient to cause friction. In being frictionless and continuant, approximants are vowel-like; however, they function phonologically as consonants, i.e. they appear at the edges of syllables. They also differ phonetically from such sounds functioning as vowels in either of two ways. Firstly, the articulation may not involve the body of the tongue, e.g. post-alveolar [j] and labiodental [v], the former the usual pronunciation in RP at the beginning of red, the latter a speech-defective pronunciation of the same sound. Secondly, where they do involve the body of the tongue, the articulations represent only brief glides to a following vowel: thus [j] in yet is a glide starting from the [i] region and [w) in wet is a glide starting from the [u] region.
4,3.5 Obstruents and Sonorants
It is sometimes found useful to classify categories of sounds according to their noise component. Those in whose production the constriction impeding the airflow through the vocal tract is sufficient to cause noise are known as obstruents. This category comprises plosives, fricatives, and affricates, sonorants are those voiced sounds in which there is no noise component (i.e. voiced nasals, approximants, and vowels).
43.6 Fortis and Lenis
A voiceless/voiced pair such as English /s,z/ are distinguished not only by the presence or absence of voice but also by the degree of breath and muscular effort involved in the articulation. Those English consonants which are usually voiced tend to be articulated with relatively weak energy, whereas those which arc always voiceless are relatively strong. Indeed, we shall see that in certain situations the so-called voiced consonants may have very little voicing, so that the energy of articulation becomes a significant factor.
4.3.7 Classification of Consonants
The chart of the International Phonetic Alphabet (IPA) (see Table 1) shows manner of articulation on the vertical axis; place of articulation on the horizontal axis; and
Description and Classification of Speech Sounds 33
■f w £ 1
! f !
I 3 h
v v t 3
S * B B B
«l> )4> ,oS]; 7, [oj; 8, [u).
It is to be noticed that the front series [i,e,e,a] and [a] of the back series are pronounced with spread or open lips, whereas the remaining three members of the back series have varying degrees of lip-rounding. The combination of tongue and lip positions in the primary Cardinal Vowels are the most frequent in languages; i.e. front and open vowels are most commonly unrounded while back vowels other than in the open position are most commonly rounded. A secondary series can be obtained by reversing the lip positions, e.g. lip-rounding applied to the [i] tongue position, or lip-spreading applied to the [u] position. Such a secondary series is denoted by the following numbers and symbols: 9, [y]; 10, [0]; 11, Ice]; 12, [02]; 13 [d]; 14, [a]; 15, [r]; 16, [ui].
This complete series of sixteen Cardinal Vowel values may be divided into two lip shape categories, with corresponding tongue positions:
unrounded: [i,e,e,a,a,A,v,ui], rounded; [y,0,ce,{E,o,j,o,u].
Such a scale is useful because (a) the vowel qualities are unrelated to particular values in languages, though many may occur in various languages, and (b) the set is recorded, so that reference may always be made to a standard, invariable scale.2 Thus a vowel quality can be described as being, for instance, similar to that of Cardinal 2 ([e]), or another as being a type half-way between Cardinal 6 ([:>]) and Cardinal 7 ([o]), but somewhat centralized. Diacritics are available in the IPA alphabet to show modifications of Cardinal values, e.g. a subscript, to mean more open, a subscript, meaning closer, and raised dots " to mean centralized. The last example given above might in this way be symbolized as [d] or [0].
It is, moreover, possible to give a visual representation of these vowel relation-
2 Copies of the original recording of the Cardinal vowels by Daniel Jones are available from the Phonetics Laboratory, Department of Linguistics, University of Manchester, Manchester M13 9PL.
Description and Classification of Speech Sounds 37
ships on a chart which is based on the Cardinal Vowel tongue positions. The simplified diagram shown in Fig. 5 is obtained by plotting the highest point of tongue-raising for each of the primary Cardinal Vowels and joining the points together. The internal triangle, corresponding to the region of centra! or [a]-type vowel sounds, is made by dividing the top line into three approximately equal sections and drawing lines parallel to the two sides, so that they meet near the base of the figure. On such a figure, the sound symbolized by [3] or [o] may have its relationship to the Cardinal scale shown visually (see the black circle on Fig. 5).
It must be understood that this diagram is a highly conventionalized one which shows, above all, quality relationships. Some attempt is, however, made to relate the shape of the figure to actual tongue positions: thus the range of movement is greater at the top of the figure, and the tongue-raising of front vowels becomes more retracted as the tongue position lowers. Nevertheless, it has been shown that it is possible to articulate vowel qualities without the tongue and lip positions which this diagram seems to postulate as necessary. It is, for instance, possible to produce a sound of the Cardinal 7 ({o]) type without the lip-tongue relationship suggested. But, on the whole, it may be assumed that a certain auditorily identified vowel quality will be produced by an articulation of the kind presupposed by the Cardinal Vowel diagram. Moreover, it is a remarkable fact that the auditory judgements as to vowel relationships made by Daniel Jones have been largely supported by recent acoustic analysis; in fact, a chart based on an acoustic analysis of Cardinal Vowel qualities corresponds very well with the traditional Cardinal Vowel figure.
4.4.3 Nasality
Besides the information concerning lip and tongue positions which the chart and symbolization denote, a vowel description must also indicate whether the vowel is purely oral or whether it is nasalized. The sixteen Cardinal Vowels mentioned may all be transformed into their nasalized counterparts if the soft palate is lowered. It is unusual, however, to find such an extensive series of nasalized vowels, since it is unusual (though not unknown) for languages to make such fine, significant,
C.i [i]
C.8 [u]
or [mi:s], where the change [u;] > [y:] can be explained by the fronting of [u:] under the influence of the [i:] of the following syllable. Such a combinative change belongs to OE, but a more recent change of this type is exemplified by words such as swan. This word was probably pronounced [swan] or [swaen] in about 1600, but the [w] sound has rounded and retracted the vowel to give the modern form [swon]. The large majority of earlier [w] + [a] sequences have now given [w] + [d], or [;>:), by reason of this combinative change affecting this particular sound sequence, e.g. want, quality, war, water.
(3) Some changes are neither independent nor dependent upon the phonetic context; they may be said to be external to the main line of evolution. Thus it was fashionable in Elizabethan times to pronounce such words as servant and heard with [aer] or [ar), perhaps originally a dialect form, rather than with [er], the regular form of development; these words, with some exceptions such as clerk, have reverted to the normal development of ME [er] > [s:] rather than [a:]. It was also fashionable to pronounce the termination -ing as [in], only now retained as a special form of affectation or in some dialects. Such changes, involving a change of distribution of phonemes among words and morphemes, do not affect the phonemic system of the language. The introduction of foreign words may, however, at least temporarily and in the speech of a restricted number of individuals, disturb the number of phonemes or their distribution as regards position in the word. Thus, if the French word beige is used in English with the pronunciation /beij/, we have a case of a final /■$/ previously unknown in English words; or again, if restaurant is pronounced with any kind of nasalized vowel in the last syllable, the possibility of a new kind of vocalic opposition is introduced into the language. However, such foreign borrowings generally tend to conform to the English system: words with a final French /$/, such as prestige or camouflage may be realized in the English form with /<%/, and a word with a nasalized vowel like restaurant will be normalized to AestarDn/, rrestarrmt/, or Aestrant/.
(4) In addition to changes of quality, there have also to be taken into account changes involving length and accentual pattern. Thus the vowel in such words as path, half, pass, still short three hundred years ago, is now long in the south of England. Or again, the vowels in good, book and breath, death, once long, are now relatively short. Changes of accent are particularly striking in the case of words which have come into the language from French: in ME, such words as village or necessary retained their accent on the penultimate syllable— /vflaidp/ and /nese'sa:rra/. Now, the accent has shifted to an earlier syllable, together with associated changes of quality— /"vilidj/, Anesssri/ (the latter may retain the ME pattern in American English). Later borrowings, or those in less common use, often retain the French accentual pattern—thus hotel or machine, have the accent on the final syllable, whereas, if they had conformed to the English system, we might have had such modern forms as Ahautl/ and /'mae^m/ or rmeiffin/, in the same way that the thoroughly anglicized form of garage gives /"gsenc^/. (See §7.5 on current changes.)
6.2.2 Rate and Route of Vowel Change
The English vowels have been subject to more striking changes than have the consonants. This is not surprising, for a consonantal articulation usually involves
r
66 The Sounds of English
an approximation of organs which can be felt; such an articulation tends to be more stable, in that it is more easily identified and transmitted more exactly from one generation to another. Changes in the consonantal system comparatively rarely involve a modification of sound (an example of such a modification would be the affrication, for combinative reasons, of the OE palatal plosives [c,i] to [tf,cy as in church < OE cirice and bridge < OE brycg). Far more common is the type of distributional change involving the conferment of phonemic status on an existing sound (e.g. [v,9,z], allophones of /f,6,s/ in OE, later obtain contrastive, phonemic, significance), the disappearance of an allophone (e.g. postvocalic [x] and [cj in such words as brought and right were largely lost in the south of England by the seventeenth century), or the insertion of an existing phoneme in a particular class of words (e.g. the initial /h/ in words of French origin such as herb, homage). Whether it is a question of consonantal change, loss, or addition, it is usually possible to explain the type of modification which has taken place and the approximate period during which it occurred.
A modification of vowel quality will, however, result from very slight changes of tongue or lip position, and there may be a series of imperceptible gradations before an appreciable quality change is evident (or is capable of being expressed by means of the Latin vowel letters). It is particularly difficult to assess rate and phonetic route of change in the case of those internal independent vowel changes which affect a phoneme throughout the language. It is known, for instance, that the modern homophones meet and meat had in ME different vowel forms, approximately of the value [e:] and [e:[. The [e:] vowel of meet became [i:] by about 1500, and it might be postulated that by a process of gradual change the [e:] of meat first closed to [e:] and then, by the eighteenth century, coalesced with the [i:) in meet. The available evidence, however, suggests that the change [v.] > [i:J may not have been either simple or gradual, but that two pronunciations existed side by side for a long period (the conservative [c:] beside another form [i:] which had resulted from an early coalescence with the meet vowel). In other vowel changes, it may be agreed that the change was gradual, but it is difficult to date precisely the stages of development. Thus the modern /ai/ of time results from a ME [i:] value; it is clear that the change has been one of progressive, widening diphthongization, but there may have been a period of incipient diphthongization when there was hesitation between the pure vowel [i:) and some such diphthong as [n] or [ai]. It is well to remember, therefore, that at any particular time in history there are likely to be a number of different, coexistent realizations of vowel phonemes, not only between regions but also between generations and social groups. An example of such variety in modern English is provided by the vowel at the end of city, which in the south of England may be rendered as [i] by the older generation and as something more like [i] by younger people. The speech of any community may, therefore, be said to reflect the pronunciation of the previous century and to anticipate that of the next.
6.2.3 Sound Change and the Linguistic System
It is convenient to study sound change in terms of the development of particular phonemes or sounds, but it is misleading to ignore the relationship of the sound units to the system within which they function and which may, in fact, not be
The Historical Background 67
changing. In other words, although there may be considerable qualitative changes, the number and pattern of the terms within the system may show relative stability. The ME I'y.l phoneme, for instance, is now realized as [ai], but there is still a phonemic opposition which contrasts such words as time, team, tame, term, tomb, and, in any case, a new phoneme /i:/ has emerged in words of the team type. On the other hand, the system may change because a sound, without itself changing, may receive a new, phonemic, value; e.g. the sound [rj] has always existed in English as a realization of /n/ followed by the velars /k/ or /g/, but when the final /g/ in a word like sing was no longer pronounced, /rj/ contrasted significantly with /n/ and /m/, e.g. ram, ran, and rang.
Since the system of our language consists of a framework of significant oppositions by means of which we communicate, it may be assumed that there is a tendency for the system to remain stable, the loss of an opposition involving a possibility of confusion. In fact, of course, the redundancy of English is such that some degree of neutralization of phonemes is easily tolerated: today, few speakers in the south of England distinguish saw and sore by means of an opposition JyJ-JodJ, yet the loss of the /;»/ diphthong is no impediment to communication. An example of an earlier coalescence of vowel phonemes is that illustrated by the homophony of meet and meat. On the other hand, new oppositions may emerge in the language, e.g. the phonemes /v,5,z,rj/, as we have seen. Nevertheless, despite the adjustments in the number of phonemes which have taken place, the history of the English sound system displays, over the last 1,000 years, a considerable degree of stability.
Though the relationships within the system may tend to remain stable, a change of phonetic realization of any phoneme is likely to have qualitative repercussions throughout the system. Such a disturbance may be observed in modern English. The phonetic relationship of the vowel phonemes in set and sat, in one type of pronunciation, is of a front vowel between close-mid and open-mid to a front vowel between open-mid and open. If, however, the vowel of sat has a closer articulation than that described, that of set must be raised too. A limit of raising is imposed by the presence of sit and seat, for it is not possible to raise the vowel of sit to any extent without danger of confusion with that of seat, unless the latter vowel becomes strongly diphthongal. (It may be objected that a quantitative as well as qualitative difference distinguishes /i:/ from hi; but in the examples given—seat and sit—the phonetic context imposes a quantity on /i:/ which is practically the same as that of/i/. If /i/ were too close to the region of /i:/, the opposition would be maintained only by realizing /i:/ as fully long at the expense of the shortening influence of the final /1/ (or by a process such as diphthongization.) Alternatively, if the vowel phoneme of sat is realized as a front open vowel, as in many English regional dialects, the vocalic area in which the phoneme of set can be realized becomes more extensive; in fact, in those kinds of English where this occurs, the vowel in set tends to be open-mid variety. Such considerations of the phonetic relationship of phonemes have a relevance in the historical, diachronic study of English. In ME there were, for instance, four long vowels in the front region— /i:,e:,e:,a:/. By 1600 Jr./ had diphthongized and the remaining vowels closed up. Such a movement may have been caused by pressure upwards from /a:/ or by the creation of an empty space brought about by the diphthongization of the pure vowel Jr./.
68 The Sounds of English
Although, therefore, it is often convenient in diachronic studies to investigate the development of individual phonemes in terms of the quality of their realization, it is clear that many sound changes can be explained only by reference to a readjustment of the phonetic relationships of the phonemes of the system as a whole. Moreover, any particular point in the development of the sound system of a language is not simply to be considered as a stage in the process of change of a number of sound units but rather as the presentation of the functioning of a system at a certain historical moment. The primary significance of the sounds of modern English is their function in the system of today; in the same way, the English sounds of 1600 are to be viewed in terms not only of their past and future forms but also of their contemporary, synchronic relationships and functions.
Some sound changes are, indeed, the result of an influence which applies to the system as a whole. Those drastic changes of vowel quality known as the Great Vowel Shift mainly affect vowels in accented syllables. But vowels in most unaccented syllables (especially those in word-final positions) have undergone, in the last thousand years, an equally striking, though different, type of change. Henry Sweet has called OE the period of full endings, stanas being realized as ['sta:nasl; ME, the period of levelled endings, when stones was pronounced rsto:nasJ; and eModE and later English, the period of lost endings, when stones is [sto:nz], [staunz]. There is, therefore, a general tendency for all unaccented vowels to shorten (if long) and to gravitate towards the weak centralized vowels [i] or [a], or sometimes [u], if not to disappear altogether. This fact accounts for the high frequency of occurrence of [ij and [a] in PresE and for the complete elision of many vowels in unaccented syllables in rapid colloquial speech, e.g. suppose [spauz], probably [pjobbli].
6.2.4 Sources of Evidence for Reconstruction
Whether our aim is to reconstruct the phonological system of English at any particular moment in history or to estimate the nature of the development affecting particular phonemes, it is necessary to establish the sound values which were used in the pronunciation of the language—relative values in the case of the system, absolute values as far as possible in the case of sound development. An investigation of the phonological structure of PresE would have to include direct observation of its phonetic features. For this purpose, future generations will have the benefit of recordings of the speech of today. Obviously, this type of evidence cannot be used for the reconstruction of past states of the spoken language. The further back we go into history the scantier the evidence of spoken forms becomes. Our conclusions will, therefore, be based on information mostly of an indirect kind; yet such is the agreement generally amongst the various types of evidence that the broad lines of sound change can be conjectured with reasonable certainty.
(I) Theoretical paths of development. If, in dealing with the changing realization of a particular phoneme, we can be reasonably sure of its sound value at two points in history, we can, from our knowledge of phonetic possibilities and probabilities, infer theoretically the intervening stages of development. We can, of course, be sure of the pronunciation of PresE. If, then, the evidence suggested unequivocally that, for instance, the vowel in home was pronounced as [a:] in OE, the development to
r
The Historical Background 69
be described and accounted for would be [a:] > [au]. It is likely that the articulation has always involved the back, rather than the front, of the tongue; the change has clearly meant a closing of the tongue position, to which at some stage there has been added a gliding (diphthongal) movement. We might, therefore, postulate such developments as [a;>au>ou>aoj or [y.>o:>ou>3o]. The available evidence will then confirm or refute the hypothesis—in this case the second solution being more in keeping with the information. Such recognition of phonetic probabilities will always be implicit in the tracing of change. It must be considered unlikely that [a:] on its way to [ou] or [au] would have passed through a stage of front articulation, without any combinative influence. Nevertheless, the possibility of a type of change which is not the most probable theoretically must never be excluded. The rounded close-mid back ME [rl developed by the nineteenth century to an unrounded open-mid centralized back [a]; and in the London area this vowel has now become more open and more front [a]. Yet, at the same time, there is a tendency to make the vowel in sad more open. There is here a potential conflict, and the future development of these vowels is uncertain. It would, therefore, be dangerous to predict, merely according to phonetic probabilities, the way our present sound system will develop.
(2) Old English. It is most important in an investigation of the development of English sounds over the last thousand years that the pronunciation of OE should be established with some certainty. If this can be done, we shall have a 'starting-point' for the phonetic route of change to PresE. The term Old English, however, spans a period of some four hundred years from about ad 700 ad 1100. Moreover, the invasion of the Angles, Saxons, and Jutes in the fifth and sixth centuries introduced four separate varieties of English: the Angles, in the Midlands, north-east England, and the south of Scotland, using types of English known as Mercian and Northumbrian (or, in general terms, anglian); the Saxons, in the south and south-west, using the west saxon dialect; and the Jutes, settling mainly in the region of Kent and using a dialect called kentish. Of the four dialects, West Saxon, which was to become a kind of standard language, is the one about which most is known from the extant texts. In its later form—that in use between about ad 900 and ad 1100— it is referred to as Classical OE.
The broad lines of the pronunciation of this language can be conjectured from a comparison of the development of the other members of the West Germanic group of languages to which it is related. But by far the most explicit evidence concerning its sounds is to be inferred from the alphabet in which it is written. The earlier runic spelling was replaced by a form of the Latin alphabet. This alphabet was probably introduced into the country in the seventh century by Irish missionaries. It can be assumed, therefore, that the sounds of OE were represented as far as possible by the Latin letters with their Latin values, with some modifications of an Irish kind. A great deal is known about the pronunciation of Vulgar Latin, whose sound system had much in common with that of modern Italian. If an Italian, knowing no English, were today asked to write down with his own spelling the PresE pronunciation of the word milk [mrrk), he would have no difficulty in representing the first sound, which he could spell as m; the vowel [i] might, however, seem to him to resemble the sound he would write in Italian as e rather than as i; the 'dark' [i] would appear to have a back vowel glide accompanying it, requiring a spelling such as ol\ and, since he has no k letter, he would spell the final [k] as c. His transcription
70 The Sounds of English
The Historical Background 71
of the word might, therefore, be meolc, which is, in fact, a West Saxon spelling of the word now written milk. This is a fortuitous example, and must not be taken to suggest that OE was pronounced in the same way as PresE. But it does demonstrate that OE spellings, which may appear to be very different, are often less surprising when we keep in mind the Latin values originally attached to the letters.
Sometimes the simple forms of the Latin alphabet were evidently inadequate for representing the English sound: thus the joined form as was used to symbolize a sound between C[a) and Cfc]; the sounds [6] and [5] were written in the earlier manuscripts as th initially and d medially and finally in a word, and later as [5] or the rune p, regardless of the sound's position in the word or its voiced or voiceless quality; the rune p frequently replaced the earlier w or uu. The vowel values of the OE system were particularly difficult to represent with the five Latin vowel letters. Sometimes the spelling used hesitated between two letters: thus the vowel of mann, probably of a C[a] or [d) quality, was written either with a or o, indicating a vowel between the unrounded open central value of the Latin letter a and the rounded open-mid to close-mtd back value of o. Unaccented vowels, too, already beginning to be obscured and levelled, presented a problem to the scribes, the Latin alphabet offering no way of showing a central vowel of the [a] type. Unaccented x, e, and / soon began all to be written as e, and unaccented a, u, o later tended to be used indifferently, indicating that the vowel distinction was being lost. A diphthong such as the one written as ea must probably be interpreted as a glide to a central [a] quality.
Quantity is often shown in the case of vowels by doubling the letter or by the use of an accent and in the case of consonants by doubling the letter. The accent in a word is also sometimes shown by the use of a mark; but, in any case, it is agreed, from a comparison of the West Germanic languages, that the word accent in OE fell generally on the first syllable of words, with the exception of certain compounds.
The written form of OE provides us, therefore, with considerable information concerning the language's pronunciation; we have a working hypothesis from which to begin our investigations. The study of later forms of English will often, in fact, confirm that the OE pronunciation postulated from the spelling and the comparison of Germanic languages is the only one from which later forms can be expected to have developed.
(3) Middle English. Spelling forms can also help us to deduce the pronunciation of the ME period, roughly ad 1100-1450. Generally speaking, it may be said that the letters still had their Latin values and that those letters which were written were meant to be sounded. Thus, the initial k in a word such as knokke was still pronounced and the vowel in time would have an [i] quality. This persistence of Latin values in spelling was no doubt due to the influence of the Church, which was still the centre of teaching and writing, and the absence of a thoroughly standardized spelling accounts for its predominantly phonetic character. However, English spelling was modified by French influences. Notably, the French ch spelling was introduced to represent the [tf) sound in a word such as chin (formerly spelt cinn), where the new spelling form indicates no change of pronunciation; in addition ou, or ow, represents the sound [u], formerly written u, e.g. hous, in OE hus. The simple u spelling was retained to express both the French sound [y] in words like duke and fortune and the OE short [u] sound, though this latter sound is often written as o,
especially when juxtaposed with letters of the iv, m, n type, e.g. wonne rather than wunne, to tivoid confusion between the letter shapes.
Rhymes, too, have their value, especially as, in this period, they are likely to have been satisfactory to the ear as well as to the eye—in the whole of Chaucer's work, for instance, there are very few rhymes which appear to involve the pairing of different vowel sounds. Nevertheless, evidence from rhymes is valueless unless it is possible to be certain, from other sources of evidence, of the pronunciation of one member of the pair. Thus, the Chaucerian rhyme par cas :: was, because we can be sure that the French word cas had a vowel of the [a] quality, is evidence to confirm the view that the (w) of was had not yet retracted and rounded the vowel to (d) and, the final s in the two words being still likely to represent [s], that the word was probably pronounced [was].
Again, words imported from French can give us information concerning the timing of sound changes. Thus French words such as age and couch, which we know from French sources had [a:] and [u:J at the time of their introduction into English, fell in with the English vowel development [a:] > [ei] and [u:] > [au] in words like name and house; we can conclude, therefore, that at the time the French words came into the language the [a:] and [u:] vowels had not begun their change.
Moreover, after the ME period, as we shall see, a great deal of direct evidence is available to us, so that our conjectures from about 1500 onwards can be made with considerable certainty. We may often, therefore, be able to deduce from our knowledge of pronunciation in the sixteenth century, the stage probably reached in the ME period in the development of a sound from OE. The OE [i:] sound in time, for example, was beginning to be diphthongized generally very early in the sixteenth century. It is reasonable to suppose (even if other evidence to support the theory did not exist) that lime still had a relatively pure [i:] for much of the ME period.
Finally, the metre of verse reveals the accent of words. It is for this reason that we know that French words, in Chaucer's verse, generally retained their original accentual pattern, e.g. courage [ku>a:d53], and that the accent shift in these cases is a phenomenon of at least late ME.
(4) Early Modern English. The same sources of evidence which we have already considered remain available for the eModE period, roughly ad 1450-1600. The introduction of printing brought standardization of spelling, and already the spoken and written forms of the language were beginning to diverge. But individuals, especially in their private correspondence, often used spellings of a largely phonetic kind, in the same unsophisticated and logical way that children still do. If a modern child writes He must have gone as He must of gone, he is only representing the phonetic identity of the weak forms of have and o/([av|), an identity which he will learn to ignore when he adopts the conventional spelling distinction. In the same way, if fifteenth- and sixteenth-century spellings show the word sweet occasionally written as swit, it may be assumed that this original ME [e:] was by now so close that it could be represented by i with its Latin value. Or again, the spelling form sarvant instead of servant reflects an open type of vowel in the first syllable which was current throughout the eModE period in such words. Moreover, the conventional adoption of an unphonetic spelling can sometimes provide us with positive evidence as to its value: thus, when words like delight (formerly delite) began to be spelt with gh, this spelling form gh clearly no longer had the
72 The Sounds of English The Historical Background 73
consonantal fricative value which it had formerly represented in light, since there never was a consonantal sound between the vowel and final [t] in delight. We may conclude, therefore, that gh no longer had its former phonetic significance in words such as light. Care must, of course, be taken to identify the increasing number of learned or technical spellings adopted by printers. The initial letter group gh in ghost (OE gasi) indicates no change in pronunciation—goose was also sometimes spelt ghoose in this period. Again, spellings which aim at revealing the etymology (true or false) of a word must usually be discarded as phonetically valueless, e.g. debt, island. Thus from the writings of individuals some general indications concerning sound changes may be gathered and used to supplement evidence derived from other sources.
Rhymes, too, continue to be useful as complementary evidence. A rhyme such as night:: white confirms the view that post-vocalic gh no longer had a consonantal value; or again, can :: swan suggests that the rounding of [a] after [w] had not yet taken place. Yet,justasin thecaseof ME, rhymes must be treated with caution, more particularly as eye-rhymes were doubtless beginning to become more prevalent. In Elizabethan literature, however, additional evidence is afforded by the frequent use of puns, which usually rely for their effect upon similarities, if not identities, of phonetic value. Shakespeare, for instance, plays on the phonetic identity of such pairs as suitor, shooter (both capable of being pronounced [Ju:tsr]) and known, none (both [no:n]); such puns suggest that the pronunciation of the two words was commonly sufficiently close to make an immediate impression upon an audience.
The most important and fruitful evidence for this period is, however, of a direct kind. It is provided by the published works of the contemporary grammarians, orthoepists, and schoolmasters, some of whom have been mentioned in §6.1. They are of unequal value and their statements have often to be interpreted in the light of other evidence; yet they provide us with the first direct descriptive accounts of the pronunciation of English. From the sixteenth century onwards, our conclusions rely more and more on their descriptive statements and less on clues of an indirect kind. Sometimes there appears to be a conflict between the phonetic probabilities, the statements of grammarians, and evidence from other sources. Frequently the solution must be that there existed at any time a variety of current pronunciations, resulting from differences of dialect, generation, fashion, and place in society, in the same way that a description of PresE (even that of a restricted area such as the south of England) would have to take into account a large number of variants.
The following representative systems are conjectures of one possible set of phonemes current in the periods in question.
6.2.5 The Classical Old English Sound System
Vowels i:.i,y:,y u:,u
e:,e o:p ae:, se
oj,o (allophone [d] before nasal consonants)
la] occurs in certain weakly accented syllables Diphthongs e:a,ea; e:a,ea Consonants p,b,t,d,k,g (allophone [y])
m,n (allophone In] before velar consonants) U
f,0,s (medial allophones [v,3,z]
f,h (allophones [x,c])
j,w
Consonants may be long or short.
The spellings hn, hi, hr, hw may be interpreted as phoneme sequences /h/ + [n,l,r,w]; alternatively, if it is assumed that h is here an indication of voiceless [n,I,r,w], these four sounds may be counted as contrastive, i.e. of phonemic status.
Text (St John, Chapter 14, verses 22, 23)
22 ju:dos kwaiO to: him. naes no.: se: skamt dnctsn, hwaet is jswardan flanOu: wilt 0e: sylfna jaswotelijsn us naes middcm cards.
23 se: hae:bnd ondswtuoda Dnd kwae6 him; jif hwa: me: lwvafl he: hilt mi:na sprae:tfa ond mi:n fasdar lova© hina Dnd we: kumad to: him Dnd we: wyrkiaG eardongsto:w3 mid him.
Authorized Version
22 Judas saith unto him, not Iscarioth, Lord, how is it that thou wilt manifest thyself unto us, and not unto the world?
23 Jesus answered and said unto him, If a man love me, he will keep my words; and my Father will love him, and we will come to him, and make our abode with him.
6.2.6 The Middle English Sound System Vowels i:,i u:,u
a:,a a:
[a] occurs in unaccented syllables Diphthongs ci,(aei)pi, iu,(eu), eu,3u,(au) Consonants p,b,t,d,k,g,t/,d5
m,n (allophone [nj before velar consonants)
I,r
f,v,9,ö,s,z,f,h (allophones [x,cj) j,w (allophone [m] after /h/)
Text (from the Prologue to the Canterbury Talesf
hwan 0at a:pnl, wi8 his /u:ras so:ta 8a druxt of mart/ haß persad to: Sa ro:to, and ba:oad e:vri vaein m switf hku:r Di hwitj vertiu endjendard is 6a flu:r,
5 The type of transcription given here is slightly archaic for Chaucer's pronunciation; e.g. long consonants were probably lost in later ME and such words as and, that would have had a weak vowel.
74 The Sounds of English
hwan zefirus e:k wi6 his swe:ta bre:0 inspired ha8 in e:vri halt and he:9 9a tender koppas, and 5a jugga sunna ha6 in 5s ram his halva kurs irunna, and sma:la fu:Ias ma:kan mebdi:a 6at sle:pan a:l 5a met wi9 a:pan i:a— sa: prikaS hem na:tiur m hir kura:d3as— 6an b:rjgan folk to: ga:n an piIgTima:c>;as,
6.2.7 The Early Modern English Sound System
Vowels i:,i u:,u
e: o:,y e:,e a
ae d:,d
/e:/ was probably /i:/ or /e:/ in certain types of pronunciation
[a] and [a:] occur as contextual variants of /*e/ and /d:/ Diphthongs ai,3u,iu (or ju),eu,ou,ai,ui,ri. Consonants p,b,t,d,k,g,tf,d3
m,n,n
l,r
f,v,e,5,s,z,f,3 (later, in medial positions),h j,w (allophone [m] after /h/)
Text (Macbeth, Act II, Scene 1)
nau o:ar 5a wrn ha:f wrrld
ne:tar si:mz ded, and wikid dre:mz abju:z
5a kYrtrind sli:p: witfkraft selibre:ts
pc:l hcksts ofarinz: and wiöard mxrdar,
alaramd bai hiz sentmal, 5a wolf,
hu:z haulz hiz watf, öts wi6 hiz stelöi pe:s,
wi6 tarkwinz raevi/rrj straidz, tu:ardz hiz dizain
mu:vz laik a go:st. 5au sju:r6 and ferm-sct er©
he:r nDt mai steps, hwitf wd Sei wn:k, far fe:r
6ai vein sto:nz pre:t av mai hwe:rabaut,
and te:k 5a prezant hürar fram 5a taim,
hwrtf nau sju:ts6 wi5 it.
6.2.8 The Present English Sound System
Vowels
u:,u
3:,a a:
a:,D
6 Alternatively, (J) or [Jj) for [sjj.
i
I
The Historical Background 75
Diphthongs ei,au,ai,au,ai,ia,Ea,ua Consonants p,b,t,d,k,g
m,n,rj l,r
f,v,9,5,s,z,/,3,h j,w
6.2.9 Modifications in the English Sound System
(1) Distribution of phonemes. The similarities of the systems given above may obscure the fact that the same sound, especially as far as the vowels are concerned, may occur in different categories of words according to the period. Thus [u:], now in food, occurred in OE in words such as town; (i:), now in team, occurred in OE in time. The following summary shows some of the most striking changes affecting the vowel quality used in particular types of word;
OE ME eModE PresE
time i: i: si ai
sweet e: e: i: i:
clean ae e: e: (or [i:]) i:
stone a: a: o: au
name a a: e: ei
moon o: o: u: u:
house u: u: au au
love o u T a
(2) Vowel changes. Several trends become apparent from a study of quality changes:
(a) OE long vowels have closed or diphthongized; on the other hand, PresE [au] and [ei] show signs of monophthongization.
(b) Certain phonemic qualitative oppositions have coalesced, e.g. OE /e:/ and /ae/; the originally separate diphthongs of day and way; the diphthong of know with the originally pure vowel of no; the diphthongs of day, way with the former pure vowel of name; OE /y:,y/ with /i:,i/,
(c) Short vowels, with the notable exceptions of the OE /a,ae/ (and the short diphthong /ea/) in open syllables, and ME /a/, have remained relatively stable.
(d) Rounded front vowels have been lost, e.g. OE /y:,y/ and earlier l&.&l.
(e) The loss of post-vocalic [r] in the eighteenth century gave rise to the PresE centring diphthongs /ia,ea,ua/, the pure vowel /3:/ and introduced /a:,a:/ into new categories of words (cart, port).
(f) Vowels under weak accent increasingly obscured to [a] or [i], or have been elided.
(g) Changes of quantity have affected certain phonemes in particular contexts or sets of words, e.g. lengthening of OE /a,ee,ea/ in open syllables and of ME /a/ + /f,6,s/; and shortening of ME /o:/ in words like good, book, blood, and of ME /e:/ in such words as breath, death, head.
(3) Consonant changes. Changes in the consonantal system are less striking, but the following may be noted:
76 The Sounds of English
(a) Double (or long) consonants within words were lost by late ME; certain other consonant clusters ceased to be tolerated, e.g. /hl,hr,hn/ by ME and /kn,gn,wr/ in the eModE period; post-vocalic /r/ was lost in much of the south-east of England in the eighteenth century.
(b) Allophones of certain phonemes have been lost, e.g. the [yl allophone of /g/ in late OE and the [x,c.J allophones of /h/ in eModE.
(c) New phonemes have emerged, e.g. 1% (%/ in OE, /v,3,z/ in ME, and /g^/ in eModE; in addition, /h/ is used initially in words of French origin where, originally, no [h) sound was pronounced (habit, herb, humble, etc).
Standard and Regional Accents
7.1 Standards of Pronunciation_
The British are today particularly sensitive to variations in the pronunciation of their language. The 'wrong accent* may still be an impediment to social intercourse or to advancement or entry in certain professions. Such extreme sensitivity is apparently not paralleled in any other country or even in other parts of the English-speaking world. There are those who claim, from an elocution standpoint, that modern speech is becoming increasingly slovenly, full of 'mumbling and mangled vowels and missing consonants'. Alexander Gil and others made the same kind of complaint in the seventeenth century. There is, in fact, no evidence to suggest that the degree of obscuration and elision is markedly greater now than it has been for four centuries. Of more significance—social as well as linguistic—is the attitude which regards a certain set of sound values as more acceptable, even more 'beautiful' than another. Judgements of this kind suggest that there is a standard for comparison; and it is clear that such a standard pronunciation does exist, although it has never been explicitly imposed by any official body. A consideration of the origins and present nature of this unofficial standard goes some way towards explaining the controversies and emotions which it arouses at the present day.
7.2 The Emergence of a Standard_
. It is clear that the controversy does not centre around the written language: the
t j spelling of English was largely fixed in the eighteenth century; the conventions of
j < grammatical forms and constructions as well as of the greater part of our
vocabulary have for a long time been accepted and adhered to by the majority of educated English speakers. Indeed, the standardization of the written form of ,i English may be said to have begun in the ninth and tenth centuries. But there
): has always existed a great diversity in the spoken realizations of our language, in
terms of the sounds used in different parts of the country and by different sections of the community. On the one hand, the sounds of the language always being in process of change, there have always been at any one time disparities between the speech sounds of the younger and older generations; the speech of the young is
78 The Sounds of English
traditionally characterized by the old as slovenly and debased. On the other hand, especially in those times when communications between regions were poor, it was natural that the speech of all communities should not develop either in the same direction or at the same rate; moreover, different parts of the country might be exposed to different external influences (e.g. foreign invasion) which might influence the phonetic structure of the language in a particular area. English has, therefore, always had its regional pronunciations in the same way that other languages have been pronounced in a variety of ways for basically geographical reasons. Yet, at the same time, especially for the last five centuries, there has existed in this country the notion that one kind of pronunciation of English was socially preferable to others; one regional accent began to acquire social prestige. For reasons of politics, commerce, and the presence of the Court, it was to the pronunciation of the south-east of England and, more particularly, to that of the London region that this prestige was attached. The early phonetician John Hart notes (1569) that it is in the Court and London that 'the flower of the English tongue is used . . . though some would say it were not so, reason would we should grant no less: for that unto these two places, do daily resort from all towns and countries, of the best of all professions, as well of the own landsmen, as of aliens and strangers . . .' Puttenham's celebrated advice in the Arte of English Poesie (1589) recommends 'the usual speech of the Court, and that of London and the shires lying about London within 60 miles and not much above . . . Northern men, whether they be noblemen or gentlemen, or of their best clerks, [use an English] which is not so courtly or so current as our Southern English is.' Nevertheless, many courtiers continued to use the pronunciation of their own region; we are told, for instance, that Sir Walter Raleigh kept his Devon accent. The speech of the Court, however, phonetically largely that of the London area, increasingly acquired a prestige value and, in time, lost some of the local characteristics of London speech. It may be said to have been finally fixed, as the speech of the ruling class, through the conformist influence of the public schools of the nineteenth century. Moreover, its dissemination as a class pronunciation throughout the country caused it to be recognized as characteristic not so much of a region as of a social stratum. With the spread of education, the situation arose in which an educated man might not belong to the upper classes and might retain his regional characteristics; on the other hand, those eager for social advancement felt obliged to modify their accent in the direction of the social standard. Pronunciation became, therefore, a marker of position in society.
7.3 The Present-Day Situation: RP_
(1) Some prestige is still attached to this implicitly accepted social standard of pronunciation. Often called received pronunciation (RP), the term suggesting that it is the result of a social judgement rather than of an official decision as to what is 'correct' or 'wrong', it has become more widely known and accepted through the advent of radio and television. The BBC used to recommend this form of pronunciation for its announcers mainly because it was the type which was most widely understood and which excited least prejudice of a regional kind. Indeed, attempts to use announcers who had a mild regional accent used to provoke protests even from the region whose accent was used. Thus, RP often
Standard and Regional Accents 79
1became identified in the public mind with 'BBC English'. This special position occupied by RP, basically educated southern British English, has led to its being the | form of pronunciation most commonly described in books on the phonetics of
British English and traditionally taught to foreigners. 1 (2) Nevertheless, it cannot be said that RP is any longer the exclusive property of
a particular social stratum. This change is due partly to the influence of radio and television in constantly bringing the accent to the ears of the whole nation but also, in considerable measure, to the modifications which are taking place in the structure of English society. Just as the sharp divisions between classes have disappeared, so the more marked characteristics of regional speech and, in the London region, the popular forms of pronunciation are tending to be modified in the direction of RP, which is equated with the 'correct' pronunciation of English. This tendency does not mean that regional forms of pronunciation show signs of disappearing; but it has to be recognized that those who wish, for any reason, to modify their speech have models of RP always readily available to their ears while, at the same time, the social inhibitions concerning movement between classes, which were formerly so strongly operative, no longer exert the same pressure.
Moreover, it must be remarked that some members of the present younger generation reject RP because of its association with the 'Establishment' in the v same way that they question the validity of other forms of traditional authority. For
them, real or assumed regional or popular accent has a greater (and less committed) prestige. It is too early to predict whether such attitudes will have any lasting effect I upon the future development of the pronunciation of English. But if this tendency
were to become more widespread and permanent, the result could be that, within the next century, RP might be so diluted that it could lose its historic identity, and i that a new standard with a wider popular and regional base would emerge. Such a
I change is made more likely through the recent more permissive attitude of the BBC
i (and of the commercial television companies) in their choice of announcers, many
of whom now have markedly non-RP or non-British accents.
(3) Certain types of regional pronunciation are, indeed, firmly established. Some, especially Scottish English speech, are universally accepted; others, particularly the popular forms of pronunciation used in large towns such as London, Liverpool, or
iBirmingham, are generally characterized as ugly by those (especially of the older generations) who do not use them. This rejection of certain sounds used in speech is not, of course, a matter of the sounds themselves: thus, [paint] may be acceptable if it means pint, but 'ugly' if it means paint. It is rather a reflection of the social connotations of speech which, though they have lost some of their force, have by no means disappeared. Indeed, RP itself can be a handicap if used in inappropriate social situations, since it may be taken as a mark of affectation or a desire to emphasize social superiority.1 It may be said, too, that if improved communications and radio have spread the availability of RP, these same influences have rendered other forms of pronunciation less remote and strange. An American pronunciation of English, for instance, is now completely accepted in Britain; this was not the case at the time when the first sound films were shown in this country, an American pronunciation then being considered strange and even difficult to understand. Speakers of RP are becoming increasingly aware of the fact that their type of
I For a summary of experiments on the social evaluation of RP using the matched-guise technique, see Giles et at. (1990).
80 77k? Sounds of English
pronunciation is one which is used by only a very small part of the English-speaking world.
(4) Within RP, those habits of pronunciation that are mostly firmly established tend to be regarded as 'correct*, whilst innovation tends to be stigmatized. Thus conservative forms tend to be most generally accepted, sometimes even by those who themselves use other pronunciations. Where the accentual patterns or the phonemic structure of words is concerned, this attitude may result in a speaker's use of the conservative variant in a formal situation and the use by the same speaker of a less well-established variant in more casual speech, e.g. the avoidance of /verifarabl/ (verifiable) and Adjuorm,/ (during) in more formal speech and their replacement with the more conservative /'venfarabl/ and Adjuaruj/. It may be of interest that the pronunciation /^tong/ with initial coalescent assimilation was acknowledged by Daniel Jones in the English Pronouncing Dictionary in the 1960s and noted as long ago as 1913 by Robert Bridges in his Tract on English Pronunciation. Nevertheless, there is still some resistance to accepting such coalescence word-initially in accented syllables.
Where realizational variation (below the level of the phoneme) is affected, most speakers are unaware of their own changing speech patterns. Objections to the use of the glottal stop are often made, its use being popularly associated with Cockney speech, and yet its occurrence as a realization of preconsonantal /t/ is increasingly frequent within the speech of the middle and younger generations of RP speakers (see §9.2.8).
(5) Even within RP there are some areas and many individual words where alternative pronunciations are possible. It is convenient to distinguish three main types of RP; General RP, Refined RP, and Regional RP.2 The last two types require some explanation. Refined RP is that type which is commonly considered to be upper-class, and it does indeed seem to be mainly associated in some way with upper-class families and with professions which have traditionally recruited from such families, e.g. officers in the navy and in some regiments. Where formerly it was very common, the number of speakers using Refined RP is increasingly declining. This may be because for many other speakers (both of other types of RP and of regional dialects) a speaker of Refined RP has become a figure of fun, and the type of speech itself is often regarded as affected. (The adjective 'Refined' has been chosen deliberately as having positive overtones for some people and negative overtones for others.) Particular characteristics of Refined RP are the realization of /au/ as [eu], and a very open word-final /a/ (and where [a) forms part of /ra,ea,ua/) and /i/. The vowel /s:/ is also pronounced very open, this time in all positions. The vowel /as/ is often dipthongized as less].
While Refined RP reflects a class distinction and describes a type of pronunciation which is relatively homogeneous, Regional RP reflects regional rather than class variation and will vary according to which region is involved in 'regional'. Some phoneticians, on the basis that part of the definition of RP is that it should not tell you where someone comes from, would regard the term 'Regional RP' as a contradiction in terms. Yet it is useful to have such a term as 'Regional RP' to describe the type of speech which is basically RP except for the presence of a few regional characteristics which go unnoticed even by other speakers of RP. For example, vocalization of dark [i] to [u] in words like held [heod] and bait [boo], a
2 cf. Wells (1982; 280-3, 297-301).
Standard and Regional Accents 81
characteristic of Cockney (and some other regional accents), now passes virtually unnoticed in an otherwise fully RP accent (listen, for example, to umpires at Wimbledon saying all.) Or, again, the use of /ae/ instead of /a./ before voiceless fricatives in words like after, bath, and past (part of the general Northern accent within England) may be likewise acceptable. But some other features of regional accents may be too stigmatized to be acceptable as RP, e.g. realization of /t/ by glottal stop word-medially between vowels, as in water (Cockney), the lack of a distinction between /a/ and /u/ (Northern), or the fronting of /u:/ to [y:] (Scottish).
The concept of Regional RP reflects the fact that there is nowadays a far greater tolerance of dialectal variation in all walks of life, although, where RP is the norm, only certain types of regional dilution of RP are acceptable. It remains true, however, that most manuals and dictionaries of the pronunciation of British English, like this book, are based almost entirely on RP.
RP, Refined RP, and Regional RP are not accents with precisely enumerable lists of features but rather represent clusterings of features, such clusterings varying from individual to individual. Thus there are not categorial boundaries between the three types of RP nor between RP and regional pronunciation; a speaker may, for example, generally be an RP speaker but have one noticeable feature of Refined RP.
(6) Finally, it has to be recognized that the role of RP in the English-speaking world has changed very considerably in the last century. Over 300 million people now speak English as a first language, and of this number native RP speakers form only a minute proportion; the majority of English speakers use some form of American pronunciation. However, despite the discrepency in numbers, RP continues for historical reasons to serve as a model in many parts of the world; and, if a model is used at all, the choice is still effectively between RP and an American pronunciation. When it is a question of teaching English as a second language, there is clearly even greater adherence to one of the two main models. Most teaching textbooks describe either RP or General American, and allegiances to one or the other tend to be traditional or geographical: thus, for instance, European countries continue on the whole to teach RP, whereas some parts of Asia and Latin America follow the American model (see also Chapter 13).
7.4 Comparing Systems of Pronunciation
A comparison of two types of pronunciation will reveal differences of several kinds (as mentioned in §5.3.5):
(a) Realizational differences. The system, i.e. the number of distinctive (phonemic) terms operating may be the same, but the phonetic realizations of the phonemes may be different: e.g. the RP opposition between the vowels of bet and bat may be maintained, but the realization of both vowels is much more open than in RP (as in Northern English) (see §§8,9.3^1), so that the sound of /*/ may come near to that of one type of RP /a/ (see §8.9.5); or when, as in Cockney, an allophone [?] represents /t/ between vowels (see §9.2.8); or when the final allophone of /l/ is [1] rather than [i] (see §9.7.1).
(b) Systemic differences (i.e. differences in phoneme inventory). The system may
II
82 The Sounds of English
be different, i.e. the number of oppositions may be smaller or greater, e.g. the RP /ae/-/a:/ opposition may not be present in those Ulster or Scottish forms which do not distinguish Sam and psalm; or when RP /at/ homophones, as in side and sighed, are differentiated qualitatively or quantitatively, as in some types of Scottish English; or when the presence of /g/ after [rjj in such a word as sing deprives |rj] of its phonemic status (see §9.6.3).
(c) Lexical differences (i.e. differences of lexical incidence). The system may be the same, but the incidence of phonemes in words is different, e.g. in those Northern forms which have the RP opposition /u:/-/u/, but nevertheless use /u:/, in book, took, etc. (see §8.9.9-10); or when /d/ is used instead of /a/ in one, among, etc., though the opposition /d/-/a/ exists (see §8.9.5); or when the choice of phoneme is associated with the habits of different generations, e.g. /d:/ for lol in off, cloth, cross, etc. (see §8.9.7) or /ei/ for hi in Monday, holiday, etc.
(d) Distributional differences. The system may be the same, but the phonetic context in which certain phonemes occur may be limited, e.g. in RP /r/ has a limited distribution, being restricted in its occurrence to prevocalic position as in red or horrid. Accents which display this limited distribution of /r/ are referred to as non-rhotic accents, whilst those in which /r/ has a full distribution (such as most American and Scottish accents) are termed rhotic. In the latter accents /r/ occurs pre-consonantally and pre-pausally as well as pre-vocalically; thus part and car will be pronounced /part/ and /ka:r/ whereas in non-rhotic accents the pronunciation will be /pa:t/ and /ka:/. See §§9.7.2 and 12.4.7.
7.5 Current Changes within RP3_
(1) Realizational changes. RP /eel is frequently heard with a more open quality approaching Cla]. This continues a trend in which this RP vowel was typically around Qe] early in this century. It appears to conflict with another trend whereby RP /a/ was becoming more fronted and also approaching C[a]. There is no evidence suggesting that the two vowels are coalescing; indeed, it seems more likely that /a/ is retreating to its central position.
Other developments among the vowels include /ea/ becoming monophthongal [e:], and /ai/ and /au/ having the same centrally open starting-point (as shown in the revision of the first symbol of /au/ from previous editions).
The vowel represented by the spelling < y> at the ends of words like pity, cruelty, lengthy is more and more frequently heard with a closer and more forward pronunciation than the usual realization of hi, e.g. in sit. Indeed the two vowels in such a pronunciation of city are far less similar to one another than the two vowels in meaty. Thus it seems best to regard the newer, closer, pronunciation as involving unaccented /i:/ (although, of course, theoretically the distinction may be said to be neutralized in this position—see §5.3.4),4
As has been mentioned in §7.3 (4), the realization of preconsonantal A/ as a glottal stop is increasingly common in present-day RP. (See §9.2.8)
(2) Systemic changes. The one recent systemic change that is now more or less completed is the loss of /oa/ from the phoneme inventory.
3 See further in Ramsaran (1990a).
4 For fuller discussion, see Lewis (1990).
Standard and Regional Accents 83
(3) Lexical changes. There is a strong trend towards selecting /a/ instead of hi in weak syllables, the choice of hi being particularly favoured after /I/ and even more so after hi, e.g. angrily /"aerjgrili/ > / seggrali/. For further detail and examples, see §8.9.2.
Another noticeable trend is the replacement of/oa/ by h:l in many common words, e.g. poor ipy.l, sure //:>:/, though /ua/ still retains its phonemic status, its contrastive function being illustrated in the speech of most speakers by such sets as doer, dour, door /du:a, dua, Ay.l.
(4) Distributional changes. The most noteworthy trend concerning a regular change in the occurrence of a phoneme is the loss of 1)1 after alveolar consonants in such words as allude /a"Iju:d/>/a"lu:d/, luminous /"lju: mmas/>/"lu:mmas/, supersede /sju:pa"si:d/ > /su:palsi:d/. /)/ is most commonly dropped after /I/ and Is/ (as, indeed, it was long ago after It/). In sequences of /n/ + /]/, elision of the /)/ is increasingly common in British English. In the case of the alveolar plosives +'/)/, coalescence whereby /rj,dj/>/tf,d5/, rather than elision, is now increasingly common except initially in an accented syllable, where hi + /}/ or idi +1)1 tend to be retained. Thus educate /"edju:keit/ > Aeckurkeit/, statuesque /staetju:"esk/ > /stastfu:lesk/.
(5) Word accent changes. Certain patterns may be detected, especially in the change affecting adjectives in -able I-Me and -aryi-ory. In both classes of words, the accent tends now to fall later in the word, thus "applicable > applicable, 'explicable > explicable, "justifiable > justiTiable, "fragmentary > fragmentary, "mandatory > mandatory.
Similarly, the feminine suffix -ess increasingly attracts primary accent in words like countess, lio'ness, prio"ress, stewardess.
Other current changes do not display such regular patterns, and it remains to be seen which of two variant pronunciations at present coexisting will prevail.
7.6 Systems and Standards other than RP__
The remainder of this book is a description of English set within the basic framework of RP, with some reference to variation in other dialects in the discussion of each of the RP phonemes. But there are a number of reasons why such particular differences should be drawn together to show the major overall differences between the phonemic system of RP and that of other major dialects of ^ English. In this section we survey briefly differences between RP and five other systems: General American, Scottish English, Northern (England) English, Cock-ney, and Australian English." We survey an American pronunciation because, as TRSted in §7Xthis is more frequently the standard model for learners of English as a second language in much of Asia and Latin America. We look at Scottish English because this is the type of pronunciation of English within the British Isles which is most frequently accepted as an alternative standard to RP. We survey Northern (England) English and Cockney because these are the areas (apart from Scottish) whose characteristic pronunciations are heard most widely within Britain and which often underlie regional forms of RP. We look at Australian English because this is typical of an English pronunciation of the southern hemisphere and may increasingly become the standard for a wider area rather than just Australia. Of
84 The Sounds of English
Standard and Regional Accents 85
course, we could easily have made a case for the inclusion of other systems of pronunciation here (e.g. Caribbean English and Indian English); but since this is not primarily a book about varieties of English, a limit had to be set somewhere. Moreover there are now books which survey dialectical variation in English pronunciation in detail.5 Where reference is made in this book to non-standard varieties of English, the type of pronunciation being referred to is the basilectal variety of the area concerned, i.e. that used by lower socio-economic classes (and by middle socio-economic classes in informal situations).
7.6.1 GeneraTAmerican
The traditional (although not undisputed) division of the United States for pronunciation purposes is into Eastern (including New England and New York City), Southern (stretching from Virginia to Texas and to all points southwards), and General (all the remaining area). General American (GA) can thus be regarded as that form of American which does not have marked regional characteristics (and is in this way comparable to RP). It is the standard model for the pronunciation of English as an 1,2 in parts of Asia (e.g. the Philippines) and parts of Latin America (e.g. Mexico).
There are two areas of systemic difference between RP and GA. First, GA has no/o/. Most commonly, those vowels which have /d/ in RP are pronounced with /a:/ in GA, e.g. cod, spot, pocket, bottle. But a limited subset has /yj, e.g. across, gone, often, cough (as can be seen from the examples, these frequently involve a following voiceless fricative). Secondly, GA lacks the RPdiphthongs /ra,ea,ua/ which correspond in GA to sequences of vowel plus /r/, e.g. beard, fare, dour, /bird/, /fer/, /dur/. This reflects the allied distributional difference between RP and GA, namely that, unlike RP, where /r/ occurs only before vowels, GA /r/ can occur before consonants and before pause (GA is called a rhotic dialect and RP a non-rhotic dialect).
The main difference of lexical incidence concerns words which in RP have /a:/ while in GA they have /«/. Like the change from /d/ to /y./, this change commonly involves the context before a voiceless fricative, or alternatively before a nasal followed by another consonant; thus RP /pa:st/-GA/paest/, RP Aa:fte/-GA /-eftw/, RP /pa:6/-GA /ps©/.
Differences of realization are always numerous between any two systems of English pronunciation, and only the most salient will be mentioned. Among the vowels this includes the realization of the diphthongs /ei/ and /au/ as monophthongs [e:l and [o:], hence late [let] and load [lo:dl. Among the consonants, /r/ is either phonetically [J, i.e. the tip of the tongue is curled further backwards than in RP, or else a similar auditory effect is achieved by bunching the body of the tongue upwards and backwards; A/ intervocalically is usually a voiced tap in GA, e.g. better [bera*]; and /I/ is generally a dark [I] in all positions in GA, unlike R¥, where it is a clear [I] before vowels and a dark [1] in other positions (see §9.7.1).
7.6,2. Scottish English
The typical vowel system of Scottish English (SE) involves the loss of the RP distinctions between /a:/ and /*/, between /u:/ and /u/, and between fyJ and
5 In particular Wells (1982). ~~
/d/. Thus SE pronounces the pairs ant and aunt, soot and suit, caught and cot similarly. SE also has no /ia,es,uo/ because, like American English, it is rhotic, and beard, fare, and dour are pronounced as /bi:rd/, /feir/, and /du:r/. However, the vowel in /feir/, which we have transcribed with the RP diphthong /ei/, is typically monophthongal [e:] (and of course would be transcribed as such if we were devising a phonemic transcription independently for SE). The vowel /au/ is also monophthongal [o:] as in coat [ko:t]; so the vowels in fare and coat are similar to those in American English. Moreover, the vowel in soot and suit is not like either of the RP vowels in these words, but is considerably fronted to something like ly], hence [syt].
The chief differences from RP in the realization of the consonants lies in the use of a tap [r], e.g. red [red] and trip [trip), though there is variation between this and U] (the usual type in RP), the use of [a] being generally more prestigious. The phoneme /I/ is most commonly a dark [i] in all positions, little [ifti], and plough [p*au]. Finally, intervocalic A/ is often realized as a glottal stop, e.g. butter pbA'aj],
7.6.3 Cockney
We use the term Cockney, rather than London English, because, unlike General American and Scottish English, Cockney is as much a class dialect as a regional one. In its broadest form the dialect of Cockney includes a considerable vocabulary of its own, including rhyming slang. The characteristcs of Cockney pronunciation are spread more widely through the working class of London than is its vocabulary. Moreover, some traces of Cockney pronunciation are often present in most middle-class speech of the area.
Unlike the previous two types of pronunciation, there are no differences in the inventory of vowel phonemes between RP and Cockney, and there are relatively few (compared with GA and SE) differences of lexical incidence. There are, however, a large number of differences of realization. The short front vowels tend to be uniformly closer than in RP, e.g. in sat, set, and sit, so much so that sat may sound like set and set itself like sit to speakers from other regions. Additionally the short vowel /a/ moves forward to almost C[a]. Among the long vowels, most noticeable is the diphthongization of I v. I (= [ii]), /u:/ (= [uu]), and fy.f, which varies between [du] morpheme-medially and bwa] morpheme-finally, thus bead rbrid], boot [buut], sword [soud], saw [sDwa]. Cockney also uses distinctive pronunciations of a number of diphthongs: /ei/=tai], /ai/ = [ai], /au/ = [aeu], and /au/ = [a:], e.g. late [lait], light [lart], load [laeud], loud [la:d]. The last two vowels are close enough to cause considerable confusion among non-Cockney listeners, although the distinction is never actually neutralized.
Among the consonants, most notable are the omission of /h/ and the replacement of /e,o/ by /f,v/, e.g. think /fink/, father Afava/, hammer Aaema/. Dark [*], i.e. /I/ in positions not immediately before vowels, becomes vocalic [u], e.g. milk [mmk]; A/ is realized as a glottal stop between vowels, e.g. [bA?a] and there is glottal replacement of [p,t,k] before a following consonant, e.g. soapbox fsaju'bDks], statement [stai^mant] technical Tte^nikal], as in some types of Scottish English; and /j/ is elided after alveolar plosives, e.g. student, during.
Cockney has consistently had a major influence on the development of RP, and
86 The Sounds of English
nowadays that type of Regional RP which is heavily influenced by Cockney is often referred to as Estuary English (i.e. a middle-class pronunciation typical of the Thames estuary). Particularly characteristic of this type of Regional RP are the replacement of [p,t,k] by [?] before a consonant (see §9.2.8 (b) (ii) below) and the use of [u] in place of [i].
7.6.4 Northern English
While there is relative homogeneity in a broad Cockney accent but much less so in General American and Scottish English, the label 'Northern English' is even less homogeneous. We use it here simply to identify those things which the disparate pronunciations systems in the north of England have in common (and we will also mention a few characteristics which are typical only of certain areas). The area we are talking about covers that area north of a line from the River Severn to the Wash, and includes Birmingham.
The major identifying feature of this area is the loss of the distinction between RP /u/ and /a/, the single phoneme doing duty here varying in quality from [u] to [a]. So Northern English has no distinction between put and putt, could and cud, and, for many speakers, between buck and book (although others may use /u:/ in the latter word). Hypercorrections are often made by those attempting Regional RP producing, for example, sugar [/Aga], pussy [pAsi], put [pAt]. Almost as identifying a characteristic is the changeover in lexical incidence from /a:/ to /*/ in words with a following voiceless fricative (or a nasal followed by a further consonant, as in General American), e.g. past /paest/, laugh /laef/, aunt /sent/. Another type of lexical incidence concerns the occurrence of a full vowel in prefixes where RP has /a/, e.g. advance /aedVens/, consume /kDn"sju:m/, observe /ob'z3:v/. The short vowels are generally realized with more open qualities than RP, e.g. mad fmad], and the diphthongs /ei/ and /au/ are commonly monophthongal [e:] and [o:] as in GA and SE (indeed sometimes, as in Newcastle, the direction of the diphthong is reversed to [ea] and [09]). Other vowel changes (compared with RP) characteristic of particular areas include the loss of the /es/-/3:/ distinction in Liverpool (the local accent is called Scouse and its common realization as [ce:], e.g. both fare and fur are pronounced [fee:]; the realization of /au/ as [u:] in Newcastle (where the broad local accent is called Geordie) while /u:/ itself becomes [la], e.g. about [abu:t], boot [brat]; and the use °f z. Particularly close hi in Birmingham, e.g. pit is almost [pit], where the distinction between pit and peat will depend on length alone.
Most notable among the consonants of Northern English is the realization of ft/ as [r] in a number of conurbations including Leeds, Liverpool, and Newcastle, and the lack of the RP allophonic difference between clear [1] and dark [i], clear [1] being used in all positions in many areas, e.g. Newcastle, and dark [i] in others, e.g. Manchester. In a quite extensive area, from Birmingham to Manchester and Liverpool, the RP single consonant /rj/ becomes /gg/, e.g. singing [singing]. Also in a number of urban areas, notably south-east Lancashire, /p,t,k/ in final position (i.e. before a pause) are realized as ejectives.
Standard and Regional Accents 87
7.6.5 Australian English
There is little regional variation in Australian English (ANE), the variation which does occur being largely correlated with social class and ranging from a broad accent all the way up to regional RP. The broad accent described here shares many features with Cockney, but has of course a particular combination of these and other features which identify it.
As in Cockney, there are no differences of phonemic inventory from RP and no extensive classes of word involved in differences of incidence. It is the realization of long /a:/ as [a:] which more than any other identifies ANE, e.g. father [fa:5a], part [pa:t]. As in Cockney, /i:/ and /u:/ are realized as [ii) and [uu] and the short front vowels are alt closer than RP, although [1] does not occur in unaccented positions, being replaced by I v. I word-finally and by /a/ in other positions, e.g. city /sati:/. In its diphthongs ANE is again like Cockney in having /ei/ = [ai] and /ai/ = [ail, and in having a convergence of quality of /ao/ and /au/; however, diphthongs in /a/ are monophthongized, so /»/ = [1:], clear [kli:) (leading to an accumulation of three vowels, /i:/, hi, and [1:) in the close front area), /ea/ = [v.], fare [fc:], while /ua/ is either replaced by /y./ as in sure or becomes disyllabic as in sewer /su:a/.
Although ANE does drop /h/, it does not use glottal stop, nor does it vocalize /I/, having dark [i] in all positions.
A particular development in Australian English (and in New Zealand) which has been the subject of much discussion recently, both in newspapers and in academic journals,6 is the increasing use of a high rising tone on declarative clauses (where a fall would normally have been expected). The meaning of this tone and the reasons behind its increased use have also been much discussed (see further under §11.6.3).
1 I
1
6 Guy et al. (1986); Britain (1992).