Worried about your foreign accent or just want to tune it up a bit? You’re in the right place! This is the first of a series on sounding good in you your target language. I’m calling it Pronunciation Plus!
We all want to make sure that a thick foreign accent does not get in the way of good communication.
It’s really important to win the confidence of your conversation partner. If you sound too different from the norm, at worst they may simply not understand you at all.
Even if they do, the chances that they will try to switch to English are increased.
How you sound can be more important than how accurate your grammar is are or how many words and expressions you actually know.
We want to sound right. That doesn’t mean sounding perfect, though.
Some elements of how you sound are more important than others.
That stubborn foreign accent has its roots in one or more elements that we’ll explore in this post.
What are the elements? In this mega overview post we’ll focus first at the level of individual sounds: phonetics and phonology. Then we’ll look at sounds strung into words and phrases covering stree, syllables, rhythm, intonation and tone. Finally, we’ll see how individual sounds can change in connected speech, especially if we’re speaking rapidly. There’s a video at the bottom of the post, where I summarise the message for a group of fellow language learners.
Phonetics of for language learners
Phonetics is the study of speech sounds from the perspective of biology (how we make them and how the ear picks them up) and physics (the properties of the sound waves). Here’s we’re focussing on how we make them.
Each speech sound is said to be made up of at least two (and sometimes three) “components”.
The first component is “initiation” or how a flow of air is initiated.
There are three organs that can initiate the flow.
First, the lungs. Air can be pushed out of the lungs (pulmonic pressure) or sucked in (pulmonic suction). All sounds in European languages involve pushing air out of the lungs. pulmonic suction is very rarely found (if at all) except as a way of signalling agreement when listening to a conversation (the “inhaled affirmative” “ja” found in Norwegian or Icelandic). If you say as shocked “Huh!” in English, you’ll feel what it’s like.
Second, the larynx. The air can be thrust upwards (“glottalic pressure” common in languages of the Caucasus, native American and African ones). The air can also be sucked down (glottalic suction)(widespread among the languages of Sub-Saharan Africa and Southeast Asia, this also occurs in a few languages of the Amazon Basin but is rare elsewhere).
Third, the tongue can be used (velaric suction). This is used in a small number of southern African “click” languages. The sounds are more widely used beyond speech (e.g. the “tut, tut” sound of disapproval made in English or the clicking you might make to “gee up” a horse).
“Articulation” is how and where the airflow is modified.
Let’s start with where. This can be in the nose, the mouth and nose, in the mouth, or in the area of the pharynx (the throat behind the nose and mouth) and, below it, the larynx (“voice box”). The lips, teeth, walls of the mouth, tongue and the uvula can all be involved in various ways.
Moving to how the airflow is modified, or “stricture type”, the articulatory organs just mentioned can modify the flow of air in various ways.
For example, “plosive” sounds are caused by the release of pressure from behind the lips as in [p] or [b] or the tongue [t, d, c, g]. “Fricatives” involve a flow of air creating sound through friction with articulatory organs. “Affricates” combine plosive and fricative, as in the “ch” [ʧ] in choke or “j” [ʤ] in joke.
Among the other stricture types are the “trills” in which one flexible organ taps against another in a flow of air (as in the Scottish trilled “r” where the tongue flaps against the “alveolar ridge” (top of the mouth just above the teeth)).
Down in the larynx, the glottis is the gap formed between vocal folds (or vocal chords/voice reeds) when they are open (we use this gap to breath as well as to speak!). In speech, modification of by the flow of air of the glottis is called “phonation”.
“Voice” (as a technical term) is the most common effect (a buzzing of the vocal chords). English has pairs of consonants which with and without vibration: a “voiced” “b”, “voiceless” “p”, voiced “g”, unvoiced “k”.
Other modifications that the glottis can produce include whispering or a “husky” or “creaky” voice.
All these bells and whistles mean that a great many speech sounds are possible. Each language only uses some of them. English does not use clicks for initiation, for example. Neither does it use the front, rounded “u” vowels found in French or German (“vue”, “ueber”). English does not use whispers to distinguish one sound from another.
The International Phonetic Alphabet has been developed to represent each physically possible sound. Individual sounds or words notated in the IPA are conventionally placed in square brackets (as we’ve done above).
If you get the initiation, articulation or phonation of individual sounds wrong, this could be causing you to mispronounce words and speak with an accent. An obvious challenge with a new language is learning to make the physical components you are not used to using. Another difficulty is when sounds are close to those found in your native language, but slightly different.
So far, so physical. Things are just about to get a whole lot weirder.
Phonemes: get these wrong and you’ll have a strong foreign accent
Each language builds words from its own system of basic, building block units of speech sound which are contrasted with other units in order to make up different words.
These basic building blocks are called “phonemes”.
languages have been studied with as few as eleven phonemes or as many as 141. English is pretty typical with about 44 (remembering there is some variation for regional accent).
Here’s where it gets peculiar: phonemes are, in a sense, “made” in our brains. We’re moving from the physical production of sounds to how the brain classifies them.
Think of a unit or “segment” of speech sound as a bunch of acoustic features only some of which are decisive in classifying any actual sound you make (and each time you say the same sound it will be a bit different).
The other acoustic features are discounted by the brain because they are not crucial “distinctive features” in the system of sound contrasts in that language. The brain may not hear the unimportant difference at all. Or it may hear the difference and perceive just a variation of “the same” sound. That is to say, as “allophones” of the same phoneme.
For example, in English there is one /l/ building block phoneme but there are actually, broadly, three actual allophone “l”s heard in English. (Phonemes are written between // using the IPA symbol for one of the allophones which fall within the phoneme in the language).
In English, /l/ may be a “light” or “clear” “l” [l] like at the beginning of “leaf”. It might be a dark “l” [ł] like at the end of cool, which is actually a different sound. The dark “l” is often pronounced nearer to a yet another sound, a “w” sound in some modern British dialects (in London, for example).
To repeat, none of these different “l” sounds matter for meaning in English. In English spoken with a Welsh accent, the light l is used in the end position as well, with no detrimental effect on comprehension.
In Polish, on the other hand, the difference between a light and dark “l” is a significant contrast. It’s “phonemic”. Change out the sound in a Polish word and it will sound wrong and you could be misunderstood.
For example, Polish “luk” with a clear “l” means “skylight” which “łuk” with a dark “l” means “bow” (as in bow and arrow). “Wola” (clear “l”) means the noun “will”, “woła” means “is calling/calls”.
So, an English person learning Polish suddenly has to tune in a whole lot more to the various “l” sounds or risk misunderstandings. That risk is particularly high with “minimal pairs”, like the examples just cited. In a “minimal pair” them the ONLY difference between the two words is the one sound.
French speakers learning English have the same problem when they have to learning to distinguish between the short “i” /ɪ/ in “bin” and the long “i” /ːi/ in “been”, as this distinction is not made between “i” sounds in French.
Spanish people have to learn to distinguish between English “b” and “v” which are phonemic contrasts in English (compare “bowels” and “vowels”) but just allophonic variations of “v” in Spanish (i.e. they are “the same” sound in Spanish. “Okay, okay” in Spanish – “vale, vale” – can be pronounced with a [b] at the beginning of “vale” rather than a [v] with no change in meaning.
If sufficient features are present in the sound you make that you’re within the right “phoneme” but get the wrong “allophone” you will still be understood, but if you’re not close enough to the mark, you’ll make the sound with “an accent”.
Up to this point, we’ve been focussed on individual sounds or segments.
These are important, but don’t stress over them. Focus in on the sound system of your own language, use some of the methods we’ll be looking at later in the series and it will come with time. After all, we’ve all got the same sound-making equipment inside.
It’s time to look beyond the segments to the so-called suprasegmental features of the language. These are also called the prosodic features or the prosodies.
Here’s the thing: the reason for your accent could come less from the individual sounds you’re making than from how your stringing ’em all together.
Prosodic features to reduce your accent in a foreign language
Stress is the relative prominence of a syllable. Differences in the stress of a syllable are created by a greater or lessor force with which the syllable is initiated (by the glottis, tongue or, usually, the lungs). The pitch is often higher on a stressed syllable, as well. For example, compare the word “increase” as a noun or verb: an INcrease; to inCREASE).
Many languages distinguish between a relatively strong and relatively weak stress. English also has an intermediate “secondary” stress. The Russian stong stress tends to be stronger than the English strong stress. Russian stress is also “mobile” (it can around in different conjugations of the same verb or different syllables of a noun depending on whether it is singular or plural or which case declension it’s in. Some languages have a simpler, fixed stress system and getting that right is a quick win when you want to improve your accent.
Syllables are series of chunks of language. They begin with a pulse of initiation (usually from the lungs). All syllables in all languages have to have a “nucleus”. This is usually a vowel. Besides the nucleus, a syllable can begin and end with a consonant or cluster of consonants.
However, languages vary in the combinations or consonants they will allow and where in the syllable.
Depending on the language, one or more consonants can be added before the nucleus (the “onset” or the syllable) or after(the “coda” of the syllable). In syllable analysis the vowel “nucleus” is often represented by “V” with a consonant as “C”.
VC is a common structure in English. It’s the only way a syllable can be found in Japanese (To-yo-ta).
Languages which allow CVC, such as English or Mandarin, vary in how many consonants can come at the beginning and the end. Mandarin only allows one at the beginning and at the end. In English you can have up to three consonants at the beginning but only in strict combinations (first /s/, then a “stop” (p, c), then a liquid consonant (r, l)(e.g. “stroll”)). Russian allows more different consonants at the beginning.
Arabic does not allow consonant clusters at all.
“Ephensis” is when leaners insert vowels to make the pattern conform to their own language. Are you unwittingly doing this?
For example, because Spanish does not have initial sp/st consonant clusters, Spanish learners of English often find it hard not to add one and say “Espain” for “Spain” or “estation” for “station”. Arabic speakers add vowels t break up consonant clusters: “children” becomes “childiren”.
Syllable “simplification” can also take place. Learners leave out consonants because the combinations are not allowed in the native language. A Cantonese person might say “gir” for “girl” because Cantonese, like Mandarin, does not allow two consonants at the end of a syllable.
Some linguists think that syllable structure, not the phoneme, is the main unit of speech. They argue that there rate of phonemes per second is simply to great for the brain to be able to process and that another, larger unit must be involved. So, syllable problems could be accentuating that foreign accent of yours.
Some languages create a rhythm on the basis of syllables which are given equal weight and equal length. In French, Welsh, Spanish, Cantonese, Mandarin, Brazilian Portuguese, initiator power is on each syllable – equal timing (“syllable timed” or a “machine-gun” effect). A sure-fire way of im-pro-ving your French ac-cent is to make sure that you give each syllable equal weight.
Other languages parcel out the speech flow into relatively equal chunks based not on syllables as such but on stress groups or “feet”. These can be one but are often consisting of one several syllables. They are marked out by giving the initial syllable of each group extra stress. English, German, Russian, European Portuguese and Persian are examples of the many such “stress-timed” (or “isochronous”) languages. A characteristic of stress-timed languages is that they tend to have “reduced” vowels in unstressed syllables, unlike the syllable timed languages.
In some languages, the length of the syllable (long or short) is central to the rhythm (Japanese is the main example of such “mora-timing” – there is some debate around the actual nature definition of “mora” and to what extent this really is different from syllable timing).
Pitch can be varied either by stretching and tensing the vocal folds (the tenser, the higher) or by changing the pressure below the vocal folds (subglottal pressure) (the higher the pressure, the higher the pitch).
Pitch variations over relatively long utterances (potentially many syllables) are called “intonation”.
Variations in intonation will not change the meaning of the individual words involved but it may change the function of what is said and you need to be on the look out for potential differences in intonation pattern and how intonation (of whatever pattern) is used in your target language.
A special intonation can, of course, often be used in English and many other languages to make a question, though the intonation pattern may be different from English. In Hungarian, for example, when you ask a question that can be answered with “yes or no”your voice starts on medium pitch and then drops a bit, before rising on the last or penultimate syllable (in the latter case, it has to rise again). Got that or is it still messing up your otherwise very passable Hungarian? 😉
English uses intonation to emphasise individual words. “What do YOU think?” (I don’t care what he thinks, I want to know your opinion). Other languages may use particles to add emphasis instead, as often the case in Russian. Sometimes languages will change the sentence structure where English uses intonation. In Welsh for example you would say “laughing she was” (with a different modification to the form of “was” “oedd” instead of “‘r oedd”) for the English “she was LAUGHING” (as in “She was laughing and not crying”, or “Would you believe it, she was laughing”).
Intonation can also be used to express emotion.
The emotions implied by different intonations can differ between languages, meaning that there can be a risk of miscommunication. However, this is to do with wider cultural factors rather than a change in the literal meaning of the phrase, “pragmatics” rather than “semantics”.
Some languages, such as Finnish, have very little intonation at all which can make them sound “cold” and “emotionless” to English or Russian speakers used to using intonation to express emotion.
If the pitch varies in one syllable it is called a “tone”.
Over half the world’s languages are “tonal languages”.
That is to say, the differences in syllable tone change the meaning of the syllable. In other words, in tonal languages, a tone (pitch level) works like a phoneme (it’s a “toneme”).
Zulu has two tones: “high” and “low”. Cantonese, has six “mid-level” and “low-level”, “high-fallling” and “low-falling”, “high-rising” and “low-rising”. That makes Mandarin, with a mere four, at once look like a piece of cake 🙂
Differences in tone can even (though pretty uncommonly) signify different grammatical features (as in Twi (spoken in parts of Ghana), where a change in tone indicates tense).
In “pitch-accent” languages like Swedish or Japanese, some words are distinguished from others by giving one syllable more prominent than the others by means of a difference in pitch rather than by stress.
Sound changes in connected speech
Another important aspect at the level of connected speech is how the pronunciation of individual segments can change as a result of the rhythm or even just the speed. You need to be alert to how this typically happens in your target language or risk sounded stilted or “foreign”.
Words can have “stong” and “weak” forms, depending on whether or not they are stressed (“I am”, “I’m).
Sounds may merge (“elision”) in the transition from one word to another (compare “He stopped” with “He stopped speaking” (where the -ed is hardly heard). In “next day” the “t” is hardly heard, it elides with the “d”).
The opposite may also occur. “Liaison” (or “intrusion”) is where a sound appears in a word combination. So, in “four” the “r” is not usually heard in standard British English but in “four o’clock” it reappears. “I always” is pronounced “I yaways”. In the French “c’est difficile” the “t” is not pronounced but if it is followed by a vowel, it’s heard (“c’est incroyable!”).
In “catenation” when a word that ends with a consonant is followed by a vowel, the two join together, so “pick it up” sounds more like “pi ki tup”).
Further, in “assimilation” adjacent sounds are modified so that they become more like each other (the “d” in “handbag” often sounds more like an “m”); in “last year” the “t” sounds more like “ch” (/t/ becomes /tʃ/).
Phew! So those are the elements that make up how you’re sounding and that could be messing with your accent.
Yes, there are quite a lot, but don’t be intimidated. It’s all a lot more manageable than it seems and you don’t have to get good at everything at once (or even at all….as long as you’re understood!).
For now, be aware!
It’s not just about learning “new sounds” that don’t occur in the languages you already know.
It’s also about tuning your ear to pick up “phomeme” distinctions that are not important in your language. Phonemes are particularly important where miscommunication could result – the wrong phoneme or toneme or misunderstanding intonation. Don’t fixate on phonemes, though.
As we’ve seen, rhythm, intonation or tones are also often part of the picture and they may be more important than the individual sounds in helping you sound “right”.
I’ve spent so much time setting the elements out on the table so that we’ll be in a better position to build targeted practical work on reducing that foreign accent and getting better pronunciation into our language learning.
So, look out for the next in the Pronunciation Plus! series. We’ll be getting down to practical methods and activities that you can do. I’ll also be explaining which are most appropriate for the beginner, intermediate and advanced stages of your language learning journey.
“Tuning in” (1) – a video call I did with a group of fellow language learners covering the ground we’ve just discussed: