Spanish Phonetics


QUESTION: What is a diphthong in Spanish? How will it help my students to know this?

Normally, a syllable is based on one vowel sound. For example, the Spanish word comemos has three syllables based on the three vowels, and the word can be separated syllabically this way: co-me-mos with the consonant sounds helping to divide.

A Spanish diphthong is the combination of two vowel sounds to make one syllable. Students can improve their pronunciation, and in particular, their fluidity, by knowing where to stress the vowels in words instead of emphasizing every vowel equally.

Spanish vowels have been traditionally divided into two groups: a) the strong, and b) the weak. The strong vowels are a, e, o. When two strong vowels stand side by side in a word unseparated by consonants, they each constitute a syllable. For example, the Spanish word caos has two syllables, ca-os. The weak vowels are i and u (thus the popular saying to help remember the order, “Only U and I are weak”). This means that when a strong and a weak vowel are side by side, the strong vowel is emphasized more in the syllable. In Spanish, the word “tónica” refers to the stressed vowel and “átona” refers to the unstressed vowel.

A diphthong can be one of three possible combinations of vowels:

  1. a weak then a strong vowel (for example, siete)
  2. a strong then a weak vowel (for example, seis)
  3. two weak vowels (for example, fuimos and ciudad)

The Spanish word siete has only two syllables (sie-te), and the stressed vowel (la vocal tónica) is the e as it is the strong one in the diphthong.

The word seis has only one syllable because it contains a diphthong, of which the e is the stressed vowel.

When two weak vowels are together, it is typically the second vowel that is stressed, like the verb fuimos where the i is stressed in the first of the two syllables. However, depending on the number of syllables in the word, both of the weak vowels may be unstressed (“átonas”) as in the Spanish words ciudad and cuidado. These words consist of two syllables; with ciudad ending in the consonant d, the last syllable is stressed, thus leaving both weak vowels unstressed, and the o on the end of cuidado shifts the stress to the second syllable, -da­-, again leaving the syllable with two weak vowels unstressed.

If in a vowel combination such as weak + strong or strong + weak the weak vowel is stressed, the potential diphthong is broken, and each vowel constitutes a separate syllable, as in the word país (pa-ís). Compare paisano (pai-sa-no) in which the i is átona and thus contains a diphthong.

I. Ahora te toca a ti (Now it’s your turn). Try dividing the following words into syllables, and then check your answers with the last page.

1. siesta                       4. cuesten          7. reímos      10. triunfal

2. actuando               5. actúo               8. paisano    11. causa

3. homogeneidad    6. esquiabais     9. leí              12. abuelo

QUESTION: Can a diphthong occur between words in a sentence?

If the vowels are different, they may follow the diphthong rules. In authentic speech, the ending vowel of one word and the initial vowel of the following word can form a diphthong. Consider the word group la humilla in which the non-stressed definite article and the first syllable form a diphthong, breaking this group in three syllables: lau-mi-lla, the silent h being omitted. This is similar to the diphthong in the single word causa where the two syllables are cau-sa. Another example is lo histórico where the most stressed syllable of the group is the accented –tó- leaving the first syllable to form a diphthong. Therefore, the syllable breaks are lois-tó-ri-co. Encouraging students to think of pronouncing words in groups and not word by word will help improve their pronun-ciation and help them attain greater fluidity.

As in the case of diphthongs within words, if the weak vowel of the combinations weak + strong and strong + weak is stressed, then there is no diphthong, for example, la única (la-ú-ni-ca) and lo hice (lo-í-ce). In the case of two weak vowels in contact with each other between words, if the first is stressed, there is no diphthong, for example, tú hiciste (tú-i-cis-te). If the second is stressed, there is a diphthong, for example, tu hijo (tui-jo).

II. Ahora te toca a ti. Try pronouncing these word groups with different vowels between words, using diphthongs. Count the number of syllables you hear, then check the syllable division at the end of this guide.

1. lo increíble      4. se unieron      7. te hiciste                  10. su ave

2. mi ojo                5. su arete            8. la inventó                11. tu abuelo

3. veinte y seis    6. de usted           9. la universidad       12. su hijita


QUESTION: Do the letters b, d, g, and v have different pronunciations in different words?

These alófonos (sounds) do indeed sound slightly different depending on the position in an utterance (word group or sentence). To understand the differences, each alófono is classified by three characteristics:
1) the mode of articulation (where and how the air is released);
2) the point of articulation (in the mouth, what touches what);
3) voicing (if the vocal chords vibrate or not).

1) For the mode of articulation, the three basic groups are stops (oclusivas), fricatives (fricativas), and nasals (nasales). Stops indicate that there is a break in the air flow, fricatives indicate that there is air passing through a narrow opening, and nasals indicate the air comes through the nose.

2) The point of articulation refers to the parts of the mouth. Here are some examples:
a) dental (tongue against the upper teeth);
b) interdental (between the teeth);
c) labiodental (upper teeth against lower lip);
d) bilabial (both lips);
e) velar (tongue against the upper ridge in mouth);

3) Voicing refers to the vocal chords vibrating, causing a voiced letter (sonora). When the vocal chords do not vibrate to voice a sound, it is called an unvoiced letter (sorda).

Now that the three descriptors have been mentioned, it is time to look at individual words and sounds. Let¹s start with the letters v/b, phonetically written as [b]. Though some regional differences exist, when they are “utterance initial,” they sound the same. In this case, the [b] is oclusiva, bilabial, sonora, as well as when it follows the letters m or n. Think of the expression Ven acá or the word tambor in which both lips touch and stop the air flow as the vocal chords vibrate to produce sound. In contrast, the [p] is oclusiva, bilabial, sorda because only air comes out, not voiced sounds.

When the letters v/b are between vowels in a word, however, the pronunciation changes slightly to fricativa, bilabial, sonora. The lips do not completely shut and the air passes through a tight opening. Think of the word labio with the [b] between vowels. Another example is the verb tuve where the v is not labiodental as it is in English.

The [d] is oclusiva, dental, sonora when “utterance initial” or after the letters n and l. Think of the word donde in which the tongue presses against the upper teeth for each [d].

When the [d] is in any other position, it is fricativa, interdental, sonora. The tongue lightly touches between the teeth to form the fricative, like in the word cada.

The [g] is made with the tongue in the back of the mouth at the velum, or soft palate. Much like the [b] and [d], if the [g] is in “utterance initial” position or following an n, it is oclusiva, velar, sonora. The word garaje and the verb tengo are good examples of this sound.

If the [g] is in any other position, it is fricativa, velar, sonora. The tongue does not block the air for the stop, but rather allows some air to pass. Some examples are digo and hago with the inter-vocalic consonant softening slightly. This is the same sound as is found in the English word “sugar”, whose g is not the same as that in “go!”.

III. Ahora te toca a ti. Say each of the following words out loud. Then, identify these consonants (b, d, g, v) using the points of articulation listed above. After, check the answers on the last page of this guide.

A. Letter: b
1. haber           3. hablar            5. Benito
2. Córdoba     4. obstinado    6. hombre

B. Letter: d
1. Dorotea       3. cada                5. cuando
2. alcalde         4. aduana           6. continuad

C. Letter: g
1. inglés            3. pongo             5. grasa
2. gorilla          4. Olga                6. traigan

D. Letter: v
1. viento           3. cuevas           5. Vicente
2. olvidamos  4. invitación     6. curva

QUESTION: Why does the n in some Spanish words sound like the ng in the English word thing but not in all words?

There are actually several possible sounds for the letter n depending on what letters surround it. When reading through the following descriptions, keep in mind that only the point of articulation will change, while the mode of articulation (nasal) and voicing (sonora) remain constant.

The letter n sounds a bit different when it is followed by the hard c (co, ca, cu) like in the words banco or oncología. This also happens when the n is followed by a hard g like in the words vengo or tengan. The g sound comes from the throat closing slightly with the vocal chords vibrating (in Spanish called una consonante oclusiva, velar, sonora) with the preceding n approximating this consonant. In English, this happens as well, for example, in the words “ink” and “bank”. The resulting sound of the n is called nasal, velar, sonora.

When the n is followed by a d or a t the pronunciation is dental, meaning that the tongue touches the back of the upper front teeth. Think of the word veinte for an example. When pronouncing a t in Spanish, the tongue should be behind the front teeth with no vibration of the vocal chords (in Spanish called una consonante oclusiva, dental, sorda). The pronunciation of the n moves up in the mouth, against the teeth in anticipation of the articulation of the next letter. This n is therefore nasal, dental, sonora.

If followed by an f, the n approximates the next sound again and becomes nasal, bilabiodental, sonora. This sound is really a combination of f and m, pronounced simultaneously. Words such as enfermo show how both the lips and upper teeth are used to produce its sound.

Yet another sound is produced when the n precedes the letters b, v, or p, or consonantes bilabiales (using both lips). By approximating the next consonant, the n becomes nasal, bilabial, sonora, or in other words, just like the letter m, as in the expression un beso, phonetically [um beso].

Finally, when preceding a ch, the n moves back in the mouth to become nasal, palatalizada, sonora.

To hear these differences, try saying the following words, paying attention to where your tongue is: cantan, treinta, estudiante, presente, pongo, engaño, venganza, tengas.

IV. Ahora, te toca a ti. Say the following words aloud. Then, identify the sound for each
n paying close attention to the point of articulation. After, check the answers on the last

page of this guide.

1. estudiante      4. pongo         7. cantan
2. un poco           5. énfasis        8. un cheque
3. cuando            6. tanque        9. un vaso

QUESTION: Why do the indirect object pronouns le/les change to se when preceding the third-person direct object pronouns lo/la/los/las?

Contrary to popular belief, it is not because the sound is unacceptable to the Spanish-speaker’s ear. The evolution from Latin is the cause of this spelling change. Let’sconsider the Latin expression įllī įllum dixī meaningse lo dije. When used separately, the two pronouns changed over time to become the following:

įllī > lī > le įllųm > lum > lo

When used together, the object pronouns evolved differently. The following changes occurred:

įllī įllųm > lī įllųm > lī ello > ljelo > yelo > ŷelo > želo > śelo > se lo

Therefore, the expression mentioned earlier, įllī įllum dixī, became se lo dije due to the evolution of the combined pronouns. Students may find this interesting to know, and hopefully it will help them remember one of the “irregularities” of modern Spanish.

It bears mentioning that in expressions such as “Voy a pedirle la mano” and “pienso contarle lo que pasó anoche”, there is no change to se because of the lack of two consecutive object pronouns.