We are delighted that our initial call has received enthusiastic responses. In order to facilitate coordination and reduce duplication of efforts, please check here to see language families especially in need of exploration and languages already taken up. You are nevertheless welcome to contact us even if the language of your interest is already listed there, as more than one person may jointly work on a language.

Call for international collaborations

Cross-Linguistic distribution of Post-focus Compression (PFC) and its historical origin


Yi Xu, University College London, UK

Bei Wang, Minzu University of China, China

Szu-wei Chen, National Chung Cheng University, Taiwan



There is increasing evidence that prosodic focus (also known as contrastive stress, nuclear accent, nuclear tone or sentence stress) is realized in many languages not only by increasing F0, duration, intensity and upper spectral energy on the focused component itself, but also by compressing the pitch range and intensity of the post-focus components (Chen et al., 2009; Cooper et al., 1985; Pell, 2001; Xu, 1999, 2005; Xu & Xu, 2005). There is also evidence that such post-focus compression (PFC) is a highly effective perceptual cue for focus (Xu et al., 2004). PFC has been reported, explicitly or implicitly, for English, German, Greek, Dutch, Swedish, Japanese, Korean, Finnish, Arabic, Uyghur, Tibetan, Hindi, Persian, Finnish, and interestingly, Mandarin Chinese (cf. Xu 2011 for a brief review). The case of Mandarin is especially interesting because it is fully tonal, and has morpho-syntactic means of marking focus (clefting, as in English). Thus PFC could to be independent of the tonal and syntactic characteristics of the language.

The independence of PFC from tone is more clearly seen in a surprising new finding that it is absent in Taiwanese, a tone language closely related to Mandarin (Chen et al., 2009; Pan, 2007; Xu, Chen & Wang, in press). More unexpectedly, the same study also found that PFC is absent in Taiwan Mandarin, an official language spoken in Taiwan, which resembles Beijing Mandarin in many respects. It seems that Taiwan Mandarin has lost PFC due to close contact with Taiwanese, because pervasive bilingualism has been a fact of life in Taiwan for several generations. More interestingly, there is evidence that PFC is also absent in Cantonese, another southern Chinese language (Gu & Lee, 2007; Wu & Xu, 2010). Furthermore, evidence is now emerging that many other languages probably also do not have PFC, see reviews by Zerbian et al. (2010) and Xu (2011).

Given the new evidence, a natural question is how could PFC have gotten into a language in the first place? There are at least three hypotheses: (a) independent genesis — emerging automatically in the language, (b) horizontal spreading — entering the language through contact with a PFC language, and (c) vertical inheritance — passing down from an ancestral language with PFC.

To support the independent genesis hypothesis, it would be desirable to show that the emergence of PFC is a case of convergent evolution, just like the development of webbed feet, fins and streamlined body shape after adopting an underwater life style. Otherwise, it would be increasingly difficult to explain why so many seemingly unrelated languages all independently developed PFC, while many other languages somehow resisted the same pressure.

To support the horizontal spreading hypothesis, it would be necessary to show that first, each PFC language, unless it is the direct descendant of the original PFC language, has been in contact with a PFC language in the past. In the case of Mandarin, this is quite likely, because historically it was in close contact with Altaic languages like Mongolian and Manchurian. We now know that at least some Altaic languages have PFC, including Japanese and Korean (Lee & Xu, 2010), although, to our knowledge, Mongolian and Manchurian have not yet been studied for PFC. The second necessary support for the spreading hypothesis is that there should be evidence of transmission of PFC from one language to another during contact. So far, however, there has only been evidence of losing PFC when two languages are in contact through bilingualism or second language learning (Wu & Chung, 2011; Chen et al., 2009; Wang et al., 2011).

Vertical inheritance seems to be the most extreme of the three hypotheses, as it implies that all PFC languages are descendants of an ancient proto-language in which PFC was first developed. But if we were to contemplate such a possibility, implausible as it may currently seem, what could have been this proto-language? Hereճ a speculation in Xu (2011):

From the distribution pattern that is currently emerging, the grouping of the PFC languages seem to be consistent with the hypothetical Nostratic superfamilty, consisting of the Indo-European, Uralic, Altaic, Afroasiatic, Dravidian, Kartvelian  and Eskimo-Aleut language families (Bomhard, 2008; Pedersen, 1931). Their common ancestor, the proto-Nostratic, could be dated back to the end of the last Ice Age, i.e., 15,000-12,000 BC, which was probably spoken along the Fertile Crescent (Bomhard, 2008).

One of the strongest evidence for the Nostratic superfamily hypothesis is the farming and language expansions after the end of the Ice Age as described by Diamond & Bellwood (2003).

The implications of cross-linguistic investigation of PFC distribution are many:

1.     The current view seems to favor the idea that every language has a unique but constantly changing prosodic system. Both the horizontal spreading and vertical inheritance hypotheses, however, would suggest that prosodic features are more stable than previously thought and may remain in a language for a long time.

2.     Most linguistic typological patterns, such as word order and tonality, are found across languages not closely related to each other (Haspelmath et al., 2005), suggesting multiple emergences of similar patterns. But a few typological features, such as phonemic clicks in southern and eastern Africa (Haspelmath et al., 2005), and lax prosody in question intonation (Rialland, 2009), seem to occur only in genetically or geographically related languages. Both the spreading and inheritance hypotheses, if supported, would group PFC with clicks and lax prosody as one of the hard-to-emerge features.

3.     Research on language typology has so far mostly relied on evidence from vocabulary, syntax and segmental phonology, and data collection has been mostly non-experimental. The inclusion of prosody and the use of systematic experimentation would bring unique contributions to the study of language typology.

4.     Typological research so far has mostly relied on properties that have to be measured in relative terms, e.g., number of shared words or phonemes, similarity of syntactic structure, etc. PFC, in contrast, is a singular feature (though with multiple acoustic cues), and the evidence it offers may be more clear-cut than most other types of evidence.

5.     Support for the vertical inheritance hypothesis would imply that Mandarin is a descendant of an Altaic language, and the characteristics it shares with other Chinese languages such as tone and much of the vocabulary are probably acquired through language contact. This scenario would challenge the standard assumption that all Chinese languages are derived from a single proto-Chinese language (Bodman, 1980).

6.     The testing of the vertical inheritance hypothesis may help establish closer links between linguistics and population genetics. For example, current genetic evidence suggests that the southern and northern populations in China belong to very difference branches (Chu et al., 1998), which is inconsistent with the widely accepted Sino-Tibetan language family. Separating Mandarin as well as Tibetan, both are shown to have PFC, from the non-PFC southern Chinese languages, would reduce the discrepancies between linguistic and genetic trees in China. Similar improvement could be made in other regions as well.

Call for collaborations

The nature of such cross-linguistic research entails that this must be a collaborative international effort. We therefore call for interested colleagues who, regardless of what they think of the hypotheses we have outlined, are keen to find out whether PFC exists in languages of their own interest, to join us in this collaborative effort. Please contact Yi Xu at yi.xu@ucl.ac.uk if you are interested.

To facilitate this effort, we have developed experimental protocols consisting of basic production and perception experimental designs, highly efficient procedures and software tools for taking accurate acoustic measurements, and effective analysis procedures for determining the presence/absence of PFC in a language. Please see the following for methodological highlights.

Methodological highlights

·      Production experiments:

1.     Simple sentences (1-3) with median length: 3 key words (3-10 syllables)

2.     Whenever possible, use the same sentences for all focus conditions

3.     In a tone language, use H level tone whenever possible

4.     Use 4 leading questions to elicit focus: No focus, focus on initial, medial or final key word

5.     Subjects (Ÿ): All native speakers of the target language, 4 males and 4 females

6.     Acoustic measurements should be taken from all three key words; use our Praat script whenever possible, see http://www.phon.ucl.ac.uk/home/yi/tools.html

7.     The script allows users to label the intervals to be analyzed, and then automatically generates a long list of text files containing measurements such as time-normalized F0 contours, time values corresponding to the time-normalize F0 points, duration of labeled intervals, maxf0, minf0, meanf0, mean intensity, etc.

·      Perception experiments:

1.     Stimuli: Taken from speakers with maximum, minimum and median standard deviation of all measured F0 samples of the same speaker

2.     Number of stimuli should allow listening subjects to finish within 1 hour

3.     Subjects (ű0): All native speakers of the target language, 5 males and 5 females

4.     Instruction to listener: Try to determine which word the speaker emphasized?

5.     Pre-test training only on procedure, not on correctness of focus identification



Taheri Ardali, M. and Xu, Y. (2012). Phonetic Realization of Prosodic Focus in Persian. In Proceedings of Speech Prosody 2012, Shanghai. 326-329.

Bodman, N. C. (1980). Proto-Chinese and Sino-Tibetan. In Contributions to Historical Linguistics. F. V. Coetsem and L. Waugh. Leiden: E. J. Brill  pp.

Bomhard, A. R. (2008). Reconstructing Proto-Nostratic: Comparative Phonology, Morphology, and Vocabulary. Leiden: Brill.

Chen, S.-w., Wang, B. and Xu, Y. (2009). Closely related languages, different ways of realizing focus. In Proceedings of Interspeech 2009, Brighton, UK.

Chen, Y., Guion-Anderson, S. and Xu, Y. (2012). Post-Focus Compression in Second Language Mandarin. In Proceedings of Speech Prosody 2012, Shanghai. 410-413.                                                                                                                                                                                                                                            

Chu, J. Y., Huang, W., Kuang, S. Q., Wang, J. M., Xu, J. J., Chu, Z. T., Yang, Z. Q., Lin, K. Q., Li, P., Wu, M., Geng, Z. C., Tan, C. C., Du, R. F. and Jin, L. (1998). Genetic relationship of populations in China. Proceedings of the National Academy of Sciences 95(20): 11763-11768.

Cooper, W. E., Eady, S. J. and Mueller, P. R. (1985). Acoustical aspects of contrastive stress in question-answer contexts. Journal of the Acoustical Society of America 77: 2142-2156.

Gu, W. and Lee, T. (2007). Effects of tonal context and focus on Cantonese F0. In Proceedings of The 16th International Congress of Phonetic Sciences, Saarbrucken

Haspelmath, M., Dryer, M. S., Gil, D. and Comrie, B. (2005). The World Atlas of Language Structures. Oxford: Oxford University Press.

Ipek, C. (2011). Phonetic Realization of Focus with no On-Focus Pitch Range Expansion in Turkish. In Proceedings of The 17th International Congress of Phonetic Sciences, Hong Kong: 140-143.

Lee, A. and Xu, Y. (2012). Revisiting focus prosody in Japanese. In Proceedings of Speech Prosody 2012, Shanghai. 274-277.                          

Lee, Y.-c. and Xu, Y. (2010). Phonetic Realization of Contrastive Focus in Korean. In Proceedings of Speech Prosody 2010, Chicago: 100033:1-4.

Pell, M. D. (2001). Influence of emotion and focus on prosody in matched statements and questions. Journal of the Acoustical Society of America 109: 1668-1680.

Pan, H. (2007). Focus and Taiwanese unchecked tones. In Topic and Focus: Cross-linguistic Perspectives on Meaning and Intonation. C. Lee, M. Gordon and D. Bring: Springer  pp. 195-213.

Pedersen, H. (1931). The Discovery of Language: Linguistic Science in the Nineteenth Century. English translation by John Webster Spargo. Bloomington, IN: Indiana University Press.

Rialland, A. (2009). The African lax question prosody: Its realisation and geographical distribution. Lingua 119(6): 928-949.

Wang, B., Wang, L. and Kadir, T. (2011). Prosodic encoding of focus in six languages in China. In Proceedings of The 17th International Congress of Phonetic Sciences, Hong KongWang, L., Wang, B. and Xu, Y. (2012). Prosodic encoding and perception of focus in Tibetan (Anduo Dialect). In Proceedings of Speech Prosody 2012, Shanghai. 286-289.                                                                                                                                                                                 

Wu, W. L. and Chung, L. (2011). Post-focus compression in English-Cantonese bilingual speakers. In Proceedings of The 17th International Congress of Phonetic Sciences, Hong KongWu, W. L. and Xu, Y. (2010). Prosodic Focus in Hong Kong Cantonese without Post-focus Compression. In Proceedings of Speech Prosody 2010, Chicago.

Xu, Y. (1999). "Effects of tone and focus on the formation and alignment of F0 contours," Journal of Phonetics 27, 55-105.

Xu, Y. (2005). "Speech melody as articulatorily implemented communicative functions," Speech Communication 46, 220-251.

Xu, Y. (2011). Post-focus compression: Cross-linguistic distribution and historical origin. In Proceedings of The 17th International Congress of Phonetic Sciences, Hong Kong.

Xu, Y., Chen, S.-w. and Wang, B. (2012). "Prosodic focus with and without post-focus compression (PFC): A typological divide within the same language family?," The Linguistic Review 29, 131-147.

Xu, Y., and Xu, C. X. (2005). "Phonetic realization of focus in English declarative intonation," Journal of Phonetics 33, 159-197.

Xu, Y., Xu, C. X. and Sun, X. (2004). On the Temporal Domain of Focus. In Proceedings of International Conference on Speech Prosody 2004, Nara, Japan: 81-84.

Zerbian, S., Genzel, S. and Kgler, F. (2010). Experimental work on prosodically-marked information structure in selected African languages (Afroasiatic and Niger-Congo). In Proceedings of Speech Prosody 2010, Chicago: 100976:1-4.


Language families especially in need of exploration:

·      Eskimo-Aleut

·      Dravidian

·      Kartvelian

·      Hebrew

·      Amerindian

Languages already examined or under examination:


1.     Beijing Mandarin (Chinese, Sino-Tibetan) [+PFC]

2.     Nanchang (Gan, Chinese, Sino-Tibetan) [+PFC]

3.     Uyghur (Turkic, Altaic) [+PFC]

4.     Turkish (Turkic, Altaic) [+PFC]

5.     Persian (Indo-Iranian, Indo-European) [+PFC]

6.     Korean (Altaic) [+PFC]

7.     Japanese (Altaic) [+PFC]

8.     Tibetan (Tibeto-Burman) [+PFC]

9.     Taiwan Mandarin (Chinese, Sino-Tibetan) [–PFC]

10.  Southern Min (Chinese, Sino-Tibetan) [–PFC]

11.  Cantonese (Chinese, Sino-Tibetan) [–PFC]

12.  Yi (Nuosu, Tibeto-Burman) [–PFC]

13.  Wa (Mon-Khmer) [–PFC]

14.  Deang (Mon-Khmer) [–PFC]

15.  Hijazi Arabic (Semitic) [–PFC]


Last update: 16 June, 2014

Back to top