Components of intonation: what are linguistic, what are mechanical/physiological?

Yi Xu and Q. Emily Wang

Presented at International Conference on Voice Physiology and Biomechanics, May 29 - June 1, 1997, Evanston Illinois.

Despite decades of investigation, the nature of intonation continues to elude us today. Some aspects of intonation seem to be universal, as they have been reported in many languages, yet their underlying mechanisms remain unclear. Other aspects, meanwhile, vary so much from language to language that ad hoc rules are still commonly used to account for the variations. In the present paper, we argue that many of the surface variations of intonation can be accounted for by the interplay between two independent forces: linguistic demands and mechanical/physiological constraints. Linguistic demands specify various pitch targets to be achieved during speech production. These pitch targets all correspond to linguistic meaning of some sort, specified by different languages. They are produced intentionally, and hence are probably implemented with specific neural commands. These pitch targets include, among other things, lexical tones in tone languages, stress-related pitch targets in non-tone languages, and melody-related pitch targets, such as final lowering and final rising. The mechanical/physiological constraints are conditioned by physical laws and physiological properties of the articulators, such as speed of muscle contraction, mass of the articulators, and mechanical characteristics of the joints. F0 variations due to mechanical/physiological constraints carry no linguistic meaning. They are not intentionally produced, and hence are probably not implemented with any specific neural commands. Observable F0 variations due to these factors include phenomena such as tone spreading, downstep, and declination.

We demonstrate in this study that it is possible to identify experimentally intonation variations attributable to either linguistic demands or mechanical/physiological constraints. In an experiment in which both lexical tone and focus position in Mandarin were controlled, several independent observations were made. First, we found in Mandarin clear instances of tone spreading similar to those found in African tone languages: a rising tone under focus spreads to the beginning of the following syllable. This spreading could not be due to a simple assimilation effect, because similar spreading was not observed when the tone under focus was a high-level tone. Rather, it is more likely due to an overshoot of the pitch raising movement which is inherent in the rising tone but lacking in the high-level tone. When the rising tone is under focus, the rising movement is exaggerated, and, as a result, the highest peak occurs in the beginning of the next syllable. We believe this effect to be physiological because it is due to the inertia of an articulatory movement. Second, we found that the effect of downstep (i.e., lowering of a high pitch after a low pitch) was gradient rather than categorical, indicating that it was probably mechanical rather than linguistic. That is, a high pitch is lowered when following a low pitch because the pitch-controlling mechanism does not fully recover from the production of the low pitch. Third, in addition to the well-known effect of pitch-range increase on a word under focus, we observed a dramatic post-focus lowering effect. However, we observed no extensive pre-focus lowering. And, when the focus was on the last syllable of the utterance, the expected pitch raising on that syllable was virtually absent. There seems to be a restriction on how much pitch can be lowered on pre-focus syllables and how much it can be raised on the last syllable of an utterance. This indicates that the restriction is probably linguistic rather than mechanical. Finally, after taking away the effect of downstep and post-focus lowering, we found very little down trend remaining in the f0 contour that is attributable to a background declination. We thus conclude that what commonly described as declination is probably largely due to a combination of the downstep and post-focus lowering. Of the two effects, the former is probably mechanical, while the latter linguistic.

See other publications