Combining Time Averaging and Ensemble Averaging in Analyzing Voiceless Fricatives in Mandarin

Journal of the Acoustical Society of America, 96, Pt 2, p.3230. 1994

Yi Xu and Lorin Wilde

Research Laboratory of Electronics
Massachusetts Institute of Technology

The random fluctuations and spurious peaks typically seen in fricative spectra can be reduced by time averaging, i.e., averaging spectra obtained with overlapping time windows over an interval of the frication noise. Furthermore, token-to-token as well as individual speaker variations in fricatives can be reduced by ensemble averaging, i.e., averaging over noise spectra of multiple tokens in the same relative time interval. However, for studying coarticulatory variation in the frication noise, neither of these two methods alone is adequate: Time averaging does not handle token-to-token and individual speaker variations; ensemble averaging requires a large number of tokens to produce smooth and consistent spectra. In the present study, time averaging and ensemble averaging were combined in the analysis of coarticulatory variation of fricatives in Mandarin. The size of the time-averaging interval was 20 ms, and the size of the individual FFT windows was 8 ms. The time-averaged spectra were further ensemble- averaged over ten repetitions of the same sentence by the same speaker. Results indicated that the spectra thus obtained were smooth, and they revealed spectral changes over time more clearly than those obtained by either time averaging or ensemble averaging alone. Further ensemble averaging across different speakers was also explored and has produced encouraging data.[Work supported by NIH.]

See other publications