Parameter estimation

This section provides a step-by-step guide on how to estimate the parameters of communicative functions using the PENTATrainer2's Learn tool. Please refer to Xu and Prom-on (2014) for more detail about model parameters and parameter estimation process.

Estimate parameters of communicative functions

1. Open Praat and click on the menu Praat -> PENTAtrainer2 -> Learn

annotate_learn_1

2. Choose the working folder. Note that the folder shown below has already been annotated by the PENTAtrainer2's Annotation tool to generate necessary data files for parameter optimization.

learn_2

3. Input the optimization parameters. Each parameter control the behavior the parameter estimation process, as described below. Click Start to begin.

learn_3

4. After clicking Start, the optimization window will pop up with the progress bar indicating the progress of the optimization process.

learn_4

5. Once completed, the optimization window will display the error and correlation information. These measurements are taken by comparing the synthetic F0 contours with the original ones. Click Close to finish the optimization process.

learn_5

6. If you select Inspect Manipulation, a Praat Manipulation object of each sound file integrating with the synthetic F0 contour (green dots) will be generated. You can visually inspect and listen to the synthesized sound by inspecting this object.

learn_6

7. The optimized parameter values are stored in the “parameters.txt” file. You can use either a text editing or spreadsheet program to open the file.

FIG3_9

8. Using a spreadsheet program (e.g., Excel), you can sort the parameters to make them easier to understand. In this example, sort the parameter by “Focus” then “Tone” and then “Sentence”. The figure shows part of the sorted parameters. A number of interesting observations can be made from these data. They show that F (Falling) tone has negative slope while R (rising) tone has positive slope. H has almost static slope with a relatively high target height while L has much lower target height. Pitch targets of LS (sandhi L tone) are similar to R.

FIG3_10

9. Utterance specific RMSE and correlation of optimized parameters are stored in the “accuracy_learning.txt” file.

FIG3_11

10. The closeness of fit between the original and synthesized F0 contours can also be visually inspected by opening the file [filename].synf0.

FIG3_13

11. The changes in the learning errors over iterations can be found in the “total_error.txt” file. Using a spreadsheet program, you can plot the changes in learning errors over iterations.

FIG3_12

PENTATrainer2

A Hypothesis-Driven Prosody Modeling Tool