This post is prompted by a query from Nobuo Yuzawa about an utterance in my program Plato. First, I see I hadn’t posted a link here to that program, so I’ve put that right. The query ended up in the comments page for EP Tips and started an exchange of comments between Nobuo and Emilio Márquez. I’m going to remove the originals from that page and reproduce them here:
Nobuo Yuzawa says:
May 19, 2014 at 1:45 am
I have a question about the tonicity of Plato Set1 No10 “I can’t imagine what they want”. The intended answer is “what”, which means that its tone is a rise-fall – there is a rise in pitch in ‘they’. Is there any useful perceptual clue to distinguish this case from the case in which ‘they’ is the nucleus and is spoken with a fall?
Emilio Márquez says:
May 19, 2014 at 4:57 am
Nobuo, I think I can hear a mid-level strong “what” and then a high weak short fall during “they”. I suppose that if the nucleus was to fall on “they”, this word would be perceived to be stronger than “what”, irrespective of pitch.
Nobuo Yuzawa says:
May 19, 2014 at 3:55 pm
Emilio, thank you for your comment. Acoustically, what you perceive as stronger in “what” than “they” may be related to duration. This is a noticeable difference in duration between the two words, but almost no change is detected in intensity between them, and “they” is higher-pitched than “what”. As a Japanese speaker, I may be simply more sensitive to changes in pitch. Your perception of more strength on “what” may also have something to do with an overall stress pattern of this sentence, “what” being a content word and “they” being a function word. This may mean that if “they” were highlighted as spoken with a fall, it might require more change in prosodic elements, such as much higher pitch, more intensity, and/or more duration. It seems that a simple comparison of prosodic elements does not always explain satisfactorily perceptual differences of speech sounds by people, especially by native speakers. Would this explanation somehow make sense?
Emilio Márquez says:
May 19, 2014 at 5:19 pm
Nobuo, You’re absolutely right. I’m afraid I mixed up perception with articulation when I referred to strong “what” vs weak “they”.
I think Nobuo and Emilio have outlined part of the answer. I’ll add my thoughts. The first diagram below shows the broadband spectrogram and fundamental frequency contour of an utterance of the word No produced by me with a falling tone. Notice that there is a brief precursive rise in F0 and the peak coincides more or less with the beginning of the vowel.
Diagram 1The second diagram shows a similar display for the same word said with a rise-fall tone. This time the peak is a lot later, approximately 225ms after the onset of the vowel.
Diagram 2Diagram 3 shows the utterance No, you fool said with a falling tone on No. The F0 contour is similar to that of Diagram 1, with the F0 peak aligned with the vowel onset.
Diagram 3Diagram 4 shows the same utterance, but this time with a rise-fall tone on the word No. Again there is a delay in the F0 peak, but this time it is in the syllable following the nucleus, and as you can see quite late in that syllable.
Diagram 4The final diagram shows the utterance No, you do it, with a falling tone on the word you. Notice that the F0 peak is soon after the onset of the vowel, which is marked with the blue vertical line.
Diagram 5So there is a distinct difference in the alignment of a falling tone within its syllable and the falling part of a rise-fall tone within its syllable. This difference is a further cue to the distinction between a fall and a rise-fall and this can be added to the “strength” difference that Nobuo and Emilio talked of.
This idea of a delay in peak alignment has been taken up by quite a few intonation researchers. One account you might like to look at is by Gussenhoven C. (1984) On the grammar and semantics of sentence accents. Dordrecht: Foris. He proposes a number of what he calls ‘modifications’ which can be applied to an accented syllable, usually the nucleus. One of these is [±delay]. The positive value of this modification can be applied to tones of all kinds, not only falls, and has the semantic effect of making the information so marked “especially noteworthy”, “not routine”.