Audio-visual influences on speech perception: a comparison of sung and spoken conditions

Lena Quinto, William Thompson, Frank A. Russo

Research output: Contribution to conferenceAbstractResearchpeer-review


Visual cues inform speech perception and may also influence the perception of lyrics in song. The importance of visual cues to speech perception is seen in the McGurk effect, in which pairing incongruent multimodal stimuli can produce intermediate syllables, e.g., visual ga with auditory ba leads to perceived da. Music and language share common links such as prosodic and rhythmic cues, but speech is specialized for verbal communication whereas music is not. Thus, we were unsure if a McGurk effect would occur for sung materials. Twenty-nine participants heard sequences of syllables (la-la-la-ba, la-la-la-ga) that were spoken or sung to a steady beat. Sung stimuli were ascending triads that returned to the original tonic or a semitone above the tonic. Incongruent stimuli were created by mixing an auditory ba with a visual ga. Signal-to-noise ratio (SNR) was manipulated to assess a possible trade-off between auditory and visual signals. The signal level across conditions was 60 dB SPL and SNR was varied across conditions: 60 dB (easy), 0 dB (moderate), and -12 dB (difficult). Participants chose the syllable they last heard from the following syllables: ba, da, ga, la, tha and va. Effects of SNR indicated that judgments of syllable perception relied more heavily on visual cues with decreasing SNR. When data were analyzed for spoken and sung stimuli separately, a congruency (McGurk) effect was observed for both domains. This was qualified by a domain x congruency interaction that indicated differences in the nature of the McGurk effect for spoken and sung domains. In particular, the influence of visual cues was greater when syllables were sung than spoken, especially if they ended on the (unexpected) raised tonic. The findings confirm that cross-modal integration of syllables occurs for sung materials, and that visual cues may be especially important for deciphering lyrics.
Original languageEnglish
Publication statusPublished - Dec 2007
Externally publishedYes
EventInternational Conference on Music Communication Science - University of New South Wales, Sydney, Australia
Duration: 3 Dec 20077 Dec 2007


ConferenceInternational Conference on Music Communication Science
Abbreviated titleICoMCS


Dive into the research topics of 'Audio-visual influences on speech perception: a comparison of sung and spoken conditions'. Together they form a unique fingerprint.

Cite this