Skip to main content


eCommons@Cornell

eCommons@Cornell >
College of Arts and Sciences >
Linguistics >
Linguistics - Monographs, Papers and Research >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1813/13079
Title: Web Harvest of Minimal Intonational Pairs
Authors: Howell, Jonathan
Rooth, Mats
Keywords: intonation
focus
web as corpus
machine learning
prosody
comparatives
speech recognition
linguistics
Issue Date: 2-Jul-2009
Abstract: This paper describes experiments on gathering spoken-language data on the web that bears on issues of the phonetics-phonology and semantics-pragmatics of intonation. The target data are tokens of fixed word strings like "than I did", where intonation varies in a way which correlates with grammatical and pragmatic context. In a web harvest procedure, audio files were identified using a search engine based in speech-to-text, downloaded, and cut to a relevant segment under program control. In an application of such a database, an SVM classifier was trained to make a grammatically determined distinction in intonation based on purely acoustic cues. Sources of error in the retrieval are quantified.
Description: Preliminary version of paper to be presented at Web as Corpus 5, September 2009. Final version will be substituted on July 17, 2009.
URI: http://hdl.handle.net/1813/13079
Appears in Collections:Linguistics - Monographs, Papers and Research

Files in This Item:

File Description SizeFormat
HowellRooth2009WebHarvest.pdf188.92 kBAdobe PDFView/Open

Items in eCommons are protected by copyright, with all rights reserved, unless otherwise indicated.

 

© Copyright 2003-2009 by the Cornell University Library Contact Us