|
eCommons@Cornell >
College of Arts and Sciences >
Linguistics >
Linguistics - Monographs, Papers and Research >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1813/13079
| Title: | Web Harvest of Minimal Intonational Pairs |
| Authors: | Howell, Jonathan Rooth, Mats |
| Keywords: | intonation focus web as corpus machine learning prosody comparatives speech recognition linguistics |
| Issue Date: | 2-Jul-2009 |
| Abstract: | This paper describes experiments on gathering spoken-language data on the web that bears on issues of the phonetics-phonology and semantics-pragmatics of intonation. The target data are tokens of fixed word strings like "than I did", where intonation varies in a way which correlates with grammatical and pragmatic context. In a web harvest procedure, audio files were identified using a search engine based in speech-to-text, downloaded, and cut to a relevant segment under program control. In an application of such a database, an SVM classifier was trained to make a grammatically determined distinction in intonation based on purely acoustic cues. Sources of error in the retrieval are quantified. |
| Description: | Preliminary version of paper to be presented at Web as Corpus 5, September 2009. Final version will be substituted on July 17, 2009. |
| URI: | http://hdl.handle.net/1813/13079 |
| Appears in Collections: | Linguistics - Monographs, Papers and Research
|
Items in eCommons are protected by copyright, with all rights reserved, unless otherwise indicated.
|