Skip to main content


eCommons@Cornell

eCommons@Cornell >
College of Engineering >
Computer Science >
Computer Science Technical Reports >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1813/6048
Title: A Theory of Term Importance in Automatic Text Analysis
Authors: Salton, Gerard
Yang, C. S.
Yu, C. T.
Keywords: computer science
technical report
Issue Date: Jul-1974
Publisher: Cornell University
Citation: http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cs/TR74-208
Abstract: Most existing automatic content analysis and indexing techniques are based on word frequency characteristics applied largely in an ad hoc manner. Contradictory requirements arise in this connection, in that terms exhibiting high occurence frequencies in individual documents are often useful for high recall performance (to retrieve many relevant items), whereas terms with low frequency in the whole collection are useful for high precision (to reject nonrelevant items).
URI: http://hdl.handle.net/1813/6048
Appears in Collections:Computer Science Technical Reports

Files in This Item:

File Description SizeFormat
74-208.pdf1.39 MBAdobe PDFView/Open
74-208.ps801.51 kBPostscriptView/Open

Refworks Export

Items in eCommons are protected by copyright, with all rights reserved, unless otherwise indicated.

 

© 2014 Cornell University Library Contact Us