eCommons

 

Automatic Hypertext Construction

Other Titles

Abstract

The unprecedented growth of the World Wide Web illustrates the importance of hypertext as a method for organizing the rapidly expanding amount of on-line text. As document collections become larger and more dynamic, however, it is not feasible to construct more than an occasional hypertext manually. This thesis presents entirely automatic methods for gathering documents for a hypertext, linking them, and annotating those connections with a description of the type or nature of the link. The problem of automatically collecting related documents is addressed in Chapter 2, where robust Information Retrieval methods are applied to form high-quality links between documents. A local context check identifies links where ambiguous vocabulary erroneously suggests a relationship. Dynamic part retrieval is employed to select the portions of documents which are most related, allowing parts to be linked when it is more appropriate to link subtopics than entire documents. Chapter 3 presents a taxonomy of hypertext link types and defines the following three classes of links: "pattern-matching" links can be found using simple string-matching methods, "manual" links require substantial application of natural language understanding methods (which are currently beyond the state of the art), and "automatic" links are those which can be found using the methods of this thesis. Chapter 4 begins the work of automatic link typing by describing two novel graphical techniques for visualizing the relationship between two or more documents. "Uniform" visuals display the relationship between documents or document parts without regard to their relative sizes, whereas "varying" visuals include information about sizes and locations. Both methods highlight relationships between documents and motivate the automatic techniques of Chapter 5. Chapter 5, thus, demonstrates automatic methods for identifying the relationships depicted in the visualizations. Using an approach based upon graph simplification, this method automatically identifies revision, summary, expansion, equivalence, comparison, contrast, tangential, and aggregate links. Chapter 6 discusses an informal evaluation of the link typing. Though somewhat inconclusive, the evaluation demonstrates that automatic document linking performs well, but also indicates that much work remains to be done toward understanding automatic link typing.

Journal / Series

Volume & Issue

Description

Sponsorship

Date Issued

1995-02

Publisher

Cornell University

Keywords

computer science; technical report

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Committee Co-Chair

Committee Member

Degree Discipline

Degree Name

Degree Level

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cs/TR95-1484

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

technical report

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record