Semantics Search Engine
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
We extract minimal information about a category?s content based on website design, semantics. We decompose sentences, parsing them through thesaurus, dictionaries and encyclopedias, matching sentences to typical sentences; building a set of logic clauses matching a particular content. We start from man made directories, parsing their websites and categories at two sub-levels. We have tested different architectures for thread based web bots, some extremely powerful and allowing to control the number of threads running in the background. We have achieved rule-based parsers for sentences, for specific thesaurus, encyclopedia and dictionary contents. We have not been able to complete the replacement of these sets of logic clauses by neural networks made of incremental knowledge mechanisms, nor the extraction of the design of a website to categorize it. We aim at providing business oriented information where precise reports of the type of content, or ad content, and where it appears would allow competitors to re-evaluate their objectives. Moroever a parallel ?ping? of yahoo and googke queries would return best entries and publicity links and a similar approach could be used as such. 4 First, the main goal is extract the minimal information set about a particular web category, based on the entire mad made websites, thersaurus, dictionaire, encyclopedias available on the web. Ultimately, the algorithm would return search results based not on the content but on its signification, breaking language barrers, domestic language and foreign language barriers. We have not extracted the minimal informational content of a website based on its HTML designs, its image and video onctents, its disposition, specific keywords, only its semantics.