A hybrid approach to statistical and semantical analysis of web documents
Merabti, M. (Hrsg). Proceedings of 5th European Conference on Internet and Multimedia Systems and Applications : July 13 - 15, 2009, Cambridge, UK. Anaheim, Calif. u.a.: Acta Press 2009 S. 115 - 120
Erscheinungsjahr: 2009
ISBN/ISSN: 978-0-88986-801-4
Publikationstyp: Buchbeitrag (Konferenzbeitrag)
Sprache: Englisch
Geprüft | Bibliothek |
Inhaltszusammenfassung
This paper describes a new approach to improve the analysis and categorization of web documents using sta-tistical methods for template based clustering as well as semantical analysis based on terminological ontologies. A domain-specific environment serves for prove of concept. In order to demonstrate the widespread practical benefit of our approach, we outline a combined mathematical and semantical framework for information retrieval on inter-net resources.
Klassifikation
DFG Fachgebiet:
Informatik
DDC Sachgruppe:
Informatik