| SciPort RLP

Durch die Nutzung unserer Webseite erklären Sie sich damit einverstanden, dass wir Cookies verwenden. Weitere Informationen

Combinations of Content Extraction Algorithms

LWA'09: Workshop Information Retrieval. Darmstadt. 2009

Erscheinungsjahr: 2009

Publikationstyp: Diverses (Konferenzbeitrag)

Sprache: Englisch

Inhaltszusammenfassung

Content Extraction is the task to identify themain text content in web documents – a topic of interest in the fields of information retrieval, web mining and content analysis. We implemented an application framework to combine different algorithms in order to improve the overall extraction performance. In this paper we present details of the framework and provide some first experimental results.

Autoren

Weißig, Yves (Autor)

Gottron, Thomas (Autor)

Klassifikation

DFG Fachgebiet:
Informatik

DDC Sachgruppe:
Informatik

Verknüpfte Personen

Thomas Gottron
Administrator Forschungsdatenbank
(FB 4: Informatik)

Beteiligte Einrichtungen