J 2007

Text Segmentation Using Context Overlap

ŘEHŮŘEK, Radim

Basic information

Original name

Text Segmentation Using Context Overlap

Name in Czech

Segmentace textu s použitím překryvu kontextů

Authors

ŘEHŮŘEK, Radim (203 Czech Republic, guarantor)

Edition

Progress in Artificial Intelligence, Guimarães, Portugal, Springer Berlin / Heidelberg, 2007, 0302-9743

Other information

Language

English

Type of outcome

Article in a journal

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Portugal

Confidentiality degree

is not subject to a state or trade secret

References:

URL

RIV identification code

RIV/00216224:14330/07:00023050

Organization unit

Faculty of Informatics

UT WoS

000252074800054

Keywords in English

text segmentation; LSI; latent semantic indexing

Tags

latent semantic indexing, LSI, text segmentation

Tags

International impact, Reviewed
Changed: 29/3/2010 18:51, RNDr. Radim Řehůřek, Ph.D.

Abstract

ORIG CZ

V originále

In this paper we propose features desirable of linear text segmentation algorithms for the Information Retrieval domain, with emphasis on improving high similarity search of heterogeneous texts. We proceed to describe a robust purely statistical method, based on context overlap exploitation, that exhibits these desired features. Experimental results are presented, along with comparison to other existing algorithms.

In Czech

In this paper we propose features desirable of linear text segmentation algorithms for the Information Retrieval domain, with emphasis on improving high similarity search of heterogeneous texts. We proceed to describe a robust purely statistical method, based on context overlap exploitation, that exhibits these desired features. Experimental results are presented, along with comparison to other existing algorithms.

Links

LC536, research and development project
Name: Centrum komputační lingvistiky
Investor: Ministry of Education, Youth and Sports of the CR, Centrum komputační lingvistiky
Displayed: 1/7/2025 05:07