ŘEHŮŘEK, Radim. Text Segmentation Using Context Overlap. Progress in Artificial Intelligence. Guimarães, Portugal: Springer Berlin / Heidelberg, 2007, vol. 2007, No 4874, p. 647-658, 11 pp. ISSN 0302-9743.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Text Segmentation Using Context Overlap
Name in Czech Segmentace textu s použitím překryvu kontextů
Authors ŘEHŮŘEK, Radim (203 Czech Republic, guarantor).
Edition Progress in Artificial Intelligence, Guimarães, Portugal, Springer Berlin / Heidelberg, 2007, 0302-9743.
Other information
Original language English
Type of outcome Article in a journal
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Portugal
Confidentiality degree is not subject to a state or trade secret
WWW URL
RIV identification code RIV/00216224:14330/07:00023050
Organization unit Faculty of Informatics
UT WoS 000252074800054
Keywords in English text segmentation; LSI; latent semantic indexing
Tags latent semantic indexing, LSI, text segmentation
Tags International impact, Reviewed
Changed by Changed by: RNDr. Radim Řehůřek, Ph.D., učo 39672. Changed: 29/3/2010 18:51.
Abstract
In this paper we propose features desirable of linear text segmentation algorithms for the Information Retrieval domain, with emphasis on improving high similarity search of heterogeneous texts. We proceed to describe a robust purely statistical method, based on context overlap exploitation, that exhibits these desired features. Experimental results are presented, along with comparison to other existing algorithms.
Abstract (in Czech)
In this paper we propose features desirable of linear text segmentation algorithms for the Information Retrieval domain, with emphasis on improving high similarity search of heterogeneous texts. We proceed to describe a robust purely statistical method, based on context overlap exploitation, that exhibits these desired features. Experimental results are presented, along with comparison to other existing algorithms.
Links
LC536, research and development projectName: Centrum komputační lingvistiky
Investor: Ministry of Education, Youth and Sports of the CR, Centrum komputační lingvistiky
PrintDisplayed: 27/4/2024 06:56