ŽIŽKA, Jan, Michal ŠRÉDL and Aleš BOUREK. Searching for Significant Word Associations in Text Documents Using Genetic Algorithms. In Computional Linguistics and Intelligent Text Processing. Berlin Heidelberg New York: Springer Verlag, 2003, p. 584-587. ISBN 3-540-00532-3.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Searching for Significant Word Associations in Text Documents Using Genetic Algorithms
Authors ŽIŽKA, Jan (203 Czech Republic, guarantor), Michal ŠRÉDL (203 Czech Republic) and Aleš BOUREK (203 Czech Republic).
Edition Berlin Heidelberg New York, Computional Linguistics and Intelligent Text Processing, p. 584-587, 4 pp. 2003.
Publisher Springer Verlag
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Mexico
Confidentiality degree is not subject to a state or trade secret
RIV identification code RIV/00216224:14330/03:00009148
Organization unit Faculty of Informatics
ISBN 3-540-00532-3
UT WoS 000182492300064
Keywords in English machine learning; text document processing; genetic algorithms; naive Bayes method
Tags Genetic algorithms, machine learning, naive Bayes method, text document processing
Changed by Changed by: doc. Ing. Jan Žižka, CSc., učo 2431. Changed: 8/9/2004 16:37.
Abstract
The paper describes experiments that used Genetic Algorithms for looking for important word assocoations (phrases) in unstructured text documents obtained from the Internet in the area of a specialized medicine branch. Genetic alforithms can evolve sets of word associations with assigned significance weights from the document categorization point of view (relevant and irrelevant documents). The categorization is similarly reliable like the naive Bayes classification based on individual words. In addition, genetic algorithms provided phrases consisting of one, two, and three words. The phrases were quite meaningful from the human point of view.
Links
MSM 143300003, plan (intention)Name: Interakce člověka s počítačem, dialogové systémy a asistivní technologie
Investor: Ministry of Education, Youth and Sports of the CR, Human-computer interaction, dialog systems and assistive technologies
PrintDisplayed: 9/7/2024 20:39