NAVRÁTIL, Jaromír and Lubomír POPELÍNSKÝ. Rapid prototyping of a web categorization tool. Online. In IDEAS '14 Proceedings of the 18th International Database Engineering & Applications Symposium. NY, USA: ACM New York, 2014, p. 294-297. ISBN 978-1-4503-2627-8. Available from: https://dx.doi.org/10.1145/2628194.2628216.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Rapid prototyping of a web categorization tool
Authors NAVRÁTIL, Jaromír (203 Czech Republic, guarantor, belonging to the institution) and Lubomír POPELÍNSKÝ (203 Czech Republic, belonging to the institution).
Edition NY, USA, IDEAS '14 Proceedings of the 18th International Database Engineering & Applications Symposium, p. 294-297, 4 pp. 2014.
Publisher ACM New York
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher United States of America
Confidentiality degree is not subject to a state or trade secret
Publication form electronic version available online
WWW URL
RIV identification code RIV/00216224:14330/14:00076180
Organization unit Faculty of Informatics
ISBN 978-1-4503-2627-8
Doi http://dx.doi.org/10.1145/2628194.2628216
UT WoS 000471152000036
Keywords in English web mining;categorization of web pages;machine learning;landmarking
Changed by Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 5/3/2018 20:31.
Abstract
This paper introduces a new method for fast prototyping of web page categorization tool based on Random Forests. The result of this work is three-fold. We describe a fast feature extraction method first. Afterwards, we introduce a system that enables a user to perform experiments manually and visualize the results via visual analytics module. The last part of this work concerns a way how to perform experiments efficiently. It is partially inspired by landmarking that allows limiting the number of experiments. This method has been used for building a new commercial system for web categorization that significantly outperforms the system already being used.
PrintDisplayed: 5/5/2024 14:06