Detailed Information on Publication Record
2014
Rapid prototyping of a web categorization tool
NAVRÁTIL, Jaromír and Lubomír POPELÍNSKÝBasic information
Original name
Rapid prototyping of a web categorization tool
Authors
NAVRÁTIL, Jaromír (203 Czech Republic, guarantor, belonging to the institution) and Lubomír POPELÍNSKÝ (203 Czech Republic, belonging to the institution)
Edition
NY, USA, IDEAS '14 Proceedings of the 18th International Database Engineering & Applications Symposium, p. 294-297, 4 pp. 2014
Publisher
ACM New York
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
10201 Computer sciences, information science, bioinformatics
Country of publisher
United States of America
Confidentiality degree
není předmětem státního či obchodního tajemství
Publication form
electronic version available online
References:
RIV identification code
RIV/00216224:14330/14:00076180
Organization unit
Faculty of Informatics
ISBN
978-1-4503-2627-8
UT WoS
000471152000036
Keywords in English
web mining;categorization of web pages;machine learning;landmarking
Změněno: 5/3/2018 20:31, RNDr. Pavel Šmerk, Ph.D.
Abstract
V originále
This paper introduces a new method for fast prototyping of web page categorization tool based on Random Forests. The result of this work is three-fold. We describe a fast feature extraction method first. Afterwards, we introduce a system that enables a user to perform experiments manually and visualize the results via visual analytics module. The last part of this work concerns a way how to perform experiments efficiently. It is partially inspired by landmarking that allows limiting the number of experiments. This method has been used for building a new commercial system for web categorization that significantly outperforms the system already being used.