Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{1206027, author = {Suchomel, Šimon and Brandejs, Michal}, address = {Sheffield, UK}, booktitle = {CLEF2014 Working Notes}, keywords = {suspicious document; plagiarism detection; search engine; source retrieval; stop word; text alignment; snippet similarity;}, howpublished = {elektronická verze "online"}, language = {eng}, location = {Sheffield, UK}, pages = {1017-1020}, publisher = {CEUR, Aachen University}, title = {Heterogeneous Queries for Synoptic and Phrasal Search}, url = {http://ceur-ws.org/Vol-1180/}, year = {2014} }
TY - JOUR ID - 1206027 AU - Suchomel, Šimon - Brandejs, Michal PY - 2014 TI - Heterogeneous Queries for Synoptic and Phrasal Search PB - CEUR, Aachen University CY - Sheffield, UK KW - suspicious document KW - plagiarism detection KW - search engine KW - source retrieval KW - stop word KW - text alignment KW - snippet similarity; UR - http://ceur-ws.org/Vol-1180/ L2 - http://ceur-ws.org/Vol-1180/ N2 - This paper describes our approaches for the Plagiarism Detection – Source Retrieval task of PAN 2014. We combined and improved methodology used at PAN 2012 and PAN 2013. Our system combines three types of queries: The keywords-based queries; the paragraph-based queries; and the headers-based queries. The queries are distinguished also by other properties such as the phrase query or the positional query. The queries are submitted to two search engines – Chatnoir and Indri – according to their properties. The query’s position serves for the search control, minimization of the total number of executed queries is the system’s priority. Downloaded documents are textually compared with the suspicious document and if a similarity is found, the downloaded document is reported. ER -
SUCHOMEL, Šimon a Michal BRANDEJS. Heterogeneous Queries for Synoptic and Phrasal Search. Online. In \textit{CLEF2014 Working Notes}. Sheffield, UK: CEUR, Aachen University, 2014, s.~1017-1020. ISSN~1613-0073.
|