NEVĚŘILOVÁ, Zuzana and Matej KVAŠŠAY. Understanding Search Queries in Natural Language. In Horák, Aleš and Rychlý, Pavel and Rambousek, Adam. Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2018. Brno: Tribun EU, 2018, p. 85-93. ISBN 978-80-263-1517-9.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Understanding Search Queries in Natural Language
Authors NEVĚŘILOVÁ, Zuzana (203 Czech Republic, guarantor, belonging to the institution) and Matej KVAŠŠAY (703 Slovakia).
Edition Brno, Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2018, p. 85-93, 9 pp. 2018.
Publisher Tribun EU
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Czech Republic
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
WWW URL
RIV identification code RIV/00216224:14330/18:00109726
Organization unit Faculty of Informatics
ISBN 978-80-263-1517-9
ISSN 2336-4289
UT WoS 000612420300011
Keywords (in Czech) search intent; search query parsing
Keywords in English search intent; search query parsing
Changed by Changed by: Mgr. Michal Petr, učo 65024. Changed: 16/5/2022 15:43.
Abstract
This work is part of a project aiming to provide one single search endpoint for all company data. We present a search query parser that takes a speech-to-text output, i.e. a sentence. The output is a structured representation of the search query from which a SPARQL query is generated. The SPARQL is then applied to an ontology with the company data. The parsing procedure consists of two steps. First, the search intent is detected, second, the query is parsed based on the search intent. For the intent classification, we use word embeddings with boosting of top 5 words, and support vector machines. For the parsing, we use semantic role labeling, named entity recognition, and external resources such as ConceptNet and DBPedia. The final parsing step is rule-based and related to the ontology structure. The intent classifier accuracy is 94%. In the subsequent manual evaluation,the resulting structures were complete and correct in 51% cases, in 34.57% of cases it was complete and correct but it also contained irrelevant information.
Links
EF16_013/0001781, research and development projectName: LINDAT/CLARIN - Výzkumná infrastruktura pro jazykové technologie - rozšíření repozitáře a výpočetní kapacity
PrintDisplayed: 4/8/2024 11:21