Understanding Search Queries in Natural Language

D 2018

Understanding Search Queries in Natural Language

NEVĚŘILOVÁ, Zuzana and Matej KVAŠŠAY

Basic information

Original name

Understanding Search Queries in Natural Language

Authors

NEVĚŘILOVÁ, Zuzana (203 Czech Republic, guarantor, belonging to the institution) and Matej KVAŠŠAY (703 Slovakia)

Edition

Brno, Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2018, p. 85-93, 9 pp. 2018

Publisher

Tribun EU

Other information

Language

English

Type of outcome

Proceedings paper

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Czech Republic

Confidentiality degree

is not subject to a state or trade secret

Publication form

printed version "print"

References:

URL

RIV identification code

RIV/00216224:14330/18:00109726

Organization unit

Faculty of Informatics

ISBN

978-80-263-1517-9

ISSN

UT WoS

000612420300011

Keywords (in Czech)

search intent; search query parsing

Keywords in English

search intent; search query parsing

Changed: 16/5/2022 15:43, Mgr. Michal Petr

Abstract

V originále

This work is part of a project aiming to provide one single search endpoint for all company data. We present a search query parser that takes a speech-to-text output, i.e. a sentence. The output is a structured representation of the search query from which a SPARQL query is generated. The SPARQL is then applied to an ontology with the company data. The parsing procedure consists of two steps. First, the search intent is detected, second, the query is parsed based on the search intent. For the intent classification, we use word embeddings with boosting of top 5 words, and support vector machines. For the parsing, we use semantic role labeling, named entity recognition, and external resources such as ConceptNet and DBPedia. The final parsing step is rule-based and related to the ontology structure. The intent classifier accuracy is 94%. In the subsequent manual evaluation,the resulting structures were complete and correct in 51% cases, in 34.57% of cases it was complete and correct but it also contained irrelevant information.

Links

EF16_013/0001781, research and development project

Name: LINDAT/CLARIN - Výzkumná infrastruktura pro jazykové technologie - rozšíření repozitáře a výpočetní kapacity

Citovat

NEVĚŘILOVÁ, Zuzana and Matej KVAŠŠAY. Understanding Search Queries in Natural Language. In Horák, Aleš and Rychlý, Pavel and Rambousek, Adam. Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2018. Brno: Tribun EU, 2018, p. 85-93. ISBN 978-80-263-1517-9.

@inproceedings{1533900,
   author = {Nevěřilová, Zuzana and Kvaššay, Matej},
   address = {Brno},
   booktitle = {Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2018},
   editor = {Horák, Aleš and Rychlý, Pavel and Rambousek, Adam},
   keywords = {search intent; search query parsing},
   howpublished = {tištěná verze "print"},
   language = {eng},
   location = {Brno},
   isbn = {978-80-263-1517-9},
   pages = {85-93},
   publisher = {Tribun EU},
   title = {Understanding Search Queries in Natural Language},
   url = {https://nlp.fi.muni.cz/raslan/2018/paper07-Neverilova_Kvassay.pdf},
   year = {2018}
}

TY  - CONF
ID  - 1533900
AU  - Nevěřilová, Zuzana - Kvaššay, Matej
PY  - 2018
TI  - Understanding Search Queries in Natural Language
PB  - Tribun EU
CY  - Brno
SN  - 9788026315179
KW  - search intent
KW  - search query parsing
UR  - https://nlp.fi.muni.cz/raslan/2018/paper07-Neverilova_Kvassay.pdf
L2  - https://nlp.fi.muni.cz/raslan/2018/paper07-Neverilova_Kvassay.pdf
N2  - This work is part of a project aiming to provide one single search endpoint for all company data. We present a search query parser that takes a speech-to-text output, i.e. a sentence. The output is a structured representation of the search query from which a SPARQL query is generated. The SPARQL is then applied to an ontology with the company data. The parsing procedure consists of two steps. First, the search intent is detected, second, the query is parsed based on the search intent. For the intent classification, we use word embeddings with boosting of top 5 words, and support vector machines. For the parsing, we use semantic role labeling, named entity recognition, and external resources such as ConceptNet and DBPedia. The final parsing step is rule-based and related to the ontology structure. The intent classifier accuracy is 94%. In the subsequent manual evaluation,the resulting structures were complete and correct in 51% cases, in 34.57% of cases it was complete and correct but it also contained irrelevant information.
ER  -

NEVĚŘILOVÁ, Zuzana and Matej KVAŠŠAY. Understanding Search Queries in Natural Language. In Horák, Aleš and Rychlý, Pavel and Rambousek, Adam. \textit{Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2018}. Brno: Tribun EU, 2018, p.~85-93. ISBN~978-80-263-1517-9.

Přehled o publikaci