Understanding Search Queries in Natural Language

NEVĚŘILOVÁ, Zuzana a Matej KVAŠŠAY. Understanding Search Queries in Natural Language. In Horák, Aleš and Rychlý, Pavel and Rambousek, Adam. Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2018. Brno: Tribun EU, 2018, s. 85-93. ISBN 978-80-263-1517-9.

Další formáty: BibTeX LaTeX RIS

Základní údaje
Originální název	Understanding Search Queries in Natural Language
Autoři	NEVĚŘILOVÁ, Zuzana (203 Česká republika, garant, domácí) a Matej KVAŠŠAY (703 Slovensko).
Vydání	Brno, Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2018, od s. 85-93, 9 s. 2018.
Nakladatel	Tribun EU

Další údaje
Originální jazyk	angličtina
Typ výsledku	Stať ve sborníku
Obor	10201 Computer sciences, information science, bioinformatics
Stát vydavatele	Česká republika
Utajení	není předmětem státního či obchodního tajemství
Forma vydání	tištěná verze "print"
WWW	URL
Kód RIV	RIV/00216224:14330/18:00109726
Organizační jednotka	Fakulta informatiky
ISBN	978-80-263-1517-9
ISSN	2336-4289
UT WoS	000612420300011
Klíčová slova česky	search intent; search query parsing
Klíčová slova anglicky	search intent; search query parsing
Změnil	Změnil: Mgr. Michal Petr, učo 65024. Změněno: 16. 5. 2022 15:43.

Anotace

This work is part of a project aiming to provide one single search endpoint for all company data. We present a search query parser that takes a speech-to-text output, i.e. a sentence. The output is a structured representation of the search query from which a SPARQL query is generated. The SPARQL is then applied to an ontology with the company data. The parsing procedure consists of two steps. First, the search intent is detected, second, the query is parsed based on the search intent. For the intent classification, we use word embeddings with boosting of top 5 words, and support vector machines. For the parsing, we use semantic role labeling, named entity recognition, and external resources such as ConceptNet and DBPedia. The final parsing step is rule-based and related to the ontology structure. The intent classifier accuracy is 94%. In the subsequent manual evaluation,the resulting structures were complete and correct in 51% cases, in 34.57% of cases it was complete and correct but it also contained irrelevant information.

Návaznosti
EF16_013/0001781, projekt VaV	Název: LINDAT/CLARIN - Výzkumná infrastruktura pro jazykové technologie - rozšíření repozitáře a výpočetní kapacity

VytisknoutZobrazeno: 13. 5. 2024 09:17

Understanding Search Queries in Natural Language

Další aplikace