Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{1874319, author = {Medková, Helena and Horák, Aleš}, address = {Amsterdam}, booktitle = {Towards a Knowledge-Aware AI : SEMANTiCS 2022 — Proceedings of the 18th International Conference on Semantic Systems, 13-15 September 2022, Vienna, Austria}, doi = {http://dx.doi.org/10.3233/SSW220022}, editor = {Dimou, Anastasia; Neumaier, Sebastian; Pellegrini, Tassilo; Vahdati, Sahar}, keywords = {natural language understanding; coordinated verbs with shared argument; zeugma; BERT language model; dataset}, howpublished = {tištěná verze "print"}, language = {eng}, location = {Amsterdam}, isbn = {978-1-64368-320-1}, pages = {206-218}, publisher = {IOS Press}, title = {Distinguishing the Types of Coordinated Verbs with a Shared Argument by means of New ZeugBERT Language Model and ZeugmaDataset}, url = {https://ebooks.iospress.nl/volumearticle/60724}, year = {2022} }
TY - JOUR ID - 1874319 AU - Medková, Helena - Horák, Aleš PY - 2022 TI - Distinguishing the Types of Coordinated Verbs with a Shared Argument by means of New ZeugBERT Language Model and ZeugmaDataset PB - IOS Press CY - Amsterdam SN - 9781643683201 KW - natural language understanding KW - coordinated verbs with shared argument KW - zeugma KW - BERT language model KW - dataset UR - https://ebooks.iospress.nl/volumearticle/60724 N2 - Sentences where two verbs share a single argument represent a complex and highly ambiguous syntactic phenomenon. The argument sharing relations must be considered during the detection process from both a syntactic and semantic perspective. Such expressions can represent ungrammatical constructions, denoted as zeugma, or idiomatic elliptical phrase combinations. Rule-based classification methods prove ineffective because of the necessity to reflect meaning relations of the analyzed sentence constituents. This paper presents the development and evaluation of ZeugBERT, a language model tuned for the sentence classification task using a pre-trained Czech transformer model for language representation. The model was trained with a newly prepared dataset, which is also published with this paper, of 7,849 Czech sentences to classify Czech syntactic structures containing coordinated verbs that share a valency argument (or an optional adjunct) in the context of coordination. ZeugBERT here reaches $88\,\%$ of test set accuracy. The text describes the process of the new dataset creation and annotation, and it offers a detailed error analysis of the developed classification model. ER -
MEDKOVÁ, Helena a Aleš HORÁK. Distinguishing the Types of Coordinated Verbs with a Shared Argument by means of New ZeugBERT Language Model and ZeugmaDataset. In Dimou, Anastasia; Neumaier, Sebastian; Pellegrini, Tassilo; Vahdati, Sahar. \textit{Towards a Knowledge-Aware AI : SEMANTiCS 2022 — Proceedings of the 18th International Conference on Semantic Systems, 13-15 September 2022, Vienna, Austria}. Amsterdam: IOS Press, 2022, s.~206-218. ISBN~978-1-64368-320-1. Dostupné z: https://dx.doi.org/10.3233/SSW220022.
|