KILGARRIFF, Adam, Vít BAISA, Miloš JAKUBÍČEK a Pavel RYCHLÝ. Longest-commonest Match. Online. In Kosem, I., Jakubíček, M., Kallas, J., Krek, S. Electronic lexicography in the 21st century: linking lexical data in the digital age. Proceedings of the eLex 2015 conference, 11-13 August 2015, Herstmonceux Castle, United Kingdom. Jlubljana: Trojina, Institute for Applied Slovene Studies, 2015, s. 397-404. ISBN 978-961-93594-3-3. |
Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{1308616, author = {Kilgarriff, Adam and Baisa, Vít and Jakubíček, Miloš and Rychlý, Pavel}, address = {Jlubljana}, booktitle = {Electronic lexicography in the 21st century: linking lexical data in the digital age. Proceedings of the eLex 2015 conference, 11-13 August 2015, Herstmonceux Castle, United Kingdom.}, editor = {Kosem, I., Jakubíček, M., Kallas, J., Krek, S.}, keywords = {multiword expresion; collocation; word sketch; Sketch Engine}, howpublished = {elektronická verze "online"}, language = {eng}, location = {Jlubljana}, isbn = {978-961-93594-3-3}, pages = {397-404}, publisher = {Trojina, Institute for Applied Slovene Studies}, title = {Longest-commonest Match}, url = {https://elex.link/elex2015/proceedings/eLex_2015_26_Kilgarriff+etal.pdf}, year = {2015} }
TY - JOUR ID - 1308616 AU - Kilgarriff, Adam - Baisa, Vít - Jakubíček, Miloš - Rychlý, Pavel PY - 2015 TI - Longest-commonest Match PB - Trojina, Institute for Applied Slovene Studies CY - Jlubljana SN - 9789619359433 KW - multiword expresion KW - collocation KW - word sketch KW - Sketch Engine UR - https://elex.link/elex2015/proceedings/eLex_2015_26_Kilgarriff+etal.pdf L2 - https://elex.link/elex2015/proceedings/eLex_2015_26_Kilgarriff+etal.pdf N2 - Finding two-word collocations is a well-studied task within natural language processing. The result of this task for a given headword is usually a list of collocations sorted by a salience score. In corpus manager Sketch Engine, these pairs are extracted from data using a word sketch grammar relation rules and log-dice statistics resulting in a sorted list of triples . The longest–commonest match is a straightforward extension of these two-word collocations into multiword expressions. The resulting expressions are also very useful for representing the most common realisation of the collocational pair and to facilitate the interpretation of the raw triplet because sometimes, for such a triple, it is not clear from what texts it comes. We present here an algorithm behind the longest–commonest match together with a simple evaluation. The longest–commonest match is already implemented in Sketch Engine. ER -
KILGARRIFF, Adam, Vít BAISA, Miloš JAKUBÍČEK a Pavel RYCHLÝ. Longest-commonest Match. Online. In Kosem, I., Jakubíček, M., Kallas, J., Krek, S. \textit{Electronic lexicography in the 21st century: linking lexical data in the digital age. Proceedings of the eLex 2015 conference, 11-13 August 2015, Herstmonceux Castle, United Kingdom.}. Jlubljana: Trojina, Institute for Applied Slovene Studies, 2015, s.~397-404. ISBN~978-961-93594-3-3.
|