RYCHLÝ, Pavel. A Lexicographer-Friendly Association Score. In RASLAN 2008. 2. vyd. Brno, RASLAN 2008. Brno: Masarykova Univerzita, 2008, p. 6-9. ISBN 978-80-210-4741-9.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name A Lexicographer-Friendly Association Score
Authors RYCHLÝ, Pavel (203 Czech Republic, guarantor, belonging to the institution).
Edition 2. vyd. Brno, RASLAN 2008. Brno, RASLAN 2008, p. 6-9, 4 pp. 2008.
Publisher Masarykova Univerzita
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 60200 6.2 Languages and Literature
Country of publisher Czech Republic
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
WWW URL
RIV identification code RIV/00216224:14330/08:00049430
Organization unit Faculty of Informatics
ISBN 978-80-210-4741-9
UT WoS 000302212600003
Keywords in English corpus linguistics tools; grammatical relations in the Sketch Engine; the logDice score
Changed by Changed by: doc. Mgr. Pavel Rychlý, Ph.D., učo 3692. Changed: 7/6/2021 17:24.
Abstract
Finding collocation candidates is one of the most important and widely used feature of corpus linguistics tools. There are many statistical association measures used to identify good collocations. Most of these measures define a formula of a association score which indicates amount of statistical association between two words. The score is computed for all possible word pairs and the word pairs with the highest score are presented as collocation candidates. The same scores are used in many other algorithms in corpus linguistics. The score values are usually meaningless and corpus specific, they cannot be used to compare words (or word pairs) of different corpora. But endusers want an interpretation of such scores and want a score’s stability. This paper present a modification of a well known association score which has a reasonable interpretation and other good features.
Links
LC536, research and development projectName: Centrum komputační lingvistiky
Investor: Ministry of Education, Youth and Sports of the CR, Centrum komputační lingvistiky
1ET100300419, research and development projectName: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu
Investor: Academy of Sciences of the Czech Republic, Intelligent Models, Algorithms, Methods and Tools for the Semantic Web (realization)
1ET200610406, research and development projectName: Jazyková poradna na internetu
Investor: Academy of Sciences of the Czech Republic, Internet Language Consulting Service
2C06009, research and development projectName: Prostředky tvorby komplexní báze znalostí pro komunikaci se sémantickým webem v přirozeném jazyce (Acronym: COT-SEWing)
Investor: Ministry of Education, Youth and Sports of the CR
PrintDisplayed: 27/5/2024 11:26