Detailed Information on Publication Record
2008
Computing Idioms Frequency in Text Corpora
BUŠTA, JanBasic information
Original name
Computing Idioms Frequency in Text Corpora
Name in Czech
Výpočet četnosti idiomů v korpusu
Authors
BUŠTA, Jan (203 Czech Republic, guarantor, belonging to the institution)
Edition
Brno, Proceedings of Recent Advances in Slavonic Natural Language Processing 2008, p. 0-0, 4 pp. 2008
Publisher
Masaryk University
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
60200 6.2 Languages and Literature
Country of publisher
Czech Republic
Confidentiality degree
není předmětem státního či obchodního tajemství
Publication form
printed version "print"
References:
RIV identification code
RIV/00216224:14330/08:00034421
Organization unit
Faculty of Informatics
ISBN
978-80-210-4741-9
UT WoS
000302212600012
Keywords in English
frequency of idioms; headwords; text corpora; czech language
Změněno: 1/6/2021 07:47, Mgr. Jan Bušta
V originále
The idioms are phrases which meaning is not composed from the meanings of each word in the phrase. This is one of the natural examples of violating the principle of compositionality that means that idioms are in area of natural language processing problem of meaning mining. To count the frequency of phrases such idioms in corpora has one big aim: To get to know which phrases we use often and which less. We do it to be able to start with getting the meaning of the whole phrases not just each word. This improves the understanding natural language. The idioms are phrases which meaning is not composed from the meanings of each word in the phrase. This is one of the natural examples of violating the principle of compositionality that means that idioms are in area of natural language processing problem of meaning mining. To count the frequency of phrases such idioms in corpora has one big aim: To get to know which phrases we use often and which less. We do it to be able to start with getting the meaning of the whole phrases not just each word. This improves the understanding natural language.
In Czech
Idiomy jsou slovní spojení, jejichž význam se neskládá z významů jednotlivých slov. Idiomy jsou příkladem porušování principu kompozicionality a tím jsou problémem při strojovém zpracování jazyka. Výpočet četnosti idiomů v korpusu přinese informaci, které idiomy se používají častěji, které méně často. Seřazení idiomů dle jejich četnosti ukáže, na které idiomy je třeba se soustředit více, a tak lépe porozumět přirozenému jazyku.
Links
LC536, research and development project |
| ||
2C06009, research and development project |
|