Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{1809740, author = {Herman, Ondřej}, address = {Brno}, booktitle = {Recent Advances in Slavonic Natural Language Processing (RASLAN 2021)}, editor = {Horák, Rychlý, Rambousek}, keywords = {Word embeddings; Sketch Engine; Corpora}, howpublished = {tištěná verze "print"}, language = {eng}, location = {Brno}, isbn = {978-80-263-1670-1}, pages = {41-46}, publisher = {Tribun EU}, title = {Precomputed Word Embeddings for 15+ Languages}, url = {https://raslan2021.nlp-consulting.net/}, year = {2021} }
TY - JOUR ID - 1809740 AU - Herman, Ondřej PY - 2021 TI - Precomputed Word Embeddings for 15+ Languages PB - Tribun EU CY - Brno SN - 9788026316701 KW - Word embeddings KW - Sketch Engine KW - Corpora UR - https://raslan2021.nlp-consulting.net/ N2 - Word embeddings serve as an useful resource for many downstream natural language processing tasks. The embeddings map or embed the lexicon of a language onto a vector space, in which various operations can be carried out easily using the established machinery of linear algebra. The unbounded nature of the language can be problematic and word embeddings provide a way of compressing the words into a manageable dense space. The position of a word in the vector space is given by the context the word appears in, or, as the distributional hypothesis postulates, a word is characterized by the company it keeps [2]. As similar words appear in similar contexts, their positions will also be close to each other in the embedding vector space. Because of this many useful semantical properties of words are preserved in the embedding vector space. ER -
HERMAN, Ondřej. Precomputed Word Embeddings for 15+ Languages. In Horák, Rychlý, Rambousek. \textit{Recent Advances in Slavonic Natural Language Processing (RASLAN 2021)}. Brno: Tribun EU, 2021, s.~41-46. ISBN~978-80-263-1670-1.
|