Information System of Masaryk University 

Gensim -- Statistical Semantics in Python

česky | in English

ŘEHŮŘEK, Radim and Petr SOJKA. Gensim -- Statistical Semantics in Python. In EuroScipy 2011, Paris. 2011.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Gensim -- Statistical Semantics in Python
Name in Czech Gensim -- statistická sémantika v Pythonu
Authors ŘEHŮŘEK, Radim (203 Czech Republic, guarantor, belonging to the institution) and Petr SOJKA (203 Czech Republic, belonging to the institution).
Edition EuroScipy 2011, Paris, 2011.
Other information
Original language English
Type of outcome Presentations at conferences
Field of Study Informatics
Country of publisher France
Confidentiality degree is not subject to a state or trade secret
WWW conference programme poster
RIV identification code RIV/00216224:14330/11:00053512
Organization unit Faculty of Informatics
Keywords (in Czech) statistická sémantika;gensim;Python;LDA;SVD
Keywords in English statistical semantics;gensim;Python;LDA;SVD
Tags International impact, Reviewed
Changed by Changed by: doc. RNDr. Petr Sojka, Ph.D., učo 2378. Changed: 17. 4. 2012 22:37.
Abstract
\texttt{Gensim} is a pure Python library that fights on two fronts: 1)~digital document indexing and similarity search; and 2)~fast, memory-efficient, scalable algorithms for Singular Value Decomposition and Latent Dirichlet Allocation. The connection between the two is unsupervised, semantic analysis of plain text in digital collections. Gensim was created for large digital libraries, but its underlying algorithms for large-scale, distributed, online SVD and LDA are like the Swiss Army knife of data analysis---also useful on their own, outside of the domain of Natural Language Processing.
Abstract (in Czech)
\texttt{Gensim} je knihovna naprogramovaná jazyce Python, která je užitečná na dvou frontách: 1) pro indexaci elektronických dokumentů a pro podobnostní hledání; a 2) pro rychlou, paměťově omezenou a efektivní škálovatelnou implementaci algoritmů pro Singular Value Decomposition a Latent Dirichlet Allocation. Vazba mezi oběma užitími je semantická analýza textů (bez učitele) v rozsáhlých digitálních kolekcích a knihovnách. Gensim byl vytvořen pro velké digitální knihovny, ale jím implementované algoritmy pro velké, distribuované, online užití SVD a LDA jsou švýcarským nožíkem analýzy dat a jako takové jsou užitečné i mimo doménu Natural Language Processing.
Links
LC536, research and development projectName: Centrum komputační lingvistiky
Investor: Ministry of Education, Youth and Sports of the CR, Basic Research Center
250503, internal MU codeName: The European Digital Mathematics Library (Acronym: EuDML)
Investor: European Union, Competitiveness and inovation framework programme
Type Name Uploaded/Created by Uploaded/Created Rights
953417 /1 Sojka, P. 14. 10. 2011

Properties

Name
953417
Application
refresh
Address within IS
https://is.muni.cz/auth/repo/953417/
Address for the users outside IS
https://is.muni.cz/repo/953417/
Address within Manager
https://is.muni.cz/auth/repo/953417/?info
Address within Manager for the users outside IS
https://is.muni.cz/repo/953417/?info
Uploaded/Created
Fri 14. 10. 2011 16:58, doc. RNDr. Petr Sojka, Ph.D.

Rights

Right to read:
  • anyone on the Internet
Right to upload:
 
Right to administer:
  • a concrete person doc. RNDr. Petr Sojka, Ph.D., učo 2378
  • a concrete person RNDr. Radim Řehůřek, Ph.D., učo 39672
Attributes
 
rehurek-sojka-scipy2011.pdf Licence Creative Commons  File version Sojka, P. 14. 10. 2011

Rights

Right to read:
 
Right to upload:
 
Right to administer:
  • a concrete person doc. RNDr. Petr Sojka, Ph.D., učo 2378
  • a concrete person RNDr. Radim Řehůřek, Ph.D., učo 39672
Attributes
 
Print
Ask the author for author copy Displayed: 22. 9. 2017 02:48

Other references 


Go to top | Current date and time: 22. 9. 2017 02:48, Week 38 (even)

Contact: istech(zavináč/atsign)fi(tečka/dot)muni(tečka/dot)cz, Office for Studies, access rights administrators, is-technicians, e-technicians, IT support | Use of cookies | learn more about Information System