Other formats:
BibTeX
LaTeX
RIS
@proceedings{1705316, author = {Novotná, Tereza and Harašta, Jakub and Kól, Jakub}, booktitle = {Cyberspace 2020}, keywords = {modelování těmat; latentní Dirichletova alokace; nezáporná maticová faktorizace; soudní rozhodnutí; koherenční skóre}, language = {eng}, title = {Topic Modelling of the Czech Supreme Court Decisions}, url = {https://cyberspace.muni.cz/}, year = {2020} }
TY - CONF ID - 1705316 AU - Novotná, Tereza - Harašta, Jakub - Kól, Jakub PY - 2020 TI - Topic Modelling of the Czech Supreme Court Decisions KW - modelování těmat KW - latentní Dirichletova alokace KW - nezáporná maticová faktorizace KW - soudní rozhodnutí KW - koherenční skóre UR - https://cyberspace.muni.cz/ N2 - Czech Supreme Court produces several thousands of court decisions per year. The Supreme court decisions are published almost unprocessed in the full-text with minimal fundamental metadata (date of the decision, docket number). This fact makes a case law research very time-consuming. Therefore, new automatic methods of processing court decisions need to be developed in order to improve ways how to retrieve more relevant case law efficiently. Topic modelling methods have the potential to cluster a large number of documents automatically or to provide new categories of relevant metadata to these documents. In this paper, two topic modelling methods - latent Dirichlet allocation and non-negative matrix factorization are applied to the corpus of Czech Supreme Court decisions. Several models for methods are trained and compared according to their coherence scores in order to find the best number of topics. Further manual qualitative analysis of the most coherent models is performed by authors. ER -
NOVOTNÁ, Tereza, Jakub HARAŠTA and Jakub KÓL. Topic Modelling of the Czech Supreme Court Decisions. In \textit{Cyberspace 2020}. 2020.
|