HARAŠTA, Jakub, Tereza NOVOTNÁ and Jaromír ŠAVELKA. Citation Data of Czech Apex Courts (preprint). arXiv. arXiv:2002.02224, 2020, 7 pp.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Citation Data of Czech Apex Courts (preprint)
Authors HARAŠTA, Jakub, Tereza NOVOTNÁ and Jaromír ŠAVELKA.
Edition arXiv, arXiv:2002.02224, 2020.
Other information
Original language English
Type of outcome Article in a journal (not reviewed)
Field of Study 50500 5.5 Law
Country of publisher United States of America
Confidentiality degree is not subject to a state or trade secret
WWW Text (arXiv.org) Dataset (GitHub)
Organization unit Faculty of Law
Keywords in English reference recognition; reference extraction; document segmentation; NLP pipeline; citation data; Supreme Court; Supreme Administrative Court; Constitutional Court; Czech Republic
Tags International impact
Changed by Changed by: JUDr. Mgr. Jakub Harašta, Ph.D., učo 323070. Changed: 18/12/2020 06:51.
Abstract
In this paper, we introduce the citation data of the Czech apex courts (Supreme Court, Supreme Administrative Court and Constitutional Court). This dataset was automatically extracted from the corpus of texts of Czech court decisions - CzCDC 1.0. We obtained the citation data by building the natural language processing pipeline for extraction of the court decision identifiers. The pipeline included the (i) document segmentation model and the (ii) reference recognition model. Furthermore, the dataset was manually processed to achieve high-quality citation data as a base for subsequent qualitative and quantitative analyses. The dataset is available to the general public at GitHub.
Links
GA17-20645S, research and development projectName: Exaktní hodnocení aplikační relevance judikatury
Investor: Czech Science Foundation
PrintDisplayed: 19/7/2024 09:19