Other formats:
BibTeX
LaTeX
RIS
@article{1618976, author = {Harašta, Jakub and Novotná, Tereza and Šavelka, Jaromír}, keywords = {reference recognition; reference extraction; document segmentation; NLP pipeline; citation data; Supreme Court; Supreme Administrative Court; Constitutional Court; Czech Republic}, language = {eng}, journal = {arXiv}, title = {Citation Data of Czech Apex Courts (preprint)}, url = {https://arxiv.org/abs/2002.02224}, year = {2020} }
TY - JFULL ID - 1618976 AU - Harašta, Jakub - Novotná, Tereza - Šavelka, Jaromír PY - 2020 TI - Citation Data of Czech Apex Courts (preprint) JF - arXiv PB - arXiv:2002.02224 KW - reference recognition KW - reference extraction KW - document segmentation KW - NLP pipeline KW - citation data KW - Supreme Court KW - Supreme Administrative Court KW - Constitutional Court KW - Czech Republic UR - https://arxiv.org/abs/2002.02224 N2 - In this paper, we introduce the citation data of the Czech apex courts (Supreme Court, Supreme Administrative Court and Constitutional Court). This dataset was automatically extracted from the corpus of texts of Czech court decisions - CzCDC 1.0. We obtained the citation data by building the natural language processing pipeline for extraction of the court decision identifiers. The pipeline included the (i) document segmentation model and the (ii) reference recognition model. Furthermore, the dataset was manually processed to achieve high-quality citation data as a base for subsequent qualitative and quantitative analyses. The dataset is available to the general public at GitHub. ER -
HARAŠTA, Jakub, Tereza NOVOTNÁ and Jaromír ŠAVELKA. Citation Data of Czech Apex Courts (preprint). \textit{arXiv}. arXiv:2002.02224, 2020, 7 pp.
|