Automatic Identification of Legal Terms in Czech Law Texts

PALA, Karel, Pavel RYCHLÝ and Pavel ŠMERK. Automatic Identification of Legal Terms in Czech Law Texts. In Semantic Processing of Legal Texts. Berlin: Springer, 2010, p. 83-94. ISBN 978-3-642-12836-3. Available from: https://dx.doi.org/10.1007/978-3-642-12837-0_5.

Other formats: BibTeX LaTeX RIS

Basic information
Original name	Automatic Identification of Legal Terms in Czech Law Texts
Name in Czech	Automatická identifikace právních termínů v českých právních textech
Authors	PALA, Karel (203 Czech Republic, guarantor, belonging to the institution), Pavel RYCHLÝ (203 Czech Republic, belonging to the institution) and Pavel ŠMERK (203 Czech Republic, belonging to the institution).
Edition	Berlin, Semantic Processing of Legal Texts, p. 83-94, 12 pp. 2010.
Publisher	Springer

Other information
Original language	English
Type of outcome	Proceedings paper
Field of Study	60200 6.2 Languages and Literature
Country of publisher	Czech Republic
Confidentiality degree	is not subject to a state or trade secret
Publication form	printed version "print"
Impact factor	Impact factor: 0.402 in 2005
RIV identification code	RIV/00216224:14330/10:00065871
Organization unit	Faculty of Informatics
ISBN	978-3-642-12836-3
ISSN	0302-9743
Doi	http://dx.doi.org/10.1007/978-3-642-12837-0_5
Keywords in English	terminology extraction; natural language processing; legal language
Tags	International impact, Reviewed
Changed by	Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 30/4/2014 04:24.

Abstract

Law texts including constitution, acts, public notices and court judgements form a huge database of texts. As many texts from small domains, the used sublanguage is partially restricted and also different from general language (Czech). As a starting collection of data, the legal database Lexis containing approx. 50,000 Czech law documents has been chosen. Our attention is concentrated mostly on noun groups, which are the main candidates for law terms. We were able to recognize 3992 such different noun groups in the selected text samples. The paper also presents results of the morphological analysis, lemmatization, tagging, disambiguation, and the basic syntactic analysis of Czech law texts as these tasks are crucial for any further sophisticated natural language processing. The verbs in legal texts have been explored preliminarily as well. In this respect, we are trying to explore how the linguistic analysis can help in identification of the semantic nature of law terms.

Links
GA407/07/0679, research and development project	Name: Právní e-slovník - PES
GA407/07/0679, research and development project	Investor: Czech Science Foundation, Legal e-dictionary - PES
LC536, research and development project	Name: Centrum komputační lingvistiky
LC536, research and development project	Investor: Ministry of Education, Youth and Sports of the CR, Centrum komputační lingvistiky

PrintDisplayed: 25/4/2024 13:01

Automatic Identification of Legal Terms in Czech Law Texts

Other applications