2010
Automatic Identification of Legal Terms in Czech Law Texts
PALA, Karel; Pavel RYCHLÝ and Pavel ŠMERKBasic information
Original name
Automatic Identification of Legal Terms in Czech Law Texts
Name in Czech
Automatická identifikace právních termínů v českých právních textech
Authors
PALA, Karel (203 Czech Republic, guarantor, belonging to the institution); Pavel RYCHLÝ (203 Czech Republic, belonging to the institution) and Pavel ŠMERK ORCID (203 Czech Republic, belonging to the institution)
Edition
Berlin, Semantic Processing of Legal Texts, p. 83-94, 12 pp. 2010
Publisher
Springer
Other information
Language
English
Type of outcome
Proceedings paper
Field of Study
60200 6.2 Languages and Literature
Country of publisher
Czech Republic
Confidentiality degree
is not subject to a state or trade secret
Publication form
printed version "print"
Impact factor
Impact factor: 0.402 in 2005
RIV identification code
RIV/00216224:14330/10:00065871
Organization unit
Faculty of Informatics
ISBN
978-3-642-12836-3
ISSN
Keywords in English
terminology extraction; natural language processing; legal language
Tags
International impact, Reviewed
Changed: 30/4/2014 04:24, RNDr. Pavel Šmerk, Ph.D.
Abstract
In the original language
Law texts including constitution, acts, public notices and court judgements form a huge database of texts. As many texts from small domains, the used sublanguage is partially restricted and also different from general language (Czech). As a starting collection of data, the legal database Lexis containing approx. 50,000 Czech law documents has been chosen. Our attention is concentrated mostly on noun groups, which are the main candidates for law terms. We were able to recognize 3992 such different noun groups in the selected text samples. The paper also presents results of the morphological analysis, lemmatization, tagging, disambiguation, and the basic syntactic analysis of Czech law texts as these tasks are crucial for any further sophisticated natural language processing. The verbs in legal texts have been explored preliminarily as well. In this respect, we are trying to explore how the linguistic analysis can help in identification of the semantic nature of law terms.
Links
GA407/07/0679, research and development project |
| ||
LC536, research and development project |
|