CzeGPT-2 – Training New Model for Czech Generative Text
Processing Evaluated with the Summarization Task

J 2024

CzeGPT-2 – Training New Model for Czech Generative Text Processing Evaluated with the Summarization Task

HÁJEK, Adam a Aleš HORÁK

Základní údaje

Originální název

CzeGPT-2 – Training New Model for Czech Generative Text Processing Evaluated with the Summarization Task

Autoři

HÁJEK, Adam (203 Česká republika, domácí) a Aleš HORÁK (203 Česká republika, garant, domácí)

Vydání

IEEE ACCESS, UNITED STATES, IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2024, 2169-3536

Další údaje

Jazyk

angličtina

Typ výsledku

Článek v odborném periodiku

Obor

10200 1.2 Computer and information sciences

Stát vydavatele

Spojené státy

Utajení

není předmětem státního či obchodního tajemství

Odkazy

URL

Impakt faktor

Impact factor: 3.900 v roce 2022

Organizační jednotka

Fakulta informatiky

DOI

http://dx.doi.org/10.1109/ACCESS.2024.3371689

UT WoS

001178339600001

Klíčová slova anglicky

Task analysis;Training;Measurement;Transformers;Decoding;Computational modeling;Vocabulary;Czech;GPT-2;large language model;model evaluation;model training;summarization

Příznaky

Mezinárodní význam, Recenzováno

Změněno: 21. 3. 2024 17:56, doc. RNDr. Aleš Horák, Ph.D.

Anotace

V originále

Automatic text summarization (ATS), alongside neural machine translation or question answering, is one of the leading tasks in Natural Language Processing (NLP). In recent years, ATS has experienced significant development, especially in the English NLP world. Modern approaches are mainly based on the versatile Transformer architecture proposed by Vaswani et al. in 2017, which has revolutionized the field, and was later tuned and adjusted to various needs of different tasks. Non-mainstream languages, with Czech taken as a representative, on the other hand, are a little bit behind these efforts and tend to use lighter or heuristic methods. With the new CzeGPT-2 model and abstractive summarizer, we would like to take a step forward detailing the process of training a GPT-2 generative transformer model for a new language with a comprehensive evaluation of the task of Czech summarization and pointing out the benefits of this approach. We also present an in-depth analysis of the errors in generated summaries, allowing to locate the model’s weak spots.},

Návaznosti

LM2023062, projekt VaV

Název: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy

Investor: Ministerstvo školství, mládeže a tělovýchovy ČR, LINDAT/CLARIAH-CZ - Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy

Citovat

HÁJEK, Adam a Aleš HORÁK. CzeGPT-2 – Training New Model for Czech Generative Text Processing Evaluated with the Summarization Task. IEEE ACCESS. UNITED STATES: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2024, roč. 2024, č. 12, s. 34570-34581. ISSN 2169-3536. Dostupné z: https://dx.doi.org/10.1109/ACCESS.2024.3371689.

@article{2380378,
   author = {Hájek, Adam and Horák, Aleš},
   article_location = {UNITED STATES},
   article_number = {12},
   doi = {http://dx.doi.org/10.1109/ACCESS.2024.3371689},
   keywords = {Task analysis;Training;Measurement;Transformers;Decoding;Computational modeling;Vocabulary;Czech;GPT-2;large language model;model evaluation;model training;summarization},
   language = {eng},
   issn = {2169-3536},
   journal = {IEEE ACCESS},
   title = {CzeGPT-2 – Training New Model for Czech Generative Text Processing Evaluated with the Summarization Task},
   url = {https://ieeexplore.ieee.org/document/10453575},
   volume = {2024},
   year = {2024}
}

TY  - JOUR
ID  - 2380378
AU  - Hájek, Adam - Horák, Aleš
PY  - 2024
TI  - CzeGPT-2 – Training New Model for Czech Generative Text Processing Evaluated with the Summarization Task
JF  - IEEE ACCESS
VL  - 2024
IS  - 12
SP  - 34570-34581
EP  - 34570-34581
PB  - IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
SN  - 21693536
KW  - Task analysis;Training;Measurement;Transformers;Decoding;Computational modeling;Vocabulary;Czech;GPT-2;large language model;model evaluation;model training;summarization
UR  - https://ieeexplore.ieee.org/document/10453575
N2  - Automatic text summarization (ATS), alongside neural machine translation or question answering, is one of the leading tasks in Natural Language Processing (NLP). In recent years, ATS has experienced significant development, especially in the English NLP world. Modern approaches are mainly based on the versatile Transformer architecture proposed by Vaswani et al. in 2017, which has revolutionized the field, and was later tuned and adjusted to various needs of different tasks. Non-mainstream languages, with Czech taken as a representative, on the other hand, are a little bit behind these efforts and tend to use lighter or heuristic methods. With the new CzeGPT-2 model and abstractive summarizer, we would like to take a step forward detailing the process of training a GPT-2 generative transformer model for a new language with a comprehensive evaluation of the task of Czech summarization and pointing out the benefits of this approach. We also present an in-depth analysis of the errors in generated summaries, allowing to locate the model’s weak spots.},
ER  -

HÁJEK, Adam a Aleš HORÁK. CzeGPT-2 – Training New Model for Czech Generative Text Processing Evaluated with the Summarization Task. \textit{IEEE ACCESS}. UNITED STATES: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2024, roč.~2024, č.~12, s.~34570-34581. ISSN~2169-3536. Dostupné z: https://dx.doi.org/10.1109/ACCESS.2024.3371689.

Podrobný výpis o publikaci