HATLAPATKA, Radim. JBIG2 Supported by OCR. In CEUR Workshop Proceedings, Volume 921. Aachen: Neuveden, 2012, p. 82-90. ISSN 1613-0073. |
Other formats:
BibTeX
LaTeX
RIS
|
Basic information | |
---|---|
Original name | JBIG2 Supported by OCR |
Name in Czech | JBIG2 s podporou OCR |
Authors | HATLAPATKA, Radim (203 Czech Republic, guarantor, belonging to the institution). |
Edition | Aachen, CEUR Workshop Proceedings, Volume 921, p. 82-90, 9 pp. 2012. |
Publisher | Neuveden |
Other information | |
---|---|
Original language | English |
Type of outcome | Proceedings paper |
Field of Study | 10201 Computer sciences, information science, bioinformatics |
Country of publisher | Germany |
Confidentiality degree | is not subject to a state or trade secret |
Publication form | printed version "print" |
WWW | Full text |
RIV identification code | RIV/00216224:14330/12:00067428 |
Organization unit | Faculty of Informatics |
ISSN | 1613-0073 |
Keywords (in Czech) | jbig2enc; JBIG2; optimalizace PDF; komprese; DML; OCR; pdfJbIm; DML-CZ; EuDML |
Keywords in English | jbig2enc; JBIG2; PDF size optimization; compression; DML; OCR; pdfJbIm; DML-CZ; EuDML |
Tags | bitmap, compression, compression ratio, DML, DML-CZ, EuDML, JBIG2, jbig2enc, lossiness, OCR, PDF, PDF size optimization, pdfJbIm |
Tags | International impact, Reviewed |
Changed by | Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 30/4/2014 09:52. |
Abstract |
---|
Digital Mathematical libraries contain a large volume of PDF documents containing scanned text. In this paper, we describe how this documents can be compressed and thus provide them more effectively to the users. We introduce a JBIG2 standard for compressing bitonal images such as scanned text and we discuss issues if OCR is used for improving the compression ratio of jbig2enc open-source encoder. For this purpose, we have designed API for using OCR in jbig2enc which we describe in this paper together with already achieved results. |
Abstract (in Czech) |
---|
Digitální matematické knihovnz obsahují velké množství PDF dokumentů obsahujících skenovaný text. V tomto článku popisujeme, jakým způsobem mohou být takové dokumenty komprimovány, a tím pádem poskytovány uživateli efektivnější cestou. Za tímto účelem představujeme JBIG2 standard pro kompresi bitonálních obrázků (např. naskenovaný text) a diskutujeme přínosy a problémy použití OCR za účelem zvýšení komprese volně šiřitelného jbig2enc enkodéru. Za tímto účelem jsme navrhli a implementovali rozhraní pro používání OCR v jbig2enc enkodéru, které zde popisujeme spolu s předběžnými výsledky. |
Links | |
---|---|
LA09016, research and development project | Name: Účast ČR v European Research Consortium for Informatics and Mathematics (ERCIM) (Acronym: ERCIM) |
Investor: Ministry of Education, Youth and Sports of the CR, Czech Republic membership in the European Research Consortium for Informatics and Mathematics | |
250503, interní kód MU | Name: The European Digital Mathematics Library (Acronym: EuDML) |
Investor: European Union |