D 2012

JBIG2 Supported by OCR

HATLAPATKA, Radim

Basic information

Original name

JBIG2 Supported by OCR

Name in Czech

JBIG2 s podporou OCR

Authors

HATLAPATKA, Radim

Edition

Brno, DML 2012: Towards a Digital Mathematics Library, 2012

Publisher

Masaryk University

Other information

Language

English

Type of outcome

Proceedings paper

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Germany

Confidentiality degree

is not subject to a state or trade secret

Organization unit

Faculty of Informatics

Keywords (in Czech)

jbig2enc; JBIG2; optimalizace PDF; komprese; DML; OCR; pdfJbIm; DML-CZ; EuDML

Keywords in English

jbig2enc; JBIG2; PDF size optimization; compression; DML; OCR; pdfJbIm; DML-CZ; EuDML

Tags

International impact, Reviewed
Changed: 8/12/2012 15:51, RNDr. Michal Růžička, Ph.D.

Abstract

V originále

Digital Mathematical libraries contain a large volume of PDF documents containing scanned text. In this paper, we describe how this documents can be compressed and thus provide them more effectively to the users. We introduce a JBIG2 standard for compressing bitonal images such as scanned text and we discuss issues if OCR is used for improving the compression ratio of jbig2enc open-source encoder. For this purpose, we have designed API for using OCR in jbig2enc which we describe in this paper together with already achieved results.

Links

LA09016, research and development project
Name: Účast ČR v European Research Consortium for Informatics and Mathematics (ERCIM) (Acronym: ERCIM)
Investor: Ministry of Education, Youth and Sports of the CR, Czech Republic membership in the European Research Consortium for Informatics and Mathematics
250503, interní kód MU
Name: The European Digital Mathematics Library (Acronym: EuDML)
Investor: European Union