[Leptonica Home Page]
Document Image Analysis
Updated: Aug 23, 2022
Source material for Chapter 18 in Mathematical morphology: from theory
to applications
This page describes how to run the applications and generate the
figures for the Document Image Analysis chapter in
Mathematical morphology: from theory to applications,
edited by Laurent Najman and Hugues Talbot, ISTE-Wiley, 2010,
The programs for doing this are in the open source
leptonica library.
For reference, here is a version
of the chapter that includes the figures. The figures are generated
by six programs:
- livre_makefigs.c This runs the other six programs to
generate all the figures.
- livre_seedgen.c This performs the first step in
an approach to page segmentation that identifies image
regions by growing a seed into a mask. This generates
the seed image for the image regions, which is Figure 1
in the chapter.
- livre_pageseg.c This performs page segmentation, showing
intermediate steps to identify the text and image regions.
It uses a fairly complicated page image as input. It generates
Figures 2 - 5.
- livre_orient.c This generates Figure 6, a visual
representation of the hit-miss Sels that are used for
identifying the orientation of roman text,
using a statistical count of ascenders and descenders.
- livre_hmt.c This generates Figures 7 and 8, which
are hit-miss Sels that are built automatically from
a 1 bpp (bit/pixel) image pattern. Figures 7 and 8 were
printed in grayscale. To seem them in color:
Figure 7 and
Figure 8.
- livre_tophat.c This generates Figure 9, which shows
how the tophat operation can be used to normalize and
whiten the background of an image with uneven illumination.
Additionally, we give a program that generates a figure that was
cut from the original paper due to length restrictions.
The program, livre_adapt.c, like the tophat, compensates for
nonuniform background, but in a more complicated way, by
first measuring the background and then doing
a locally-adaptive linear mapping in the attempt to make the
background uniform. The figure demonstrates a number of operations
for doing this.
The eight panels are as follows:
- The input image.
- The background-normalized color image, where target background
value is 200.
- The input image, converted to grayscale.
- The grayscale image closed with a 25 x 25 Sel to remove
the dark text.
- The background further smoothed by a convolution, using a 15 x 15
flat-topped block Sel.
- The background-normalized grayscale image (again, with the
target value of 200), using (3) as the input. The result in
this case is very similar to (2).
- Applying a linear TRC (tone reproduction curve) to (6), with
the dark point at 30 and the white point at 180.
- Thresholding the result to 1 bpp.
The most simple way to build these programs and generate the figures is
as follows:
- Go to www.leptonica.org
and download the source code.
- In the src directory, type make to build the
leptonica library.
- All the programs are in the prog directory.
In the prog directory, first type make.
- Then, still in the prog directory, run livre_makefigs.
The figures will be placed in /tmp/, named dia_fig1.png,
dia_fig2.png, etc.
To begin to learn about the leptonica image processing library,
first read the
README,
and then read the very
high-level overview
of the files in the library.
[Leptonica Home Page]
This documentation is licensed by Dan Bloomberg under a
Creative Commons
Attribution 3.0 United States License.
© Copyright 2001-2023, Leptonica