Difference between revisions of "Pdfimages"

From Parallel Library Services
Jump to navigation Jump to search
 
Line 1: Line 1:
<code>pdfimages</code> is part of the [[Poppler]] utilities for working with PDFs. It is a tool which helps to extract images from PDFs.
Manual: https://manpages.debian.org/testing/poppler-utils/pdfimages.1.en.html


== Syntax ==
<code>pdfimages</code> is part of the [[Poppler]] utilities for working with PDFs. <code>pdfimages</code> saves images from a Portable Document Format (PDF) file as Portable Pixmap (PPM), Portable Bitmap (PBM), [[Portable Networks Graphics]] (PNG), Tagged Image File Format (TIFF), JPEG, JPEG2000, or JBIG2 files.


<syntaxhighlight lang="bash">
It is part of the [[Poppler]] software package of PDF tools.
pdfimages /path/to/file.pdf /path/to/output/dir
</syntaxhighlight>
 
Extract the PDF file called bar.pdf and save every image as image-00{1,2,3..N}.ppm, enter:
 
<syntaxhighlight lang="bash" line>
pdfimages bar.pdf /tmp/image
ls /tmp/image*
</syntaxhighlight>
 
A sample output:
 
<syntaxhighlight lang="bash">
image-000.ppm  image-1025.ppm  image-1140.ppm  image-1256.ppm  image-247.ppm  image-374.ppm  image-501.ppm  image-628.ppm  image-755.ppm
image-001.ppm  image-1026.ppm  image-1141.ppm  image-1257.ppm  image-248.ppm  image-375.ppm  image-502.ppm  image-629.ppm  image-756.ppm
image-002.ppm  image-1027.ppm  image-1142.ppm  image-1258.ppm  image-249.ppm  image-376.ppm  image-503.ppm  image-630.ppm  image-757.ppm
</syntaxhighlight>
 
Normally, all images are written as PBM (for monochrome images) or PPM (for non-monochrome images) files. With the -j option, images in DCT format are saved as JPEG files. All non-DCT images are saved in PBM/PPM format as usual:
 
<syntaxhighlight lang="bash">
pdfimages -j bar.pdf /tmp/image
</syntaxhighlight>
 
== Options ==
 
The <code>-f</code> option specifies the first page to scan. To scan the first 5 pages, enter:
 
<syntaxhighlight lang="bash">
pdfimages -j -f 5 bar.pdf /tmp/image
</syntaxhighlight>
 
The <code>-l</code> option specifies the last page to scan. To scan the last 5 pages, enter:
 
<syntaxhighlight lang="bash">
pdfimages -j -l 5 bar.pdf /tmp/image
</syntaxhighlight>


[[Category:Tools]]
[[Category:Tools]]

Latest revision as of 12:09, 8 December 2021

Manual: https://manpages.debian.org/testing/poppler-utils/pdfimages.1.en.html

pdfimages is part of the Poppler utilities for working with PDFs. pdfimages saves images from a Portable Document Format (PDF) file as Portable Pixmap (PPM), Portable Bitmap (PBM), Portable Networks Graphics (PNG), Tagged Image File Format (TIFF), JPEG, JPEG2000, or JBIG2 files.

It is part of the Poppler software package of PDF tools.