How can I convert TIFF files of various dimensions to uniformly sized PDF files with page size 8.5″×11″ without losing quality? - perl

Currently I'm trying to use Perl/ImageMagick and/or Ghostscript to convert scanned text documents stored as TIFFs into an 8.5″×11″ (ANSI A “Letter” size) PDF file.
I've tried many of the ImageMagick filters with resize and still find that some files perfectly legible before are now illegible. Often these images are at 72 dpi and when converted to be 8.5″×11″, it ends up with something like 612×792 pixels. The original was 1700×2200; as you can see there are quite a bit of pixels lost in the re-size.
Should I be using something else besides resize? Could it be something like ImageMagick is reporting the image is 72 dpi when it's really something like 200 dpi? Would re-sampling the image into the highest dpi that would fit in the 8.5″×11″ area help?
Does anyone have any other options to ultimately create a PDF file with all pages being 8.5″×11″?

(Mantra: 'Use the right tool for the job...')
You possibly shouldn't use ImageMagick for the job, but rather LibTIFF's tiff2pdf commandline utility:
tiff2pdf \
-z \
-o output.pdf \
-p letter \
-F \
input.tiff
-z is for (lossless) Zip/Flate compression.
-o defines the output filename.
-p sets the media size.
-F fills the page.

Related

JPEG to PNG conversion with 300 DPI

Unable to convert a JPEG image into a 300 DPI PNG image using ImageMagick.
After conversion the PNG image is 72 DPI only. I'm using ImageMagick 6.9.0-0 Q16 x86 and Ghostscript v9.15.
Below is the line I use in my Perl script:
system("\"$imagemagick\" -set units PixelsPerInch -density 300 \"$jpg\" \"$png\"");
Adjusting the units & density will not alter the underlining image data, but updates meta info for rendering libraries. Important for vector to raster, but not very useful for raster to raster. To adjust the DPI of an image, use the -resample operation.
convert source.jpg -resample 300 out.png
You verify the DPI resolution with the following...
identify -format "%[resolution.x] %[resolution.y]\n" out.png
I'm wondering where the 72dpi is coming from. Assuming you are using X and some kind of Unix, ImageMagick defaults to using the screen resolution (72 dpi). I'm not sure what it does under OSX/XQuartz but it's likely similar. Is your screen resolution set to 72dpi (!?).
I'm with #emcconville #ikegami - just do this straight from ImageMagick on the commandline - passing the right options to be sure.
There are image manipulation modules that you can use from perl without having to resort to system commands as well such as Imager::Transformations, Image::Magick, and GD. Here's how to convert with GD.
perl -MGD -E 'my $imgjpg = GD::Image->newFromJpeg("img.jpg");
open my $imgpng, ">", "img.png" or die; print $imgpng $imgjpg->png();'
With most image manipulation packages the original resolution show be maintained during conversion - though some (including GD) will default to lower color depths (8 bit) unless passed a Truecolor flag.
e.g. GD::Image->newFromJpeg("img.jpg", 1);

How to compress the image using gifsicle tool or any other tool on ubuntu

I have some images in .GIF format which i want to compress. But i am getting the output either same or hardly 2-5% compression.I need higher compression ratio so that the web pages can be loaded fastly. Currently i am using the gifsicle tool but hardly i am finding much difference in size of the generated gif images.
I opted this tool from yahoo smush it.
gifsicle -O3 gifimage1.gif -o new-gifimage1.gif
UPDATE:
The giflossy fork has been merged into gifsicle now, so you will be able to use the --lossy flag with gifsicle now (with the latest version), no need to install giflossy separately.
If you want to enable lossy compression which reduces the size considerably, you can use giflossy fork of gifsicle. Once you have installed it you can use the lossey option as below
gifsicle -O3 --lossy=80 gifimage1.gif -o new-gifimage1.gif
Installation:
On Mac : brew install giflossy
On Unix : Building Gifsicle on UNIX
You can use below command to tweak your compression options
gifsicle -O3 --colors=64 --use-col=web --lossy=100 --scale 0.8 gifimage1.gif -o new-gifimage1.gif
Gifsicle's --optimize option will only attempt lossless reduction of an image's file size. What you probably have* is an animated gif where each frame contains random dithering, so most of the pixels will change from one frame to the next.
If your original GIF image had used pattern dithering, you would be able to compress it a lot more. But if that's not an option, I suggest you try either reducing the dimensions of the image (e.g., --scale 0.5), or reducing the number of colours in it (e.g., --colors 16).
* (I'm only guessing, since you didn't bother to share your image)

Libreoffice Draw Export Resolution makes no sense

I am attempting to make a very simple label using Libreoffice Draw v 4.0.2.2. The label has not much more to it than regularly spaced lines of centered text
This image will be printed, and I have a fixed size/ppi requirement to ensure appropriate print quality.
I set the page size to my specs, and layout the text as I desire. The print shop takes several image formats including .tiff and .png. When I export the image, a dialog pops up that asks for the image size/ resolution. The given ppi is very low (~40) and I require a minimum of 180ppi. When I enter this, the image size adjusts itself and results in an image that is far too small.
The only solution that appears to be viable is to explode the page size and the drawing text size so it gets shrunk upon export. This is a very imprecise and illogical feature (bug?) of the program that I really wish is a result of my ignorance.
I found a thread in the mailing list which describes this issue exactly. The only answer that is given is essentially "yes, this is ridiculous and doesn't help anybody".
Can anyone give some advice to this? Or at least shed some light on who might need this "feature"?
There is something off about the Export tool of LibreOffice in general. It has been years since it is broken. Taking a screenshot is an alternative, but obviously you cannot control the resolution.
So, a better work around is exporting to SVG, and then convert the SVG to PNG with Inkscape. Once downloaded, convert the file with the following command:
inkscape -z -e out.png -w 1024 in.svg
If you are in Windows (x64), you will need to indicate the full path:
"C:/Program Files/Inkscape/inkscape.exe" -z -e out.png -w 1024 in.svg
If you install the 32 bit version, this should work:
"C:\Program Files (x86)/Inkscape/inkscape.exe" -z -e out.png -w 1024 in.svg
This can be done from inside Libre Office, there is no need to use any external tool. The Export dialog is very confusing, yes; you have to realize that both size and resolution can be set independently.
Select File -> Export -> choose the desired format. The export dialog should appear.
TAKE NOTE of Width and Height. Set the desired resolution; notice how Width and Height change (?). Don't worry, restore Width and Height to your saved values. And that's it. You get a high resolution image with the desired size and DPI.
Libre Draw (the one I'm using anyway) is a vector drawing app - have you asked the print shop if they can use vector formats like eps, pdf? Most should be able to in my experience. Then resolution becomes irrelevant.
-Terry

Imagemagick command line, combine two different sized images

I'd like to use "convert" (or whatever) from Imagemagick to combine two different sized images. I'd like them to be aligned at the bottom left corners. For example, I have two images:
trans_alpha.png (a transparent 42x37 blank image)
and shadow.png (a 68x23 image, which I want overlaid on trans_alpha.png aligned at the bottom left)
The result I'd like would be a 68x37 image, NOTE these sizes are examples only, I don't want to put the size into the command line, I just want to use the sizes from the input images.
I have tried a lot of combinations without success:
Attempt no. 776 (close, but aligned to top left, not bottom left)..:
convert trans_alpha.png -background none shadow.png -gravity SouthWest -layers merge +repage result.png
Attempt no. 841 (aligned correctly, but result image isn't wide enough)...
convert trans_alpha.png shadow.png -gravity SouthWest -composite result.png
Hopefully that makes sense.
Thanks,
Paul
In answer to my own question (courtesy of the clever people on www.imagemagick.org)
convert \
trans_alpha.png shadow.png \
-flip \
-background none \
-mosaic \
-flip \
result.png
Imagemagick includes many useful transformations, but occasionally still it lacks the one you need. Since your original images are PNG lossless bitmaps, you can convert both to long-form PBM or a related format like long-form PPM. The advantage of these forms is that they represent the entire image, pixel by pixel, in plain text, which one can write a program -- usually a fairly short program -- to process any way one likes. As storage formats, PBM and PPM are egregiously inefficient, but they are likewise egregiously easy to manipulate, and that's what you want.
The pbm(5) manpage (available for example on Debian/Ubuntu systems in the netpbm package) is well written and explains the process clearly.
I am unable to test at the moment but you can use -page with layers so something like this might work but you may need to calculate the Y offset:
convert \
trans_alpha.png \
-background none \
shadow.png \
-page +0+10 \
-layers merge \
+repage \
result.png
You may not need the -background none

Tesseract and tiff format - spp not in set {1,3}

While trying to run this command:
tesseract bond111.tif bond111 batch.nochop makebox
I get the next error
Error in pixReadFromTiffStream: spp not in set {1,3}
Error in pixReadStreamTiff: pix not read
Error in pixReadTiff: pix not read
Assuming that spp not in set is the main error here, what does it mean?
At first it had trouble because the bpp was higher than 24 so I reduced it using Gimp but that did not resolve the issue.
It probably means your TIFF image has an alpha channel and therefore the underlying Leptonica library used by Tesseract doesn't support it. If you're using Imagemagick then be aware that operations such as -draw can cause alpha channels to be added. If you're using convert in your workflow and want to remove the channel again immediately, flatten the image before writing by adding -background white -flatten +matte before the output filename, e.g.:
convert input.tiff -fill white -draw 'rectangle 10,10 20,20' -background white -flatten +matte output.tiff
Tesseract (well, Leptonica) accepts PNGs these days and is less picky about them, so it might be easier to migrate your workflow to PNG anyway.
Sources: magick-users mailing list posting; tesseract-ocr mailing list posting
Thanks for your post ZakW, you pointed me to the right direction.
Anyhow i also needed to set '-depth 8'. Quality was not good enough for OCR, whatever I tried.
What worked for me is this solution:
ghostscript -o document.tiff -sDEVICE=tiffgray -r720x720 -g6120x7920 -sCompression=lzw document.pdf
tesseract document.tiff document -l deu
vim document.txt
This way I got perfect text with Umlauts in german.
Adjusting the conversion to the following line did help me.
convert -density 300 input.pdf -depth 8 -background white -alpha Off output.tiff
Note that the other answers did not work for me since they use the deprecated +matte flag instead of -alpha Off.
You can try using the command 'tiffinfo' provided by libtiff_tools to verify the TIFF format of your src image. A number of TIFF formats exist, with different values for Bits-per-pixel (bpp) and Samples-per-pixel (spp).
Error in pixReadFromTiffStream: spp not in set {1,3,4}
An 'spp' value of 2 is invalid for TIFF.
I solved the problem by saving directly to TIFF format from Gimp, instead of converting from .png to .tif using ImageMagick's 'convert'.
See also: TIFF format