I'm trying to use Ghostscript and/or ImageMagick to convert each page of a Postscript document into PNG images. The problem is that both produce images that are way too saturated (I think that's the right terminology).
Here are the commands I'm trying:
gs -dSAFER -dBATCH -dNOPAUSE -sDEVICE=png16m -dGraphicsAlphaBits=4 -sOutputFile=page_%02d.png brochure.ps
convert brochure.ps im_page_%02d.png
This is the input Postscript file (brochure.ps from above)
Here's a couple of the output images I'm getting:
Page 1
Page 6
As you can see (especially on the page with the big green map of New Hampshire), the colors of the output PNGs are too bright/saturated. How can I prevent the colors from being changed so much and get a more accurate conversion?
Preview in OS X 10.6 automatically does a very accurate conversion to PNG when you open a Postscript file in it. This leads me to believe there is just something screwy with the way ghostscript converts ps->png (I'm fairly confident ImageMagick is just a wrapper for ghostscript for this operation). Is there a tool besides ghostscript I should be using instead?
Note: As pipitas points out below, the visible difference of colors varies by OS. It is very obvious in OS X 10.6, but apparently not very noticeable in Windows XP.
You are right in assuming ImageMagick just being a wrapper for Ghostscript when converting from PostScript or PDF to an image format.
I think, this problem can only be solved to anybody's satisfaction once the efforts to add support for ICC profile handling and color management (currently underway) are completed for Ghostscript (design document as PDF). That point in time is close, however. If I understand recent commits to http://svn.ghostscript.com/trunk/ correctly, the next release (which will be dubbed 9.00 and out hopefully in August) will include support for color management via LittleCMS. Yay!
OSX 10.4 and up provide sips (scriptable image processing system) and it works well with PDF format. Perhaps it can be a temporary solution until Ghostscript supports color management.
Related
We have many square EPS images, which we would like to export via script to PNG at very specific formats/sizes, namely
8192x8192, greyscale, no alpha, no anti-aliasing
2048x2048,greyscale, no alpha, anti-aliased.
We have had no luck scripting the "professional" tools Photoshop or Illustrator to do this (although we can do so through the UI, their weak scripting support does not give control over alpha or precise image export size, so we either always get alpha in the large images, or we sometimes get slightly inaccurate image sizes which breaks subsequent algorithms.)
Our first attempt at doing the high resolution version of this was:
gs -sDEVICE=pnggray -o cover.png -dDEVICEWIDTHPOINTS=8192 -dDEVICEHEIGHTPOINTS=8192 -dGraphicsAlphaBits=1 -dPDFFitPage=true cover.eps
However, this does not seem to resize the image to fill the box as expected.
Is there a way, given a square EPS, to get Ghostscript to do what we want?
Your problem with EPS files is that they do not request a media size. That's because EPS files are intended to be included in other PostScript programs, so they need to be resized by the application generating the PostScript.
To that end, EPS files include comments (which are ignored by PostScript interpreters) which define the BoundingBox of the EPS. An application which places EPS can quickly scan the EPS to find this information, then it sets the CTM appropriately in the final PostScript program it is creating and inserts the content of the EPS.
The FitPage switch in Ghostscript relies on having a known media size (and you should set -dFIXEDMEDIA when using this) and a requested media size, figuring out what scale factor to apply to the request in order to make it fit the actual size, and setting up the CTM to apply that scaling.
If you don't ever get a media size request (which you won't with an EPS) then no scaling will take place.
Now Ghostscript does have a different switch, EPSCrop which picks up the comments from the EPS and uses that to set the media size (Ghostscript has mechanisms to permit processing of comments for this reason, amongst others). You could implement a similar mechanism to pick up the BoundingBox comments, and scale the EPS so that it fits a desired target media size.
I could probably knock something up, but I'd have to mess around creating an example file to work from.....
Do not accidentally specify PDFFitPage in the command line above. Specify EPSFitPage when dealing with EPS files. PDFFitPage will silently do nothing.
I am attempting to make a very simple label using Libreoffice Draw v 4.0.2.2. The label has not much more to it than regularly spaced lines of centered text
This image will be printed, and I have a fixed size/ppi requirement to ensure appropriate print quality.
I set the page size to my specs, and layout the text as I desire. The print shop takes several image formats including .tiff and .png. When I export the image, a dialog pops up that asks for the image size/ resolution. The given ppi is very low (~40) and I require a minimum of 180ppi. When I enter this, the image size adjusts itself and results in an image that is far too small.
The only solution that appears to be viable is to explode the page size and the drawing text size so it gets shrunk upon export. This is a very imprecise and illogical feature (bug?) of the program that I really wish is a result of my ignorance.
I found a thread in the mailing list which describes this issue exactly. The only answer that is given is essentially "yes, this is ridiculous and doesn't help anybody".
Can anyone give some advice to this? Or at least shed some light on who might need this "feature"?
There is something off about the Export tool of LibreOffice in general. It has been years since it is broken. Taking a screenshot is an alternative, but obviously you cannot control the resolution.
So, a better work around is exporting to SVG, and then convert the SVG to PNG with Inkscape. Once downloaded, convert the file with the following command:
inkscape -z -e out.png -w 1024 in.svg
If you are in Windows (x64), you will need to indicate the full path:
"C:/Program Files/Inkscape/inkscape.exe" -z -e out.png -w 1024 in.svg
If you install the 32 bit version, this should work:
"C:\Program Files (x86)/Inkscape/inkscape.exe" -z -e out.png -w 1024 in.svg
This can be done from inside Libre Office, there is no need to use any external tool. The Export dialog is very confusing, yes; you have to realize that both size and resolution can be set independently.
Select File -> Export -> choose the desired format. The export dialog should appear.
TAKE NOTE of Width and Height. Set the desired resolution; notice how Width and Height change (?). Don't worry, restore Width and Height to your saved values. And that's it. You get a high resolution image with the desired size and DPI.
Libre Draw (the one I'm using anyway) is a vector drawing app - have you asked the print shop if they can use vector formats like eps, pdf? Most should be able to in my experience. Then resolution becomes irrelevant.
-Terry
According to the MATLAB manual, when you save a figure using print or by choosing file|save, if you choose the painters renderer and save to PDF or EPS vector formats, all fonts get substituted. Is there a way to get around this limitation?
Whenever I output a figure, whether I use print or export_fig, the fonts get substituted, and so they no longer match the fonts in the document that I plan on putting the figure into. I would prefer to keep them in a vector format, because I use LaTeX and so I want to be able to use the same figures in my documents as in my beamer presentations and have them scale nicely without bloating the file size.
If I'm reading that link correctly, not all fonts get substituted. From 'Choosing a Printer Driver':
The table below lists the fonts supported by the MATLAB PostScript and Ghostscript drivers when generated with the Painters renderer (fully vectorized output). This same set of fonts is supported on both Windows and UNIX:
AvantGarde
Helvetica-Narrow
Times-Roman
Bookman
NewCenturySchlbk
ZapfChancery
Courier
Palatino
ZapfDingbats
Helvetica
Symbol
So, if you use one of the above fonts, the output vector-format figure should maintain the correct font. See for example:
list_fonts = listfonts
figure('renderer','painters'),
plot(peaks),
xlabel('this font is Helvetica','fontname','Helvetica','fontsize',24)
set(gcf,'paperpositionmode','auto')
print(gcf,'-depsc2','test1.eps')
Which produces:
So, choose one of the fonts from the list above, and the font will be output correctly. Otherwise, change the font in your presentation to match one of the above fonts.
I also encountered this problem for many times, and I have an simple but effective way that never fails me (on Windows, need GSview).
1) save fig as PDF
2) save PDF as ps
3) open ps with GSview, then click "File->PS to EPS", specify a file name and done.
Hope this helps.
While trying to run this command:
tesseract bond111.tif bond111 batch.nochop makebox
I get the next error
Error in pixReadFromTiffStream: spp not in set {1,3}
Error in pixReadStreamTiff: pix not read
Error in pixReadTiff: pix not read
Assuming that spp not in set is the main error here, what does it mean?
At first it had trouble because the bpp was higher than 24 so I reduced it using Gimp but that did not resolve the issue.
It probably means your TIFF image has an alpha channel and therefore the underlying Leptonica library used by Tesseract doesn't support it. If you're using Imagemagick then be aware that operations such as -draw can cause alpha channels to be added. If you're using convert in your workflow and want to remove the channel again immediately, flatten the image before writing by adding -background white -flatten +matte before the output filename, e.g.:
convert input.tiff -fill white -draw 'rectangle 10,10 20,20' -background white -flatten +matte output.tiff
Tesseract (well, Leptonica) accepts PNGs these days and is less picky about them, so it might be easier to migrate your workflow to PNG anyway.
Sources: magick-users mailing list posting; tesseract-ocr mailing list posting
Thanks for your post ZakW, you pointed me to the right direction.
Anyhow i also needed to set '-depth 8'. Quality was not good enough for OCR, whatever I tried.
What worked for me is this solution:
ghostscript -o document.tiff -sDEVICE=tiffgray -r720x720 -g6120x7920 -sCompression=lzw document.pdf
tesseract document.tiff document -l deu
vim document.txt
This way I got perfect text with Umlauts in german.
Adjusting the conversion to the following line did help me.
convert -density 300 input.pdf -depth 8 -background white -alpha Off output.tiff
Note that the other answers did not work for me since they use the deprecated +matte flag instead of -alpha Off.
You can try using the command 'tiffinfo' provided by libtiff_tools to verify the TIFF format of your src image. A number of TIFF formats exist, with different values for Bits-per-pixel (bpp) and Samples-per-pixel (spp).
Error in pixReadFromTiffStream: spp not in set {1,3,4}
An 'spp' value of 2 is invalid for TIFF.
I solved the problem by saving directly to TIFF format from Gimp, instead of converting from .png to .tif using ImageMagick's 'convert'.
See also: TIFF format
I have a bunch of diagrams created using a Java diagramming tool that I wrote - they are mostly black and white diagrams, with the blocks in aqua, and occasional other colours. They are currently being saved as JPG files, and I want to insert them into a book that I am preparing for Print On Demand.
The book is an OpenOffice ODT file, which will later be converted to a PDF.
Currently I use JPG files, but the print facility they use requires 300 DPI, so I modified my diagramming tool to set the xDensity and yDensity to 300, and resUnits to 1, using getAsTree(), and then expand the diagram by a factor of 3 (300/96). IMO the result looks pretty good!
Unfortunately, someone on another forum pointed out that line diagrams are "fuzzed" on JPG files, so suggested that I change over to PNG, or possibly BMP files, both of which ODT files allow to be inserted.
My problem is that BMPs don't seem to have a DPI, and PNGMetadata doesn't seem to support getAsTree(). Can someone point me in the right direction? Thanks.
I don't understand the getAsTree() part, but answering the question that appears in the title, setting dpi for PNG files, you could use the imagemagick convert tool:
convert -density 300 -units pixelsperinch infile.jpg outfile.png
PNG, BMP and dozens of other image formats don't compress your diagrams - compression is probably what your commentor was getting at. JPEGs are great for photos but suck at diagrams.
You might want to look into SVG and other vector formats. Or if your environment allows, exporting 0% compression JPEGs and converting them into another format for lossless reproduction at 300DPI.
Hope that helps!
I decided not to try to do this programmatically. Instead I create the original diagram in PNG, then convert to 300 DPI using Irfanview. Irfanview's batch capability lets me convert to 300 DPI, scale up to compensate, and set to grey scale, all in one operation - and on multiple files at a time. This seems to be the best solution - but thanks to everyone anyway!