itext modifies TIFF images when creating PDF. - itext

I know how to create PDF from TIFFs. My question is:
How can itext just embed original TIFFs without modifying them?
I used document.add(img) (where img is the TIFF) to create a PDF. However, the TIFF was modified to smaller size. In this case, my original uncompressed b/w TIFF file size of 2.8 MB was compressed to CCITT Group 4 TIFFs.
Does itext have a way not to modify TIFF?

Please consult ISO-32000-1. If you read this standard closely, you'll find references to TIFF in the context of LZW and Flate filters, but you'll discover that TIFF is not one of the available filters in PDF. Table 6 shows the options:
As TIFF is not supported in PDF, iText has no other option than to convert it into a format that is accepted. In your case CCITTFaxDecode.
If you really want to keep the TIFF as-is, you need to add it as an attachment. That's explained in my answer to this question: Attaching files to a PDF

Related

Publication ready figures-Matlab [duplicate]

This question already has answers here:
How can I plot professional quality graphs in matlab? [closed]
(2 answers)
Closed 2 years ago.
I want to create high-quality figures ready to be submitted for publication. What is the best font size, type of file to be saved as(.fig, .eps, .png?) and generally the characteristics required for a top-quality figure?
Depends on the paper editing software you use.
If you use Latex or Overleaf, exporting to PDF is the easiest/simplest/smallest-file-size one while keeping image quality as high as possible (i.e. vector format). My typical workflow is
create the plot, set font size to 18 o 20 use set(gca,'fontsize',..).
set(gcf,'paperpositionmode','auto'); this is important! it makes your figure what-you-see-is-what-you-get, so you can adjust figure sizes relative to the font size.
save as PDF or print -dpdf to export to the figure.
call pdfcrop yourfigure.pdf to remove the margins, pdfcrop is available on Linux/Mac, but also available on Windows. If you can't install pdfcrop, you can also use inkscape to fit plot to page.
I usually try to make the figure in its final look using matlab commands (so it can be easily reproduced), but in case there are changes that I can;t make automatically, I will open the pdf using inkscape and manually edit.
if the figure is a 3D surface/mesh rendering with transparency, I will directly call print -dpng -r300 myfile.png to create a png bitmap image directly. Exporting to pdf generates huge file sizes and slow rendering.
once done, upload the pdf or png to overleaf, the platform accepts these formats directly.
If using older versions of MS Word, I had been exporting images to EPS and insert to keep the images in highest quality possible. However, since 2017, this feature was turned off
https://support.microsoft.com/en-gb/office/support-for-eps-images-has-been-turned-off-in-office-a069d664-4bcf-415e-a1b5-cbb0c334a840
but one can still modify a registry key to turn it on.
Later MS Office accepts svg, which you can export from matlab.

tesseract lower size of pdf output file

After the scanned images is there an option to output low resolution pdf images and text
The images in the pdf are so huge that the size of the pdf goes upto 1 gb.
using cmd like :
tesseract testing/eurotext.png testing/eurotext-eng -l eng pdf
Tesseract use provided image(s) for creating pdf without its modification => if your input image size is big => pdf will be big.
So you can:
Decrease size of input image (e.g. use tiff with g4, resize image...)
Use tesseract to produce hocr file and create pdf with some other tool like hocr2pdf, hocr-pdf...)
Use some pdf compression tool (there are online tools and offline like pdfsizeopt

How to load DICOM pixel data in browser preserving HU values?

I need to display the DICOM images in a browser. This requires, the DICOM to be converted to PNG (or any other compatible) format.
I also need to calculate some overlay pixels based on dynamic input from the user. On conversion to PNG, I am getting 4 values (Alpha, R, G, B). But I can not use these values for my calculations. I need the original HU values from the DICOM images.
Is there any way that, PNG can contain the original DICOM values. I heard that using monochromatic 16 bit PNG format it is possible. How do we do that?
Alternatively, how to load DICOM pixel data in browser preserving HU values?
When you convert DICOM pixel data to other non-DICOM format like PNG, BMP, JPG, J2K etc., the data you are looking for is lost. You may further research for TIF format whether it preserves the data and it loads in browser. I guess it will not.
I will recommend to avoid this way. Instead, I will suggest using DICOM pixel data as-is in browser. This can be achieved by involving some java-script DICOM toolkit for browsers like cornerstone. You may also look for other toolkit if available and suits you.
Note that this involves learning curve. It will be too broad here to explain its working.

Artifacts appear using imread function from opencv

I use imread function to read one jpeg file and save the rgb image in bmp format. Comparing the two files, I found artifacts appear and use green circle to denote artifacts. The version of OpenCV is 3.0. I compile the libraries by myself with SSE, SSE2 and SSE3 switchd on (default setting). My OS is windows 7 professional. You can use the following image to check.
original jpeg image
saved bmp file
If I read the jpeg file in Matlab, the rgb image is correct. I save rgb image in png format in Matlab, read the png file using opencv and save the loaded image in bmp file. Everything is OK. It seems that there is a problem with jpeg decoder. The jpeg library used is libjpeg.lib.
Due to the size limit, I cut the patch from the second image.
You're always going to get some artifacts in JPEG. You can reduce the appearance of such artifacts by changing the quantization tables used (usually with loss of compression).
JPEG encoders often use a "quality" setting to change the quantization tables.

Setting DPI for PNG files

I have a bunch of diagrams created using a Java diagramming tool that I wrote - they are mostly black and white diagrams, with the blocks in aqua, and occasional other colours. They are currently being saved as JPG files, and I want to insert them into a book that I am preparing for Print On Demand.
The book is an OpenOffice ODT file, which will later be converted to a PDF.
Currently I use JPG files, but the print facility they use requires 300 DPI, so I modified my diagramming tool to set the xDensity and yDensity to 300, and resUnits to 1, using getAsTree(), and then expand the diagram by a factor of 3 (300/96). IMO the result looks pretty good!
Unfortunately, someone on another forum pointed out that line diagrams are "fuzzed" on JPG files, so suggested that I change over to PNG, or possibly BMP files, both of which ODT files allow to be inserted.
My problem is that BMPs don't seem to have a DPI, and PNGMetadata doesn't seem to support getAsTree(). Can someone point me in the right direction? Thanks.
I don't understand the getAsTree() part, but answering the question that appears in the title, setting dpi for PNG files, you could use the imagemagick convert tool:
convert -density 300 -units pixelsperinch infile.jpg outfile.png
PNG, BMP and dozens of other image formats don't compress your diagrams - compression is probably what your commentor was getting at. JPEGs are great for photos but suck at diagrams.
You might want to look into SVG and other vector formats. Or if your environment allows, exporting 0% compression JPEGs and converting them into another format for lossless reproduction at 300DPI.
Hope that helps!
I decided not to try to do this programmatically. Instead I create the original diagram in PNG, then convert to 300 DPI using Irfanview. Irfanview's batch capability lets me convert to 300 DPI, scale up to compensate, and set to grey scale, all in one operation - and on multiple files at a time. This seems to be the best solution - but thanks to everyone anyway!