pdf files are not compressing much in itextsharp - itext

I am using itextsharp for compressing PDF files, but when I try to compress, it is not compressed as much as I expected (from 4.5 mb to 4.4 mb). Any suggestions as how the compression can be improved?
Here is my code :
string pdfFile = #"c:\Img_Comp\Bigfile.pdf";
PdfReader reader = new PdfReader(pdfFile);
PdfStamper stamper = new PdfStamper(reader, new FileStream(#"c:\Img_Comp\Small.pdf", FileMode.CreateNew), PdfWriter.VERSION_1_5);
stamper.FormFlattening = true;
stamper.Writer.CompressionLevel = PdfStream.BEST_COMPRESSION;
stamper.SetFullCompression();
stamper.Close();

Related

How to convert PowerPoint (.ppt) to PDF in Java

Only last slide is getting convert means last slide overlapping every slides. Can anyone suggest how to combine it in one PDF?
I have tried with different approach but they are first creating image and then PDF.
FileInputStream inputStream = new FileInputStream(in);
SlideShow ppt = new SlideShow(inputStream);
inputStream.close();
Dimension pgsize = ppt.getPageSize();
Document document = new Document();
PdfWriter pdfWriter = PdfWriter.getInstance(document, new FileOutputStream(out));
document.setPageSize(new Rectangle(
(float)pgsize.getWidth(), (float)pgsize.getHeight()));
document.open();
PdfGraphics2D graphics= null;
for (int i = 0; i < ppt.getSlides().length; i++) {
Slide slide = ppt.getSlides()[i];
graphics = (PdfGraphics2D) pdfWriter.getDirectContent()
.createGraphics((float)pgsize.getWidth(), (float)pgsize.getHeight());
slide.draw(graphics);
}
graphics.dispose();
document.close();

Issues converting certain TIF compressions to PDF using iTextSharp

I am using iTextSharp to convert & stitch single-page TIF files to multi-page PDF file. The single-page TIF files are of different bit depths and compressions.
Here is the code-
private void button1_Click(object sender, EventArgs e)
{
List<string> TIFfiles = new List<string>();
Document document;
PdfWriter pdfwriter;
Bitmap tifFile;
pdfFilename = <file path>.PDF;
TIFfiles = <load the path to each TIF file in this array>;
//Create document
document = new Document();
// creation of the different writers
pdfwriter = PdfWriter.GetInstance(document, new System.IO.FileStream(pdfFilename, FileMode.Create));
document.Open();
document.SetMargins(0, 0, 0, 0);
foreach (string file in TIFfiles)
{
//load the tiff image
tifFile = new Bitmap(file);
//Total number of pages
iTextSharp.text.Rectangle pgSize = new iTextSharp.text.Rectangle(tifFile.Width, tifFile.Height);
document.SetPageSize(pgSize);
document.NewPage();
PdfContentByte cb = pdfwriter.DirectContent;
tifFile.SelectActiveFrame(FrameDimension.Page, 0);
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(tifFile, ImageFormat.Tiff);
// scale the image to fit in the page
img.SetAbsolutePosition(0, 0);
cb.AddImage(img);
}
document.Close();
}
This code works well and stitches & converts tifs to PDF. Issue is with processing time and pdf file size that it creates when processing certain types of TIFs.
For e.g.
Original TIF --> B&W/Bit depth 1/Compression CCITT T.6 --> Faster processing, PDF file size is ~1.1x times the TIF file size.
Original TIF --> Color/Bit depth 8/Compression LZW --> Faster processing, PDF file size is ~1.1x times the TIF file size.
Original TIF --> Color/Bit depth 24/Compression JPEG--> Slow processing, PDF file size is ~12.5x times the TIF file size.
Why doesn't converting Color/Bit depth 24/Compression JPEG files gives similar result as other tif files?
Moreover, this issue is only with iTextSharp. I had a colleague test the same set of TIF samples using Java-iText and the resulting PDF was of smaller size (1.1x times) and had faster processing.
Unfortunately, I need to use .Net for this TIF to PDF conversion, so am stuck with using iTextSharp.
Any ideas/suggestions on how to get those Compression JPEG TIF files to create smaller size PDFs as it does for other TIF compressions?
Appreciate your help!
Regards,
AG
I was able to reproduce your problem with the code you supplied, but found that the problem went away once I used Image.GetInstance instead of the bitmap used in your sample. When using the code below, the file size and run time was the same between Java and C#. If you have any questions about the sample, don't hesitate to ask.
List<string> TIFfiles = new List<string>();
Document document;
PdfWriter pdfwriter;
iTextSharp.text.Image tifFile;
String pdfFilename = pdfFile;
TIFfiles = new List<string>();
TIFfiles.Add(tifFile1);
TIFfiles.Add(tifFile2);
TIFfiles.Add(tifFile3);
TIFfiles.Add(tifFile4);
TIFfiles.Add(tifFile5);
TIFfiles.Add(tifFile6);
TIFfiles.Add(tifFile7);
//Create document
document = new Document();
// creation of the different writers
pdfwriter = PdfWriter.GetInstance(document, new System.IO.FileStream(pdfFilename, FileMode.Create));
document.Open();
document.SetMargins(0, 0, 0, 0);
int i = 0;
while (i < 50)
{
foreach (string file in TIFfiles)
{
//load the tiff image
tifFile = iTextSharp.text.Image.GetInstance(file);
//Total number of pages
iTextSharp.text.Rectangle pgSize = new iTextSharp.text.Rectangle(tifFile.Width, tifFile.Height);
document.SetPageSize(pgSize);
document.NewPage();
PdfContentByte cb = pdfwriter.DirectContent;
// scale the image to fit in the page
tifFile.SetAbsolutePosition(0, 0);
cb.AddImage(tifFile);
}
i++;
}
document.Close();

Remove unused image objects

I have PDF files that are being created with a composition tool to produce financial statements.
The PDF files are around the 5000 - 10000 pages per file using global image resources to maximise space efficiences.
These statements include marketing images. Many of them (about 3mb worth), not every particular statements uses all the images.
When I extract the PDF file using a tool that has been developed for this purpose (or if I use adobe acrobat just for testing purposes) - to extract a blank page at the start of the PDF file, the resulting extracted PDF is around the 3mb. Auditing the space usage sees that it is comprised of 3mb of images.
Using iTextSharp (latest 5.4.4) I have attempted to iterate through each page and copy to a writer calling reader.RemoveUnusedObjects. But this does not reduce the size.
I also found another example to use a pdfstamper and tried the same thing. Same result.
I've also tried setting maximum compression and SetFullCompression. Neither made any difference.
Can anyone give me any pointers for what I might do. I'm hoping I can do it as a simple exercise and not have to parse the objects in the PDF file and manually remove the unused ones.
Code Below:
iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(inputFile);
iTextSharp.text.Document document = new iTextSharp.text.Document(reader.GetPageSizeWithRotation(1));
// step 2: we create a writer that listens to the document
// step 3: we open the document
iTextSharp.text.pdf.PdfCopy pdfCpy = new iTextSharp.text.pdf.PdfCopy(document, new System.IO.FileStream(outputFile, System.IO.FileMode.Create));
document.Open();
iTextSharp.text.pdf.PdfContentByte cb = pdfCpy.DirectContent;
//pdfCpy.NewPage();
int objects = reader.RemoveUnusedObjects();
reader.RemoveFields();
reader.RemoveAnnotations();
// we retrieve the total number of pages
int numberofPages = reader.NumberOfPages;
int i = 0;
while (i < numberofPages)
{
i++;
document.SetPageSize(reader.GetPageSizeWithRotation(i));
document.NewPage();
iTextSharp.text.pdf.PdfImportedPage page = pdfCpy.GetImportedPage(reader, i);
pdfCpy.SetFullCompression();
reader.RemoveUnusedObjects();
reader.RemoveFields();
reader.RemoveAnnotations();
int rotation = reader.GetPageRotation(i);
if (rotation == 90 || rotation == 270)
{
cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height);
}
else
{
cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
}
pdfCpy.AddPage(page);
}
pdfCpy.NewPage();
pdfCpy.Add(new iTextSharp.text.Paragraph("This is added text"));
document.Close();
pdfCpy.CompressionLevel = iTextSharp.text.pdf.PdfStream.BEST_COMPRESSION;
pdfCpy.Close();
reader.Close();
Stamper example:
iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(inputFile);
using (FileStream fs = new FileStream(outputFile + ".2" , FileMode.Create))
{
iTextSharp.text.pdf.PdfStamper stamper = new iTextSharp.text.pdf.PdfStamper(reader, fs, iTextSharp.text.pdf.PdfWriter.VERSION_1_5);
iTextSharp.text.pdf.PdfWriter writer = stamper.Writer;
writer.SetPdfVersion(iTextSharp.text.pdf.PdfWriter.PDF_VERSION_1_5);
writer.CompressionLevel = iTextSharp.text.pdf.PdfStream.BEST_COMPRESSION;
reader.RemoveFields();
reader.RemoveUnusedObjects();
stamper.Reader.RemoveUnusedObjects();
stamper.SetFullCompression();
stamper.Writer.SetFullCompression();
stamper.Close();
}
reader.Close();
Try using iTextSharp.text.pdf.PdfSmartCopy instead of PdfCopy.
For me it decreased a PDF with a size of ~43MB PDF to ~4MB.

iText - Add text to existing landscape document

In my project, I am opening existing PDFs and add text on it at specific positions. This worked well since today where I got a landscape file. This is my code:
outputStream = new FileOutputStream(file + "_out.pdf");
PdfReader reader = new PdfReader(file);
PdfStamper stamper = new PdfStamper(reader, outputStream);
stamper.setRotateContents(true);
PdfContentByte canvas = stamper.getOverContent(1);
ColumnText.showTextAligned(canvas,
Element.ALIGN_LEFT, new Phrase("Text"), xPosition, yPosition, 0);
stamper.close();
reader.close();
outputStream.close();
When I opened the newly created file, the content is shown in portrait mode.
How do I have to change the code in order to get the file in landscape mode, together with the stamped text?
The file in question: http://www.share-online.biz/dl/RO1IPFVM0JX

export to pdf using itextsharp

I have a .aspx page in which I have a panel and some controls and one gridview. I am trying to take the print of that panel, but the downloaded pdf is empty. I don't know why, could someone explain me?
Here's the code I was trying
Response.ContentType = "application/pdf";
Response.AddHeader("content-disposition", "attachment;filename=UserDetails.pdf");
Response.Cache.SetCacheability(HttpCacheability.NoCache);
StringWriter sw = new StringWriter();
HtmlTextWriter hw = new HtmlTextWriter(sw);
this.pnlprint.RenderControl(hw);
StringReader sr = new StringReader(sw.ToString());
Document pdfDoc = new Document(PageSize.A4, 10f, 10f, 100f, 0f);
HTMLWorker htmlparser = new HTMLWorker(pdfDoc);
PdfWriter.GetInstance(pdfDoc, Response.OutputStream);
pdfDoc.Open();
htmlparser.Parse(sr);
pdfDoc.Close();
Response.Write(pdfDoc);
Response.End();
Try changing your:
Response.Write(pdfDoc);
to:
Response.BinaryWrite(pdfDoc.ToArray());