Using itextsharp to remove text from pdf - itext

I'm working on a program to remove text from a specified area of a pdf.
It works well on most pdfs, but I've found it falls over with some pdfs which contain graphics using Indexed colorspace - it only works on CMYK or RGB. I'm afraid I'm really clueless on this subject so could really use some help.
Here's my code:
Dim source_file as String ="c:\test pdf\test.pdf"
Dim destination_file as String = ="c:\test pdf\output.pdf"
Dim reader As PdfReader = New PdfReader(source_file)
Using outputPdfStream As Stream = New FileStream(destination_file, FileMode.Create, FileAccess.Write, FileShare.None)
Dim stamper = New PdfStamper(reader, outputPdfStream)
Dim Locs As New List(Of PdfCleanUpLocation)
Locs.Add(New PdfCleanUpLocation(1, New Rectangle(97.0F, 405.0F, 480.0F, 445.0F), BaseColor.WHITE))
Dim oCleaner As New PdfCleanUpProcessor(Locs, stamper)
oCleaner.CleanUp()
stamper.Close()
reader.Close()
End Using
The error I'm getting is:
iTextSharp.text.exceptions.UnsupportedPdfException: 'The color space [/Indexed, /DeviceCMYK, 73, 13 0R] is not supported'
This comes up at the oCleaner.CleanUp() line
For reference, I originally extracted the code from the below link where someone was trying to do something similar, but a lot more involved, a few years ago:
https://www.vbforums.com/showthread.php?831051-RESOLVED-Confusion-converting-C-code
If anyone can suggest a way of getting this to work with pdfs featuring Indexed colorspace graphics I'd be extremely grateful!
Thanks for reading!

Related

iText - add image to an existing PDF

I am doing the following using iText.
I have a PDF
I add an image to the PDF
I save the modified PDF.
To add image in PDF I am using this method itext-add.
This worked fine until I got a certain PDF. In this PDF, the adding-image-to-the-pdf method doesn't work. Moreover, it corrupts the PDF.
Points to note:
I'm getting PDFs from a third party and these are contractual PDFs. So it is possible that they have added some restrictions.
And one fun fact, when I add an annotation on the same page where I want to add that image, that image starts coming!
I'm using iText 7.1.10
String srcFileName = "/Users/kalpit/Desktop/step1-stack.pdf";
String destFileName = "step1-test1-kd2.pdf";
File destFile = new File(destFileName);
PdfDocument pdf = new PdfDocument(new PdfReader(srcFileName), new PdfWriter(destFile),new StampingProperties().useAppendMode());
Document document = new Document(pdf);
String imFile = "/Users/kalpit/Desktop/sample-image.png";
ImageData data = ImageDataFactory.create(imFile);
Image image = new Image(data);
document.add(image);
document.close();
and here is the PDF : https://ipupload.com/tP6/step1.pdf
As #Nikita found out, the problem does only occur if working in append mode. The cause is that in unfortunate conditions the changed resources are not stored, a bug.
The unfortunate conditions here are that on page 1
the content stream array already is an indirect object in its own right, so only this indirect object (and not the page object) is marked as changed when the instructions for showing the image are added; and
the resources and resource type dictionaries are direct objects but only they (and not the page object, the indirect object holding them) are marked as changed when the image resource is added.
Thus, the changes to the page content streams are stored but the page object is not.
So there now is an instruction to draw an image from an image page resource which is not there.
One way to work around this is not to use append mode as proposed by #Nikita.
Alternatively, if you are required to use append mode, you can explicitly mark your page as changed:
Document document = new Document(pdf);
String imFile = "/Users/kalpit/Desktop/sample-image.png";
ImageData data = ImageDataFactory.create(imFile);
Image image = new Image(data);
document.add(image);
pdf.getFirstPage().setModified(); // <-----
document.close();
Well, something wrong with appendMode. Do you really need him? Try to add image via the next code:
PdfDocument pdf = new PdfDocument(new PdfReader(srcFile), new PdfWriter(outFileName));
Document doc = new Document(pdf);
PdfImageXObject xObject = new PdfImageXObject(ImageDataFactory.createPng(UrlUtil.toURL(pathToYourimage.png)));
Image image = new Image(xObject, 100).setHorizontalAlignment(HorizontalAlignment.RIGHT);
doc.add(image);
doc.close();
The result is the next

How to change adobe readers zoom level?

I have a Jasper Report which creates a PDF in Java Spring. I have been trying to change the zoom level for hours and have not been successful. Whenever I open the pdf's using Adobe reader, its 149% (and coworkers is even worse). There was a similar question which did not help.
I have tried the following property names and none of them have worked
"zoom"
"net.sf.jasperreports.viewer.zoom"
"net.sf.jasperreports.viewer.zoom"
com.jaspersoft.studio.viewer.zoom
com.jaspersoft.studio.unit.viewer.zoom
The values I have tried are
0.5
1.1
2
I have checked my Adobe Reader properties and zoom is set to default, and accessibility is also off.
As Villat indicated in comment one way to set zoom level is "this.zoom=50;"
You can do this either by indicating it in jrxml
<property name="net.sf.jasperreports.export.pdf.javascript" value="this.zoom=50;"/>
or
by setting it to the SimplePdfExporterConfiguration if exporting from java
....
SimplePdfExporterConfiguration configuration = new SimplePdfExporterConfiguration();
configuration.setPdfJavaScript("this.zoom=50;");
exporter.setConfiguration(configuration);
However
It is up to the reader (application used to open pdf), to decide if it will/can execute the javascript.
For example in standard Adobe Acrobat Reader DC a user can manually turn this off under menu Edit>>Preferences
Furthermore, if the reader is already open it seems to not always like to change the zoom level through javascript, my installed reader works properly only if it opens with the pdf.
Alternative solution
If you are exporting in java you can post elaborate the pdf adding a OpenAction, see Bruno Lowagie's answer https://stackoverflow.com/a/24095098/5292302
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
PdfDestination pdfDest = new PdfDestination(PdfDestination.XYZ, 0,
reader.getPageSize(1).getHeight(), 0.75f);
PdfAction action = PdfAction.gotoLocalPage(1, pdfDest, stamper.getWriter());
stamper.getWriter().setOpenAction(action);
stamper.close();
reader.close();
}
Hence once exported, you call a similar method, if memory allows it you can also do this in memory using a ByteArrayOutputStream or similar.
This solution is more reliable, but in the the end it's always up to to the reader that user is using if it will be respected or not.

iTextSharp insert image PushbuttonField not working

I have searched and searched and cannot find the answer to my problem. I've tried many different approaches in my code, but I've hit a wall and I'm not sure where to go from here. I seem to be wanting to do the same thing as these two threads:
Trying to insert an image into a pdf‏ in c#
Add image in an existing PDF with itextsharp
They are very similar and the answer is the same. However, when I use that exact code, the result is a PDF without an image. Here is my code:
using (var existingFileStream = new FileStream(fileNameExisting, FileMode.Open))
using (var newFileStream = new FileStream(fileNameNew, FileMode.Create))
{
var pdfReader = new PdfReader(existingFileStream);
var stamper = new PdfStamper(pdfReader, newFileStream, '\0', true);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (var field in form.Fields)
{
if (field.Key == "form1[0].ec_Bldg_Photo_1[0].ImageField2[0]")
{
PushbuttonField imageField = form.GetNewPushbuttonFromField(field.Key);
imageField.Layout = PushbuttonField.LAYOUT_ICON_ONLY;
imageField.IconReference = null;
imageField.ProportionalIcon = true;
imageField.Image = Image.GetInstance(#"PATH_TO_IMAGE\front.jpg");
form.ReplacePushbuttonField(field.Key, imageField.Field);
}
}
stamper.FormFlattening = false;
stamper.Close();
pdfReader.Close();
}
I have tried to rule out all of the obvious things. My path to the image is correct, the field is indeed a PushbuttonField when I read the existing PDF field and get the field type. If I open the PDF in Adobe Reader and click on the placeholder for the image, it allows me to pick a file from my PC. When I place an image in the file, save, and then read in that PDF, I can then change my code to this:
imageField.ProportionalIcon = false;
And now all of sudden the image is stretched on the saved copy. So I see that it is changing this part but this is when I enter the image manually in Adobe Reader. When I read in the field after I set that image in Adobe Reader and it shows correctly, I see a couple interesting things. The field.Image property IS NULL and the field.IconReference is NOT NULL. When I use the original code to try and insert the image, it is reversed, where Image is NOT NULL but IconReference IS NULL
Any help would be greatly appreciated, thank you!!
EDIT 1: Ok so I didn't see it the first time, but I went back and checked more thoroughly and I did find that key. Here it is:
Several things are at play here.
Usage Rights:
The PDF is digitally signed with a private key owned by Adobe.
You can see this using RUPS here (in your screen shot you didn't go deep enough):
This has two implications:
The signature unlocks special permissions in Adobe Reader, such as the permission to save a filled out form locally.
Making any changes to the original PDF breaks the signature and removes the special permissions leading to an ugly error message in Adobe Reader.
This functionality is deprecated in (and even removed from) PDF 2.0. It's old technology that became obsolete with the emergence of PDF viewers other than Adobe Reader.
My suggestion: remove the usage rights to avoid breaking the signature. See the FAQ entry "Why do I get an error saying that "use of extended features is no longer available"?" iText 7 / iText 5
This is the iText 7 code:
public void removeUsageRights(PdfDocument pdfDoc) {
PdfDictionary perms = pdfDoc.getCatalog().getPdfObject().getAsDictionary(PdfName.Perms);
if (perms == null) {
return;
}
perms.remove(new PdfName("UR"));
perms.remove(PdfName.UR3);
if (perms.size() == 0) {
pdfDoc.getCatalog().remove(PdfName.Perms);
}
}
This is the iText 5 code:
PdfReader reader = new PdfReader(old_file);
if (reader.hasUsageRights()) {
reader.removeUsageRights();
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(new_file));
stamper.close();
}
reader.close();
This is the iText 5 answer.
Hybrid Form:
If you click on the /AcroForm entry, you see this:
There is a /Fields array with references to field dictionaries that are also widget annotations. That means that the document has an AcroForm form inside. However, there is also an /XFA entry with a series of XML snippets. That means that the document has an XFA form inside.
In other words: the same form description is added twice inside. You are changing a button in one form (the AcroForm part), but not in the other (the XFA form) and that leads to inconsistencies.
XFA has been deprecated in PDF 2.0 because there weren't many vendors supporting that technology. It's kind of frustrating to be confronted with forms that use deprecated technology.
My suggestion: I would remove the XFA part. See the FAQ entry "Is it safe to remove XFA?" iText 5 / iText 7
In iText 5, removing XFA is done like this:
AcroFields form = stamper.getAcroFields();
form.removeXfa();
Important: my suggestion is to remove all the deprecated functionality from the PDF, but if the government expects that functionality to be present, then you're out of luck. In that case, you will need to use Adobe software to process the form. If that's the case, you could complain to the government that their requirements lead to a de facto vendor lock-in. By the way: iText Software is also a vendor. It's an open source company that offers open source software under the AGPL license. The AGPL license allows free use under certain circumstances (see How do I make sure my software complies with AGPL: How can I use iText for free?) If you don't meet those requirements, you will have to purchase a commercial license for your use of iText.

XFA forms PDF is modified using itextsharp and resulting PDF Gives Extended feature error

I am trying to set field values programatically by modifying XFA Forms PDF containing 2D Barcode.I am having troble opening rersulting PDF with Regular Adobe Reader. Here is an Error "This document enabled extended features in adobe reader.The document has been changed since it was created and use of extended feature is no longer available.Please contact the author of original version of this document" (Note:-File opens fine with Adobe acrobat)
Following is an example c# code which does this.
var reader = new PdfReader(#"c:\abc.pdf");
// System.IO.FileStream fs = new FileStream(reader, System.IO.FileMode.CreateNew, FileAccess.ReadWrite);
var output = new MemoryStream();
var stamper = new PdfStamper(reader, output, '\0', true);
stamper.ViewerPreferences = PdfWriter.AllowModifyContents;
stamper.AcroFields.SetField("form1[0].#subform[0].Line1a_FamilyName[0]", "Family Name");
stamper.FormFlattening = false;
stamper.Close();
reader.Close();
Response.AddHeader("Content-Disposition", "attachment; filename=YourPDF.pdf");
Response.ContentType = "application/pdf";
Response.BinaryWrite(output.ToArray());
Response.End();
See this post by iText's author:
Submitted by Bruno Lowagie on Fri, 12/31/2010 - 16:37
After filling out my form, my PDF shows the following message: This document enabled extended features in Adobe Reader. The document has been changed since it was created and use of extended features is no longer available. Please contact the author for the original version of this document. How do I avoid this message?
The creator of the form made the document Reader enabled. Reader enabling can only be done using Adobe software. You can avoid this message in two ways:
1. Remove the usage rights. This will result in a form that is no longer Reader enabled. For instance: if the creator of the document allowed that the filled out form could be saved locally, this will no longer be possible after removing the usage rights.
2. Fill out the form in append mode. This will result in a bigger file size, but Reader enabling will be preserved.
For more info, read section 8.7.1 of iText in Action.

error in generating pdf using iTextSharp means previous pdf file is diplayed

I am working in asp .net mvc3.
I have these statements in controller class:
PdfWriter.GetInstance(doc, new FileStream((Request.PhysicalApplicationPath + "\\Receipt5.pdf"),
FileMode.Create));
doc.Open();
PdfPTable table = new PdfPTable(2);
table.AddCell("tt[0]");
table.AddCell("tt[1]");
doc.close();
All time my values are changing but in pdf sometimes showing old result. please tell me what should i do for it that whenever i press done button then new pdf document should generate.
i am using iTextSharp to generate pdf.
It seems that you're not able to replace the old file cause it is locked.
Try to delete it and see what happens.
Anyway, consider that if more than one user tries to print the same document you can have a concurrency problem.
I would suggest you to use a generated file name:
var newFile = System.IO.Path.Combine(Request.PhysicalApplicationPath, Guid.NewGuid().ToString() + ".pdf");