iText failure with adding elements with 5.5.4 - itext

I have code that has broken with the 5.5.4 update. I have internally confirmed it works with previous 3 versions...
adding the element causes a null pointer exception
reader = new PdfReader('Users/Me/Documents/a.pdf')
stamper = new PdfStamper(reader, new FileOutputStream('some_file'))
cb = stamper.getOverContent(1)
ct = new ColumnText(cb)
ct.setSimpleColumn(120f, 48f, 200f, 600f)
pz = new Paragraph ( new Phrase (20, 'Hello World!', f) )
ct.addElement(pz)
ct.go()
stamper.close()
reader.close()
john
great new book btw Bruno...
UPDATE
I did indeed miss out a bit of code before, and I was trying to isolate the issue in a longer piece of code
This version does exhibit the issue for me:
bf = BaseFont.createFont(BaseFont.HELVETICA_BOLD, 'Cp1252', BaseFont.EMBEDDED)
f = new Font(bf, 13)
reader = new PdfReader(src')
stamper = new PdfStamper(reader, new FileOutputStream(dest))
cb = stamper.getOverContent(1)
ct = new ColumnText(cb)
ct.setSimpleColumn(120f, 48f, 200f, 600f)
pz = new Paragraph ( 'Hello World!' )
ct.addElement(pz)
ct.go()
stamper.close()
reader.close()
ct.addText(chunk) works but not addElement() in 5.5.4, does in 5.5.1 - 5.5.3

your code sample is incomplete. For instance: it misses the PDF to which you are trying to add extra words (I took a simple "Hello World" pdf) and it misses some definitions, such as the declaration of the f variable (I took the default font, which is Helvetica, 12pt).
When I fill out the blanks you left in your question, I end up with the following code:
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
PdfContentByte cb = stamper.getOverContent(1);
ColumnText ct = new ColumnText(cb);
ct.setSimpleColumn(120f, 48f, 200f, 600f);
Font f = new Font();
Paragraph pz = new Paragraph(new Phrase(20, "Hello World!", f));
ct.addElement(pz);
ct.go();
stamper.close();
reader.close();
}
This code results in the PDF colum_text_phrase.pdf, a document that now shows the words "Hello World!" twice, once because these words were already present on the original document, and once because we added these words using the adapted code from your question.
In other words: I can not reproduce the problem you are reporting, hence your question is unanswerable.
Note that I don't understand this line:
Paragraph pz = new Paragraph(new Phrase(20, "Hello World!", f));
Why would you create a Phrase and then wrap it inside a Paragraph?
Why not either use a Phrase, or a Paragraph? For instance:
Paragraph pz = new Paragraph(20, "Hello World!", f);
Edit:
Based on your edit, I have adapted the ColumnTextPhrase example. I still wasn't able to reproduce the problem, not by using:
pz = new Paragraph ( 'Hello World!' )
(which is what you wrote), nor by using:
pz = new Paragraph ( 'Hello World!', f )
(which is what you probably meant to write).
EDIT:
Using the file you shared in your comment, I was able to reproduce the problem. I also understand why the problem occurs: you have a file that is a Tagged PDF. Now you are adding content that isn't tagged. Older iText versions allow you to do such an inconsiderate operation, the latest iText version apparently prevents you from doing something that goes against "good taste". I'll pass this to development to find out if this was intentional or not.

Related

iTextSharp does not show Unicode symbols on pdf document

I have a little problem in ASP.NET Webforms I am using iTextSharp to create a pdf document. Everything works great, except the Unicode characters that are not displayed or they appear as question marks. Here is my code, if anyone knows what am I doing wrong I would be very grateful!
BaseFont base = BaseFont.CreateFont("C:\\Windows\\Fonts\\Wingdings.ttf", BaseFont.CP1252, BaseFont.EMBEDDED);
Font font = new Font(base, 12, Font.NORMAL, BaseColor.BLACK);
using (MemoryStream ms = new MemoryStream())
{
using (var document = new Document())
{
HTMLWorker parser = new HTMLWorker(document);
document.NewPage();
string str = " \u2022 \u2706 ";
Phrase phrase = new Phrase(str, font);
document.Add(phrase);
//and after that I have some other strings which are parsed with parser and that works perfectly
}}
I have tried to parse the phrase with XMLWorkerHelper, but it did not work at all. Also, I have tried to convert the string like this and the symbols appeared as question marks. I have tried to change the font like Wingding2, Wingding3 but it did not work.
byte[] bytestr = Encoding.Default.GetBytes(str);
string convertstr = Encoding.UTF8.GetString(bytestr);
Phrase phrase = new Phrase(convertstr, font);
document.Add(phrase);
Also I have tried to enter them like this, but also I've got the same result with the question marks
string str = "•✆";
Phrase phrase = new Phrase(str, font);
document.Add(phrase);
EDIT: I have tried to change the font to Segoe UI Symbol and escape characters like \u260F, but still does not work.

iTextSharp and Hyphenation

In earlier versions of iTextSharp, I have incorporated hyphenation in the following way (example is for German hyphenation):
HyphenationAuto autoDE = new HyphenationAuto("de", "DR", 3, 3);
BaseFont.AddToResourceSearch(RuntimePath + "itext-hyph-xml.dll");
chunk = new Chunk(text).SetHyphenation(autoDE);
In recent versions of iText, this is no longer possible as the function
BaseFont.AddToResourceSearch()
has been removed from iText. Now how to replace this statement?
When inspecting the 2nd ed. of the iText IN ACTION manual, the statement need not be replaced at all, apparently. When doing so, however, no hyphenation takes place (and no errors occur). I also have taken a newer version of
itext-hyph-xml.dll
and re-referenced it. Same result, no hyphenation. This file resides on the same path as iTextSharp.dll, and I have included the path in the CLASSPATH environment variable. Nothing helps. I'm stuck, please help.
Calling iTextSharp.text.io.StreamUtil.AddToResourceSearch() works for me:
var content = #"
Allein ist besser als mit Schlechten im Verein: mit Guten im Verein, ist besser als allein.
";
var table = new PdfPTable(1);
// make sure .dll is in correct /bin directory
StreamUtil.AddToResourceSearch("itext-hyph-xml.dll");
using (var stream = new MemoryStream())
{
using (var document = new Document(PageSize.A8.Rotate()))
{
PdfWriter.GetInstance(document, stream);
document.Open();
var chunk = new Chunk(content)
.SetHyphenation(new HyphenationAuto("de", "DR", 3, 3));
table.AddCell(new Phrase(chunk));
document.Add(table);
}
File.WriteAllBytes(OUT_FILE, stream.ToArray());
}
Tested with iTextSharp 5.5.11 and itext-hyph-xml 2.0.0.0. Output PDF:

How to fill an existing pdf form with itext 7 and itextsharp 5.5.9 without flattening it?

I am trying to fill out a USCIS form and after filling it is making as read only (flattening that). I am not sure why it is doing that. Even though I don’t have any code to flatten that.
I searched the stack overflow and tried many different things (with itextsharp 5.5.9 and itext 7) but still it doesn’t work.
Here is the sample code I am using
string src = #"https://www.uscis.gov/sites/default/files/files/form/i-90.pdf";
string dest = #"C:\temp\i-90Filled.pdf";
var reader = new PdfReader(src);
reader.SetUnethicalReading(true);
var writer = new PdfWriter(dest);
PdfDocument pdfDoc = new PdfDocument(reader, writer);
// add content
PdfAcroForm form = PdfAcroForm.GetAcroForm(pdfDoc, true);
IDictionary<String, PdfFormField> fields = form.GetFormFields();
PdfFormField toSet;
fields.TryGetValue("form1[0].#subform[0].P1_Line3b_GivenName[0]", out toSet);
toSet.SetValue("John");
pdfDoc.Close();
Forms are filled like this with iTextSharp 5:
string src = #"https://www.uscis.gov/sites/default/files/files/form/i-90.pdf";
string dest = #"C:\temp\i-90Filled.pdf";
PdfReader pdfReader = new PdfReader(src);
PdfStamper pdfStamper = new PdfStamper(pdfReader, new FileStream(
dest, FileMode.Create));
AcroFields form = pdfStamper.AcroFields;
form.SetField("form1[0].#subform[0].P1_Line3b_GivenName[0]", "John");
Note that the form you are trying to fill is a hybrid form. It contains the description of the form twice: once as an AcroForm; once as an XFA form. You may want to remove the XFA form by adding this line:
form.RemoveXfa();
For more info, read the documentation:
Is it safe to remove XFA?
How to change the text color of an AcroForm field?
Your code is using iText 7 for C#. If you want that code to work, you most certainly need to remove the XFA part as iText 7 doesn't support XFA. iText 7 was developed with PDF 2.0 in mind (this spec is to be released in 2017). XFA will be deprecated in PDF 2.0. A valid PDF 2.0 file will not be allowed to contain an XFA stream.

iText 5.5.3 + XFAWorker: Repeated section not rendering

We are using iText Java 5.5.3 with XFAWorker to inject XML into PDF templates and flatten those to simple PDFs.
To integrate into a system that has parts built in various technologies we have wrapped it into a http-Server using the simple framework
The injection stage works fine, but after flattening some sections are lost.
Detailled information:
The PDF template we inject into is: CLAIM.Plan.pdf.
Sample data we try to inject is: plan_v4.xml.
Without flattening we get: CLAIM_plan_unflattened.pdf.
With flattening the result is CLAIM_plan_flattened.pdf.
plan_v4.xml has multiple instances of //form/PlanItems/PlanItem and [CLAIM_plan_v3.pdf][6] is repeating the section to display all of them. This is the OK case. [CLAIM_plan_v4.pdf][7] is not showing the repeated sections for the individual PlanItem instances. This is the broken case.
The main bit of code that uses iText looks like this:
// load the XML document that contains the contents of the data element.
// NOT the <data> element itself or anything else.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document newdoc = db.parse(new InputSource(new InputStreamReader(xfaStream, "UTF-8")));
Node newdata = newdoc.getDocumentElement();
// Open the PDF and extract the complete xfa.
PdfReader reader = new PdfReader(pdfStream);
XfaForm xfa = new XfaForm(reader);
Document doc = xfa.getDomDocument();
// Remove all contents of the <data> element
NodeList list = doc.getElementsByTagNameNS("http://www.xfa.org/schema/xfa-data/1.0/", "data");
Node child;
do {
child = list.item(0).getFirstChild();
if (child!=null) {
list.item(0).removeChild(child);
}
} while (child!=null);
// Replace <data> with the new xml to inject.
list.item(0).appendChild(doc.importNode(newdata,true));
ByteArrayOutputStream os = new ByteArrayOutputStream();
PdfStamper stamper = new PdfStamper(reader, os, '\0', false);
xfa.setDomDocument(doc);
xfa.setChanged(true);
XfaForm.setXfa(xfa, stamper.getReader(), stamper.getWriter());
stamper.close();
If I now just finish quickly:
response.getOutputStream().write(os.toByteArray());
I get the good (but unflattened) result.
If I add the flattening stage like this:
com.itextpdf.text.Document document = new com.itextpdf.text.Document();
PdfWriter writer = PdfWriter.getInstance(document, response.getOutputStream());
XFAFlattener xfaf = new XFAFlattener(document, writer);
xfaf.flatten(new PdfReader(os.toByteArray()));
document.close();
The result is flattened but missing the repeated sections.
Can someone tell me what I am doing wrong so the PlanItems don't get rendered in the final PDF?

Whats the alternative to copyAcroForm?

We are currently porting our code base from iText 2.1.7 to iText 5.5.0 (yeah I know.. we had a little longer ;-).
Well.. "now" that copyAcroForm has gone the way of the Dodo, I'm struggling to find an alternative to this code:
File outputFile = new File...
Document document = new Document();
FileOutputStream fos = new FileOutputStream(outputFile);
PdfCopy subjobWriter = new PdfCopy(document, fos);
document.open();
PdfReader reader = new PdfReader(generationReader);
for (int i=1; i<=reader.getNumberOfPages(); i++) {
PdfImportedPage page = subjobWriter.getImportedPage(reader, i);
subjobWriter.addPage(page);
}
PRAcroForm form = reader.getAcroForm();
if (form != null)
subjobWriter.copyAcroForm(reader);
subjobWriter.freeReader(reader);
reader.close();
subjobWriter.close();
document.close();
fos.close();
but haven't really found anything. I read in the changelog of 4.34 or so that I apparently should use PdfCopy.addDocument(). I tried that and commented out the other code, such as this:
...
PdfReader reader = new PdfReader(generationReader);
reader.consolidateNamedDestinations();
subjobWriter.addDocument(reader);
subjobWriter.freeReader(reader);
subjobWriter.setOutlines(SimpleBookmark.getBookmark(reader));
...
but that didn't help either.
The problem is, that everything from the original PDF is copied EXCEPT the form (and its fields and content), or rather, it looks like the whole form has been flattened instead.
Since all the samples I could find either used copyAcroForm() which doesn't exist anymore or the PdfCopyFields class which is deprecated and all the samples at itextpdf.com and the "iText in Action, 2nd edition" use copyAcroForm() as well, I'm at loss as to how to solve this. Any idea anyone?
Rog
Please take a look at the MergeForms example:
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileOutputStream(filename));
copy.setMergeFields();
document.open();
for (PdfReader reader : readers) {
copy.addDocument(reader);
}
document.close();
for (PdfReader reader : readers) {
reader.close();
}
One line in particular is very important:
copy.setMergeFields();
Did you add that line?