How can I get rid of PDF startxref error and AbstractRenderer warning messages when generating PDF documents and then merging it using iText 7? - itext

I am using iText version 7.1.6 to generate PDF documents and in the end I am trying to merge it.
Below is the code used for merging along with comments.
List<byte[]> pdfDocumentList= new ArrayList<byte[]>();
// pdfDocumentList has list of byte arrays generated from other ways
ByteArrayOutputStream mergeOutputStream = new ByteArrayOutputStream();
PdfDocument pdfMerged = new PdfDocument(new PdfWriter(mergeOutputStream));
PdfMerger merger = new PdfMerger(pdfMerged);
ByteArrayOutputStream finalOutputStream = new ByteArrayOutputStream();
PdfWriter writer = new PdfWriter(finalOutputStream);
PdfDocument pdf = new PdfDocument(writer);
// sb is containing the concatenated HTML sources
HtmlConverter.convertToPdf(sb.toString(), pdf, properties);
pdf.close();
pdfDocumentList.add(finalOutputStream.toByteArray());
if(!pdfDocumentList.isEmpty()){
for(byte[] bytes : pdfDocumentList){
PdfDocument externalPdf = new PdfDocument(new PdfReader(new ByteArrayInputStream(bytes)));
merger.merge(externalPdf, 1, externalPdf.getNumberOfPages());
}
}
pdfMerged.close();
return mergeOutputStream.toByteArray();
When I am merging the list of PDF documents, I get the below error and warning. In addition, the warning keeps getting printed multiple times. How can I fix it?
Warning
WARNING: The background rectangle has negative or zero sizes. It will not be displayed.
Jul 18, 2019 2:24:24 PM com.itextpdf.layout.renderer.AbstractRenderer drawBackground
Error
<Jul 18, 2019, 2:27:19,964 PM AST> <Error> <com.itextpdf.kernel.pdf.PdfReader> <BEA-000000> <Error occurred while reading cross reference table. Cross reference table will be rebuilt.
com.itextpdf.io.IOException: PDF startxref not found.
at com.itextpdf.io.source.PdfTokenizer.getStartxref(PdfTokenizer.java:262)
at com.itextpdf.kernel.pdf.PdfReader.readXref(PdfReader.java:753)
at com.itextpdf.kernel.pdf.PdfReader.readPdf(PdfReader.java:538)
at com.itextpdf.kernel.pdf.PdfDocument.open(PdfDocument.java:1818)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:238)
Truncated. see log file for complete stacktrace
>
2019-07-18 14:27:19 ERROR user: KALASINX ip: 127.0.0.1 (ServiceInterceptor.java:59) ~ ServiceInterceptor Error:
com.itextpdf.kernel.PdfException: Trailer not found.
at com.itextpdf.kernel.pdf.PdfReader.rebuildXref(PdfReader.java:1064) ~[kernel-7.1.6.jar:?]

After analyzing the HTML code and with repeated testing, I was able to get rid of the warning message. I had to remove the background-color style in the CSS of the HTML that was related to
the table, tr, and td tags.

Related

Error while parsing the PDF Document (Pg Entry of Structure Element with MCIDs not set.)

I am using iTextsharp to create/merge Tagged PDFs.
When I run PDF Accessibility Checker2.0 on the generated PDF, I get following error:
Error while parsing the PDF Document (Pg Entry of Structure Element with MCIDs not set.) under PDF syntax as shown below:
I could not find anything related to this issue online. I checked in : https://taggedpdf.com/508-pdf-help-center/
I need to fix this issue using iTextsharp library but I am not able to fix it manually as well.
Please help me if someone has some idea about how to fix this.
Thanks in advance.
I am adding the code below, I am using to create tagged PDF:
PdfReader reader = new PdfReader(pdfSourceFile);
iTextSharp.text.Document document = new iTextSharp.text.Document();
PdfCopy writer = new PdfSmartCopy(document, new
FileStream(pdfDestinationFile, FileMode.Create));
writer.SetTagged();
document.Open();
for (int j = 1; j <= reader.NumberOfPages; j++)
{
if (reader.GetPageContent(j).Length > 0)
{
var page = writer.GetImportedPage(reader, j, true);
writer.AddPage(page);
}
}
document.Close();
writer.Close();
reader.Close();
I have omitted some lines of logic here.

How to get values and the keys of all the controls in a dynamic pdf using iText?

I've tried to extract all the fields out of a dynamic form. But I've observed that the code worked for some forms while not for others. Worst, the code worked differently for the same form but two different downloaded files. But after digging a lot, I found that those forms which correctly, were freshly processed. Not even a single details were filled from a PDF Software(Adobe Reader). Also, if the form was filled and saved the thumbnail of the form in the explorer would change from. The code snippet is as follows:
PdfDocument pdfDoc;
pdfDoc = new PdfDocument(new PdfReader(src), new PdfWriter(dest));
PdfDictionary perms = pdfDoc.getCatalog().getPdfObject().getAsDictionary(PdfName.Perms);
if (perms != null) {
perms.remove(new PdfName("UR"));
perms.remove(PdfName.UR3);
if (perms.size() == 0) {
pdfDoc.getCatalog().remove(PdfName.Perms);
}
}
PdfAcroForm form = PdfAcroForm.getAcroForm(pdfDoc, true);
List<String> result = new ArrayList<String>(form.getFormFields().keySet());
Map<String, PdfFormField> fields = form.getFormFields();
Below is the image for the same form, but downloaded twice. The one with the colorful thumbnail is not filled. Other is filled using Adobe Reader and saved, and on saving the thumbnail vanished.
I suspect a flag might get set on saving the form. Any help is appreciated.Another peculiar observation, there was a mismatch in the number of parameters in the PdfCatalog object for the above two forms. An entry for the property 'NeedsRendering' was present in the faulty PDF and otherwise for the working PDf. I've attached screenshots for the working PDF during a debugging session.
hrsa_working_form:
Update 1
#Browno, apologies for the confusing question from a newbie's mind. I've posted the screenshots from the itext RUPS for the key '/AcroForm'. On exploring the answers for XFAForm, I've learned how to fill them. But flattening them causes an exception. I've used the pdfxfa jar under the license of AGPL. I'm lacking the knowledge of XFAFlattener and it's properties used in the XFAFlattenerProperties class. Below is the code snapshot:
public void fillData2(String src, String xml, String dest, String newDest){
throws IOException, ParserConfigurationException, SAXException, InterruptedException {
PdfReader reader = new PdfReader(src);
reader.setUnethicalReading(true);
PdfDocument pdfDoc = new PdfDocument(reader, new PdfWriter(dest), new StampingProperties().useAppendMode());
PdfAcroForm form = PdfAcroForm.getAcroForm(pdfDoc, true);
List<String> result = new ArrayList<String>(form.getFormFields().keySet());
System.out.println(result.size());
XfaForm xfa = form.getXfaForm();
xfa.fillXfaForm(new FileInputStream(xml));
xfa.write(pdfDoc);
//form.flattenFields(); throws exception
pdfDoc.close();
FileInputStream fis = new FileInputStream(dest);
FileOutputStream fos = new FileOutputStream(newDest);
XFAFlattener xfaFlattener = new XFAFlattener();
xfaFlattener.setFontSettings(new XFAFontSettings().setEmbedExternalFonts(true));
xfaFlattener.flatten(fis, fos);
fis.close();
fos.close();
}
The encountered exception is:
Exception in thread "main" java.lang.NoSuchFieldError: FONTFAMILY
at com.itextpdf.tool.xml.xtra.xfa.font.XFAFontProvider.addFonts(XFAFontProvider.java:117)
at com.itextpdf.tool.xml.xtra.xfa.font.XFAFontProvider.<init>(XFAFontProvider.java:56)
at com.itextpdf.tool.xml.xtra.xfa.XFAFlattener.initFlattener(XFAFlattener.java:643)
at com.itextpdf.tool.xml.xtra.xfa.XFAFlattener.flatten(XFAFlattener.java:201)
at com.itextpdf.tool.xml.xtra.xfa.XFAFlattener.flatten(XFAFlattener.java:396)
at com.mycompany.kitext.kitext.fillData2(kitext.java:153)
at com.mycompany.kitext.kitext.main(kitext.java:81)
Also, as per #mkl's comment, I've attached the PDF forms:
https://drive.google.com/file/d/0B6w278NcMSCrZDZoZklmVTNuOWc/view?usp=sharing
//iText RUPS /AcroForm Snapshot
https://drive.google.com/file/d/0B6w278NcMSCrZ1Q1VHc5YzY4UG8/view?usp=sharing
//Form filled with fillXfaForm()
//running low on reputation
Form filled with XFA
I've also read the pdfXFA Release notes for developers. But couldn't find a similar example. Thanks for your help and the great work on iText.

How to fill an existing pdf form with itext 7 and itextsharp 5.5.9 without flattening it?

I am trying to fill out a USCIS form and after filling it is making as read only (flattening that). I am not sure why it is doing that. Even though I don’t have any code to flatten that.
I searched the stack overflow and tried many different things (with itextsharp 5.5.9 and itext 7) but still it doesn’t work.
Here is the sample code I am using
string src = #"https://www.uscis.gov/sites/default/files/files/form/i-90.pdf";
string dest = #"C:\temp\i-90Filled.pdf";
var reader = new PdfReader(src);
reader.SetUnethicalReading(true);
var writer = new PdfWriter(dest);
PdfDocument pdfDoc = new PdfDocument(reader, writer);
// add content
PdfAcroForm form = PdfAcroForm.GetAcroForm(pdfDoc, true);
IDictionary<String, PdfFormField> fields = form.GetFormFields();
PdfFormField toSet;
fields.TryGetValue("form1[0].#subform[0].P1_Line3b_GivenName[0]", out toSet);
toSet.SetValue("John");
pdfDoc.Close();
Forms are filled like this with iTextSharp 5:
string src = #"https://www.uscis.gov/sites/default/files/files/form/i-90.pdf";
string dest = #"C:\temp\i-90Filled.pdf";
PdfReader pdfReader = new PdfReader(src);
PdfStamper pdfStamper = new PdfStamper(pdfReader, new FileStream(
dest, FileMode.Create));
AcroFields form = pdfStamper.AcroFields;
form.SetField("form1[0].#subform[0].P1_Line3b_GivenName[0]", "John");
Note that the form you are trying to fill is a hybrid form. It contains the description of the form twice: once as an AcroForm; once as an XFA form. You may want to remove the XFA form by adding this line:
form.RemoveXfa();
For more info, read the documentation:
Is it safe to remove XFA?
How to change the text color of an AcroForm field?
Your code is using iText 7 for C#. If you want that code to work, you most certainly need to remove the XFA part as iText 7 doesn't support XFA. iText 7 was developed with PDF 2.0 in mind (this spec is to be released in 2017). XFA will be deprecated in PDF 2.0. A valid PDF 2.0 file will not be allowed to contain an XFA stream.

iText 5.5.3 + XFAWorker: Repeated section not rendering

We are using iText Java 5.5.3 with XFAWorker to inject XML into PDF templates and flatten those to simple PDFs.
To integrate into a system that has parts built in various technologies we have wrapped it into a http-Server using the simple framework
The injection stage works fine, but after flattening some sections are lost.
Detailled information:
The PDF template we inject into is: CLAIM.Plan.pdf.
Sample data we try to inject is: plan_v4.xml.
Without flattening we get: CLAIM_plan_unflattened.pdf.
With flattening the result is CLAIM_plan_flattened.pdf.
plan_v4.xml has multiple instances of //form/PlanItems/PlanItem and [CLAIM_plan_v3.pdf][6] is repeating the section to display all of them. This is the OK case. [CLAIM_plan_v4.pdf][7] is not showing the repeated sections for the individual PlanItem instances. This is the broken case.
The main bit of code that uses iText looks like this:
// load the XML document that contains the contents of the data element.
// NOT the <data> element itself or anything else.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document newdoc = db.parse(new InputSource(new InputStreamReader(xfaStream, "UTF-8")));
Node newdata = newdoc.getDocumentElement();
// Open the PDF and extract the complete xfa.
PdfReader reader = new PdfReader(pdfStream);
XfaForm xfa = new XfaForm(reader);
Document doc = xfa.getDomDocument();
// Remove all contents of the <data> element
NodeList list = doc.getElementsByTagNameNS("http://www.xfa.org/schema/xfa-data/1.0/", "data");
Node child;
do {
child = list.item(0).getFirstChild();
if (child!=null) {
list.item(0).removeChild(child);
}
} while (child!=null);
// Replace <data> with the new xml to inject.
list.item(0).appendChild(doc.importNode(newdata,true));
ByteArrayOutputStream os = new ByteArrayOutputStream();
PdfStamper stamper = new PdfStamper(reader, os, '\0', false);
xfa.setDomDocument(doc);
xfa.setChanged(true);
XfaForm.setXfa(xfa, stamper.getReader(), stamper.getWriter());
stamper.close();
If I now just finish quickly:
response.getOutputStream().write(os.toByteArray());
I get the good (but unflattened) result.
If I add the flattening stage like this:
com.itextpdf.text.Document document = new com.itextpdf.text.Document();
PdfWriter writer = PdfWriter.getInstance(document, response.getOutputStream());
XFAFlattener xfaf = new XFAFlattener(document, writer);
xfaf.flatten(new PdfReader(os.toByteArray()));
document.close();
The result is flattened but missing the repeated sections.
Can someone tell me what I am doing wrong so the PlanItems don't get rendered in the final PDF?

How can I set XFA data in a static XFA form in iTextSharp and get it to save?

I'm having a very strange issue with XFA Forms in iText / iTextSharp (iTextSharp 5.3.3 via NuGet). I am trying to fill out a static XFA styled form, however my changes are not taking.
I have both editions of iText in Action and have been consulting the second edition as well as the iTextSharp code sample conversions from the book.
Background: I have an XFA Form that I can fill out manually using Adobe Acrobat on my computer. Using iTextSharp I can read what the Xfa XML data is and see the structure of the data. I am essentially trying to mimic that with iText.
What the data looks like when I add data and save in Acrobat (note: this is only the specific section for datasets)
Here is the XML file I am trying to read in to replace the existing data (note: this is the entire contexts of that file):
However, when I pass the path to the replacement XML File in and try to set the data, the new file created (a copy of the original with the data replaced) without any errors being thrown, but the data is not being updated. I can see that the new file is created and I can open it, but there is no data in the file.
Here is the code being utilized to replace the data or populate for the first time, which is a variation of http://sourceforge.net/p/itextsharp/code/HEAD/tree/trunk/book/iTextExamplesWeb/iTextExamplesWeb/iTextInAction2Ed/Chapter08/XfaMovie.cs
public void Generate(string sourceFilePath, string destinationtFilePath, string replacementXmlFilePath)
{
PdfReader pdfReader = new PdfReader(sourceFilePath);
using (MemoryStream ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(pdfReader, ms))
{
XfaForm xfaForm = new XfaForm(pdfReader);
XmlDocument doc = new XmlDocument();
doc.Load(replacementXmlFilePath);
xfaForm.DomDocument = doc;
xfaForm.Changed = true;
XfaForm.SetXfa(xfaForm, stamper.Reader, stamper.Writer);
}
var bytes = ms.ToArray();
File.WriteAllBytes(destinationtFilePath, bytes);
}
}
Any help would be very much appreciated.
I found the issue. The replacement DomDocument needs to be the entire merged XML of the new document, not just the data or datasets portion.
I upvoted your answer, because it's not incorrect (I'm happy my reference to the demo led you to have another look at your code), but now that I have a second look at your original code, I think it's better to use the book example:
public byte[] ManipulatePdf(String src, String xml) {
PdfReader reader = new PdfReader(src);
using (MemoryStream ms = new MemoryStream()) {
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
AcroFields form = stamper.AcroFields;
XfaForm xfa = form.Xfa;
xfa.FillXfaForm(XmlReader.Create(new StringReader(xml)));
}
return ms.ToArray();
}
}
As you can see, it's not necessary to replace the whole XFA XML. If you use the FillXfaForm method, the data is sufficient.
Note: for the C# version of the examples, see http://tinyurl.com/iiacsCH08 (change the 08 into a number from 01 to 16 for the examples of the other chapters).