After reading many stackoverflow and tried many solutions I'm stuck with this:
I am receiving a PDF that I cannot change and need to automatically process it.
The PDF is a PDF form with 2 fields and submit button.
The following code is the closes I came to what I need to do:
public static final String SRC = "C:\\Dev\\test.pdf";
public static final String DEST = "C:\\Dev\\test_result.pdf";
public static final String DATA = "C:\\Dev\\data.xml";
File file = new File(DEST);
PdfReader reader = new PdfReader(SRC);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(DEST));
AcroFields form = stamper.getAcroFields();
XfaForm xfa = form.getXfa();
xfa.fillXfaForm(new FileInputStream(DATA));
This gives a null pointer:
Exception in thread "main" java.lang.NullPointerException
at com.itextpdf.text.pdf.XfaForm.fillXfaForm(XfaForm.java:1168)
at com.itextpdf.text.pdf.XfaForm.fillXfaForm(XfaForm.java:1146)
at com.itextpdf.text.pdf.XfaForm.fillXfaForm(XfaForm.java:1134)
at com.itextpdf.text.pdf.XfaForm.fillXfaForm(XfaForm.java:1131)
I can get and set the fields on the form with this code:
AcroFields fields = reader.getAcroFields();
fields.setField("pdfForm.loginUser", "myemail#domain.com");
fields.setField("pdfForm.loginPass", "mypassword");
How do I convert the Acrofields to XfaForm?
Related
I have editable pdf with fields name, roll number , DOB and city, I want to validate the fields like, for example when I enter the value in roll number field, if name is empty then it should alert "name is empty" and similar to roll number and DOB and city.
DOB - Need to validate mm/dd/yyyy format.
I am able to get fields and values using below iTextSharp code, but how can validate and and show alert to user.
iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(filename);
using (MemoryStream outputStream = new MemoryStream())
{
//PdfStamper stamper = new PdfStamper(reader, );
using (PdfStamper stamper = new PdfStamper(reader, outputStream, '\0', true))
{
PdfWriter writer = stamper.Writer;
AcroFields acroFields = reader.AcroFields;
AcroFields.Item dateField = acroFields.GetFieldItem("D");
iTextSharp.text.pdf.PdfAction pdfAction = iTextSharp.text.pdf.PdfAction.JavaScript("app.alert('hello');", writer);
iTextSharp.text.pdf.PdfDictionary widgetRefDict = (iTextSharp.text.pdf.PdfDictionary)iTextSharp.text.pdf.PdfReader.GetPdfObject(dateField.GetWidgetRef(0));
iTextSharp.text.pdf.PdfDictionary actionDict = widgetRefDict.GetAsDict(iTextSharp.text.pdf.PdfName.AA);
if (actionDict == null)
{
actionDict = new iTextSharp.text.pdf.PdfDictionary();
// add the newly created action dict
widgetRefDict.Put(iTextSharp.text.pdf.PdfName.AA, actionDict);
}
actionDict.Put(iTextSharp.text.pdf.PdfName.V, pdfAction);
stamper.Close();
reader.Close();
}
byte[] content = outputStream.ToArray();
// Write out PDF from memory stream.
using (FileStream fs = File.Create("d:\\Page1-output.pdf"))
{
fs.Write(content, 0, (int)content.Length);
}
}
I try to generate a toc(table of content) for my pdf, and I want to get some strings which look like chapter title in xxx.pdf using ITextExtractionStrategy. But I got com.itextpdf.kernel.PdfException when I am running a test.
Here is my code:
#org.junit.Test
public void test() throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfDocument pdfDoc = new PdfDocument(new PdfReader("src/test/resources/template/xxx.pdf"),
new PdfWriter(baos));
pdfDoc.addNewPage(1);
Document document = new Document(pdfDoc);
// when add this code, throw com.itextpdf.kernel.PdfException: Dictionary doesn't have supported font data.
Paragraph title = new Paragraph(new Text("index"))
.setTextAlignment(TextAlignment.CENTER);
document.add(title);
SimpleTextExtractionStrategy extractionStrategy = new SimpleTextExtractionStrategy();
for (int i = 1; i < pdfDoc.getNumberOfPages(); i++) {
PdfPage page = pdfDoc.getPage(i);
PdfCanvasProcessor parser = new PdfCanvasProcessor(extractionStrategy);
parser.processPageContent(page);
}
...
document.close();
pdfDoc.close();
new FileOutputStream("./yyy.pdf").write(baos.toByteArray());
}
Here is the output:
com.itextpdf.kernel.PdfException: Dictionary doesn't have supported font data.
at com.itextpdf.kernel.font.PdfFontFactory.createFont(PdfFontFactory.java:123)
at com.itextpdf.kernel.pdf.canvas.parser.PdfCanvasProcessor.getFont(PdfCanvasProcessor.java:490)
at com.itextpdf.kernel.pdf.canvas.parser.PdfCanvasProcessor$SetTextFontOperator.invoke(PdfCanvasProcessor.java:811)
at com.itextpdf.kernel.pdf.canvas.parser.PdfCanvasProcessor.invokeOperator(PdfCanvasProcessor.java:454)
at com.itextpdf.kernel.pdf.canvas.parser.PdfCanvasProcessor.processContent(PdfCanvasProcessor.java:282)
at com.itextpdf.kernel.pdf.canvas.parser.PdfCanvasProcessor.processPageContent(PdfCanvasProcessor.java:303)
at com.example.pdf.util.Test.test(Test.java:138)
Whenever you add content to a PdfDocument like you do here
Document document = new Document(pdfDoc);
Paragraph title = new Paragraph(new Text("index"))
.setTextAlignment(TextAlignment.CENTER);
document.add(title);
you have to be aware that this content is not already stored in its final form; for example fonts used are not yet properly subset'ed. The final form is generated when you're closing the document.
Text extraction on the other hand requires the content to extract to be in its final form.
Thus, you should not apply text extraction to a document you're working on. In particular, don't apply text extraction to a page you've changed the content of.
If you need to extract text from the documents you create yourself, close your document first, open a new document from the output, and extract from that new document.
I have multiple copies of a .pdf document that are commented by different users. I would like to merge all these comments into a new pdf "merged".
I wrote this sub inside a class called document with properties "path" and "directory".
Public Sub MergeComments(ByVal pdfDocuments As String())
Dim oSavePath As String = Directory & "\" & FileName & "_Merged.pdf"
Dim oPDFdocument As New iText.Kernel.Pdf.PdfDocument(New PdfReader(Path),
New PdfWriter(New IO.FileStream(oSavePath, IO.FileMode.Create)))
For Each oFile As String In pdfDocuments
Dim oSecundairyPDFdocument As New iText.Kernel.Pdf.PdfDocument(New PdfReader(oFile))
Dim oAnnotations As New PDFannotations
For i As Integer = 1 To oSecundairyPDFdocument.GetNumberOfPages
Dim pdfPage As PdfPage = oSecundairyPDFdocument.GetPage(i)
For Each oAnnotation As Annot.PdfAnnotation In pdfPage.GetAnnotations()
oPDFdocument.GetPage(i).AddAnnotation(oAnnotation)
Next
Next
Next
oPDFdocument.Close()
End Sub
This code results in an exception that I am failing to solve.
iText.Kernel.PdfException: 'Pdf indirect object belongs to other PDF document. Copy object to current pdf document.'
What do I need to change in order to perform this task? Or am I completely off with my code block?
You need to explicitly copy the underlying PDF object to the destination document. After that you will be easily able to add that object to the list of page annotations.
Instead of adding the annotation directly:
oPDFdocument.GetPage(i).AddAnnotation(oAnnotation)
Copy the object to the destination document first, wrap it into PdfAnnotation class with makeAnnotation method and then add it as usual. Code is in Java but you will easily be able to convert it into VB:
PdfObject annotObject = oAnnotation.getPdfObject().copyTo(pdfDocument);
pdfDocument.getPage(i).addAnnotation(PdfAnnotation.makeAnnotation(annotObject));
Here is a working Java code, with annotations copied from one document to other using the copyTo method.
PdfReader reader = new PdfReader(new
RandomAccessSourceFactory().createBestSource(sourceFileName), null);
PdfDocument document = new PdfDocument(reader);
PdfReader toMergeReader = new PdfReader(new RandomAccessSourceFactory().createBestSource(targetFileName), null);
PdfDocument toMergeDocument = new PdfDocument(toMergeReader);
PdfWriter writer = new PdfWriter(targetFileName + "_MergedVersion.pdf");
PdfDocument writeDocument = new PdfDocument(writer);
int pageCount = toMergeDocument.getNumberOfPages();
for (int i = 1; i <= pageCount; i++) {
PdfPage page = document.getPage(i);
writeDocument.addPage(page.copyTo(writeDocument));
PdfPage pdfPage = toMergeDocument.getPage(i);
List<PdfAnnotation> pageAnnots = pdfPage.getAnnotations();
if (pageAnnots != null) {
for (PdfAnnotation pdfAnnotation : pageAnnots) {
PdfObject annotObject = pdfAnnotation.getPdfObject().copyTo(writeDocument);
writeDocument.getPage(i).addAnnotation(PdfAnnotation.makeAnnotation(annotObject));
}
}
}
reader.close();
toMergeReader.close();
toMergeDocument.close();
document.close();
writeDocument.close();
writer.close();
I am trying to generate pdf from template using iText. I've set value in AcroFields. when I opening output file, there is nothing on it.Is there anyone who had same problem before? .
FileOutputStream fos = new FileOutputStream(destPath);
PdfReader reader = new PdfReader(templatePath);
PdfStamper stamp = new PdfStamper(reader, fos);
AcroFields form = stamp.getAcroFields();
form.setField("parttype", "1dafd");
stamp.setFormFlattening(true);
stamp.close();
fos.flush();
fos.close();
I used examples available on web to create an application that is able to get xml structure of XFA form and then set it back filled. Important code looks like this:
public void readData(String src, String dest)
throws IOException, ParserConfigurationException, SAXException,
TransformerFactoryConfigurationError, TransformerException {
FileOutputStream os = new FileOutputStream(dest);
PdfReader reader = new PdfReader(src);
XfaForm xfa = new XfaForm(reader);
Node node = xfa.getDatasetsNode();
NodeList list = node.getChildNodes();
for (int i = 0; i < list.getLength(); i++) {
if ("data".equals(list.item(i).getLocalName())) {
node = list.item(i);
break;
}
}
Transformer tf = TransformerFactory.newInstance().newTransformer();
tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
tf.setOutputProperty(OutputKeys.INDENT, "yes");
tf.transform(new DOMSource(node), new StreamResult(os));
reader.close();
}
public void fillPdfWithXmlData(String src, String xml, String dest)
throws IOException, DocumentException {
PdfReader.unethicalreading = true;
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest), '\0', true);
AcroFields form = stamper.getAcroFields();
XfaForm xfa = form.getXfa();
xfa.fillXfaForm(new FileInputStream(xml));
stamper.close();
reader.close();
}
When I use it to fill this form: http://www.vzp.cz/uploads/document/tiskopisy-pro-zamestnavatele-hromadne-oznameni-zamestnavatele-verze-2-pdf-56-kb.pdf it works fine and I can see the filled form in Acrobat Reader. However, if I open the document in Foxit Reader I see a blank form (tested in latest version and 5.x version).
I tried to play a little bit with it and got these XfaForm(...).getDomDocument() data:
Filled by Acrobat Reader: http://pastebin.com/kXKyh9EM
Filled by Foxit Reader: http://pastebin.com/tiZ7EmfE
Filled by iText: http://pastebin.com/tTKLMERC
Filled by Foxit Reader after a fill by iText: http://pastebin.com/Uuq0jS4b
The field which was filled is . Is it possible to use iText in a way that it works even with Foxit Reader (and the XFA signature stays?)
In your Filled by iText example there's a superfluous <xfa:data> element:
<xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
<xfa:data>
<xfa:data>
<HOZ>
<!-- rest of your data here -->
</HOZ>
</xfa:data>
</xfa:data>
<!-- data description -->
</xfa:datasets>
This is because the fillXfaForm() method of XfaForm expects the XML data without <xfa:data> as the root element. So your XML data should just look like:
<HOZ>
<!-- rest of your data here -->
</HOZ>
I see that your readData() method that extract the existing form data including the <xfa:data> element:
<xfa:data>
<HOZ>
<!-- rest of your data here -->
</HOZ>
</xfa:data>
Stripping the outer element should fix your problem. For example:
XfaForm xfa = new XfaForm(reader);
Node node = xfa.getDatasetsNode();
NodeList list = node.getChildNodes();
for (int i = 0; i < list.getLength(); i++) {
if ("data".equals(list.item(i).getLocalName())) {
node = list.item(i);
break;
}
}
// strip <xfa:data>
node = node.getFirstChild();
// Transformer code here