PdfCopy breaks the string replacement of PDF using iTextSharp - itext

I have two pdfs, lets say content and header.
Now I want to add this header to the top of content.
So first, I have used Document, PdfWriter then took the PdfContentByte (canvas) to place the header.pdf inside content with desired location.
This addTemplate method breaks the format of PDF.
I already have string replacement method after this merging, which is not working now after this merge method.
After the suggestions from iText, I tried to use PdfCopy or PdfSmartCopy.
But how can we place this header in desired location using PdfCopy ?
If not, then how to merge without breaking the format of PDF using writer / pdfContentByte ?
Note:
I have tried to merge the Pdfs by using PdfCopy, but again its breaking.
PdfObject pdfObject = dict.GetDirectObject(PdfName.CONTENTS);
(pdfObject is PRStream)
Here the pdfObject is not a PRStream instead it becomes PdfArray.
Please help me to resolve this issue.

Related

itext7 pdf to image

I am using iText7(java) and am looking for a way to convert a pdf page to image.
In older iText versions you could do this :
PdfImportedPage page = writer.getImportedPage(reader, 1);
Image image = Image.getInstance(page);
But iText7 does not have PdfImportedPage .
My use case, I have a one page pdf file. I need to add a table and resize the pdf contents to fit a single page. In old iText I would create a page , add table, convert existing pdf page to image, resize image and add that resized image to new page. Is there a new way to do this in iText7.
Thanks to Bruno's answer I got this working with following code :
PdfPage origPage = readerDoc.getPage(1);
Rectangle rect = origPage.getPageSize();
Document document = new Document(writerDoc);
Table wrapperTable = new Table(1);
Table containerTable = new Table(new float[]{0.5f,0.5f});
containerTable.setWidthPercent(100);
containerTable.addCell( "col1");
containerTable.addCell("col2");
PdfFormXObject pageCopy = origPage.copyAsFormXObject(writerDoc);
Image image = new Image(pageCopy);
image.setBorder(Border.NO_BORDER);
image.setAutoScale(true);
image.setHeight(rect.getHeight()-250);
wrapperTable.addCell(new Cell().add(containerTable).setBorder(Border.NO_BORDER));
wrapperTable.addCell(new Cell().add(image).setBorder(Border.NO_BORDER));
document.add(wrapperTable);
document.close();
readerDoc.close();
Please read the official documentation for iText 7, more specifically Chapter 6: Reusing existing PDF documents
In PDF, there's the concept of Form XObjects. A Form XObject is a piece of PDF content that is stored outside the content stream of a page, hence XObject which stands for eXternal Object. The use of the word Form in Form XObject could be confusing, because people might be thinking of a form as in a fillable form with fields. To avoid that confusing, we introduced the term PdfTemplate in iText 5.
The class PdfImportedPage you refer to was a subclass of PdfTemplate: it was a piece of PDF syntax that could be reused in another page. Over the years, we noticed that people also got confused by the word PdfTemplate.
In iText 7, we returned to the basics. When talking about a Form XObject, we use the class PdfFormXObject. When talking about a page in a PDF file, we use the class PdfPage.
This is how we get a PdfPage from an existing document:
PdfDocument origPdf = new PdfDocument(new PdfReader(src));
PdfPage origPage = origPdf.getPage(1);
This is how we use that page in a new document:
PdfDocument pdf = new PdfDocument(new PdfWriter(dest));
PdfFormXObject pageCopy = origPage.copyAsFormXObject(pdf);
If you want to use that pageCopy as an Image, just create it like this:
Image image = new Image(pageCopy);

Setting AcroField text color in itextSharp

I am using iTextSharp 5.5.3 i have a PDF with named fields i created with Adobe lifecycle I am able to fill the fields using iTextSharp but when i change the textcolor for a field it does not change. i really dont know why this is so. here is my code below
form.SetField("name", "Michael Okpara");
form.SetField("session", "2014/2015");
form.SetField("term", "1st Term");
form.SetFieldProperty("name", "textcolor", BaseColor.RED, null);
form.RegenerateField("name");
If your form is created using Adobe LifeCycle, then there are two options:
You have a pure XFA form. XFA stands for the XML Forms Architecture and your PDF is nothing more than a container of an XML stream. There is hardly any PDF syntax in the document and there are no AcroForm fields. I don't think this is the case, because you are still able to fill out the fields (which wouldn't work if you had a pure XFA form).
You have a hybrid form. In this case, the form is described twice inside the PDF file: once using an XML stream (XFA) and once using PDF syntax (AcroForm). iText will fill out the fields in both descriptions, but the XFA description gets preference when rendering the document. Changing the color of a field (or other properties) would require changing the XML and iText(Sharp) can not do that.
If I may make an educated guess, I would say that you have a hybrid form and that you are only changing the text color of the AcroForm field without changing the text color in the XFA field (which is really hard to achieve).
Please try adding this line:
form.RemoveXfa();
This will remove the XFA stream, resulting in a form that only keeps the AcroForm description.
I have written a small example named RemoveXFA using the form you shared to demonstrate this. This is the C#/iTextSharp version of that example:
public void ManipulatePdf(String src, String dest)
{
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileStream(dest, FileMode.Create));
AcroFields form = stamper.AcroFields;
form.RemoveXfa();
IDictionary<String, AcroFields.Item> fields = form.Fields;
foreach (String name in fields.Keys)
{
if (name.IndexOf("Total") > 0)
form.SetFieldProperty(name, "textcolor", BaseColor.RED, null);
form.SetField(name, "X");
}
stamper.Close();
reader.Close();
}
In this example, I remove the XFA stream and I look over all the remaining AcroFields. I change the textcolor of all the fields with the word "Total" in their name, and I fill out every field with an "X".
The result looks like this: reportcard.pdf
All the fields show the letter "X", but the fields in the TOTAL column are written in red.
I finally found a way, guess the problem was coming from using Adobe LC, so i switched to Open Office it all worked but when i flatten the form everything disappears. I found a solution to that here ITextSharp PDFTemplate FormFlattening removes filled data
Thanks Mr Lowagie for your help

Creating a PDF image in iText

I'm trying to add an image that I have on the filesystem as a pdf into a pdf that I'm creating on the fly.
I tried to use the Image class, but it seems that it does not work with PDFs (only JPEG, PNG or GIF). How can I create an element from an existing PDF, so that I can place it in my new PDF?
Please download chapter 6 of my book and read all about the class PdfImportedPage.
In the most basic example, you'd create a PdfReader instance and import the page into the PdfWriter instance, from which point on you can use the PdfImportedPage instance, either directly, or wrapped inside an Image object:
PdfReader reader = new PdfReader(existingPdf);
PdfImportedPage page = writer.getImportedPage(reader, i);
Image img = Image.getInstance(page);
reader.close();

Merging documents using OpenXml and section breaks causes empty paragraphs

I am stitching a couple of documents together with a requirement that each document should retain its header and footer information in the final document. Using AltChunk instead of raw OpenXml or DocumentBuilder saves a lot of effort with regards to styles, formatting, references, parts, etc.
Unfortunately, after a couple of days I can't seem to get a 100% working version due to a small and frustrating issue and I need some insight.
My code is loosly based on this article
I modify each sub document, prior to appending it (as an AltChunk) to a working document, by moving the last section properties into the last paragraph (in order to retain header and footer references), but Word seems to be adding a blank paragraph to each of these documents as it renders them in the final document. I end up with:
document 1 with correct header and footer
section properties/break
blank paragraph
document 2 with correct header and footer
section properties/break
blank paragraph
etc.
I cant remove the blank paragraphs afterwards, as I ideally don't want to use WAS to render the document first.
It seems as if you cannot have a next-page section break without a following paragraph?
After further investigation, it seems that will not be away around my usage scenario. I would need to place the last section properties in the body element, but due to my way of processing with nested AltChunk, it would not work.
I have changed my approach completely and went back to a more detailed append procedure using OpenXml Power Tools and some LINQ to Xml.
I'm using Document Builder and works perfectly for me!
var sources = new List<OpenXmlPowerTools.Source>();
sources.Add(new OpenXmlPowerTools.Source(new WmlDocument(#tempReportPart1)));
sources.Add(new OpenXmlPowerTools.Source(new WmlDocument(#tempReportPart2)));
var outputPath = #"C:\Users\xpto\Documents\TestFolder\myNewDocument.docx";
DocumentBuilder.BuildDocument(sources, outputPath);
I have the similar empty paragraph issue while importing HTML files.
My solution is,
After inserting HTML AltChunk, I add a GUID place holder. After processing the file, I will open the file again, locate the GUID and check if there is a empty paragraph before it, if so remove the empty paragraph and GUID. it seems work perfectly in my solution.
Hope it helps.

Rich Text Content Fields with iTextSharp

I'm using iTextSharp to create pdf files on the fly for my users to save on their computers. Currently the way it works is that I had several pdf templates that iTextSharp opens, sets fields from the database and save on user computer.
I'm having real problems with adding rich content into the body field which the users enter using a Rich Text editor (NiceEdit). It contains really simple options such as bullets, bold, italic, font colors, sizes etc. which is all what they need. I tried all possible options out there but nothing seem to help.
PdfReader reader = new PdfReader(originalReport.ToString());
PdfStamper stamper = new PdfStamper(reader, new FileStream(report, FileMode.Create));
AcroFields fields = stamper.AcroFields;
fields.SetFieldProperty("body", "setfflags", PdfFormField.FF_RICHTEXT, null);
fields.SetFieldRichValue("body", bodyValue.ToString());
fields.GenerateAppearances = false;
I tried flattening the form but it doesn't display the field at all. If I didn't flatten the form then the field displays the content but with no format at all (even new lines are removed) and if I used the SetField option instead of the SetFieldRichValue I get a string of the html content
I also tried allowing the pdf template field to "Allow Rich Text" from acrobat pro but this gives an exception and halts the pdf :)
Also please note that since I'm using a template that depends on the type of the user, the SetField option is used, I don't create the document from scratch hence I don't use paragraphs which I can get if I used the HTMLWorker and simply add to a document
Any help please?