"Content can not be added to a PdfImportedPage." error - itext

I am trying to download and merges multiple pdf files by using ITextSharp.
It used to working before but I being got an "Content can not be added to a PdfImportedPage." error message on the line:
importedPage = writer.GetImportedPage(reader, currentPageIndex);
The full code is below, any help will be very appreciated.
private string MergeDocuments(IList<string> fileUrls, string fileName)
{
var reportFolder = this.ReportFolder + "\\";
using (MemoryStream output = new MemoryStream())
{
Document document = new Document();
try
{
// Initialize pdf writer
PdfWriter writer = PdfWriter.GetInstance(document, output);
// Open document to write
document.Open();
PdfContentByte content = writer.DirectContent;
PdfImportedPage importedPage;
// Iterate through all pdf documents
foreach (var url in fileUrls)
{
// Create pdf reader
using (PdfReader reader = new PdfReader(new Uri(url)))
{
int numberOfPages = reader.NumberOfPages;
// Iterate through all pages
for (int currentPageIndex = 1; currentPageIndex <= numberOfPages; currentPageIndex++)
{
// Determine page size for the current page
document.SetPageSize( reader.GetPageSizeWithRotation(currentPageIndex) );
// Create page
document.NewPage();
importedPage = writer.GetImportedPage(reader, currentPageIndex);
content.AddTemplate(importedPage, 1f, 0, 0, 1f, 0, 0);
}
}
}
}
catch (Exception exception)
{
throw new Exception("Error occured", exception);
}
File.WriteAllBytes(reportFolder + fileName + ".pdf", output.GetBuffer());
}
return "Reports/" + fileName + ".pdf";
}
When I try the following code, I get a null pointer exception in the addDocument() method:
using (MemoryStream output = new MemoryStream()) {
Document document = new Document();
document.Open();
PdfCopy copy = new PdfSmartCopy(document, output);
foreach (var url in fileUrls) {
using (WebClient client = new WebClient()) {
var byteArray = client.DownloadData(url);
PdfReader reader = new PdfReader(byteArray);
copy.AddDocument(reader);
reader.Close();
}
}
}

I found the problem, the document object should be closed before writing memory stream to file.
Just added document.Close() as below.
document.Close();
File.WriteAllBytes(reportFolder + fileName + ".pdf", output.GetBuffer());

Related

How to merge two pdf files with out losing layers using itextsharp in C#?

How to merge two pdf files with layers using itextsharp in C#, I tried but losing layers
// step 1: creation of a document-object
Document document = new Document();
// step 2: we create a writer that listens to the document
PdfCopy writer = new PdfCopy(document, new FileStream(outPutFilePath, FileMode.Create));
if(writer == null)
{
return;
}
// step 3: we open the document
document.Open();
foreach(string fileName in filesPath)
{
// we create a reader for a certain document
PdfReader reader = new PdfReader(fileName);
reader.ConsolidateNamedDestinations();
// step 4: we add content
for(int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
page.ContentTagged = true;
writer.AddPage(page);
}
PRAcroForm form = reader.AcroForm;
if(form != null)
{
// writer.CopyDocumentFields(reader);
}
reader.Close();
}
// step 5: we close the document and writer
writer.Close();
document.Close();

Empty content on the downloaded PDF using itextsharp in WebAPI 2 response

public IHttpActionResult DownloadPDF()
{
var stream = CreatePdf();
return ResponseMessage(new HttpResponseMessage
{
Content = new StreamContent(stream)
{
Headers =
{
ContentType = new MediaTypeHeaderValue("application/pdf"),
ContentDisposition = new ContentDispositionHeaderValue("attachment")
{
FileName = "myfile.pdf"
}
}
},
StatusCode = HttpStatusCode.OK
});
}
Here is the CreatePdf method:
private Stream CreatePdf()
{
using (var document = new Document(PageSize.A4, 50, 50, 25, 25))
{
var output = new MemoryStream();
var writer = PdfWriter.GetInstance(document, output);
writer.CloseStream = false;
document.Open();
document.Add(new Paragraph("Hello World"));
document.Close();
output.Seek(0, SeekOrigin.Begin);
return output;
}
}
I can able to download the PDF but the context is empty. Here I am using memory stream and I also tried with file stream its downloading in the respective folder but if I tried to open the downloaded file then also the content is empty. Can anyone help me what I'm missing here?
Here is an approach that usually works for me when using Web API
private byte[] CreatePdf() {
var buffer = new byte[0];
//stream to hold output data
var output = new MemoryStream();
//creation of a document-object
using (var document = new Document(PageSize.A4, 50, 50, 25, 25)) {
//create a writer that listens to the document
// and directs a PDF-stream to output stream
var writer = PdfWriter.GetInstance(document, output);
//open the document
document.Open();
// Create a page in the document
document.NewPage();
// Get the top layer to write some text
var pdfContentBytes = writer.DirectContent;
pdfContentBytes.BeginText();
//add content to page
document.Add(new Paragraph("Hello World"));
//done writing text
pdfContentBytes.EndText();
// make sure any data in the buffer is written to the output stream
writer.Flush();
document.Close();
}
buffer = output.GetBuffer();
return buffer;
}
And then in the action
public IHttpActionResult DownloadPDF() {
var buffer = CreatePdf();
return ResponseMessage(new HttpResponseMessage {
Content = new StreamContent(new MemoryStream(buffer)) {
Headers = {
ContentType = new MediaTypeHeaderValue("application/pdf"),
ContentDisposition = new ContentDispositionHeaderValue("attachment") {
FileName = "myfile.pdf"
}
}
},
StatusCode = HttpStatusCode.OK
});
}

Issue in converting HTML to PDF containing <pre> tag with Flying Saucer and ITEXT

I am using Flying Saucer library to convert html to pdf. It is working fine with the all the HTML files.
But for some HTML files which include some tags in pre tag, generated PDF file has tags displayed.
If I remove pre tags then the formatting of data is lost.
My code is
org.w3c.dom.Document document = null;
try {
Document doc = Jsoup.parse(new File(htmlFile), "UTF-8", "");
Whitelist wl = new RelaxedPlusDataBase64Images();
Cleaner cleaner = new Cleaner(wl);
doc = cleaner.clean(doc);
Tidy tidy = new Tidy();
tidy.setShowWarnings(false);
tidy.setXmlTags(false);
tidy.setInputEncoding("UTF-8");
tidy.setOutputEncoding("UTF-8");
tidy.setPrintBodyOnly(true);
tidy.setXHTML(true);
tidy.setMakeClean(true);
tidy.setAsciiChars(true);
if (doc.select("pre").html().contains("</")) {
doc.select("pre").unwrap();
}
Reader reader = new StringReader(doc.html());
document = (tidy.parseDOM(reader, null));
Element element = (Element) document.getElementsByTagName("head").item(0);
element.getParentNode().removeChild(element);
NodeList elements = document.getElementsByTagName("img");
for (int i = 0; i < elements.getLength(); i++) {
String value = elements.item(i).getAttributes().getNamedItem("src").getNodeValue();
if (value != null && value.startsWith("cid:") && value.contains("#")) {
value = value.substring(value.indexOf("cid:") + 4, value.indexOf("#"));
elements.item(i).getAttributes().getNamedItem("src").setNodeValue(value);
System.out.println(value);
}
}
document.normalize();
System.out.println(getNiceLyFormattedXMLDocument(document));
} catch (Exception e) {
System.out.println(e);
}
Method to create PDF is :
try {
org.w3c.dom.Document doc = CleanHtml.cleanNTidyHTML("b.html");
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(doc, null);
renderer.setPDFVersion(new Character('7'));
String outputFile = "test.pdf";
OutputStream os = new FileOutputStream(outputFile);
renderer.layout();
renderer.createPDF(os);
os.flush();
os.close();
} catch (Exception e) {
e.printStackTrace();
}
By using itext XMLWorker :
try {
org.w3c.dom.Document doc = CleanHtml.cleanNTidyHTML("a.html");
String k = CleanHtml.getNiceLyFormattedXMLDocument(doc);
OutputStream file = new FileOutputStream(new File("test.pdf"));
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, file);
document.open();
ByteArrayInputStream is = new ByteArrayInputStream(k.getBytes());
XMLWorkerHelper.getInstance().parseXHtml(writer, document, is);
document.close();
file.close();
} catch (Exception e) {
e.printStackTrace();
}
public static String getNiceLyFormattedXMLDocument(org.w3c.dom.Document doc) throws IOException, TransformerException {
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
// transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
Writer stringWriter = new StringWriter();
StreamResult streamResult = new StreamResult(stringWriter);
transformer.transform(new DOMSource(doc), streamResult);
String result = stringWriter.toString();
return result;
}

iTextSharp - Create new document as Byte[]

Have a little method which goes to the database and retrieves a pdf document from a varbinary column and then adds data to it. I would like to add code so that if this document (company stationery ) is not found then a new blank document is created and returned. The method could either return a Byte[] or a Stream.
Problem is that the variable "bytes" in the else clause is null.
Any ideas what's wrong?
private Byte[] GetBasePDF(Int32 AttachmentID)
{
Byte[] bytes = null;
DataTable dt = ServiceFactory
.GetService().Attachments_Get(AttachmentID, null, null);
if (dt != null && dt.Rows.Count > 0)
{
bytes = (Byte[])dt.Rows[0]["Data"];
}
else
{
// Create a new blank PDF document and return it as Byte[]
ITST.Document doc =
new ITST.Document(ITST.PageSize.A4, 50f, 50f, 25f, 25f);
MemoryStream ms = new MemoryStream();
PdfCopy copy = new PdfCopy(doc, ms);
ms.Position = 0;
bytes = ms.ToArray();
}
return bytes;
}
You are trying to use PdfCopy but that's intended for existing documents, not new ones. You just need to create a "blank" document using PdfWriter and Document. iText won't let you create a 100% empty document but the code below essentially does that by just adding a space.
private static Byte[] CreateEmptyDocument() {
using (var ms = new System.IO.MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
doc.Add(new Paragraph(" "));
doc.Close();
}
}
return ms.ToArray();
}
}
I think you may need to use
bytes = ms.GetBuffer();
not
bytes = ms.ToArray();

iText rotation creates pdf which displays out of memory exception

Following is a code snippet creating a pdf file where pages could be rotated in the resulting file. This works fine for most pdf files. But one particualr pdf file of version 1.6 the page is already rotated by 180, on applying further rotation to it e.g. 90 degress and saving the file causes it to get corrupted. Infact even if you don't rotate the file and simply write it out to another file using iText the file the resulting pdf is corrupted and displays an out of memory exception when opened in Adobe reader.
Why would that happen? Am I missing some sort of compression in the file.
private String createPdfFileWithoutForms(final EditStateData[] editStateData, final String directory)
throws EditingException {
Long startTime = System.currentTimeMillis();
File pdfFileToReturn = new File(directory + File.separator + UidGenerator.generate() + ".pdf");
com.lowagie.text.Document document = null;
FileOutputStream outputStream = null;
PdfCopy pdfCopy = null;
PdfReader reader = null;
PdfDictionary pageDict = null;
int rotationAngle = 0;
Map<Integer, Integer> rotationQuadrants = null;
try {
document = new com.lowagie.text.Document();
outputStream = new FileOutputStream(pdfFileToReturn);
pdfCopy = new PdfCopy(document, outputStream);
pdfCopy.setFullCompression();
pdfCopy.setCompressionLevel(9);
document.open();
for (EditStateData state : editStateData) {
try {
reader = new PdfReader(state.getFileName());
reader.selectPages(state.getPages());
rotationQuadrants = state.getRotationQuadrants();
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
// Rotation quadrant key is the source page number
if (rotationQuadrants.containsKey(state.getPages().get(i - 1))) {
rotationAngle = reader.getPageRotation(i);
pageDict = reader.getPageN(i);
pageDict.put(PdfName.ROTATE,
new PdfNumber((rotationAngle
+ rotationQuadrants.get(state.getPages().get(i - 1))) % 360));
}
document.setPageSize(reader.getPageSizeWithRotation(i));
document.newPage();
// import the page from source pdf
PdfImportedPage page = pdfCopy.getImportedPage(reader, i);
// add the page to the destination pdf
pdfCopy.addPage(page);
}
} catch (final IOException e) {
LOGGER.error(e.getMessage(), e);
throw new EditingException(e.getMessage(), e);
} finally {
if (reader != null) {
reader.close();
}
}
}
} catch (final Exception e) {
LOGGER.error(e.getMessage(), e);
throw new EditingException(e.getMessage(), e);
} finally {
if (document != null) {
document.close();
}
if (pdfCopy != null) {
pdfCopy.close();
}
IoUtils.closeQuietly(outputStream);
}
LOGGER.debug("Combining " + editStateData.length + " pdf files took "
+ ((System.currentTimeMillis() - startTime) / 1000) + " msecs");
return pdfFileToReturn.getAbsolutePath();
}