Create Muti Page PDF Document from One Page Document - itext

I'm using iText7 for .NET c#. I'm trying to create a multi page PDF document (n-number of pages) in memory where the source pdf document consists of only one page.
I can create a new document with the one page, but unable to create the additional pages as needed. I've tried
MemoryStream ms = new MemoryStream();
Stream s1 = Assembly.GetExecutingAssembly().GetManifestResourceStream("doc.pdf");
s1.CopyTo(ms)
PdfReader readerSrc = new PdfReader(ms);
PdfDocument srcPdfDoc = new PdfDocument(readerSrc);
MemoryStream msDest = new ByteArrayOutputStream();
PdfDocument destPdfDoc - new PdfDocument(new PdfWriter(msDest));
srcPdfDoc.CopyPageTo(1,1,destPdfDoc)
destPdfDoc.AddNewPage(1, new PageSize.A4)
srcPdfDoc.CopyPageTo(1,2,destPdfDoc)
But then I get an ArgumentOutOfRange exception...Index out of range...etc....
I've tried closing the destPdfDoc to then re-open it thinking that the second page hadn't been written until I close it. But then when open the destPdfDoc up the 2nd time, it has no pages. I can't figure out how to open up the destPdfDoc in like "Append" mode. If that makes sense. Bottom line, I'm lost.
I've done this using iTextSharp, but then when using the new iText7, the library changed and my old code doesn't work anymore.

I figured it out.
MemoryStream ms = new MemoryStream();
Stream s1 = Assembly.GetExecutingAssembly().GetManifestResourceStream("doc.pdf");
s1.CopyTo(ms)
PdfReader readerSrc = new PdfReader(ms);
PdfDocument srcPdfDoc = new PdfDocument(readerSrc);
MemoryStream msDest = new ByteArrayOutputStream();
PdfDocument destPdfDoc - new PdfDocument(new PdfWriter(msDest));
PdfMerger merger = new PdfMeger(destPdfDoc);
merger.Merge(srcPdfDoc , 1,1)
merger.Merge(srcPdfDoc , 1,1)
destPdfDoc.Close();
srcPdfDoc.Close();

Related

Set table using "setFixedPosition" on specified page in a reopened PDF document

This question is the follow up to another question on stackoverflow.
I open an existing PDF with this code snippet:
reader = New PdfReader(filenameSource)
writer = New PdfWriter(destFile)
pdf = New PdfDocument(reader, writer)
doc = New Document(pdf, pdf.GetDefaultPageSize, False)
I can add a paragraph now via doc.add(new Paragraph(...))
But when I try to place a table with table.setFixedPosition(...), the table doesn´t show on the page.
Has anybody any hint for me?
Thanks and best regards
Benjamin
Based on your information I wrote this piece of code:
PdfReader reader = new PdfReader("LoremIpsum.pdf");
PdfWriter writer = new PdfWriter("LoremIpsum-with-positioned-table.pdf");
PdfDocument pdf = new PdfDocument(reader, writer);
Document doc = new Document(pdf, pdf.GetDefaultPageSize(), false);
Table table = new Table(new float[] { 200 });
table.AddCell(new Cell().Add(new Paragraph("test")).SetBackgroundColor(ColorConstants.CYAN));
table.SetFixedPosition(1, 100, 100, 200);
doc.Add(table);
doc.Close();
This didn't reproduce your issue
when I try to place a table with table.setFixedPosition(...), the table doesn´t show on the page.
because the result looks like this:
The table clearly shows.

Html to PDF , text inside <pre> not coming good in PDF

I am trying to convert html to pdf with iText API(itextpdf.5.5.7.jar, xmlworker 5.5.7.jar), everything is good except the text inside the pre tag.
In html the text inside the pre is good but in PDF the formatting is totally gone and simply coming in normal lines without formatting. I checked the blogs but I did not find correct answer.
Please find the screen shots for HTML and PDF.
HTML screen:
PDF screen:
this is my code
String pdfFileName="C:\test.pdf";
com.itextpdf.text.Document document = new com.itextpdf.text.Document();
PdfWriter writer = PdfWriter.getInstance(document,
new FileOutputStream(pdfFileName));
writer.setInitialLeading(12.5f);
document.open();
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
CSSResolver cssResolver = XMLWorkerHelper.getInstance()
.getDefaultCssResolver(true);
Pipeline<?> pipeline = new CssResolverPipeline(cssResolver,
new HtmlPipeline(htmlContext, new PdfWriterPipeline(document,
writer)));
XMLWorker worker = new XMLWorker(pipeline, true);
XMLParser p = new XMLParser(worker);
File input = new File(completeHtmlFilePath);
p.parse(new InputStreamReader(new FileInputStream(input), "UTF-8"));
document.close();
return pdfFileNameWithPath;

How do I set an action for the root outline of a PDF

How do I set an action on the root Outline for a PDF?
I know I can do this on a kid of the root:
newOutline = new PdfOutline (rootOutline, PdfAction.GotoLocalPage ("1", false), rootNode.DivisionLabel, true);
But how to I do the same thing for the root?
In that I can not set the root outline (its readonly), and I can not set an action for it either. I get started like this:
PdfReader inputPdf = new PdfReader (rs);
int pageCount = inputPdf.NumberOfPages;
PdfStamper stamper = new PdfStamper (inputPdf, ws);
PdfWriter writer = stamper.Writer;
writer.ViewerPreferences = PdfWriter.PageModeUseOutlines;
PdfContentByte cb = writer.DirectContent;
PdfOutline rootOutline = cb.RootOutline;
Thanks for the help...
I could never get the PdfWriter returned by stamper.writer to work. I had to change my method such that it uses an independent PdfReader and PdfWriter pair, where I can copy the pages from the input PDF to output PDF while adding the needed local destinations and outlines. Grrr very frustrating working with iTextSharp...
I'm not sure that you can set an action for the root Outline. When would it trigger? The root is just a container for any other outlines.
If you want to always Go To Page 1 when the document opens, then there are other ways of doing that.

create pdf from asp.net page view

I try to create a pdf file from html page view with itextsharp.
This code block works but, I get html code string from url and I want to give this string to be converted. So I can convert any site view to pdf by giving url to function. Thanks..
Response.ContentType = "application/pdf";
Response.AddHeader("content-disposition", "attachment;filename=TestPage.pdf");
Response.Cache.SetCacheability(HttpCacheability.NoCache);
StringWriter sw = new StringWriter();
HtmlTextWriter hw = new HtmlTextWriter(sw);
WebClient wc = new WebClient();
string htmlText = wc.DownloadString("http://");
this.Page.RenderControl(hw);
StringReader sr = new StringReader(sw.ToString());
Document pdfDoc = new Document(PageSize.A4, 10f, 10f, 100f, 0f);
HTMLWorker htmlparser = new HTMLWorker(pdfDoc);
PdfWriter.GetInstance(pdfDoc, Response.OutputStream);
pdfDoc.Open();
htmlparser.Parse(sr);
pdfDoc.Close();
Response.Write(pdfDoc);
Response.End();
HTMLWorker is buggy and has been discontinued, in favor of XMLWorker.
The project page : http://sourceforge.net/projects/xmlworker/
The demo page : http://demo.itextsupport.com/xmlworker/
The documentation : http://demo.itextsupport.com/xmlworker/itextdoc/index.html
Even though the documentation refers to the Java version, adapting the examples to iTextSharp should be straightforward.

creating a pdf from a template in itextsharp and outputting as content disposition.

I would like to open an existing pdf, add some text and then output as content disposition using itext sharp. I have the following code. Where it falls down it is that i want to output as memory stream but need to filestream to open the original file.
Here's what i have. Obviously defining PdfWriter twice won't work.
public static void Create(string path)
{
var Response = HttpContext.Current.Response;
Response.Clear();
Response.ContentType = "application/pdf";
System.IO.MemoryStream m = new System.IO.MemoryStream();
Document document = new Document();
PdfWriter wri = PdfWriter.GetInstance(document, new FileStream(path, FileMode.Create));
PdfWriter.GetInstance(document, m);
document.Open();
document.Add(new Paragraph(DateTime.Now.ToString()));
document.NewPage();
document.Add(new Paragraph("Hello World"));
document.Close();
Response.OutputStream.Write(m.GetBuffer(), 0, m.GetBuffer().Length);
Response.OutputStream.Flush();
Response.OutputStream.Close();
Response.End();
}
You've got a couple of problems that I'll try to walk you through.
First, the Document object is only for working with new PDFs, not modifying existing ones. Basically the Document object is a bunch of wrapper classes that abstract away the underlying parts of the PDF spec and allow you to work with higher level things like paragraphs and reflowable content. These abstractions turn what you think of "paragraphs" into raw commands that write the paragraph one line at a time with no relationship between lines. When working with an existing document there's no safe way to say how to reflow text so these abstractions aren't used.
Instead you want to use the PdfStamper object. When working with this object you have two choices for how to work with potentially overlapping content, either your new text gets written on top of existing content or your text gets written below it. The two methods GetOverContent() or GetUnderContent() of an instantiated PdfStamper object will return a PdfContentByte object that you can then write text with.
There's two main ways to write text, either manually or through a ColumnText object. If you've done HTML you can think of the ColumnText object as using a big fixed-position single row, single column <TABLE>. The advantage of the ColumnText is that you can use the higher level abstractions such as Paragraph.
Below is a full working C# 2010 WinForms app targeting iTextSharp 5.1.2.0 that show off the above. See the code comments for any questions. It should be pretty easy to convert this to ASP.Net.
using System;
using System.IO;
using System.Windows.Forms;
using iTextSharp.text;
using iTextSharp.text.pdf;
namespace WindowsFormsApplication1 {
public partial class Form1 : Form {
public Form1() {
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e) {
string existingFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "file1.pdf");
string newFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "file2.pdf");
using (FileStream fs = new FileStream(existingFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.LETTER)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
doc.Add(new Paragraph("This is a test"));
doc.Close();
}
}
}
//Bind a PdfReader to our first document
PdfReader reader = new PdfReader(existingFile);
//Create a new stream for our output file (this could be a MemoryStream, too)
using (FileStream fs = new FileStream(newFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
//Use a PdfStamper to bind our source file with our output file
using (PdfStamper stamper = new PdfStamper(reader, fs)) {
//In case of conflict we want our new text to be written "on top" of any existing content
//Get the "Over" state for page 1
PdfContentByte cb = stamper.GetOverContent(1);
//Begin text command
cb.BeginText();
//Set the font information
cb.SetFontAndSize(BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1250, false), 16f);
//Position the cursor for drawing
cb.MoveText(50, 50);
//Write some text
cb.ShowText("This was added manually");
//End text command
cb.EndText();
//Create a new ColumnText object to write to
ColumnText ct = new ColumnText(cb);
//Create a single column who's lower left corner is at 100x100 and upper right is at 500x200
ct.SetSimpleColumn(100,100,500,200);
//Add a higher level object
ct.AddElement(new Paragraph("This was added using ColumnText"));
//Flush the text buffer
ct.Go();
}
}
this.Close();
}
}
}
As to your second problem about the FileStream vs MemoryStream, if you look at the method signature for almost every (actually all as far as I know) method within iTextSharp you'll see that they all take a Stream object and not just a FileStream object. Any time you see this, even outside of iTextSharp, this means that you can pass in any subclass of Stream which includes the MemoryStream object, everything else stays the same.
The code below is a slightly modified version of the one above. I've removed most of the comments to make it shorter. The main change is that we're using a MemoryStream instead of a FileStream. Also, when we're done with the PDF when need to close the PdfStamper object before accessing the raw binary data. (The using statment will do this for us automatically later but it also closes the stream so we need to manually do it here.)
One other thing, never, ever use the GetBuffer() method of the MemoryStream. It sounds like what you want (and I have mistakenly used it, too) but instead you want to use ToArray(). GetBuffer() includes uninitialized bytes which usually produces corrupt PDFs. Also, instead of writing to the HTTP Response stream I'm saving the bytes to array first. From a debugging perspective this allows me to finish all of my iTextSharp and System.IO code and make sure that it is correct, then do whatever I want with the raw byte array. In my case I don't have a web server handy so I'm writing them to disk but you could just as easily call Response.BinaryWrite(bytes)
string existingFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "file1.pdf");
string newFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "file2.pdf");
PdfReader reader = new PdfReader(existingFile);
byte[] bytes;
using(MemoryStream ms = new MemoryStream()){
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
PdfContentByte cb = stamper.GetOverContent(1);
ColumnText ct = new ColumnText(cb);
ct.SetSimpleColumn(100,100,500,200);
ct.AddElement(new Paragraph("This was added using ColumnText"));
ct.Go();
//Flush the PdfStamper's buffer
stamper.Close();
//Get the raw bytes of the PDF
bytes = ms.ToArray();
}
}
//Do whatever you want with the bytes
//Below I'm writing them to disk but you could also write them to the output buffer, too
using (FileStream fs = new FileStream(newFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
fs.Write(bytes, 0, bytes.Length);
}
The second part of your question title says:
"outputting as content disposition"
If that's what you really want you can do this:
Response.AddHeader("Content-Disposition", "attachment; filename=DESIRED-FILENAME.pdf");
Using a MemoryStream is unnecessary, since Response.OutputStream is available. Your example code is calling NewPage() and not trying to add the text to an existing page of your PDF, so here's one way to do what you asked:
Response.ContentType = "application/pdf";
Response.AddHeader("Content-Disposition", "attachment; filename=itextTest.pdf");
PdfReader reader = new PdfReader(readerPath);
// store the extra text on the last (new) page
ColumnText ct = new ColumnText(null);
ct.AddElement(new Paragraph("Text on a new page"));
int numberOfPages = reader.NumberOfPages;
int newPage = numberOfPages + 1;
// get all pages from PDF "template" so we can copy them below
reader.SelectPages(string.Format("1-{0}", numberOfPages));
float marginOffset = 36f;
/*
* we use the selected pages above with a PdfStamper to copy the original.
* and no we don't need a MemoryStream...
*/
using (PdfStamper stamper = new PdfStamper(reader, Response.OutputStream)) {
// use the same page size as the __last__ template page
Rectangle rectangle = reader.GetPageSize(numberOfPages);
// add a new __blank__ page
stamper.InsertPage(newPage, rectangle);
// allows us to write content to the (new/added) page
ct.Canvas = stamper.GetOverContent(newPage);
// add text at an __absolute__ position
ct.SetSimpleColumn(
marginOffset, marginOffset,
rectangle.Right - marginOffset, rectangle.Top - marginOffset
);
ct.Go();
}
I think you've already figured out that the Document / PdfWriter combination doesn't work in this situation :) That's the standard method for creating a new PDF document.