How to get the header and footer of a page - itext

How do I get code in the header and footer of a PDF file using C # 2012 (asp.net) + iTextSharp (v5.4.5.0)?
At the moment I can extratir the pages but I need to separate the headers and footers of your content.
Below is my code to get the page:
PdfReader pdfreader = new PdfReader(nmfile);
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
for (int page = 1; page <= pdfreader.NumberOfPages; page++)
{
extractText = PdfTextExtractor.GetTextFromPage(pdfreader, page, strategy);
extractText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(extractText)));
//...
}
Thanks to all

Related

LayoutResult more one page in Itext7

i ask this: Remove the first and last lines properties in the paper Itext7
and if i do it:
PdfWriter pdfWriter = new PdfWriter(dest);
PdfDocument pdfDoc = new PdfDocument(pdfWriter);
Div div = new Div();
Document doc = new Document(pdfDoc, PageSize.A5);
doc.setMargins(0,0,0,36);
for (int i = 0; i <50 ; i++) {
ListItem listItem = new ListItem();
String s= "hello "+i;
Paragraph p = new Paragraph();
for (int j = 0; j <s.length() ; j++) {
p.add("HELLO " +I);
}
LayoutResult result = div.createRendererSubTree().setParent(doc.getRenderer()).layout(new LayoutContext(new LayoutArea(0,PageSize.A5)));
List<IRenderer> childRendererParagraph = result.getSplitRenderer().getChildRenderers();
childRendererParagraph contain Paragraphs only from first page.And i don't know how many pages well be in pdf
As I mentioned in the answer to your previous question,
split renderer represent the part of the content which iText can place on the area, overflow - the content which overflows.
So if you want to layout the rest of the content, you can perform the same operation (layout) on your overflowRenderer.
The code is as follows:
LayoutResult firstPageResult = div.createRendererSubTree().setParent(doc.getRenderer()).layout(new LayoutContext(new LayoutArea(0, PageSize.A5)));
LayoutResult secondPageResult = firstPageResult.getOverflowRenderer().layout(new LayoutContext(new LayoutArea(1, PageSize.A5)));
Once the content has been fully placed, the overflowRenderer will be null.
pdf.setDefaultPageSize(new PageSize(595, 14400));
Resize the page to display all content on one page

Watermark specific page in PDF iTextSharp

I create the PDF dynamically based on the custom object data and water mark all pages in the PDF using iTextSharp 5.5.10.0 as shown in the code below.
Question: How can I watermark one or more page based on the criteria in the for loop? Custom object data has a boolean flag which indicates whether the pdf page could be watermarked?
Any help is greatly appreciated.
using (MemoryStream workStream = new MemoryStream())
{
using (Document document = new Document())
{
using (PdfWriter writer = PdfWriter.GetInstance(document, workStream))
{
HeaderFooter headerfooter = new HeaderFooter();
headerfooter.watermarkpage = WaterMarkPage;
headerfooter.includefooter = IncludeFooter;
writer.PageEvent = headerfooter;
writer.CloseStream = false;
document.Open();
PdfContentByte cb = writer.DirectContent;
foreach (TestResult testresultdata in testresults)
{
document.Add(PopulateData(testresultdata, reporttitle, includesponsordata));
// Move to next page
document.NewPage();
} // end foreach Order
document.Close();
} // end writer
} // end document
}

Mixing text and images causes incorrect vertical positioning

Using iTextSharp version 5.5.8 (same bug existed in 5.5.7), there's an unpleasant bug when you add images to Chapters and Sections - the images and the section headings start out OK but quickly become offset relative to each other.
The PDF generated from the following code starts out correctly, it says "Section 1" and below it is the image. The next section ("Section 2") has a little of the image overlapping the section text, the next section is even worse, etc. I think it's the text that's mal-positioned, not the image.
Is this a known iTextSharp bug?
static Document m_doc = null;
static BaseFont m_helvetica = null;
static Font m_font = null;
static PdfWriter m_writer = null;
static Image m_image = null;
static void Main(string[] args)
{
m_doc = new Document(PageSize.LETTER);
m_helvetica = BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
m_font = new Font(m_helvetica, 10.0f);
m_writer = PdfWriter.GetInstance(m_doc, new FileStream("Output.pdf", FileMode.Create));
m_writer.StrictImageSequence = true;
m_doc.Open();
m_doc.Add(new Chunk("Created by iTextSharp version " + new iTextSharp.text.Version().GetVersion, m_font));
Chapter chapter = new Chapter("Chapter 1", 1);
chapter.TriggerNewPage = false;
if (m_image == null)
{
m_image = Image.GetInstance(new Uri("https://pbs.twimg.com/profile_images/2002307628/Captura_de_pantalla_2012-03-17_a_la_s__22.14.48.png"));
m_image.ScaleAbsolute(100, 100);
}
for (int i = 0; i < 5; i++)
{
Section section = chapter.AddSection(18, "Section " + (i + 1));
section.Add(new Chunk(" ", m_font));
section.Add(m_image);
}
m_doc.Add(chapter);
m_doc.Close();
}
From the documentation for the Java version:
A Section is a part of a Document containing other Sections, Paragraphs, List and/or Tables.
Further looking at the Add() method in the C# source we see:
Adds a Paragraph, List, Table or another Section
Basically, instead of a Chunk use a Paragraph. So instead of this
section.Add(new Chunk(" ", m_font));
Use this:
section.Add(new Paragraph(new Chunk(" ", m_font)));
Or even just this:
section.Add(new Paragraph(" ", m_font));

iTextSharp does not render header/footer when using element generated by XmlWorkerHelper

I am trying to add header/footer to a PDF whose content is otherwise generated by XMLWorkerHelper. Not sure if it's a placement issue but I can't see the header/footer.
This is in an ASP.NET MVC app using iTextSharp and XmlWorker packages ver 5.4.4 from Nuget.
Code to generate PDF is as follows:
private byte[] ParseXmlToPdf(string html)
{
XhtmlToListHelper xhtmlHelper = new XhtmlToListHelper();
Document document = new Document(PageSize.A4, 30, 30, 90, 90);
MemoryStream msOutput = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(document, msOutput);
writer.PageEvent = new TextSharpPageEventHelper();
document.Open();
var htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
var cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
cssResolver.AddCssFile(HttpContext.Server.MapPath("~/Content/themes/mytheme.css"), true);
var pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, writer)));
var worker = new XMLWorker(pipeline, true);
var parser = new XMLParser();
parser.AddListener(worker);
using (TextReader sr = new StringReader(html))
{
parser.Parse(sr);
}
//string text = "Some Random Text";
//for (int k = 0; k < 8; ++k)
//{
// text += " " + text;
// Paragraph p = new Paragraph(text);
// p.SpacingBefore = 8f;
// document.Add(p);
//}
worker.Close();
document.Close();
return msOutput.ToArray();
}
Now instead of using these three lines
using (TextReader sr = new StringReader(html))
{
parser.Parse(sr);
}
if I comment them out and uncomment the code to add a random Paragraph of text (commented in above sample), I see the header/footer along with the random text.
What am I doing wrong?
The EventHandler is as follows:
public class TextSharpPageEventHelper : PdfPageEventHelper
{
public Image ImageHeader { get; set; }
public override void OnEndPage(PdfWriter writer, Document document)
{
float cellHeight = document.TopMargin;
Rectangle page = document.PageSize;
PdfPTable head = new PdfPTable(2);
head.TotalWidth = page.Width;
PdfPCell c = new PdfPCell(ImageHeader, true);
c.HorizontalAlignment = Element.ALIGN_RIGHT;
c.FixedHeight = cellHeight;
c.Border = PdfPCell.NO_BORDER;
head.AddCell(c);
c = new PdfPCell(new Phrase(
DateTime.UtcNow.ToString("yyyy-MM-dd HH:mm:ss") + " GMT",
new Font(Font.FontFamily.COURIER, 8)
));
c.Border = PdfPCell.TOP_BORDER | PdfPCell.RIGHT_BORDER | PdfPCell.BOTTOM_BORDER | PdfPCell.LEFT_BORDER;
c.VerticalAlignment = Element.ALIGN_BOTTOM;
c.FixedHeight = cellHeight;
head.AddCell(c);
head.WriteSelectedRows(
0, -1, // first/last row; -1 flags all write all rows
0, // left offset
// ** bottom** yPos of the table
page.Height - cellHeight + head.TotalHeight,
writer.DirectContent
);
}
}
Feeling angry and stupid for having lost more than two hours of my life to Windows 8.1! Apparently the built in PDF Reader app isn't competent enough to show 'Headers'/'Footers'. Installed Foxit Reader and opened the output and it shows the Headers allright!
I guess I should try with Adobe Reader too!!!
UPDATES:
I tried opening the generated PDF in Adobe Acrobat reader and finally saw some errors. This made me look at the HTML that I was sending to the XMLWorkerHelper more closely. The structure of the HTML had <div>s inside <table>s and that was tripping up the generated PDF. I cleaned up the HTML to ensure <table> inside <table> and thereafter the PDF came out clean in all readers.
Long story short, the code above is fine, but if you are testing for PDF's correctness, have Adobe Acrobat Reader handy at a minimum. Other readers behave unpredictably as in either not reporting errors or incorrectly rendering the PDF.

iText - Strange column- / page changing behaviour with ColumnText

I am quite new to iText and trying to accomplish the following:
read a list of text files from local hd
arrange the texts of the files in a 2-column layout pdf file
add a consecutively numbered index before each text
I started with the MovieColumns1 example (http://itextpdf.com/examples/iia.php?id=64) and ended up with the following code:
final float[][] COLUMNS_COORDS = { { 36, 36, 296, 806 }, { 299, 36, 559, 806 } };
Document document = new Document(PageSize.A4);
PdfWriter writer = PdfWriter.getInstance(document, resultFile);
document.open();
ColumnText ct = new ColumnText(writer.getDirectContent());
ct.setSimpleColumn(COLUMNS_COORDS[0][0], COLUMNS_COORDS[0][1],
COLUMNS_COORDS[0][2], COLUMNS_COORDS[0][3]);
File textDir = new File("c:/Users/raddatz/Desktop/123/texts/");
File[] files = textDir.listFiles();
int i = 1;
int column = 0;
for (File file : files) {
String text = FileUtils.readFileToString(file, "UTF-8");
float yLine = ct.getYLine();
System.out.println("adding '" + file.getName() + "'");
PdfPCell theText = new PdfPCell(new Phrase(text, new Font(Font.HELVETICA, 10)));
theText.setBorder(Rectangle.NO_BORDER);
theText.setPaddingBottom(10);
PdfPCell runningNumber = new PdfPCell(new Phrase(new DecimalFormat("00").format(i++), new Font(
Font.HELVETICA, 14, Font.BOLDITALIC,
new Color(0.7f, 0.7f, 0.7f))));
runningNumber.setBorder(Rectangle.NO_BORDER);
runningNumber.setPaddingBottom(10);
PdfPTable table = new PdfPTable(2);
table.setWidths(new int[] { 12, 100 });
table.addCell(runningNumber);
table.addCell(theText);
ct.addElement(table);
int status = ct.go(true);
if (ColumnText.hasMoreText(status)) {
column = Math.abs(column - 1);
if (column == 0) {
document.newPage();
System.out.println("inserting new page with size :" + document.getPageSize());
}
ct.setSimpleColumn(
COLUMNS_COORDS[column][0], COLUMNS_COORDS[column][1],
COLUMNS_COORDS[column][2], COLUMNS_COORDS[column][3]);
yLine = COLUMNS_COORDS[column][3];
System.out.println("correcting yLine to: " + yLine);
} else {
ct.addElement(table);
}
ct.setYLine(yLine);
System.out.println("before adding: " + ct.getYLine());
status = ct.go(false);
System.out.println("after adding: " + ct.getYLine());
System.out.println("--------------------------------");
}
document.close();
Here you can see the result:
http://d.pr/f/NEmx
Looking at the first page of the resulting PDF I assumed everything was working out fine.
But on second page you can see the problem(s):
text #31 is not displayed completely (first line + index is cut / not in visible area)
text #46 is not displayed completely (first three lines + index is cut / not in visible area)
On page 3 everything seems to be ok again. I am really lost here.
-- UPDATE (2013-03-14) --
I have analyzed the contents of the PDF now. The problem is not that content is show in non-visible areas but that the content is not present in the pdf at all. The missing part of the content is exactly the one which would have fit in the previous column / page. So it seems like ColumnText.go(true) is manipulating the object passed by addElement() before. Can someone confirm this? If so: what can I do about it?
-- end UPDATE (2013-03-14) --
Looking forward to your reply
regards,
sven
Solved! As soon as ColumnText indicates a table will not fit the current column I reinitialize ct with a new instance of ColumnText and add the table again.
In other words:
Each instance of ColumnText is exactly dealing one column of my document.
if (ColumnText.hasMoreText(status) || mediaPoolSwitch) {
column = Math.abs(column - 1);
if (mediaPoolSwitch) {
currentMediaPool = mediapool;
column = 0;
}
if (column == 0) {
document.newPage();
writeTitle(writer.getDirectContent(), mediapool.getName());
}
ct = new ColumnText(writer.getDirectContent());
ct.addElement(table);
ct.setSimpleColumn(
COLUMNS_COORDS[column][0], COLUMNS_COORDS[column][1],
COLUMNS_COORDS[column][2], COLUMNS_COORDS[column][3]);
yLine = COLUMNS_COORDS[column][3];
LOG.debug("correcting yLine to: " + yLine);
} else {
ct.addElement(table);
}
ct.setYLine(yLine);
ct.go();