ITEXT- Large PDF generation optimization

ITEXT- Large PDF generation optimization - itext

I have to generate large PDF files , which has table with 30000 rows. I have to generate around 40 files which is taking 19 hours. Could any body suggest optimized way for the same. Most of the time taken by document.add(table) method.
I am using ITEXT 5.4
I used features of larElement interface , in my table I have 40 to 96 columns.
I can post the code later. below is pseudo code.
public void createTable(rs,document){
PdfpTable table = new PdfPTable(96)
table.setComplete(false);
int K=1;
while(rs.next) {
for(int i=1,i <=columnCount;i++) {
PdfPCell cell = new PdfPCell();
Chunk chunk = new Chunk(rs.getString(i))
cell.addElement(chunk);
table.addCell(cell)
}
k++;
}
if(k%==10000) {
document.add(table);
}
table.setComplete(true);
document.add(table);
}

Related

Nested Table issue with iText in .net

I use iText 7.0.4.0 with my .net application to generate pdfs. But inner tables overflow when the text is long.
Outer table has 10 columns with green border and seems it has rendered fine as per the image below. Each Outer table cell contains one table with one cell inside it.But Inner Table cell has overflown when the paragraph text is large.
I use iText in a large Forms building product. Hence I've recreated the issue with simple scenario and the code is given below. Please note that the number of columns are not fixed in real usage.
Could anyone please show me the correct path to achieve this?
Here is the C# Code
private Table OuterTable()
{
var columns = GetTableColumnWidth(10);
var outerTable = new Table(columns, true);
outerTable.SetWidthPercent(100);
for (int index = 0; index < columns.Length; index++)
{
Cell outerTableCell = new Cell();
Table innerTable = new Table(new float[] { 100 });
innerTable.SetWidthPercent(100);
Cell innerTableCell = new Cell();
Paragraph paragraph = new Paragraph("ABCDEFGHIJKL").AddStyle(_fieldValueStyle);
innerTableCell.Add(paragraph);
innerTable.AddCell(innerTableCell);
outerTableCell.Add(innerTable);
outerTable.AddCell(outerTableCell);
innerTableCell.SetBorder(new SolidBorder(Color.RED, 2));
innerTableCell.SetBorderRight(new SolidBorder(Color.BLUE, 2));
outerTableCell.SetBorder(new SolidBorder(Color.GREEN, 2));
}
return outerTable;
}

Thanks mkl for spending your valuable time. I solved my issue with your idea of 'no inner tables'. This is not how to solve the issue of nested tables mentioned in the question but another way of achieving the result.
I've used "\n" in the paragraph to achieve what I want. Here is the output and the code.
private Table OuterTable()
{
var columns = GetTableColumnWidth(10);
var outerTable = new Table(columns, true);
outerTable.SetWidthPercent(100);
for (int index = 0; index < columns.Length; index++)
{
Cell outerTableCell = new Cell();
outerTableCell.Add(GetContent());
outerTable.AddCell(outerTableCell);
}
return outerTable;
}
private Paragraph GetContent()
{
int maxIndex = 3;
Paragraph paragraph = new Paragraph();
for (int index = 0; index < maxIndex; index++)
{
paragraph.Add(index + " - ABCDEFGHIJKL \n").AddStyle(_fieldValueStyle);
}
return paragraph;
}

Repeated lines in PdfPTable at the end of the page?

I'm using a PdfPTable (iText) to print a table that is populated with some list of values.
The problem is that, in the case where the PdfPTable takes more than one page to be displayed, its last line is printed at the end of the first page and ALSO at the beginning of the second one.
Please find an example below :
EDIT :
Please find the code below :
protected static PdfPTable addUserList(PdfWriter writer, Document document, List<MyObject> objects) throws Exception {
PdfPTable headerTable = new PdfPTable(4);
headerTable.setWidthPercentage(100);
headerTable.setWidths(new int[] { 4, 7, 5, 3 });
PdfPCell headerCell = PDFUtils.makeDefaultCell(1);
headerCell.setBorderColor(Color.WHITE);
headerCell.setBorder(PdfPCell.RIGHT);
headerCell.setBorderWidth(1f);
Phrase phrase = new Phrase("Column1", Style.OPIFICIO_12_BOLD_WHITE);
headerCell.setHorizontalAlignment(Element.ALIGN_CENTER);
headerCell.setPhrase(phrase);
headerTable.addCell(headerCell);
phrase = new Phrase("Column2", Style.OPIFICIO_12_BOLD_WHITE);
headerCell.setPhrase(phrase);
headerCell.setHorizontalAlignment(Element.ALIGN_CENTER);
headerTable.addCell(headerCell);
phrase = new Phrase("Column3", Style.OPIFICIO_12_BOLD_WHITE);
headerCell.setPhrase(phrase);
headerCell.setHorizontalAlignment(Element.ALIGN_CENTER);
headerTable.addCell(headerCell);
phrase = new Phrase("Column4", Style.OPIFICIO_12_BOLD_WHITE);
Chunk chunk = new Chunk("(1)", Style.OPIFICIO_6_BOLD_WHITE);
chunk.setTextRise(7f);
phrase.add(chunk);
chunk = new Chunk("(XX)", Style.OPIFICIO_8_BOLD_WHITE);
chunk.setTextRise(1f);
phrase.add(chunk);
headerCell.setPhrase(phrase);
headerCell.setHorizontalAlignment(Element.ALIGN_CENTER);
headerCell.setBorder(PdfPCell.NO_BORDER);
headerTable.addCell(headerCell);
PdfPTable userTable = new PdfPTable(4);
userTable.setWidthPercentage(100);
userTable.setWidths(new int[] { 4, 7, 5, 3 });
PdfPCell cell = PDFUtils.makeDefaultCell(1);
cell.setBackgroundColor(null);
cell.setPaddingTop(2f);
cell.setPaddingLeft(6f);
cell.setPaddingRight(6f);
for (MyObject object : objects) {
if (object != null) {
cell.setHorizontalAlignment(Element.ALIGN_LEFT);
if (object.getAttribute1() != null) {
phrase = new Phrase(object.getAttribute1(), Style.FUTURASTD_10_NORMAL_BLACK);
} else {
phrase = new Phrase("", Style.FUTURASTD_10_NORMAL_BLACK);
}
cell.setBorderWidth(1f);
cell.setBorderColor(Color.WHITE);
cell.setBorder(PdfPCell.RIGHT);
cell.setPhrase(phrase);
userTable.addCell(cell);
phrase = new Phrase(object.getAttribute2(), Style.FUTURASTD_10_NORMAL_BLACK);
cell.setBorderWidth(1f);
cell.setBorderColor(Color.WHITE);
cell.setBorder(PdfPCell.RIGHT);
cell.setPhrase(phrase);
userTable.addCell(cell);
phrase = new Phrase(object.getAttribute3(), Style.FUTURASTD_10_NORMAL_BLACK);
cell.setBorderWidth(1f);
cell.setBorderColor(Color.WHITE);
cell.setBorder(PdfPCell.RIGHT);
cell.setPhrase(phrase);
userTable.addCell(cell);
phrase = new Phrase(object.getAttribute4(), Style.FUTURASTD_10_NORMAL_BLACK);
cell.setBorder(PdfPCell.NO_BORDER);
cell.setHorizontalAlignment(Element.ALIGN_RIGHT);
cell.setPhrase(phrase);
userTable.addCell(cell);
}
}
PdfPTable mainTable = new PdfPTable(1);
mainTable.setWidthPercentage(100);
mainTable.setSplitLate(false);
mainTable.setHeaderRows(1);
PdfPCell cellH = new PdfPCell();
cellH.addElement(headerTable);
cellH.setBorder(Rectangle.NO_BORDER);
cellH.setCellEvent(new PDFUtils.CellBackgroundRedRecap());
mainTable.addCell(cellH);
if (userTable.getRows().size() > 0) {
PdfPCell cellUser = PDFUtils.makeDefaultCell(1);
cellUser.setPaddingTop(7f);
cellUser.setCellEvent(new PDFUtils.CellBackgroundRecap());
cellUser.setBorder(PdfCell.NO_BORDER);
cellUser.addElement(userTable);
mainTable.addCell(cellUser);
}
return mainTable;
}

The problem actually was solved years ago (as #Bruno pointed out in a comment).
The solution, therefore, is to replace the older iText version used by a more current one in which the problem is fixed.
Indeed, the OP was having an old version in the pom.xml of another module of his project that was conflicting with the one he was modifying. He has deleted it and now it works.
Updating from versions 2.x (including the unofficial 4.2.0) to 5.x requires at least updating import statements as itext was moved from com.lowagie to com.itextpdf. Further changes might be necessary to adapt to actual API changes, e.g. the signing API was overhauled during the 5.3.x versions; the basic API structure, though, remained fairly stable during the 5.x versions.
Furthermore, updating from anything below 7 to 7.x requires greater changes as the whole iText API has been re-designed to get rid of sub-optimal aspects of the earlier API design.

Jasper Reports: Exporting report to multiple files

I'm developing a jrxml template for generate job candidate's resume. The candidates are in my database.
I need to generate a Word file (.docx) for 1 record (by job candidate), as the image below:
How can I make Jasper generate one file for each record of my SQL query? And export these files to Word?
I saw there is a parameter called PAGE_INDEX exporter. But I did not find how to use it ...
Can someone help me please?
Note 1: My reports are not generated by JasperServer. I developed a Java program to generate them and send reports by email.
Note 2: The number of pages for each candidate may be different.
Updating status
I managed to generate one record per file. But I could only generate the file to the first record.
I need to generate other files for the remaining records.
I'm still with the another problem too: how to separate into separate files when the number of pages for each record (candidate entity) can change?
final JRDocxExporter exporter = new JRDocxExporter();
exporter.setExporterInput(new SimpleExporterInput(jasperPrint));
exporter.setExporterOutput(new SimpleOutputStreamExporterOutput(new java.io.File("/home/admin/resume candidate.docx")));
SimpleDocxReportConfiguration configuration = new SimpleDocxReportConfiguration();
configuration.setPageIndex(0);
exporter.setConfiguration(configuration);
exporter.exportReport();

PROBLEM SOLUTION
I solved the problem by inserting a variable in the footer of each page with the expression: $V{REPORT_COUNT}, which have record count that is in the Detail Band:
After that, the Java program do loop between the pages of JasperPrint object.
So, i locate that element that tells me what page belongs to candidate.
Based on this information and storing candidate index data and its pages (in a HashMap > mapCandPage), I can determine the page that starts and the page ends for each candidate. And that way I can export one document for each candidate record.
public static void main(String args[]) throws Exception {
File relJasperArqFile = new File("Candidate Resume Template.jasper");
Connection conn = ConnectionFactory.getNewConnectionSQLDRIVER();
JasperReport jasperReport = (JasperReport) JRLoader.loadObject(relJasperArqFile);
JasperPrint jasperPrint
= JasperFillManager.fillReport(jasperReport,
null,
conn);
final JRDocxExporter exporter = new JRDocxExporter();
exporter.setExporterInput(new SimpleExporterInput(jasperPrint));
List<JRPrintPage> listPrintPage = jasperPrint.getPages();
int candIdx = 0;
int fileIdx = 0;
int lastCandIdx = 0;
HashMap<Integer, List<Integer>> mapCandPage = new HashMap<>();
for (int pageIdx = 0; pageIdx < listPrintPage.size(); pageIdx++) {
JRPrintPage page = listPrintPage.get(pageIdx);
candIdx = getCandIdx(page);
if (!mapCandPage.containsKey(candIdx)) {
mapCandPage.put(candIdx, (new ArrayList<>()));
}
mapCandPage.get(candIdx).add(pageIdx);
if (pageIdx > 0 && candIdx != lastCandIdx) {
fileIdx++;
exporter.setExporterOutput(new SimpleOutputStreamExporterOutput(new File(String.format("Candidate Resume %d.docx", fileIdx))));
SimpleDocxReportConfiguration configuration = new SimpleDocxReportConfiguration();
configuration.setStartPageIndex(mapCandPage.get(lastCandIdx).get(0));
configuration.setEndPageIndex(mapCandPage.get(lastCandIdx).get(mapCandPage.get(lastCandIdx).size() - 1));
exporter.setConfiguration(configuration);
exporter.exportReport();
}
lastCandIdx = candIdx;
}
fileIdx++;
exporter.setExporterOutput(new SimpleOutputStreamExporterOutput(new File(String.format("Candidate Resume %d.docx", fileIdx))));
SimpleDocxReportConfiguration configuration = new SimpleDocxReportConfiguration();
configuration.setStartPageIndex(mapCandPage.get(lastCandIdx).get(0));
configuration.setEndPageIndex(mapCandPage.get(lastCandIdx).get(mapCandPage.get(lastCandIdx).size() - 1));
exporter.setConfiguration(configuration);
exporter.exportReport();
}
public static Integer getCandIdx(JRPrintPage page) {
JRPrintElement lastRowNumber = page.getElements().get(page.getElements().size() - 1);
return Integer.parseInt(((JRTemplatePrintText) lastRowNumber).getFullText());
}
This is a test and my code is not optimized. If anyone has suggestions or a better idea, please post here. Thank you.

Replace the text in pdf document using itextSharp

I want to replace a particular text in PDF document. I am currently using itextSharp library to play with PDF documents.
I had extracted the bytes from pdfdocument and then replaced that byte and then write the document again with the bytes but it is not working. In the below example I am trying to replace string 1234 with 5678
Any advise on how to perform this would be helpful.
PdfReader reader = new PdfReader(opf.FileNames[i]);
byte[] pdfbytes = reader.GetPageContent(1);
PdfString oldstring = new PdfString("1234");
PdfString newstring = new PdfString("5678");
byte[] byte1022 = oldstring.GetOriginalBytes();
byte[] byte1067 = newstring.GetOriginalBytes();
int position = 0;
for (int j = 0; j <pdfbytes.Length ; j++)
{
if (pdfbytes[j] == byte1022[0])
{
if (pdfbytes[j+1] == byte1022[1])
{
if (pdfbytes[j+2] == byte1022[2])
{
if (pdfbytes[j+3] == byte1022[3])
{
position = j;
break;
}
}
}
}
}
pdfbytes[position] = byte1067[0];
pdfbytes[position + 1] = byte1067[1];
pdfbytes[position + 2] = byte1067[2];
pdfbytes[position + 3] = byte1067[3];
File.WriteAllBytes(opf.FileNames[i].Replace(".pdf","j.pdf"), pdfbytes);

What makes you think 1234 is part of the page's content stream and not of a form XObject? Your code is never going to work in general if you don't parse all the resources of a page.
Also: I see GetPageContent(), but I don't see you using SetPageContent() anywhere. How are the changes ever going to be stored in the PdfReader object?
Moreover, I don't see you using PdfStamper to write the altered PdfReader contents to a file.
Finally: I'm to shy to quote the words of Leonard Rosenthol, Adobe's PDF Architect, but ask him, and he'll tell you personally that you shouldn't do what you're trying to do. PDF is NOT a format for editing.Read the intro of chapter 6 of the book I wrote on iText: http://www.manning.com/lowagie2/samplechapter6.pdf

Merge 2 pdf byte streams using Itextsharp

I have a method that returns a pdf byte stream (from fillable pdf) Is there a straight forward way to merge 2 streams into one stream and make one pdf out of it? I need to run my method twice but need the two pdf's into One pdf stream. Thanks.

You didn't say if you're flattening the filled forms with the PdfStamper, so I'll just say you must flatten the before trying to merge them. Here's a working .ashx HTTP handler:
<%# WebHandler Language="C#" Class="mergeByteForms" %>
using System;
using System.IO;
using System.Web;
using iTextSharp.text;
using iTextSharp.text.pdf;
public class mergeByteForms : IHttpHandler {
HttpServerUtility Server;
public void ProcessRequest (HttpContext context) {
Server = context.Server;
HttpResponse Response = context.Response;
Response.ContentType = "application/pdf";
using (Document document = new Document()) {
using (PdfSmartCopy copy = new PdfSmartCopy(
document, Response.OutputStream) )
{
document.Open();
for (int i = 0; i < 2; ++i) {
PdfReader reader = new PdfReader(_getPdfBtyeStream(i.ToString()));
copy.AddPage(copy.GetImportedPage(reader, 1));
}
}
}
}
public bool IsReusable { get { return false; } }
// simulate your method to use __one__ byte stream for __one__ PDF
private byte[] _getPdfBtyeStream(string data) {
// replace with __your__ PDF template
string pdfTemplatePath = Server.MapPath(
"~/app_data/template.pdf"
);
PdfReader reader = new PdfReader(pdfTemplatePath);
using (MemoryStream ms = new MemoryStream()) {
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
AcroFields form = stamper.AcroFields;
// replace this with your form field data
form.SetField("title", data);
// ...
// this is __VERY__ important; since you're using the same fillable
// PDF, if you don't set this property to true the second page will
// lose the filled fields.
stamper.FormFlattening = true;
}
return ms.ToArray();
}
}
}
Hopefully the inline comments make sense. _getPdfBtyeStream() method above simulates your PDF byte streams. The reason you need to set FormFlattening to true is that a when you fill PDF form fields, names are supposed to be unique. In your case the second page is the same fillable PDF form, so it has the same field names as the first page and when you fill them they're ignored. Comment out the example line above:
stamper.FormFlattening = true;
to see what I mean.
In other words, a lot of the generic code to merge PDFs on the Internet and even here on stackoverflow will not work (for fillable forms) because Acrofields are not being accounted for. In fact, if you take a look at stackoverflow's about itextsharp tag "SO FAQ & Popular" to Merge PDFs, it's mentioned in the third comment for the correctly marked answer by #Ray Cheng.
Another way to merge fillable PDF (without flattening the form) is to rename the form fields for the second/following page(s), but that's more work.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

ITEXT- Large PDF generation optimization - itext

Related

Nested Table issue with iText in .net

Repeated lines in PdfPTable at the end of the page?

Jasper Reports: Exporting report to multiple files

Replace the text in pdf document using itextSharp

Merge 2 pdf byte streams using Itextsharp

Categories

Resources