How to generate PDF in Hebrew? currently the PDF is generated empty - itext

I'm using iTextSharp 5.5.13 and when i try to generate the PDF with Hebrew it comes out empty.
this is my code: I'm i doing something wrong?
public byte[] GenerateIvhunPdf(FinalIvhunSolution ivhun)
{
byte[] pdfBytes;
using (var mem = new MemoryStream())
{
Document document = new Document(PageSize.A4);
PdfWriter writer = PdfWriter.GetInstance(document, mem);
writer.PageEvent = new MyHeaderNFooter();
document.Open();
var font = new
Font(BaseFont.CreateFont("C:\\Downloads\\fonts\\Rubik-Light.ttf", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED), 14);
Paragraph p = new Paragraph("פסקת פתיחה")
{
Alignment = Element.ALIGN_RIGHT
};
PdfPTable table = new PdfPTable(2)
{
RunDirection = PdfWriter.RUN_DIRECTION_RTL
};
PdfPCell cell = new PdfPCell(new Phrase("מזהה", font));
cell.BackgroundColor = BaseColor.BLACK;
table.AddCell(cell);
document.Add(p);
document.Add(table);
document.Close();
pdfBytes = mem.ToArray();
}
return pdfBytes;
}
The PDF comes out blank

I changed a few details of your code, and now I get this:
My changes:
Embedding the font
As I don't have Rubik installed on my system, I have to embed the font into the PDF to have a chance to see anything. Thus, I replaced BaseFont.NOT_EMBEDDED by BaseFont.EMBEDDED when creating the var font:
var font = new Font(BaseFont.CreateFont("Rubik-Light.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED), 14);
Making the Paragraph kind of work
You create the Paragraph p without specifying a font. Thus, a default font with default encoding is used. The default encoding is WinAnsiEncoding which is Latin1-like, so no Hebrew characters can be represented. I added your Rubik font instance to the Paragraph p creation:
Paragraph p = new Paragraph("פסקת פתיחה", font)
{
Alignment = Element.ALIGN_RIGHT
};
Et voilà, the writing appears.
iText developers often have communicated that in iText 5.x and earlier right-to-left scripts are only supported properly in certain contexts, e.g. in tables, but not in others like paragraphs immediately added to the document. As your Paragraph p is added immediately to the Document document, its letters appear in the wrong order in the output.
Making the PdfPTable work
You defined the PdfPTable table to have two columns (new PdfPTable(2)) but then you added only one cell. Thus, table contains not even a single complete row. iText, therefore, draws nothing when it is added to the document.
I changed the definition of table to have a single column only:
PdfPTable table = new PdfPTable(1)
{
RunDirection = PdfWriter.RUN_DIRECTION_RTL
};
Furthermore, I commented out the line setting the cell background to black because usually it is difficult to read black on black:
PdfPCell cell = new PdfPCell(new Phrase("מזהה", font));
//cell.BackgroundColor = BaseColor.BLACK;
table.AddCell(cell);
And again the writing appears.
Properly downloading the font
Another possible obstacle is that when downloading the font from the URL you gave — https://fonts.google.com/selection?selection.family=Rubik — one can see in the customization tab of the selection drawer that by default only Latin characters are included in the download, in particular not Hebrew ones:
I tested with a font file I downloaded with all language options enabled:

Related

itext 7 c# how to clip an existing pdf

let's say I have a bunch of pdf files that I want to migrate into a new pdf. BUT the new pdf file is a table-structured file. And the content of the pdf files should fit in the first cell of a two-column-table.
I am not sure if the approach of working with tables is correct. I am open to any other solutions. All I want is at the end some custom text at the top, followed by pdf content and a checkbox on the right side. (One per pdf content)
What I have so far:
`
PdfDocument pdfDoc = new PdfDocument(new PdfWriter(dest));
Document doc = new Document(pdfDoc, PageSize.A4);
doc.SetMargins(0f, 0f, 18f, 18f);
PdfReader reader = new PdfReader(src);
PdfDocument srcDoc = new PdfDocument(reader);
Table table = new Table(new float[] { 2f, 1f });
PdfFormXObject imagePage = srcDoc.GetFirstPage().CopyAsFormXObject(pdfDoc);
var image = new Image(imagePage);
Cell cell = new Cell().Add(image);
cell.SetHorizontalAlignment(HorizontalAlignment.LEFT);
cell.SetVerticalAlignment(VerticalAlignment.TOP);
table.AddCell(cell);
Table checkTable = new Table(2);
Cell cellCheck1 = new Cell();
cellCheck1.SetNextRenderer(new CheckboxCellRenderer(cellCheck1, "cb1", 0));
cellCheck1.SetHeight(50);
checkTable.AddCell(cellCheck1);
Cell cellCheck2 = new Cell();
cellCheck2.SetNextRenderer(new CheckboxCellRenderer(cellCheck2, "cb2", 1));
cellCheck2.SetHeight(50);
checkTable.AddCell(cellCheck2);
table.AddCell(checkTable);
doc.Add(table);
doc.Close();`
My Problem here is that the pdf content has still its margin. Which completely spoils the design. It is so frustrating, I appreciate any help.
You say
My Problem here is that the pdf content has still its margin. Which completely spoils the design.
PDFs (usually) don't know anything about margins. Thus, you have to detect the margins of the page to import first. You can do this by parsing the page content into an event listener that keeps track of the bounding box of drawing instructions, like the TextMarginFinder. Then you can reduce the source page to those dimensions. This can be done by means of the following method:
PdfPage restrictToText(PdfPage page)
{
TextMarginFinder finder = new TextMarginFinder();
new PdfCanvasProcessor(finder).ProcessPageContent(page);
Rectangle textRect = finder.GetTextRectangle();
page.SetMediaBox(textRect);
page.SetCropBox(textRect);
return page;
}
You apply this method in your code right before you copy the page as form XObject, i.e. you replace
PdfFormXObject imagePage = srcDoc.GetFirstPage().CopyAsFormXObject(pdfDoc);
by
PdfFormXObject imagePage = restrictToText(srcDoc.GetFirstPage()).CopyAsFormXObject(pdfDoc);
This causes the Image this XObject will be embedded in to have the correct size. Unfortunately it will be somewhat mispositioned because the restricted page still has the same coordinate system as the original one, merely its crop box defines a smaller section than before. To fix this, one has to apply an offset, one has to subtract the coordinates of the lower left corner of the page crop box which has become the XObject bounding box. Thus, add after instantiating the Image the following code:
Rectangle bbox = imagePage.GetBBox().ToRectangle();
image.SetProperty(Property.LEFT, -bbox.GetLeft());
image.SetProperty(Property.BOTTOM, -bbox.GetBottom());
image.SetProperty(Property.POSITION, LayoutPosition.RELATIVE);
Now the restricted page is properly positioned in your table cell.
Beware: The TextMarginFinder (as its name indicates) determines the margins by text alone. Thus, if the page contains other contents, too, e.g. decorations like a logo, this logo is ignored and might eventually be cut out. If you want such decorations, too, in your overviews, you have to use a different margin finder class.

Embedding font using itext 5 for PDF/UA compliance

We are currently building a proof of concept to generate PDF/UA compliant PDF from from a CSS and html (xhtml) file using xslt. We are able tag the PDF and add the appropriate metadata information.
The last major issue we are unable to solve is embedding a standard PDF font zapfdinbats, which our accessibility assessment tool complains about - using PAC 2.0 along with adobe DC built in checker.
As you can see from the image below the other fonts we are using seems automatically get embedded using the xmlworker from our CSS.
I have also tried finding the font as indicated and found one, however, it doesn't seem to be the correct one.
Here is a sample of our code
private static ReturnValue CreateFromHtml(string html)
{
ReturnValue Result = new ReturnValue();
var stream = new MemoryStream();
using (var doc = new Document(PageSize.LETTER))
{
using (var ms = new MemoryStream())
{
using (var writer = PdfWriter.GetInstance(doc, ms))
{
writer.CloseStream = false;
writer.SetPdfVersion(PdfWriter.PDF_VERSION_1_7);
//TAGGED PDFVERSION_1_7
//Make document tagged
writer.SetTagged();
//===============
//PDF/UA
//Set document metadata
writer.ViewerPreferences = PdfWriter.DisplayDocTitle;
doc.AddLanguage("en-US");
doc.AddTitle("document title");
writer.CreateXmpMetadata();
doc.Open();
var embedfont = HttpContext.Current.Server.MapPath("~/scripts/ZapfDingbats.ttf");
var fontProv = new XMLWorkerFontProvider();
fontProv.DefaultEncoding = "UTF-8";
fontProv.Register(embedfont);
//Testing zapfDingbats font
Font font = FontFactory.GetFont(embedfont, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Paragraph p1 = new Paragraph("Testing of Fonts", font);
doc.Add(p1);
//end font processing
var tagProcessors = (DefaultTagProcessorFactory)Tags.GetHtmlTagProcessorFactory();
tagProcessors.RemoveProcessor(HTML.Tag.IMG);
tagProcessors.AddProcessor(HTML.Tag.IMG, new CustomImageTagProcessor());
var cssFiles = new CssFilesImpl();
cssFiles.Add(XMLWorkerHelper.GetInstance().GetDefaultCSS());
var cssResolver = new StyleAttrCSSResolver(cssFiles);
var charset = Encoding.UTF8;
var context = new HtmlPipelineContext(new CssAppliersImpl(new XMLWorkerFontProvider()));
context.SetAcceptUnknown(true).AutoBookmark(true).SetTagFactory(tagProcessors);
var htmlPipeline = new HtmlPipeline(context, new PdfWriterPipeline(doc, writer));
var cssPipeline = new CssResolverPipeline(cssResolver, htmlPipeline);
var worker = new XMLWorker(cssPipeline, true);
var xmlParser = new XMLParser(true, worker, charset);
using (var sr = new StringReader(html))
{
xmlParser.Parse(sr);
doc.Close();
ms.Position = 0;
ms.CopyTo(stream);
stream.Position = 0;
}
}
}
}
// get bytes from stream
Result.Data = stream.ToArray();
// success
Result.Success = true;
return Result;
}
Maybe there is something in the CSS we need to do (our CSS is quite large f
iText only ships with the Adobe Font Metrics (AFM) file of Zapfdingbats. This means that you can't embed that font unless you provide the corresponding PostScript Font Binary (PFB) file. This PFB file can't be shipped with iText because iText doesn't have a license to do so.
The first step to solve this, is to:
purchase a Zapfdingbats license so that you get the PFB (If I recall correctly, it's a font owned by Adobe), or
use an alternative font when you want to insert special characters (check boxes, phone symbols,...) into your text (e.g. purchase a license for the AdobePiStd font that was used as a substitution font and use that font instead of Zapfdingbats).
In your case, you provided a font ZapfDingbats.ttf which you register with the XMLWorkerFontProvider. When you register this font, it can be recognized through an alias. If ZapfDingbats.ttf isn't picked up by XML Worker, there is probably a mismatch between the name of the font used in the PDF and the alias that was used when ZapfDingbats.ttf was registered.
What is the font name used for ZapfDingbats in the CSS? You should register ZapfDingbats using that name as alias.

How does the itext document divides its pages and fits elements into a page

Hi I recently posted a question here :
IText PDFImage seems to shrink or disappear during new pages after upgrade from 2.1.7 to 5.5.5 (Java .jars)
But I think it is not the problem with the library but more of a missing setting sort of problem. I am wondering if there is a way to control what element gets drawn on the existing page verses pushing to a new page
I want to do the following
-create document
-create pdfPTable
-create a bunch of image element for each PdfPCells
-add to pdfPTable then write to document
Result: It seems that some images get shrink near the end/beginng of the page or is missing ( seems like its trying to fit on to the page )
Sample code again for visibility
ByteArrayOutputStream baos = createTemporaryOutputStream();
Document doc = newDocument();
PdfWriter writer = newWriter(doc, baos);
writer.setViewerPreferences(PdfWriter.ALLOW_PRINTING | PdfWriter.PageLayoutSinglePage);
//create page rectangle landscape
Rectangle page = new Rectangle(PageSize.A4.rotate());
doc.setPageSize(page);
doc.setMargins((float)36.0, (float)36.0, (float)36.0, (float)36.0);
doc.open();
//create element pdf table.
PdfPTable table = new PdfPTable(new float[]{(float) 770.0});
table.setWidthPercentage(100);
table.setSplitRows(true);
table.setSplitLate(false);
table.setHeaderRows(0);
// in my case I used 5 800*600 images (same picture)
//then I loop through them and create pdfcell
//and then add it to table which then gets added to the document
List<Image> hi = (List<Image>) model.get("images");
for (Image image : hi) {
com.itextpdf.text.Image pdfImage = com.itextpdf.text.Image.getInstance(image.getBytes());
pdfImage.scalePercent((float) (0.8642384 * 100));
PdfPCell cell = new PdfPCell(pdfImage, false);
table.addCell(cell);
}
doc.add(table);
doc.close();
thank you for your time. Any insight as to what my problem is would be helpful

How to move text written in Type 3 Font from one pdf to other pdf?

I have a pdf which include text written in Type 3 Font.
I want to get some text from it and write it into other pdf in exactly same shape.
I am using itext. Please give me a tip.
edit: I attached my code.
DocumentFont f = renderInfo.getFont();
String str = renderInfo.getText();
x = renderInfo.getBaseline().getStartPoint().get(Vector.I1);
In this code, I want to write str into x value position.
In Type 3 Font, is it work?
You can copy parts of one page to a new one using code like this:
InputStream resourceStream = getClass().getResourceAsStream("from.pdf");
PdfReader reader = new PdfReader(new FileOutputStream("from.pdf"));
Rectangle pagesize = reader.getPageSizeWithRotation(1);
Document document = new Document(pagesize);
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("areaOfFrom.pdf"));
document.open();
PdfContentByte content = writer.getDirectContent();
PdfImportedPage page = writer.getImportedPage(reader, 1);
content.saveState();
content.rectangle(0, 350, 360, 475);
content.clip();
content.newPath();
content.addTemplate(page, 0, 0);
content.restoreState();
document.close();
reader.close();
This turns your
into
Unfortunately, though, that hidden content is merely... hidden... but it is still there. You can especially mark the lines with that hidden text and try to copy&paste them.
If you want to completely remove that hidden text (or start out by merely copying the desired text), you have to inspect the content of the imported page and filter it. I'm afraid iText does not yet explicitly support something like that. It can be done using the iText lowlevel API but it is quite some work.

Add itextsharp barcode image to PdfPCell

I'm trying generate barcodes from a text string and then PDF the results using iTextSharp. Right now, I have the code below:
// Create a Document object
var pdfToCreate = new Document(PageSize.A4, 0, 0, 0, 0);
// Create a new PdfWrite object, writing the output to a MemoryStream
var outputStream = new MemoryStream();
var pdfWriter = PdfWriter.GetInstance(pdfToCreate, outputStream);
PdfContentByte cb = new PdfContentByte(pdfWriter);
// Open the Document for writing
pdfToCreate.Open();
PdfPTable BarCodeTable = new PdfPTable(3);
// Create barcode
Barcode128 code128 = new Barcode128();
code128.CodeType = Barcode.CODE128_UCC;
code128.Code = "00100370006756555316";
// Generate barcode image
iTextSharp.text.Image image128 = code128.CreateImageWithBarcode(cb, null, null);
// Add image to table cell
BarCodeTable.AddCell(image128);
// Add table to document
pdfToCreate.Add(BarCodeTable);
pdfToCreate.Close();
When I run this and try to open the PDF programmatically, I receive an error saying "The document has no pages."
However when I use:
pdfToCreate.Add(image128);
(instead of adding to a table, I add it to a cell), the barcode shows up (though it's floating off the document).
I'd like to send the barcodes to a table so that I can format the document easier. The final product will be 20-60 barcodes read from a database. Any idea on where I'm going wrong?
You are telling the PdfPTable constructor to create a three column table but only providing one of the columns:
PdfPTable BarCodeTable = new PdfPTable(3);
iTextSharp by default ignores incomplete rows. You can either change the constructor to match your column count, add the extra cells or call BarCodeTable.CompleteRow() which will fill in the blanks using the default cell template.