Embedding fonts flattening a document with XFAFlattener - itext

I'm flattening an existing document with XFA_Worker. Is there a way to force fonts to be embedded in the flattened document?
using iText = iTextSharp.text;
using iTextPDF = iTextSharp.text.pdf;
using iTextImage = iTextSharp.text.Image;
using iTextReader = iTextSharp.text.pdf.PdfReader;
using iTextWriter = iTextSharp.text.pdf.PdfWriter;
reader = new iTextReader(tempFileDirectory + "\\" + "orig_" + tmpFileName);
iTextPDF.AcroFields form = reader.AcroFields;
iTextPDF.XfaForm xfa = form.Xfa;
if (xfa.XfaPresent)
{
iText.Document document = new iText.Document();
iTextWriter writer = iTextWriter.GetInstance(document,
new FileStream(tempFileDirectory + "\\" + tmpFileName, FileMode.Create));
XFAFlattener xfaf = new XFAFlattener(document, writer);
xfaf.Flatten(reader);
document.Close();
writer.Close();
}

For a specific project, I have HTML that requires a regular font and a bold font that are embedded. The requirements for the PDF are also very strict: I am not allowed to use any font other than OpenSans and OpenSans bold, and the text has to be stored as UNICODE (no simple fonts allowed).
To meet these requirements, I wrote the following implementation of the FontProvider interface:
class MyFontProvider implements FontProvider {
protected BaseFont regular;
protected BaseFont bold;
public MyFontProvider() throws DocumentException, IOException {
regular = BaseFont.createFont("resources/fonts/OpenSans-Regular.ttf", BaseFont.IDENTITY_H, true);
bold = BaseFont.createFont("resources/fonts/OpenSans-Bold.ttf", BaseFont.IDENTITY_H, true);
}
public boolean isRegistered(String fontname) {
return true;
}
public Font getFont(String fontname, String encoding, boolean embedded, float size, int style, BaseColor color) {
Font font;
switch (style) {
case Font.BOLD:
font = new Font(bold, size);
break;
default:
font = new Font(regular, size);
}
font.setColor(color);
return font;
}
}
I create two BaseFont instances in the constructor. In the getFont() method, I ignore the fontname, encoding and embedded parameters. I will always return embedded OpenSans using Identity-H as "encoding". I do apply the font size, font color, and the style (but only in case the style is set to bold; in all other cases, I use the regular font). You can adapt this FontProvider any way you like, but in my case, these were strict requirements for the project.
Obviously, I need the "long" version of the XML Worker code because I now need to declare MyFontProvider to the HTML pipeline:
// CSS
CSSResolver cssResolver = new StyleAttrCSSResolver();
CssFile cssFile = XMLWorkerHelper.getCSS(new FileInputStream(CSS));
cssResolver.addCss(cssFile);
// HTML
CssAppliers cssAppliers = new CssAppliersImpl(new MyFontProvider());
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
// Pipelines
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new ByteArrayInputStream(invoice));
Every time XML Worker needs a font, it will call MyFontProvider and the getFont() method will return no other font than embedded OpenSans or OpenSans-Bold.
The same principle exists for XFA Flattener where you can also work with CssAppliers and FontProvider implementations. As XFA Worker is a closed source product, which means that you are either using the trial version or that you are a customer of iText Group. In both cases, you should contact iText Group directly if this answer doesn't fix your problem.

Related

Itext How to set both UTF encoding and Font to a Paragraph

I want to set a font to a text that is in cyrillic.I successfully convert the text to cyrilic, but i cannot set a Font to the same text.
File fontFile = new File("arialuni.ttf");
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(RESULT));
document.open();
writer.getAcroForm().setNeedAppearances(true);
Font boldFont = new Font(Font.FontFamily.TIMES_ROMAN, 18, Font.BOLD);
Font normalFont = new Font(Font.FontFamily.TIMES_ROMAN, 10, Font.ITALIC);
BaseFont unicode = BaseFont.createFont(fontFile.getAbsolutePath(), BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
FontSelector fs = new FontSelector();
fs.addFont(new Font(unicode));
addContent(document,article.getTitle(),fs,boldFont);
private static void addContent(Document document,String paragraph,FontSelector fs,Font font) throws DocumentException {
Phrase phrase = fs.process(paragraph);
Paragraph p = new Paragraph(phrase.toString(),font);
document.add(p);
}
As #mkl indicates in the comments, you are mixing FontSelector functionality that gives you a Phrase that could use the appropriate unicode fonts (fonts with BaseFont.IDENTITY_H as encoding parameter), with creating a Paragraph with a simple font (Font.FontFamily.TIMES_ROMAN).
When you do fs.process(paragraph), you get a Phrase in which every Chunk has the correct font, but when you do phrase.toString(), you throw away all those fonts, and you replace them with Font.FontFamily.TIMES_ROMAN. That doesn't make any sense.
Why don't you replace this:
Phrase phrase = fs.process(paragraph);
Paragraph p = new Paragraph(phrase.toString(),font);
document.add(p);
with:
document.add(fs.process(paragraph));
Why does your addContent() method need a Font as parameter? Also, if you really need a Paragraph object you can also do this:
Paragraph p = new Paragraph();
p.add(fs.process(paragraph));
document.add(p);
Or even:
Paragraph p = new Paragraph(fs.process(paragraph));
document.add(p);
As long as you don't replace the correct fonts with the incorrect fonts by "flattening" the Phrase to a String, you're probably OK.
Note that you probably even don't need the FontSelector. There's nothing wrong with doing this:
BaseFont unicode = BaseFont.createFont(
fontFile.getAbsolutePath(), BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font font = new Font(unicode, 12);
Paragraph p = new Paragraph(paragraph, font);
It seems to me that you are making things unnecessary complex.

How do you underline text with dashedline in ITEXT PDF

I have underlined "Some text" by
var par = new Paragraph();
par.Add(new Chunk("Some text", CreateFont(12, Font.UNDERLINE)));
document.Add(par);
It is possible underline just "Some text" with dashed line (not the paragraph)?
Thanks
This answer tells you how to do it but unfortunately doesn't provide any code so I've provided it below.
To the best on my knowledge there isn't a direct way to achieve this in iTextSharp by just setting a simple property. Instead, you need to use a PageEvent and manually draw the line yourself.
First you need to subclass PdfPageEventHelper:
private class UnderlineMaker : PdfPageEventHelper {
}
You then want to override the OnGenericTag() method:
public override void OnGenericTag(PdfWriter writer, Document document, iTextSharp.text.Rectangle rect, string text) {
}
The text variable will hold an arbitrary value that you set later on. Here's a full working example with comments:
private class UnderlineMaker : PdfPageEventHelper {
public override void OnGenericTag(PdfWriter writer, Document document, iTextSharp.text.Rectangle rect, string text) {
switch (text) {
case "dashed":
//Grab the canvas underneath the content
var cb = writer.DirectContentUnder;
//Save the current state so that we don't affect future canvas drawings
cb.SaveState();
//Set a line color
cb.SetColorStroke(BaseColor.BLUE);
//Set a dashes
//See this for more details: http://api.itextpdf.com/itext/com/itextpdf/text/pdf/PdfContentByte.html#setLineDash(float)
cb.SetLineDash(3, 2);
//Move the cursor to the left edge with the "pen up"
cb.MoveTo(rect.Left, rect.Bottom);
//Move to cursor to the right edge with the "pen down"
cb.LineTo(rect.Right, rect.Bottom);
//Actually draw the lines (the "pen downs")
cb.Stroke();
//Reset the drawing states to what it was before we started messing with it
cb.RestoreState();
break;
}
//There isn't actually anything here but I always just make it a habit to call my base
base.OnGenericTag(writer, document, rect, text);
}
}
And below is an implementation of it:
//Create a test file on the desktop
var testFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
//Normal iTextSharp boilerplate, nothing special here
using (var fs = new FileStream(testFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
//Bind our helper class
writer.PageEvent = new UnderlineMaker();
//Create a new paragraph
var p = new Paragraph();
//Add a normal chunk
p.Add(new Chunk("This is a "));
//Create another chunk with an arbitrary tag associated with it and add to the paragraph
var c = new Chunk("dashed underline test");
c.SetGenericTag("dashed");
p.Add(c);
//Add the paragraph to the document
doc.Add(p);
doc.Close();
}
}
}
If you wanted to get fancy you could pass a delimited string to SetGenericTag() like dashed-black-2x4 or something and parse that out in the OnGenericTag event.

How do I change the weight of a simulated bold font using itext

I am using the iText library to generate text. I am loading the Arial Unicode MS font which does not contain a bold style so iText is simulating the bold. This works fine, but the weight of the bold font appears too heavy compared with text generated using the Java API or even using Microsoft Word.
I tried to get the weight from the FontDescriptor, but the value returned is always 0.0
float weight = font.getBaseFont().getFontDescriptor(BaseFont.FONT_WEIGHT, fontSize);
Is there a way I can change the weight of a simulated bold font?
As an addendum to #Chris' answer: You do not need to construct those Object[]s as there is a Chunk convenience method:
BaseFont arialUnicodeMs = BaseFont.createFont("c:\\Windows\\Fonts\\ARIALUNI.TTF", BaseFont.WINANSI, BaseFont.EMBEDDED);
Font arial12 = new Font(arialUnicodeMs, 12);
Paragraph p = new Paragraph();
for (int i = 1; i < 100; i++)
{
Chunk chunk = new Chunk(String.valueOf(i) + " ", arial12);
chunk.setTextRenderMode(PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE, i/100f, null);
p.add(chunk);
}
document.add(p);
results in
EDIT
Sorry, I just realized after posting this that you're using iText but my answer is for iTextSharp. You should, however, be able to use most of the code below. I've updated the source code link to reference the appropriate Java source.
Bold simulation (faux bold) is done by drawing the text with a stroke. When iText is asked to draw bold text with a non-bold font it defaults to applying a stroke with a width of of the font's size divided by 30. You can see this in the current source code here. The magic part is setting the chunk's text rendering mode to a stroke of your choice:
//.Net code
myChunk.Attributes[Chunk.TEXTRENDERMODE] = new Object[] { PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE, MAGIC_NUMBER_HERE, null };
//Java code
myChunk.attributes.put(Chunk.TEXTRENDERMODE, new Object[]{Integer.valueOf(PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE), MAGIC_NUMBER_HERE, null});
Knowing that you can just apply the same logic but using your weight preference. The sample below creates four chunks, the first normal, the second faux-bold, the third ultra-heavy faux-bold and the fourth ultra-lite faux-bold.
//.Net code below but should be fairly easy to convert to Java
//Path to our PDF
var testFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
//Path to our font
var ff = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "ARIALUNI.TTF");
//Normal document setup, nothing special here
using (var fs = new FileStream(testFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
//Register our font
FontFactory.Register(ff, "Arial Unicode MS");
//Declare a size to use throughout the demo
var size = 20;
//Get a normal and a faux-bold version of the font
var f = FontFactory.GetFont("Arial Unicode MS", BaseFont.IDENTITY_H, BaseFont.EMBEDDED, size, iTextSharp.text.Font.NORMAL);
var fb = FontFactory.GetFont("Arial Unicode MS", BaseFont.IDENTITY_H, BaseFont.EMBEDDED, size, iTextSharp.text.Font.BOLD);
//Create a normal chunk
var cNormal = new Chunk("Hello ", f);
//Create a faux-bold chunk
var cFauxBold = new Chunk("Hello ", fb);
//Create an ultra heavy faux-bold
var cHeavy = new Chunk("Hello ", f);
cHeavy.Attributes = new Dictionary<string, object>();
cHeavy.Attributes[Chunk.TEXTRENDERMODE] = new Object[] { PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE, size / 10f, null };
//Create a lite faux-bold
var cLite = new Chunk("Hello ", f);
cLite.Attributes = new Dictionary<string, object>();
cLite.Attributes[Chunk.TEXTRENDERMODE] = new Object[] { PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE, size / 50f, null };
//Add to document
var p = new Paragraph();
p.Add(cNormal);
p.Add(cFauxBold);
p.Add(cHeavy);
p.Add(cLite);
doc.Add(p);
doc.Close();
}
}
}

itextSharp - html to pdf some turkish characters are missing

When I am trying to generate PDF from HTML, some Turkish characters like ĞÜŞİÖÇ ğüşıöç are missing in PDF, I see a space in place of these characters but i want to print that character.
My code is:
public virtual void print pdf(string html, int id)
{
String htmlText = html.ToString();
Document document = new Document();
string filePath = HostingEnvironment.MapPath("~/Content/Pdf/");
PdfWriter.GetInstance(document, new FileStream(filePath + "\\pdf-"+id+".pdf", FileMode.Create));
document.Open();
iTextSharp.text.html.simpleparser.HTMLWorker hw =
new iTextSharp.text.html.simpleparser.HTMLWorker(document);
hw.Parse(new StringReader(htmlText));
document.Close();
}
How to print all Turkish characters on PDF?
I have finally find a solution for this problem, by this you can print all Turkish character.
String htmlText = html.ToString();
Document document = new Document();
string filePath = HostingEnvironment.MapPath("~/Content/Pdf/");
PdfWriter.GetInstance(document, new FileStream(filePath + "\\pdf-"+Name+".pdf", FileMode.Create));
document.Open();
iTextSharp.text.html.simpleparser.HTMLWorker hw = new iTextSharp.text.html.simpleparser.HTMLWorker(document);
FontFactory.Register(Path.Combine(_webHelper.MapPath("~/App_Data/Pdf/arial.ttf")), "Garamond"); // just give a path of arial.ttf
StyleSheet css = new StyleSheet();
css.LoadTagStyle("body", "face", "Garamond");
css.LoadTagStyle("body", "encoding", "Identity-H");
css.LoadTagStyle("body", "size", "12pt");
hw.SetStyleSheet(css);
hw.Parse(new StringReader(htmlText));
Same my problem solved this code;
var pathUpload = Server.MapPath($"~/Test.pdf");
using (var fs = System.IO.File.Create(pathUpload))
{
using (var doc = new Document(PageSize.A4, 0f, 0f, 10f, 10f))
{
using (var writer = PdfWriter.GetInstance(doc, fs))
{
doc.Open();
BaseFont baseFont = BaseFont.CreateFont("C:\\Windows\\Fonts\\Arial.ttf", "windows-1254", true);
Font fontNormal = new Font(baseFont, 24, Font.NORMAL);
var p = new Paragraph("Test paragrapgh İÇşıĞğŞçöÖ", fontNormal);
doc.Add(p);
doc.Close();
}
} }
I had the same prolem after a few days of reserach;
BaseFont myFont = BaseFont.CreateFont(#"C:\windows\fonts\arial.ttf", "windows-1254", BaseFont.EMBEDDED);
Font fontNormal = new Font(myFont);
Eveytime you need to write a text having special characters, do it this way;
doc.Add(new Paragraph("İıĞğŞşÜüÖöŞşÇç", fontNormal)); // a new paragraph
results.Add(new ListItem("İıĞğŞşÜüÖöŞşÇç", fontNormal)); // a new list item
additionally, this may be needed for itextsharp to let font change;
using Font = iTextSharp.text.Font;
it works like a charm :)
I had a similar problem and I couldn't get the CP1254 encoding to work but I found another solution which worked for me.
In the css just add "font-family: Arial;" and put it on the outer div tag.
.className{
font-family: Arial;
}
<div class="className">
...
</div>
I found this answer here: How to generate a valid PDF/A file using iText and XMLWorker (HTML to PDF/A process)
It took a long time to find this solution but I found it searching for a font solution to display Turkish characters.

iTextSharp does not render header/footer when using element generated by XmlWorkerHelper

I am trying to add header/footer to a PDF whose content is otherwise generated by XMLWorkerHelper. Not sure if it's a placement issue but I can't see the header/footer.
This is in an ASP.NET MVC app using iTextSharp and XmlWorker packages ver 5.4.4 from Nuget.
Code to generate PDF is as follows:
private byte[] ParseXmlToPdf(string html)
{
XhtmlToListHelper xhtmlHelper = new XhtmlToListHelper();
Document document = new Document(PageSize.A4, 30, 30, 90, 90);
MemoryStream msOutput = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(document, msOutput);
writer.PageEvent = new TextSharpPageEventHelper();
document.Open();
var htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
var cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
cssResolver.AddCssFile(HttpContext.Server.MapPath("~/Content/themes/mytheme.css"), true);
var pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, writer)));
var worker = new XMLWorker(pipeline, true);
var parser = new XMLParser();
parser.AddListener(worker);
using (TextReader sr = new StringReader(html))
{
parser.Parse(sr);
}
//string text = "Some Random Text";
//for (int k = 0; k < 8; ++k)
//{
// text += " " + text;
// Paragraph p = new Paragraph(text);
// p.SpacingBefore = 8f;
// document.Add(p);
//}
worker.Close();
document.Close();
return msOutput.ToArray();
}
Now instead of using these three lines
using (TextReader sr = new StringReader(html))
{
parser.Parse(sr);
}
if I comment them out and uncomment the code to add a random Paragraph of text (commented in above sample), I see the header/footer along with the random text.
What am I doing wrong?
The EventHandler is as follows:
public class TextSharpPageEventHelper : PdfPageEventHelper
{
public Image ImageHeader { get; set; }
public override void OnEndPage(PdfWriter writer, Document document)
{
float cellHeight = document.TopMargin;
Rectangle page = document.PageSize;
PdfPTable head = new PdfPTable(2);
head.TotalWidth = page.Width;
PdfPCell c = new PdfPCell(ImageHeader, true);
c.HorizontalAlignment = Element.ALIGN_RIGHT;
c.FixedHeight = cellHeight;
c.Border = PdfPCell.NO_BORDER;
head.AddCell(c);
c = new PdfPCell(new Phrase(
DateTime.UtcNow.ToString("yyyy-MM-dd HH:mm:ss") + " GMT",
new Font(Font.FontFamily.COURIER, 8)
));
c.Border = PdfPCell.TOP_BORDER | PdfPCell.RIGHT_BORDER | PdfPCell.BOTTOM_BORDER | PdfPCell.LEFT_BORDER;
c.VerticalAlignment = Element.ALIGN_BOTTOM;
c.FixedHeight = cellHeight;
head.AddCell(c);
head.WriteSelectedRows(
0, -1, // first/last row; -1 flags all write all rows
0, // left offset
// ** bottom** yPos of the table
page.Height - cellHeight + head.TotalHeight,
writer.DirectContent
);
}
}
Feeling angry and stupid for having lost more than two hours of my life to Windows 8.1! Apparently the built in PDF Reader app isn't competent enough to show 'Headers'/'Footers'. Installed Foxit Reader and opened the output and it shows the Headers allright!
I guess I should try with Adobe Reader too!!!
UPDATES:
I tried opening the generated PDF in Adobe Acrobat reader and finally saw some errors. This made me look at the HTML that I was sending to the XMLWorkerHelper more closely. The structure of the HTML had <div>s inside <table>s and that was tripping up the generated PDF. I cleaned up the HTML to ensure <table> inside <table> and thereafter the PDF came out clean in all readers.
Long story short, the code above is fine, but if you are testing for PDF's correctness, have Adobe Acrobat Reader handy at a minimum. Other readers behave unpredictably as in either not reporting errors or incorrectly rendering the PDF.