Itext How to set both UTF encoding and Font to a Paragraph - itext

I want to set a font to a text that is in cyrillic.I successfully convert the text to cyrilic, but i cannot set a Font to the same text.
File fontFile = new File("arialuni.ttf");
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(RESULT));
document.open();
writer.getAcroForm().setNeedAppearances(true);
Font boldFont = new Font(Font.FontFamily.TIMES_ROMAN, 18, Font.BOLD);
Font normalFont = new Font(Font.FontFamily.TIMES_ROMAN, 10, Font.ITALIC);
BaseFont unicode = BaseFont.createFont(fontFile.getAbsolutePath(), BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
FontSelector fs = new FontSelector();
fs.addFont(new Font(unicode));
addContent(document,article.getTitle(),fs,boldFont);
private static void addContent(Document document,String paragraph,FontSelector fs,Font font) throws DocumentException {
Phrase phrase = fs.process(paragraph);
Paragraph p = new Paragraph(phrase.toString(),font);
document.add(p);
}

As #mkl indicates in the comments, you are mixing FontSelector functionality that gives you a Phrase that could use the appropriate unicode fonts (fonts with BaseFont.IDENTITY_H as encoding parameter), with creating a Paragraph with a simple font (Font.FontFamily.TIMES_ROMAN).
When you do fs.process(paragraph), you get a Phrase in which every Chunk has the correct font, but when you do phrase.toString(), you throw away all those fonts, and you replace them with Font.FontFamily.TIMES_ROMAN. That doesn't make any sense.
Why don't you replace this:
Phrase phrase = fs.process(paragraph);
Paragraph p = new Paragraph(phrase.toString(),font);
document.add(p);
with:
document.add(fs.process(paragraph));
Why does your addContent() method need a Font as parameter? Also, if you really need a Paragraph object you can also do this:
Paragraph p = new Paragraph();
p.add(fs.process(paragraph));
document.add(p);
Or even:
Paragraph p = new Paragraph(fs.process(paragraph));
document.add(p);
As long as you don't replace the correct fonts with the incorrect fonts by "flattening" the Phrase to a String, you're probably OK.
Note that you probably even don't need the FontSelector. There's nothing wrong with doing this:
BaseFont unicode = BaseFont.createFont(
fontFile.getAbsolutePath(), BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font font = new Font(unicode, 12);
Paragraph p = new Paragraph(paragraph, font);
It seems to me that you are making things unnecessary complex.

Related

Embedding fonts flattening a document with XFAFlattener

I'm flattening an existing document with XFA_Worker. Is there a way to force fonts to be embedded in the flattened document?
using iText = iTextSharp.text;
using iTextPDF = iTextSharp.text.pdf;
using iTextImage = iTextSharp.text.Image;
using iTextReader = iTextSharp.text.pdf.PdfReader;
using iTextWriter = iTextSharp.text.pdf.PdfWriter;
reader = new iTextReader(tempFileDirectory + "\\" + "orig_" + tmpFileName);
iTextPDF.AcroFields form = reader.AcroFields;
iTextPDF.XfaForm xfa = form.Xfa;
if (xfa.XfaPresent)
{
iText.Document document = new iText.Document();
iTextWriter writer = iTextWriter.GetInstance(document,
new FileStream(tempFileDirectory + "\\" + tmpFileName, FileMode.Create));
XFAFlattener xfaf = new XFAFlattener(document, writer);
xfaf.Flatten(reader);
document.Close();
writer.Close();
}
For a specific project, I have HTML that requires a regular font and a bold font that are embedded. The requirements for the PDF are also very strict: I am not allowed to use any font other than OpenSans and OpenSans bold, and the text has to be stored as UNICODE (no simple fonts allowed).
To meet these requirements, I wrote the following implementation of the FontProvider interface:
class MyFontProvider implements FontProvider {
protected BaseFont regular;
protected BaseFont bold;
public MyFontProvider() throws DocumentException, IOException {
regular = BaseFont.createFont("resources/fonts/OpenSans-Regular.ttf", BaseFont.IDENTITY_H, true);
bold = BaseFont.createFont("resources/fonts/OpenSans-Bold.ttf", BaseFont.IDENTITY_H, true);
}
public boolean isRegistered(String fontname) {
return true;
}
public Font getFont(String fontname, String encoding, boolean embedded, float size, int style, BaseColor color) {
Font font;
switch (style) {
case Font.BOLD:
font = new Font(bold, size);
break;
default:
font = new Font(regular, size);
}
font.setColor(color);
return font;
}
}
I create two BaseFont instances in the constructor. In the getFont() method, I ignore the fontname, encoding and embedded parameters. I will always return embedded OpenSans using Identity-H as "encoding". I do apply the font size, font color, and the style (but only in case the style is set to bold; in all other cases, I use the regular font). You can adapt this FontProvider any way you like, but in my case, these were strict requirements for the project.
Obviously, I need the "long" version of the XML Worker code because I now need to declare MyFontProvider to the HTML pipeline:
// CSS
CSSResolver cssResolver = new StyleAttrCSSResolver();
CssFile cssFile = XMLWorkerHelper.getCSS(new FileInputStream(CSS));
cssResolver.addCss(cssFile);
// HTML
CssAppliers cssAppliers = new CssAppliersImpl(new MyFontProvider());
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
// Pipelines
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new ByteArrayInputStream(invoice));
Every time XML Worker needs a font, it will call MyFontProvider and the getFont() method will return no other font than embedded OpenSans or OpenSans-Bold.
The same principle exists for XFA Flattener where you can also work with CssAppliers and FontProvider implementations. As XFA Worker is a closed source product, which means that you are either using the trial version or that you are a customer of iText Group. In both cases, you should contact iText Group directly if this answer doesn't fix your problem.

iText basefont mingliu.ttc italic cannot work

I want to show text in mingliu font style italic, using the following code but fail, the output is still standard style, not italic (I am using iText 2).
PdfContentByte cb = writer.getDirectContent();
..................
String ttfPath = null;
ttfPath = BaseSection.class.getResource("/WEB-INF/lib/mingliu.ttc").getPath();
try{
this.bfi = BaseFont.createFont(ttfPath+",0,Italic", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
cb.setFontAndSize(bfi, 8);
..........
cb.showText(companyText);
}
Any method to show the mingliu text in italic style using BaseFont.createFont?
Thanks.
I find the following can solve my problem
PdfContentByte cb = writer.getDirectContent();
cb.saveState();
String ttfPath = BaseSection.class.getResource("/WEB-INF/lib/mingliu.ttc").getPath();
bf = BaseFont.createFont(ttfPath+",0", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
String companyText = "abc";
Font font = new Font(bf, 8, Font.ITALIC);
Chunk chunk = new Chunk(companyTextC, font);
Phrase phrase = new Phrase(chunk);
ColumnText.showTextAligned(cb, Element.ALIGN_RIGHT, phrase, document.right(), 1, 0);
cb.restoreState();
Hope this can help others with similar problem.

How do I change the weight of a simulated bold font using itext

I am using the iText library to generate text. I am loading the Arial Unicode MS font which does not contain a bold style so iText is simulating the bold. This works fine, but the weight of the bold font appears too heavy compared with text generated using the Java API or even using Microsoft Word.
I tried to get the weight from the FontDescriptor, but the value returned is always 0.0
float weight = font.getBaseFont().getFontDescriptor(BaseFont.FONT_WEIGHT, fontSize);
Is there a way I can change the weight of a simulated bold font?
As an addendum to #Chris' answer: You do not need to construct those Object[]s as there is a Chunk convenience method:
BaseFont arialUnicodeMs = BaseFont.createFont("c:\\Windows\\Fonts\\ARIALUNI.TTF", BaseFont.WINANSI, BaseFont.EMBEDDED);
Font arial12 = new Font(arialUnicodeMs, 12);
Paragraph p = new Paragraph();
for (int i = 1; i < 100; i++)
{
Chunk chunk = new Chunk(String.valueOf(i) + " ", arial12);
chunk.setTextRenderMode(PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE, i/100f, null);
p.add(chunk);
}
document.add(p);
results in
EDIT
Sorry, I just realized after posting this that you're using iText but my answer is for iTextSharp. You should, however, be able to use most of the code below. I've updated the source code link to reference the appropriate Java source.
Bold simulation (faux bold) is done by drawing the text with a stroke. When iText is asked to draw bold text with a non-bold font it defaults to applying a stroke with a width of of the font's size divided by 30. You can see this in the current source code here. The magic part is setting the chunk's text rendering mode to a stroke of your choice:
//.Net code
myChunk.Attributes[Chunk.TEXTRENDERMODE] = new Object[] { PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE, MAGIC_NUMBER_HERE, null };
//Java code
myChunk.attributes.put(Chunk.TEXTRENDERMODE, new Object[]{Integer.valueOf(PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE), MAGIC_NUMBER_HERE, null});
Knowing that you can just apply the same logic but using your weight preference. The sample below creates four chunks, the first normal, the second faux-bold, the third ultra-heavy faux-bold and the fourth ultra-lite faux-bold.
//.Net code below but should be fairly easy to convert to Java
//Path to our PDF
var testFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
//Path to our font
var ff = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "ARIALUNI.TTF");
//Normal document setup, nothing special here
using (var fs = new FileStream(testFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
//Register our font
FontFactory.Register(ff, "Arial Unicode MS");
//Declare a size to use throughout the demo
var size = 20;
//Get a normal and a faux-bold version of the font
var f = FontFactory.GetFont("Arial Unicode MS", BaseFont.IDENTITY_H, BaseFont.EMBEDDED, size, iTextSharp.text.Font.NORMAL);
var fb = FontFactory.GetFont("Arial Unicode MS", BaseFont.IDENTITY_H, BaseFont.EMBEDDED, size, iTextSharp.text.Font.BOLD);
//Create a normal chunk
var cNormal = new Chunk("Hello ", f);
//Create a faux-bold chunk
var cFauxBold = new Chunk("Hello ", fb);
//Create an ultra heavy faux-bold
var cHeavy = new Chunk("Hello ", f);
cHeavy.Attributes = new Dictionary<string, object>();
cHeavy.Attributes[Chunk.TEXTRENDERMODE] = new Object[] { PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE, size / 10f, null };
//Create a lite faux-bold
var cLite = new Chunk("Hello ", f);
cLite.Attributes = new Dictionary<string, object>();
cLite.Attributes[Chunk.TEXTRENDERMODE] = new Object[] { PdfContentByte.TEXT_RENDER_MODE_FILL_STROKE, size / 50f, null };
//Add to document
var p = new Paragraph();
p.Add(cNormal);
p.Add(cFauxBold);
p.Add(cHeavy);
p.Add(cLite);
doc.Add(p);
doc.Close();
}
}
}

itextSharp - html to pdf some turkish characters are missing

When I am trying to generate PDF from HTML, some Turkish characters like ĞÜŞİÖÇ ğüşıöç are missing in PDF, I see a space in place of these characters but i want to print that character.
My code is:
public virtual void print pdf(string html, int id)
{
String htmlText = html.ToString();
Document document = new Document();
string filePath = HostingEnvironment.MapPath("~/Content/Pdf/");
PdfWriter.GetInstance(document, new FileStream(filePath + "\\pdf-"+id+".pdf", FileMode.Create));
document.Open();
iTextSharp.text.html.simpleparser.HTMLWorker hw =
new iTextSharp.text.html.simpleparser.HTMLWorker(document);
hw.Parse(new StringReader(htmlText));
document.Close();
}
How to print all Turkish characters on PDF?
I have finally find a solution for this problem, by this you can print all Turkish character.
String htmlText = html.ToString();
Document document = new Document();
string filePath = HostingEnvironment.MapPath("~/Content/Pdf/");
PdfWriter.GetInstance(document, new FileStream(filePath + "\\pdf-"+Name+".pdf", FileMode.Create));
document.Open();
iTextSharp.text.html.simpleparser.HTMLWorker hw = new iTextSharp.text.html.simpleparser.HTMLWorker(document);
FontFactory.Register(Path.Combine(_webHelper.MapPath("~/App_Data/Pdf/arial.ttf")), "Garamond"); // just give a path of arial.ttf
StyleSheet css = new StyleSheet();
css.LoadTagStyle("body", "face", "Garamond");
css.LoadTagStyle("body", "encoding", "Identity-H");
css.LoadTagStyle("body", "size", "12pt");
hw.SetStyleSheet(css);
hw.Parse(new StringReader(htmlText));
Same my problem solved this code;
var pathUpload = Server.MapPath($"~/Test.pdf");
using (var fs = System.IO.File.Create(pathUpload))
{
using (var doc = new Document(PageSize.A4, 0f, 0f, 10f, 10f))
{
using (var writer = PdfWriter.GetInstance(doc, fs))
{
doc.Open();
BaseFont baseFont = BaseFont.CreateFont("C:\\Windows\\Fonts\\Arial.ttf", "windows-1254", true);
Font fontNormal = new Font(baseFont, 24, Font.NORMAL);
var p = new Paragraph("Test paragrapgh İÇşıĞğŞçöÖ", fontNormal);
doc.Add(p);
doc.Close();
}
} }
I had the same prolem after a few days of reserach;
BaseFont myFont = BaseFont.CreateFont(#"C:\windows\fonts\arial.ttf", "windows-1254", BaseFont.EMBEDDED);
Font fontNormal = new Font(myFont);
Eveytime you need to write a text having special characters, do it this way;
doc.Add(new Paragraph("İıĞğŞşÜüÖöŞşÇç", fontNormal)); // a new paragraph
results.Add(new ListItem("İıĞğŞşÜüÖöŞşÇç", fontNormal)); // a new list item
additionally, this may be needed for itextsharp to let font change;
using Font = iTextSharp.text.Font;
it works like a charm :)
I had a similar problem and I couldn't get the CP1254 encoding to work but I found another solution which worked for me.
In the css just add "font-family: Arial;" and put it on the outer div tag.
.className{
font-family: Arial;
}
<div class="className">
...
</div>
I found this answer here: How to generate a valid PDF/A file using iText and XMLWorker (HTML to PDF/A process)
It took a long time to find this solution but I found it searching for a font solution to display Turkish characters.

itextsharp's Arial ,courier new font always show italic in windows 8

why in windows 8 , I run below code, which set font is Arial ,and font style is regular,
after PDF has been created. Why Is Arial Black Rendering in Italics?
the font style become italic. courier new font got the same issue. only this two font has issue.
this code work fine in windows 7, font style is regular.
string path = #"c:\test\";
iTextSharp.text.Rectangle r = new iTextSharp.text.Rectangle(400, 300);
Document doc = new Document(r);
PdfWriter.GetInstance(doc, new FileStream(path + "Blocks.pdf", FileMode.CreateNew));
doc.Open();
BaseFont baseFont = BaseFont.CreateFont("C:\\Windows\\Fonts\\ariali.ttf", BaseFont.CP1252, false);
//set font.style=0;
iTextSharp.text.Font newFont = new iTextSharp.text.Font(baseFont, 16f, 0, iTextSharp.text.BaseColor.BLACK);
Chunk c1 = new Chunk("A chunk represents an isolated string. ", newFont);
doc.Add(c1);
doc.Close();
try that its work for my.
and make sure that you use the right font
NormalFont ="C:\\Windows\\Fonts\\ARIAL.ttf";
BaseFont baseFont = BaseFont.CreateFont(fontType, BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
Font font = new iTextSharp.text.Font(baseFont, fontSize, fontStyle);
Chunk chunk = new Chunk(str, font);