special characters in droidText - Android - special-characters

I need to use special characters like "ç", "á"... but nothing works.
Part of the code:
Paragraph title = new Paragraph();
com.lowagie.text.Font fontTitle= FontFactory.getFont("Arial", BaseFont.IDENTITY_H, BaseFont.EMBEDDED, 15, Font.BOLD);
title.add(new Paragraph("ç", fontTtile));
document.add(title);
Using IDENTITY_H and EMBEDDED the pdf doesn't work, but if I just use ("Arial", 15, Font.BOLD) it works (special characters don't).
How can I fix it?
Thanks.

Related

How to generate PDF in Hebrew? currently the PDF is generated empty

I'm using iTextSharp 5.5.13 and when i try to generate the PDF with Hebrew it comes out empty.
this is my code: I'm i doing something wrong?
public byte[] GenerateIvhunPdf(FinalIvhunSolution ivhun)
{
byte[] pdfBytes;
using (var mem = new MemoryStream())
{
Document document = new Document(PageSize.A4);
PdfWriter writer = PdfWriter.GetInstance(document, mem);
writer.PageEvent = new MyHeaderNFooter();
document.Open();
var font = new
Font(BaseFont.CreateFont("C:\\Downloads\\fonts\\Rubik-Light.ttf", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED), 14);
Paragraph p = new Paragraph("פסקת פתיחה")
{
Alignment = Element.ALIGN_RIGHT
};
PdfPTable table = new PdfPTable(2)
{
RunDirection = PdfWriter.RUN_DIRECTION_RTL
};
PdfPCell cell = new PdfPCell(new Phrase("מזהה", font));
cell.BackgroundColor = BaseColor.BLACK;
table.AddCell(cell);
document.Add(p);
document.Add(table);
document.Close();
pdfBytes = mem.ToArray();
}
return pdfBytes;
}
The PDF comes out blank
I changed a few details of your code, and now I get this:
My changes:
Embedding the font
As I don't have Rubik installed on my system, I have to embed the font into the PDF to have a chance to see anything. Thus, I replaced BaseFont.NOT_EMBEDDED by BaseFont.EMBEDDED when creating the var font:
var font = new Font(BaseFont.CreateFont("Rubik-Light.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED), 14);
Making the Paragraph kind of work
You create the Paragraph p without specifying a font. Thus, a default font with default encoding is used. The default encoding is WinAnsiEncoding which is Latin1-like, so no Hebrew characters can be represented. I added your Rubik font instance to the Paragraph p creation:
Paragraph p = new Paragraph("פסקת פתיחה", font)
{
Alignment = Element.ALIGN_RIGHT
};
Et voilà, the writing appears.
iText developers often have communicated that in iText 5.x and earlier right-to-left scripts are only supported properly in certain contexts, e.g. in tables, but not in others like paragraphs immediately added to the document. As your Paragraph p is added immediately to the Document document, its letters appear in the wrong order in the output.
Making the PdfPTable work
You defined the PdfPTable table to have two columns (new PdfPTable(2)) but then you added only one cell. Thus, table contains not even a single complete row. iText, therefore, draws nothing when it is added to the document.
I changed the definition of table to have a single column only:
PdfPTable table = new PdfPTable(1)
{
RunDirection = PdfWriter.RUN_DIRECTION_RTL
};
Furthermore, I commented out the line setting the cell background to black because usually it is difficult to read black on black:
PdfPCell cell = new PdfPCell(new Phrase("מזהה", font));
//cell.BackgroundColor = BaseColor.BLACK;
table.AddCell(cell);
And again the writing appears.
Properly downloading the font
Another possible obstacle is that when downloading the font from the URL you gave — https://fonts.google.com/selection?selection.family=Rubik — one can see in the customization tab of the selection drawer that by default only Latin characters are included in the download, in particular not Hebrew ones:
I tested with a font file I downloaded with all language options enabled:

How to print the chinese and japanese characters using I text xml worker [duplicate]

I'm facing a problem when trying to export a Vietnamese document as PDF using iText.
I put Vietnamese words in .xml file like this
<td fontfamily="Helvetica" fontstyle="0" fontsize="9" align="0" colspan="48" lineoccupied="1">T\u1ED5 ch\u1EE9c tham gia</td>
then having java to get the phrases from xml file and convert it into Unicode using this method:
public String convertToUnicode(String s) {
int i = 0, len = s.length();
char c;
StringBuffer sb = new StringBuffer(len);
try {
while (i < len) {
c = s.charAt(i++);
if (c == '\\') {
if (i < len) {
c = s.charAt(i++);
if (c == 'u') {
if (Character.digit(s.charAt(i), 16) != -1
&& Character.digit(s.charAt(i + 1), 16) != -1
&& Character.digit(s.charAt(i + 2), 16) != -1
&& Character.digit(s.charAt(i + 3), 16) != -1) {
if (s.substring(i).length() >= 4) {
c = (char) Integer.parseInt(s.substring(i, i + 4), 16);
i += 4;
} else {
sb.append('\\');
}
} else {
sb.append('\\');
}
} // add other cases here as desired...
}
} // fall through: \ escapes itself, quotes any character but u
sb.append(c);
}
} catch (Exception e) {
System.out.println("Error Generate PDF :: " + e.getStackTrace().toString());
return s;
}
return sb.toString();
}
After that, export String to PDF - encoding UTF-8.
But the program failed to display Vietnamese character '\u1ED5' and '\u1EE9'
The output becomes "T chc tham gia"
Could you please show me how to fix this issue?
Thanks :)
There are 3 XML Worker examples involving Asian languages on the official iText web site. They parse an XHTML file containing Chinese characters, but it should be easy to adapt them to Vietnamese examples.
You can find the HTML files were going to parse here:
hero.html
hero2.html
Both files contain the following text:
長空 (Broken Sword), 秦王殘劍 (Flying Snow), 飛雪 (Moon), 如月 (the King), and 秦王 (Sky).
In the first case, a font is defined using CSS:
<span style="font-size:12.0pt; font-family:MS Mincho">長空</span>
In the second case, no specific font is defined:
<body><p>長空 (Broken Sword), 秦王殘劍 (Flying Snow), 飛雪 (Moon), 如月 (the King), and 秦王 (Sky).</p></body>
These files contain UTF-8 characters, so we're going to parse them like this:
XMLWorkerHelper.getInstance().parseXHtml(writer, document,
new FileInputStream(HTML), Charset.forName("UTF-8"));
The first thing you need, is a font that supports Vietnamese characters. That's something iText can't help you with. In your HTML file, you've defined Helvetica, but that's a standard Type1 font that is never embedded when using iText and that doesn't know how to draw Vietnamese glyphs. That's never going to work.
The first example D07_ParseHtmlAsian will automatically search for a font named MS Mincho. If it finds that font (for instance because you have msmincho.ttc in your Windows fonts directory), the font will show up in your PDF. See hero.pdf. If it doesn't find a font with that name, then the glyphs won't be visible, because you didn't provide any font program for those glyphs.
The second example D07bis_ParseHtmlAsian offers a workaround in case you don't have MS Mincho anywhere. In that case, you have to use an XMLWorkerFontProvider and register a font that can be used instead of MS Mincho. For instance: we use a font stored in the file cfmingeb.ttf and assign the alias MS Mincho:
XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
fontProvider.register("resources/fonts/cfmingeb.ttf", "MS Mincho");
The resulting file asian.pdf is slightly different from what we expect, but now we can at least see the Chinese glyphs.
In the third example, the HTML file doesn't tell us anything about the font that needs to be used. We'll define the font using CSS like this:
CSSResolver cssResolver = new StyleAttrCSSResolver();
CssFile cssFile = XMLWorkerHelper.getCSS(new ByteArrayInputStream("body {font-family:tsc fming s tt}".getBytes()));
cssResolver.addCss(cssFile);
Now, all the text in the body will use the font TSC FMing S TT (stored in the file cfmingeb.ttf). You can see the difference in the resulting PDF asian2.pdf.
I think you need an encoding as UTF-8 for your HTML and use &#xUNUM; for hex or &#NUM; for regular code to embed your special characters. Not sure where but somewhere in your program since it is not display shown, but your final HTML should be:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML LEVEL 1//EN">
<HTML>
<HEAD>
<TITLE>Your Page Title</TITLE>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
</HEAD>
<BODY>
<!-- YOUR CONTENT HERE -->
<td fontfamily="Helvetica" fontstyle="0" fontsize="9"
align="0" colspan="48"
lineoccupied="1">Tổ chức tham gia</td>
</BODY>
</HTML>
You can cut and paste the above into an HTML file and view the result. More reading pleasure is here Unicode and HTML

iText support for HIndi and Arabic

Our project needs dynamic PDF generation in 6 languages which consists of Hindi and Arabic. iText works brilliantly for other languages except these two. Can someone let me know if current version of iText(5.5.5) have ligature implementation for Hindi and Arabic?
about Arabic language:
I understand you face one of two problems:
the Arabic words don't appear.
the Arabic words appears like :
the solution is:
first: update your itext 5.5.8 to itext 7 implementation:
https://search.maven.org/artifact/com.itextpdf/itext7-core/7.2.2/pom
then edit your code:
String font = "your Arabic font";
PdfFontFactory.register(font);
FontProgram fontProgram = FontProgramFactory.createFont(font, true);
PdfFont f = PdfFontFactory.createFont(fontProgram, PdfEncodings.IDENTITY_H);
LanguageProcessor languageProcessor = new ArabicLigaturizer();
com.itextpdf.kernel.pdf.PdfDocument tempPdfDoc = new
com.itextpdf.kernel.pdf.PdfDocument(new PdfReader(pdfFile.getPath()), TempWriter);
com.itextpdf.layout.Document TempDoc = new
com.itextpdf.layout.Document(tempPdfDoc);
com.itextpdf.layout.element.Paragraph paragraph0 = new
com.itextpdf.layout.element.Paragraph(languageProcessor.process("الاستماره الالكترونية"))
.setFont(f).setBaseDirection(BaseDirection.RIGHT_TO_LEFT)
.setFontSize(15);
//and look how i useded setBaseDirection & and don't use TextAlignment ,it will work without it

How to move text written in Type 3 Font from one pdf to other pdf?

I have a pdf which include text written in Type 3 Font.
I want to get some text from it and write it into other pdf in exactly same shape.
I am using itext. Please give me a tip.
edit: I attached my code.
DocumentFont f = renderInfo.getFont();
String str = renderInfo.getText();
x = renderInfo.getBaseline().getStartPoint().get(Vector.I1);
In this code, I want to write str into x value position.
In Type 3 Font, is it work?
You can copy parts of one page to a new one using code like this:
InputStream resourceStream = getClass().getResourceAsStream("from.pdf");
PdfReader reader = new PdfReader(new FileOutputStream("from.pdf"));
Rectangle pagesize = reader.getPageSizeWithRotation(1);
Document document = new Document(pagesize);
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("areaOfFrom.pdf"));
document.open();
PdfContentByte content = writer.getDirectContent();
PdfImportedPage page = writer.getImportedPage(reader, 1);
content.saveState();
content.rectangle(0, 350, 360, 475);
content.clip();
content.newPath();
content.addTemplate(page, 0, 0);
content.restoreState();
document.close();
reader.close();
This turns your
into
Unfortunately, though, that hidden content is merely... hidden... but it is still there. You can especially mark the lines with that hidden text and try to copy&paste them.
If you want to completely remove that hidden text (or start out by merely copying the desired text), you have to inspect the content of the imported page and filter it. I'm afraid iText does not yet explicitly support something like that. It can be done using the iText lowlevel API but it is quite some work.

Zend_Pdf UTF-8 characters?

I have problems outputting UTF-8 characters into pdf file with Zend_Pdf. Here is my code:
// Load Zend_Pdf class
include 'Zend/Pdf.php';
// Create new PDF
$pdf = new Zend_Pdf();
// Set font
$page->setFont(Zend_Pdf_Font::fontWithPath('fonts/times.ttf'), 12);
// Draw text
$page->drawText('Janko Hraško', 200, 643, 'UTF-8');
The font I§m loading supports UTF-8 characters. But I am getting this error"
Notice: iconv() [function.iconv]: Detected an illegal character in input string in D:\data\o\Zend\Pdf\Resource\Font\Type0.php on line 241
With the Helvetica font your code works!
Solved:
$page->drawText('Janko Hraško', 200, 643, 'Windows-1250');
For some reason, Windows-1250 encoding works but UTF-8 doesn't. Weird but I will use Windows-1250 then.