I am trying to build desktop application for Hindi PDFs in c#. But the Unicode encoding is not well supported.Any idea to fix this.
string ARIALUNI_TTF = path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "ARIALUNI.TTF");
bf = iTextSharp.text.pdf.BaseFont.CreateFont(ARIALUNI_TTF, BaseFont.IDENTITY_H,BaseFont.EMBEDDED);
iTextSharp.text.Font font = new iTextSharp.text.Font(bf, 8, iTextSharp.text.Font.NORMAL);
Can Identity_H will give support for Hindi Encoding?
Hindi is not supported yet. A font like mangal.ttf, that supports the Devanagari script, will show you in iTextSharp the glyphs but not the ligatures. Work is being done on the Indic front not only for Hindi support but also for Telegu, Gujarati and others.
You basically require support for Asian Characters. A similar thread can be found here(stackoverflow). The implementation revolve around usage of BaseFont (use createFont method), which indicates using font and appropriate encoding. You can find the example on the official site of iText here. Note that the example is in Java, but the same implementation is available in .Net as well.
Related
I've created a Kannada eBook using Sigil, Kannada fonts are working properly in Samsung, HTC, Sony (tested in some models). But not working in iPad, and iPhone.
Can anybody please suggest me some solutions for this?
I believe the font you mention is a so-called "ASCII" font, rather than a Unicode-encoded font. In that case, correct display is a matter of luck and is not guaranteed. Even if it works on today's Sony devices, it might stop working tomorrow. According to the spec, EPUBs must be Unicode.
Yes, I am well aware of the issues involved in re-coding legacy content in ASCII encodings for Kannada and other Indic languages into Unicode. However, from the standpoint of future-proofing your content and guaranteeing operability across platforms, it is an investment that will prove worthwhile.
It would also be helpful if you could create a one-page, one-line book with this problem and post the XHTML and the content.opf file.
UPDATE
It appears that version 5.0 of Nudi includes Unicode-conformant fonts, see this article, and also includes a converter.
This mainly happens because of the missing font. Try to download Sigil font onto your device which cannot read kannada fonts.
Alternatively, if you have typed in MS Word using the Sigil Font then While saving select save options and tick the box "Embed Fonts". By doing this your document can be read in any device without the reader having that particular font.
I am developing a website in the Georgian language. The Georgian alphabet has its own Unicode range, but there are also special fonts which have Georgian glyphs in place of English characters, a bit like the "Symbol" and "Dingbats" fonts.
For example the string "saqarTvelo" will be rendered as "საქართველო" with these fonts. So now I have two options and don't know what to do:
Using Georgian Unicode for my website, but the problem is that all fonts are created for English Unicode, and don't work with Georgian Unicode.
Using Georgian fonts with English Unicode. But I don't know how search engine will react.
Please tell me what to do, I am stacked!
The short answer is that using the approach you mean in option 1, search engines will see the word “საქართველო” in your text as “saqarTvelo”, so normal searches will fail.
The question seems to refer to two different ways of using Georgian letters on web pages:
Using Unicode encoding, so that characters will be rendered using an Unicode-encoded font (which is what most fonts are, but most fonts don’t contain Georgian letters).
Using a nonstandard, “private” encoding, usually one that maps 256 different code positions (8-bit combinations) to whatever characters are needed for some purposes. This presumes that the text is rendered using a font encoded the same way.
Method 2 can be characterized as a wrong approach, but it has been used on the web since the early days (even when CSS was not available and one had to resort to <font face=...> for setting font), and especially in the early days. It really does not work unless the user’s computer has the specific, “privately” encoded font (or some font encoded exactly the same way). Since search engines are font-agnostic, they only see the 8-bit codes and try to interpret them in the encoding declared or implied for the page, not in the “private” encoding (which cannot be declared since it has no published definition and no standard name, or any name for that matter).
Method 1 has the problem that for it to work, the user’s computer needs to have some (Unicode-encoded font) that supports the characters used. Nowadays, this can be reasonably well solved using a downloadable font (web font) via #font-face. Fonts that support Georgian letters include some useful free fonts like DejaVu fonts, GNU Freefont fonts, and Quivira. For more info on this approach, see my Guide to using special characters in HTML.
Using method 1, search engines will see the Georgian letters correctly, provided that the document’s encoding (normally UTF-8) has been properly declared or can be inferred by the search engine.
I'm using PDFTable from http://www.vanxuan.net/tool/pdftable/ which is based on FPDF class. I managed to export HTML table to pdf using PDFTable. However, I'm facing one issue. The non-English characters are all displayed in gibberish. It doesn't seem that it supports unicode. The language I'm trying to display is Arabic and Russian.
I could, theoretically, create a similar class to PDFTable, which is inherited from FPDF, and develop it from scratch to add unicode support. But it's a lot of work. Has anyone done something like that and perhaps could share? Thank you!
For unicode support, the best way is to use tFPDF from http://www.fpdf.org/en/script/script92.php. It's a fork of FPDF with specifically to support unicode. The class is based on the latest FPDF version 1.7.
I'm using unicode symbols in a web as graphic components.
I need to trust in the way this unicode characters are rendered.
Here there is a simplified example of what I'm trying to build.
You can see that the unicode characters are rendering different in different computers.
Chrome under OSX:
Chrome under Windows:
I only need to support modern browsers so #font-face and google fonts are allowed.
Updated
I know the problem is that the chosen font has not the special characters and finding one with them and compatible with #font-face or googlefonts will be the solution but this is the real problem: how to find a font with this characteristics.
The most likely answer is that your selected font has no glyphs defined for those unicode code-points (and from perusing the font, that seems to be the case) and you will need to switch to a font that has glyphs defined for those code-points.
When a font has no defined glyph for a Unicode code-point, it's up to the platform to figure out how to handle it. Windows used to simply show a square box for anything that wasn't defined, but since Windows Vista (or maybe Windows 7), it will now display a glyph from the system default font, if that's available. What you are most likely seeing for your unicode characters are the versions from the system default fonts - which, of course, are not the same on Windows and Mac.
You should try and find a font that a) contains all the characters you need, b) can be legally used as a downloadable font via #font-face.
You are now using the Fedoka One font, but it contains a very limited character repertoire. The first four characters that you are trying to show are not there (not even “⋕”, since it is quite distinct from the Ascii character “#” despite visual similarity). Since the font-family rule next specifies fantasy, browsers will try whatever fancy font they have been set to use as a generic fantasy font, and it probably hasn’t got them either—fantasy fonts tend to have a limited repertoire. Browsers then go their own ways, possibly using various fonts.
Those four characters are rare in fonts, and the fonts containing them have no similarity with Fedoka One in style. So you may need to reconsider the approach.
Some notes on using special characters in HTML: http://www.cs.tut.fi/~jkorpela/html/characters.html
I have a file with Chinese text that I want to use in my XCode project (I'm planning to load it through a database as it is lot of text), the problem is I don't know how to add the font to my project so that it's viewable when used on an iPhone?
Thanks :)
I currently live in China and deal with this all of the time. Usually the problem is not the font, it's the way the characters are represented. All unix variants use UTF-8 (most OSes) Windows uses UTF-16/32 (I forget). The cool thing about UTF-8 is that it is backward-compatible with ASCII. Open your text in the TextEdit or Firefox. In Firefox you can tell the browser to try different encodings, then save it to a file. If it is the wrong encoding, Mac TextEdit can convert between UTF-8 and UTF-16. Once you have the string in UTF-8 encoding, you can display it in your text field.
When displaying text to a textfeild make sure to display a UTF-8 string, not an ASCII string.
If you are interested in the details of UTF-8, just say so and I will expand on the UTF-8 design.
rw
The iPhone already has chinese fonts installed by default.
I've had some success using the FontLabel library. It allows you to use arbitrary .ttf fonts in your app and it's Apache-licensed:
http://github.com/zynga/FontLabel
For the majority of cases this has worked perfectly for me.