How to use unicode chars in a web with a reliable rendering - unicode

I'm using unicode symbols in a web as graphic components.
I need to trust in the way this unicode characters are rendered.
Here there is a simplified example of what I'm trying to build.
You can see that the unicode characters are rendering different in different computers.
Chrome under OSX:
Chrome under Windows:
I only need to support modern browsers so #font-face and google fonts are allowed.
Updated
I know the problem is that the chosen font has not the special characters and finding one with them and compatible with #font-face or googlefonts will be the solution but this is the real problem: how to find a font with this characteristics.

The most likely answer is that your selected font has no glyphs defined for those unicode code-points (and from perusing the font, that seems to be the case) and you will need to switch to a font that has glyphs defined for those code-points.
When a font has no defined glyph for a Unicode code-point, it's up to the platform to figure out how to handle it. Windows used to simply show a square box for anything that wasn't defined, but since Windows Vista (or maybe Windows 7), it will now display a glyph from the system default font, if that's available. What you are most likely seeing for your unicode characters are the versions from the system default fonts - which, of course, are not the same on Windows and Mac.

You should try and find a font that a) contains all the characters you need, b) can be legally used as a downloadable font via #font-face.
You are now using the Fedoka One font, but it contains a very limited character repertoire. The first four characters that you are trying to show are not there (not even “⋕”, since it is quite distinct from the Ascii character “#” despite visual similarity). Since the font-family rule next specifies fantasy, browsers will try whatever fancy font they have been set to use as a generic fantasy font, and it probably hasn’t got them either—fantasy fonts tend to have a limited repertoire. Browsers then go their own ways, possibly using various fonts.
Those four characters are rare in fonts, and the fonts containing them have no similarity with Fedoka One in style. So you may need to reconsider the approach.
Some notes on using special characters in HTML: http://www.cs.tut.fi/~jkorpela/html/characters.html

Related

List of well supported unicode characters

Is there a list somewhere of which unicode characters are well supported? I.e. if I used these characters in an application or on a web page, there's a good chance that the user will see what I see, and not a question mark or a square.
http://apps.timwhitlock.info/emoji/tables/unicode This is a good start. This shows what a number of unicode characters look like on several common platforms. But this list is limited to emoji. I'm more interested in things like arrows and mathematical symbols.
Of course, I can always see which characters look good on my computer, phone, web browser, favorite font, etc. But I want to know what will work well for most other people, too.
If you need to know if a Unicode character is assigned you should check the official Unicode Chart.
Wikipedia has a good list of fonts which cover the most characters. See Unicode Font # List of Unicode fonts.
Arial is preinstalled on almost every machine and covers a lot of characters. Not to forget Noto from Google - a font collection which covers almost every character you will ever come in touch with.
For a fast lookup of Unicode characters I recommend Fileformat.info.
But I want to know what will work well for most other people, too.
I would go with Arial, or Times New Roman, or make your decision platform dependent.

For eBook Kannada font (Nudi 01 e b.ttf) is converting to gibberish English font in iPad, iPhone

I've created a Kannada eBook using Sigil, Kannada fonts are working properly in Samsung, HTC, Sony (tested in some models). But not working in iPad, and iPhone.
Can anybody please suggest me some solutions for this?
I believe the font you mention is a so-called "ASCII" font, rather than a Unicode-encoded font. In that case, correct display is a matter of luck and is not guaranteed. Even if it works on today's Sony devices, it might stop working tomorrow. According to the spec, EPUBs must be Unicode.
Yes, I am well aware of the issues involved in re-coding legacy content in ASCII encodings for Kannada and other Indic languages into Unicode. However, from the standpoint of future-proofing your content and guaranteeing operability across platforms, it is an investment that will prove worthwhile.
It would also be helpful if you could create a one-page, one-line book with this problem and post the XHTML and the content.opf file.
UPDATE
It appears that version 5.0 of Nudi includes Unicode-conformant fonts, see this article, and also includes a converter.
This mainly happens because of the missing font. Try to download Sigil font onto your device which cannot read kannada fonts.
Alternatively, if you have typed in MS Word using the Sigil Font then While saving select save options and tick the box "Embed Fonts". By doing this your document can be read in any device without the reader having that particular font.

How will search engines react to different unicode?

I am developing a website in the Georgian language. The Georgian alphabet has its own Unicode range, but there are also special fonts which have Georgian glyphs in place of English characters, a bit like the "Symbol" and "Dingbats" fonts.
For example the string "saqarTvelo" will be rendered as "საქართველო" with these fonts. So now I have two options and don't know what to do:
Using Georgian Unicode for my website, but the problem is that all fonts are created for English Unicode, and don't work with Georgian Unicode.
Using Georgian fonts with English Unicode. But I don't know how search engine will react.
Please tell me what to do, I am stacked!
The short answer is that using the approach you mean in option 1, search engines will see the word “საქართველო” in your text as “saqarTvelo”, so normal searches will fail.
The question seems to refer to two different ways of using Georgian letters on web pages:
Using Unicode encoding, so that characters will be rendered using an Unicode-encoded font (which is what most fonts are, but most fonts don’t contain Georgian letters).
Using a nonstandard, “private” encoding, usually one that maps 256 different code positions (8-bit combinations) to whatever characters are needed for some purposes. This presumes that the text is rendered using a font encoded the same way.
Method 2 can be characterized as a wrong approach, but it has been used on the web since the early days (even when CSS was not available and one had to resort to <font face=...> for setting font), and especially in the early days. It really does not work unless the user’s computer has the specific, “privately” encoded font (or some font encoded exactly the same way). Since search engines are font-agnostic, they only see the 8-bit codes and try to interpret them in the encoding declared or implied for the page, not in the “private” encoding (which cannot be declared since it has no published definition and no standard name, or any name for that matter).
Method 1 has the problem that for it to work, the user’s computer needs to have some (Unicode-encoded font) that supports the characters used. Nowadays, this can be reasonably well solved using a downloadable font (web font) via #font-face. Fonts that support Georgian letters include some useful free fonts like DejaVu fonts, GNU Freefont fonts, and Quivira. For more info on this approach, see my Guide to using special characters in HTML.
Using method 1, search engines will see the Georgian letters correctly, provided that the document’s encoding (normally UTF-8) has been properly declared or can be inferred by the search engine.

Any standard for Unicode font support expected of all browsers?

Is there a standard governing Unicode font support expected of all browsers?
The latest version of Unicode contains a repertoire of more than 110,000 characters covering 100 scripts. I don't expect the browsers to support all of them, but there should be minimum support for some characters such as letters from the Latin script, common punctuation, and symbols of type math, currency, and other.
I am currently having problem displaying the U+060B AFGANI SIGN (؋) and U+202F NARROW NON-BREAK SPACE on the Android browser. I wonder if there is a list of universally recognized Unicode characters so that developers can use them confidently without having to worry about browser display issues.
There is no standard on Unicode support in browsers. Besides, the ability to display a character mostly depends on fonts, though browsers differ in their abilities in scanning through fonts. Normally what you can do is to specify a suitable font-family list of fonts that each support all the characters you need. For generalities on this, see my Guide to using special characters in HTML.
On Android, the problem is that there is a very limited set of fonts. If you need any characters beyond what is supported by them, you need to use a downloadable font, via #font-face.
The currency symbol “؋” U+060B AFGHANI SIGN is present in about a dozen fonts, but the only free font among them (if we don’t count the bitmap font GNU Unifont) appears to be Scheherazade.
For U+202F NARROW NO-BREAK SPACE, font support is wider. But in general, it is often better to use other methods than such characters. Many fonts contain this character as almost as wide as a normal space, and its description in the Unicode standard as regards to its width is vague: “a narrow form of a no-break space, typically the width of a thin space or a mid space”. “Thin space” is described as “a fifth of an em (or sometimes a sixth)” in the Unicode standard, and in reality its width varies. And “mid space” is really an undefined concept.
For example, if the text is in a language that uses spaces as thousands separators, you could in principle write a number like 100 000 as 100 000, but it’s better to write, say,
<span class="gr">100 000</span>
with CSS code like .gr { word-spacing: -0.15em }.
AFAIK, all browsers support #font-face for loading webfonts and can support any character within those fonts. As such, you should be able to display any character in any browser if you make sure you provide access to a webfont with support for those characters.
To avoid using giant fonts just to support a few special characters, you can create your own fonts with tools like the Icomoon App.
I used the Icomoon App to create the Emoji emoticon font as well as for creating custom icon fonts on a per project basis.
For more info on the use or creation of icon fonts (or other webfonts), see Create webfont with Unicode Supplementary Multilingual Plane symbols

UTF-8 special characters don't show up

I'm trying to figure out why characters like this : 👉 show up like empty boxes. They are unicode characters though and charset is utf-8.
Can it be a font problem which doesn't have a glyph for that? Any ideas?
Details: Html page, i use firefox 16.0.1, Windows 7.. Page like on this post i dont see this glyph either
Thanks
The character which you've there is the Unicode Character 'WHITE RIGHT POINTING BACKHAND INDEX' (U+1F449). On that page, you can find a list of known fonts supporting the character behind the link Fonts that support U+1F449.
Font
LastResort
Segoe UI Emoji
Segoe UI Symbol
Symbola
Neither of those fonts is been used here on stackoverflow.com, so you'll also see an empty box.
If this occurs on your own website, and you'd like to fix it, then you'd need to supply a supporting font along with the webapp by CSS #font-face, or in this particular case perhaps better, look for a CSS based icon library such as Font Awesome. The <i class="fa fa-hand-o-right"> comes very close with this character.
The character U+1F449 was added to Unicode in version 6 in 2010, and it generally takes about ten years from the adoption of a character into Unicode before it is widely supported in fonts.
The few fonts that contain it now include Symbola and Segoe UI Symbol. If you have either of them installed, you’ll probably see it; otherwise not. Segoe UI Symbol is shipped with Windows 8 and apparently with (at least some variants of) Windows 7, though the Windows 7 version may be limited – an update is available from Microsoft. Symbola is a free font, so you could in principle use it as a downloadable font (via #font-face), but its file size is rather large.
Web browsers are supposed to use fallback fonts, if the fonts specified for an element do not contain a glyph for some character in the content. Firefox generally implements this will, IE does not, especially in older versions, so if you use the character on a web page, it is best to wrap in an element of its own (usually span is used for the purpose) and set the following on it in CSS:
font-family: Segoe UI Symbol, Symbola;
But this will as such (without #font-face) work only for people using computers that contain one of the fonts.
Missing font characters will usually be substituted with other fonts, and UTF-8 should be able to display all unicode characters. I suspect that the encoding of your file (how it is saved by your editor), does not match the declaration in the meta tags of your HTML page.
You can check your page with this W3-checker, it can possibly give you hints about the problem of your page.
EDIT:
You are right, it's not an encoding problem, the number of the character has such a high number, that the "normal" fonts do not support it. Maybe you can use one of those ☛ ☞, otherwise you would have to use a web font, and fonts with full unicode support can be quite large.