Missing latin characters on Mono GDI+ - unicode

There is an issue and so far I'm unable to solve it. I have a program written in Mono with GDI+ forms and controls. It seems that there is something wrong with the following unicode characters: ĉ, ċ, č (both upper and lowercase), from U+0108 to U+010D. It occurs on all controls (TextBox, Button, Label etc.) as well as when using DrawString() function. If I write 'ĉĉĉĉĉĉ' on TextBox, only the first character is displayed properly. If the string ends with any other character than ĉ, ċ or č, it is drawn correctly, otherwise the characters at the end are not displayed.
It seems that this is not a font issue, tested with Arial and some other random fonts.
Tested on:
Ubuntu 16.04.3 LTS + Mono 5.4.1.7
Raspbian GNU/Linux 9 (stretch) + Mono 5.10.0.160
Any idea how to solve this issue?

There is a bug in libgdiplus library in text-cairo.c
I fixed it and made PR.

Related

How to fix PowerShell 7 fonts not showing correctly | oh-my-posh

I've already installed Windows Terminal, set it up with "oh my posh" and everything working as intended.
Though whenever I launch PowerShell 7 (without the terminal), the font is messy as you can see at the image below
I have already tried to change the font, to the same one I used in terminal's .json but there are still some parts that are not rendering correctly and I cannot use it that way with VSCode
The problem is because the Windows Console doesn't fully support UTF-8:
Windows Console was created way back in the early days of Windows,
back before Unicode itself existed! Back then, a decision was made to
represent each text character as a fixed-length 16-bit value (UCS-2).
Thus, the Console’s text buffer contains 2-byte wchar_t values per
grid cell, x columns by y rows in size.
...
One problem, for example, is that because UCS-2 is a fixed-width
16-bit encoding, it is unable to represent all Unicode codepoints.
This means you have "partial" support for Unicode characters in the Windows Console (i.e. as long as the character can be represented in UCS-2), but won't support all potential (32-bit) Unicode regions.
When you see boxes, that means that the character that is being used is using a region outside of the UCS-2 range. You also tell this because you get 2 boxes (i.e. 2 x 16 bit values). That is why you can't have happy faces 😀 in your Windows Console (which makes me sad ☹️).
In order for it to work in all locations, you will have to modify your oh-my-posh themes to use a different character that can be represented with a UCS-2 character.
For Version 2 of Oh My Posh, to make the font changes you have to edit the $ThemeSettings variable. Follow the instructions on the GitHub on configuring Theme Settings. e.g.:
$ThemeSettings.GitSymbols.BranchSymbol = [char]::ConvertFromUtf32(0x2514)
For Version 3+ of Oh My Posh, you have to edit the JSON configuration file to make the changes, e.g.:
...
{
"type": "git",
"style": "powerline",
"powerline_symbol": "\u2514",
....

Unicode characters aren't combined properly

I am working with some Devanagari text data I want to display in the browser. Unfortunately, there's one combination of nonspacing combining characters that doesn't get rendered as a proberly combined character.
The problem occurs every time a base character is combined with the Devanagari Stress Sign Udatta ॑ (U+0951) and the Devanagari Sign Visarga ः (U+0903).
An example for this would be र॑ः, which is र (U+0930) + ॑ + ः and should be rendered as one character. But the stress sign and the other one don't seem to like each other (as you can see above!).
It's no problem to combine the base char with each of the other two signs alone, btw: र॑ / रः
I already tried to use several fonts which should be able to render Devanagari characters (some Noto fonts, Siddhanta, GentiumPlus) and tested it with different browsers, but the problem seems to be something else.
Does anyone have an idea? Is this not a valid combination of symbols?
EDIT: I just tried to switch around the two marks just to see what if - it renders as रः॑, so U+0951 and U+0903 don't seem to have the same function, as the stress sign gets rendered on top of the other mark.
It looks like i don't understand Unicode enough, yet.
This is NOT a solution for your problem, but might be useful information:
I am working with some Devanagari text data I want to display in the
browser.
Like you, I couldn't get this to work in any browser despite trying several fonts, including Arial Unicode MS:
The browser was simply rendering the text Devanagari Test: रः॑ from within the <body> of a JSP. The stress sign is clearly appearing above the Sign Visarga instead of the base character.
Is this not a valid combination of symbols?
It is a valid combination. I don't know Devanagari, so I don't know whether it is semantically "valid", but it is trivial to generate exactly the character you want from a Java application:
System.out.println("Devanagari test: \u0930\u0903\u0951");
This is the output from executing the println() call, showing the stress sign above the base character:
The screenshot above is from NetBeans 8.2 on Windows 10, but the rendering also worked fine using the latest releases of Eclipse and Intellij IDEA. The constraints are:
The three characters must be specified in that order in println() for the rendering to work.
The Sign Visarga and the Stress Sign Udatta must be presented in their Unicode form. Pasting their glyph representations into the source code won't work, although this can be done for the base character.
An appropriate font must be used for the display. I used Arial Unicode MS for the screen shot above, but other fonts such as Serif, SansSerif and Monospaced also worked.
Does anyone have an idea?
Unfortunately not, although it is clear that:
The grapheme you want to render exists, and is valid.
Although it won't render in a browser, it can be written to the console by a Java application.
The problem seems to be that all browsers apply the diacritic (Stress Sign Udatta) to the immediately preceding character rather than the base character.
See Why are some combining diacritics shifted to the right in some programs? for more information on this.

What character is this:?

EDIT
While posting the question, character I ask for was shown well to me, but after postig it does not show up anymore. As it does not appear, please look up in original site
EDIT2
I looked for Unicode chars associated with "alien", and found no matching ones. Here is how they are compared side by side:
I found, that some texts inside my database contain character like . I am not sure, how it would rendered with different fonts and environments, so here is the image, how I see it:
I tried to identify it with different ways. For example, when I paste it into Sublime Text, it automatically shows as control character <0x85>. When I tried to identify it in different unicode-detectors (http://www.babelstone.co.uk/Unicode/whatisit.html, https://unicode-table.com/en/, https://unicode-search.net/unicode-namesearch.pl), their conclusion is pretty match the same:
Uni­code code point char­acter U+0085
UTF-8 en­co­ding c2 85 hexa­decimal
194 133 deci­mal
0302 0205 octal
Uni­co­de char­ac­ter name <control>
Uni­co­de 1.0 char­act­er name (de­pre­ca­ted) NEXT LINE (NEL)
https://unicode-search.net/unicode-namesearch.pl
also included this information
HTML en­co­ding … … hexa­decimal
… … deci­mal
which gave me some vague hint, how it was possible, that … become ``. But this is not main problem here.
My question is: how is possible, that control character is shown up like this and what is the actual glyph used to represent it?
I tried to sketch into http://shapecatcher.com/ to identify it but without success. I did not find such a glyph in any Unicode table.
The alien symbol is not a Unicode character; but is in Microsoft's Webdings font, with character code 0x85. Running Start > Run > charmap, then selecting Webdings from the Font drop list, opens this window:
If I click that alien character in the leftmost column, the message Character Code : 0x85 is shown at the bottom of the window.
I can even copy that character from the Character Map and paste it into Microsoft Wordpad:
The WebDings symbols were included in Unicode Release 7: Pictographic symbols (including many emoji), geometric symbols, arrows, and ornaments originating from the Wingdings and Webdings sets. Therefore you would expect the alien symbol to also be in Unicode. However, I don't think the version of Webdings that was used included that alien symbol, since Windows 10 also has a ttf file for Webdings (version 5.01), and it also does not include the alien symbol:
So presumably what originally caught your attention was some text being rendered with an older version of the Webdings font which included that alien symbol.
The glyph is 👽 U+1F47D EXTRATERRESTRIAL ALIEN. I don't know why your system misrenders a control character.

Freetype unicode on Windows

I'm using Freetype 2.5.3 on a portable OpenGL application.
My issue is that i can't get unicode on my Windows machine, while i get them correctly on linux-based systems (lubuntu, OSX, Android)
i'm using the famous arialuni.ttf (23mb) so i'm pretty sure it contains everything. In fact, i had this working in my previous Windows installation (Win7), then re-installed Win7 from another source and now unicode is not working right.
Specifically when i draw a string, then only latin are rendered while unicode are getting skipped. I dug deeper and i found that character codes are not what they should be in wstring. For example, i'm using some greek letters in the string like γ which i know it should have a code point of 947.
My engine just iterates the wstring characters and drives the above code point to another vector that holds texture coordinates so i can draw the glyph.
The problem is that on my Windows 7 machine, the wstring does not give me 947 for a γ, but instead it gives me a 179. In addition, the character of Ά returns as 2 characters of 206 code (??) instead of one of 902.
It's like simple iterating a wstring, like:
for(size_t c=0,sz=wtext.size();c<sz;c++) {
uint32_t ch = wtext[c]; // code point
...
}
This is only happening on my newly installed Win7; it worked before on another Win7 system, along with my all linux machines. Now it's broken on this, and also on my XP virtual machine.
I don't use any wide formatting functions on this, just like:
wstring wtext = L"blΆh";
In addition, i can see my glyphs being rendered correctly in my OpenGL texture, so not a font issue either. My font generator uses the greek range of ~900-950 code points to collect the glyphs.
I add the code points per language with this:
FT_UInt charcode;
FT_ULong character = FT_Get_First_Char(face, &charcode);
do {
character = FT_Get_Next_Char(face, character, &charcode);
...
} while(charcode);
Not sure why but i fixed it by saving the file as UTF-8 BOM, rather UTF-8 (i had it by default).

How do I add a new Arabic vowel-sign in the PUA area of a font?

I am using Ubuntu 14.04, with FontForge compiled from the Git repo as of 31
July.
I'm trying to add a vowel-sign to an Arabic font, Graph, by Future Soft Egypt:
http://openfontlibrary.org/en/font/graph
I have added glyphs where the Unicode code-point already exists (eg peh,
U+067E), and that works fine. I am now trying to add a vowel sign where no
Unicode code-point exists - it is a "damma with tail", used by some writers in
Swahili to mean "o".
I decided to put it in the PUA at U+E909, and copied the font's damma (U+064F)
and added a tail:
http://kevindonnelly.org.uk/swahili/images/dammas.png
I generated the font, and set up the keyboard to emit that character.
The glyph comes up OK, but there are two problems, as can be seen here:
http://kevindonnelly.org.uk/swahili/images/output.png
showing at top "bubu", using the original damma, and at bottom "bobo", using
the new damma-with-tail.
(1) The damma-with-tail is too far to the left, even though the anchor points
in FF have not been moved.
(2) Worse, the damma-with-tail means that only the isolated versions of the
consonant glyphs get used - in the second line the two bs should be joined, as
in the first line.
I'm not sure whether this is a function of using the PUA, or whether it's due
to my missing some step I need to take in FF (eg the Encoding -> Add Encoding
Slots that needs to be done for the consonants), but if anyone could shed some
light on how to fix the two problems, I'd be very grateful.