Trademark superscript in NSI - unicode

I want to add a Trade Mark superscript in the NSI script.
I tried using unicode character for trademark - U+2122 , but it doesn't display the trade mark character correctly when the installer exe is run?
I have following questions:
How do I add the trademark symbol in the NSI
I am using NSI compiler version 2.46 . Do I need to upgrade?
How to create (enable) unicode support in a NSI file?

Source files in NSIS 2 are just a bunch of bytes and these bytes are stored directly in the .exe. At run-time Windows will (on NT based systems) convert these bytes to Unicode strings by using the current codepage/system locale (Language for non-unicode programs). This means that you have to use the correct codepage/encoding in your text editor. If your installer supports multiple languages you need to use LangString and basically edit those strings with the correct encoding set in your editor. Using a .nsh for each language might help.
NSIS 3 uses Unicode internally in the compiler and if you are creating a Unicode installer (Unicode True) then you can use any Unicode code point. You can save the .nsi as UTF-8 or UTF-16 (with BOM) or you can use the ${U+hexnumber} syntax:
Unicode True
Section
MessageBox mb_ok "Hello World${U+2122}"
SectionEnd
NSIS 3 can also generate Ansi installers and it knows about the ${U+hexnumber} syntax but it cannot guarantee that the codepoint will display correctly on the end-users system, it is still limited to simple bytes and will convert from Unicode to Ansi using the current codepage from the system you are compiling on.

You can try to use the character 0x99, the Windows 1252 equivalent of U+2122.
With a western configured windows, you can enter it directly via the keyboard with Alt0153 (keep the Alt key pressed while entering the digits on the numeric keypad, and it is Alt and not Alt Gr).

Related

Looking for c++ equivalent for _wfindfirst for char16_t

I have filenames using char16_t characters:
char16_t Text[2560] = u"ThisIsTheFileName.txt";
char16_t const* Filename = Text;
How can I check if the file exists already? I know that I can do so for wchar_t using _wfindfirst(). But I need char16_t here.
Is there an equivalent function to _wfindfirst() for char16_t?
Background for this is that I need to work with Unicode characters and want my code working on Linux (32-bit) as well as on other platforms (16-bit).
findfirst() is the counterpart to _wfindfirst().
However, both findfirst() and _wfindfirst() are specific to Windows. findfirst() accepts ANSI (outdated legacy stuff). _wfindfirst() accepts UTF-16 in the form of wchar_t (which is not exactly the same thing as char16_t).
ANSI and UTF-16 are generally not used on Linux. findfirst()/_wfindfirst() are not included in the gcc compiler.
Linux uses UTF-8 for its Unicode format. You can use access() to check for file permission, or use opendir()/readdir()/closedir() as the equivalent to findfirst().
If you have a UTF-16 filename from Windows, you can convert the name to UTF-8, and use the UTF-8 name in Linux. See How to convert UTF-8 std::string to UTF-16 std::wstring?
Consider using std::filesystem in C++17 or higher.
Note that a Windows or Linux executable is 32-bit or 64-bit, that doesn't have anything to do with the character set. Some very old systems are 16-bit, you probably don't come across them.

Save file with ANSI encoding in VS Code

I have a text file that needs to be in ANSI mode. It specifies only that: ANSI. Notepad++ has an option to convert to ANSI and that does the trick.
In VS Code I don't find this encoding option. So I read up on it and it looks like ANSI doesn't really exist and should actually be called Windows-1252.
However, there's a difference between Notepad++'s ANSI encoding and VS Code's Windows-1252. It picks different codepoints for characters such as an accented uppercase e (É), as is evident from the document.
When I let VS Code guess the encoding of the document converted to ANSI by Notepad++, however, it still guesses Windows-1252.
So the questions are:
Is there something like pure ANSI?
How can VS Code convert to it?
Check out the upcoming VSCode 1.48 (July 2020) browser support.
It also solves issue 33720, where you had to force encoding for the full project before with, for instance:
"files.encoding": "utf8",
"files.autoGuessEncoding": false
Now you can set the encoding file by file:
Text file encoding support
All of the text file encodings of the desktop version of VSCode are now supported for web as well.
As you can see, the encoding list includes ISO 8859-1, which is the closest norm of what "ANSI encoding" could represent.
The Windows codepage 1252 was created based on ISO 8859-1 but is not completely equal.
It differs from the IANA's ISO-8859-1 by using displayable characters rather than control characters in the 80 to 9F (hex) range
From Wikipedia:
Historically, the phrase "ANSI Code Page" was used in Windows to refer to non-DOS encodings; the intention was that most of these would be ANSI standards such as ISO-8859-1.
Even though Windows-1252 was the first and by far most popular code page named so in Microsoft Windows parlance, the code page has never been an ANSI standard.
Microsoft explains, "The term ANSI as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community."

MFC multibyte application shows junk "????" on pasting Chinese characters, but typing works

Our MFC application uses Multi Byte Character Set (MBCS). OS is Windows 7.
We could type in Chinese Simplified characters by virtual keyboard, but copy pasting Chinese characters from Google Translate to an edit box in the application shows junk characters "????"
Is this a known issue with MBCS applications? Is there a workaround?
When copying and pasting into a multi-byte app the Unicode characters will be converted into the local code page. If they can't be converted you'll get ?. You really should be compiling and distributing your app in Unicode otherwise you'll be fighting these sorts of issues all the time.
If you can't re-compile in Unicode try catching the 'Paste' action and handle the clipboard yourself. Use GetClipboardData and read the value for CF_UNICODETEXT, which will be the valid text. You'll then need to do your own conversion to the correct multi-byte format.

Using unicode / utf-8 in programmers editors

There are a lot of programmers editors that claim to support unicode / utf-8. I've tried a number of them (UltraEdit, jedit, emedit) but none of them tell you how to actually enter unicode characters into a file. Some of them tell you how to change the default file encoding to utf-8 or how to select a font that has good support for utf-8, but not how to enter utf-8 into a file using their editor.
The Go language (and some others) support utf-8 and I like the idea of using the actual utf-8 symbols for variables instead of variables with names like omega. I haven't found a programmers editor yet that actually allows you to do this, though.
The only editor / word processor that I've found that lets you how to enter unicode is Microsoft Word. Type the unicode and Alt+X and Word converts it. To get the Greek letter omega type "03c9" followed by Alt+X. UltraEdit will let you copy utf-8 from a web page into it, but their docs don't say how to actually enter utf-8 in a file, and their tech. support people don't know either.
This should be simple, but seems to be completely undocumented. Is there some key combination convention the lets you enter unicode into these editors that supposedly support unicode the way that Ctrl-F is widely used for search?
Thanks.
The standard programmer’s editor vim(1) supports limited Unicode input even if your operating system should be too broken to do so (are there any such, still?).
Just enter ^VuXXXX, where XXXX represents exactly four hex digits.
That will allow you to enter the ~6% of Unicode allocated to the Basic Multilingual Plane. The rest are forbidden to you.
This may be fixed in a newer release.
Otherwise, just use your mouse.
A few techniques I use if an editor is lacking:
Use the Windows charmap.exe utility to select characters and paste into a document.
Install an input method editor (IME) to write in a particular language.
Windows ALT keycodes.
Better to set your keyboard to generate Unicode characters across all Windows applications than to rely on a single application's custom input feature IMO.
Use the EnableHexNumpad feature and you can type any character in the Basic Multilingual Plane using Alt+numbad-plus,hexcode. (May not be of much use on a laptop without a numpad though.)
Or if there are particular characters you want to type a lot, find a keyboard layout that allows you to type them directly. For example eurokb might cover it, or you can make your own with MSKLC.
Old question, but you can type a lot of unicode in GNU Emacs or Vim
GNU Emacs: M-x set-input-method RET tex (or C-x RET C-\ tex) will let you type \omega to generate ω
Vim: Vim digraphs can generate unicode; C-k w * in insert mode gives you ω.
deceze hit the nail on the head. (S)he just didn't elaborate. bobince gave a bit more.
And I'm hazarding a guess that you're a developer or tester working on L14N or I18N. I'm also guessing you need to do more than just a few characters here or there, or you'd be satisfied with pasting from another app. So, I'll share some advice. (note: here, "you" refers to the next person to look here. I'm sure the original poster doesn't care anymore by now. :-))
If you're on Windows 10, install an appropriate keyboard driver that lets you input the characters you want into any application. I'm sure Linux has support for the same sort of thing.
E.g. I'm teaching myself Hindi (हिंदी), so I installed Windows' Hindi (Devanangari) support. I typed "Hindi", in Hindi using that support, then I switched back to US English to do the rest of this post. If all you need are accented characters from Western European languages, you can install the INTL English support and type directly in español or français or whatever.
Don't look at entering Unicode characters as entering some sort of special data amidst your English text. It's just someone else's language. Use their keyboard. Type their language.
I'm writing a flashcard app to help my learning. I'm using the Hindi keyboard support to type characters into Word, WordPad, Excel, and the Visual Studio editor. And that Hindi keyboard support works exactly the same way in all of those apps, as I'd expect it to work in just about any text editor that supports Unicode. And as you saw above, it also works in a simple text edit control in Chrome. No copy and paste. No remembering special codes. It's as ubiquitous as ctrl-F.
It looks like the unicode support in programmers editors (except for some Microsoft products) is mostly read-only. They can open a file with unicode and display the characters, but typing unicode into a file is a different story. If you want to enter unicode in a programmers editor you can copy it from somewhere else (a web page or Microsoft Word or Notepad) and paste it into the editor, but the editors make typing unicode difficult or impossible.
UltraEdit tech support referred me to this web page which explains a lot. Unfortunately none of the solutions worked with UltraEdit.
Microsoft Word and Notepad support unicode entry. Type the unicode value followed by Alt+X and it converts the hexadecimal and displays it. You can then copy and paste it into UltraEdit or one of the other programmers editors. As others have mentioned unicode support depends on support within the operating system as well as the editor.
What got me interested in using unicode in source code files is Mark Summerfield's book Programming in Go. He includes an example .go file that uses unicode. It would be great to use unicode Greek characters for variable names instead of variables named "omega" or "theta".
Using unicode in source code is a bad idea, however. Support for unicode in programmers editors is lousy, and developers would have to save or convert their source code files to utf-8 instead of ASCII. Developer's tools are just not ready to write code in unicode no matter how neat the idea sounds.

Displaying Unicode characters above U+FFFF on Windows

the application I'm developing with EVC++ 4 runs on Windows CE 5 and should support unicode (AFAIK wchar_t uses UTF-16 on windows, so I'm using that), so I want to be able to test it with "more exotic" characters. Especially with characters that use 4 Byte in UTF-16 and not just 2. Therefore I'm trying to display such characters in a texteditor (atm on my desktop PC with Windows XP, not on the embedded device).
But I haven't managed it to do so yet. As an example I've chosen this character.
Like mentioned here "MPH 2B Damase" should support this character. So I downloaded the font and put it into Windows\Fonts. I created a textfile using a hexeditor (just to be sure) with following content:
FFFE D802 DC00
When I open it with notepad (which should be unicode-capable, right?) and use the downloaded font it doesn't display 1 char, as intended, but this 2:
˘Ü
What am I doing wrong? :)
Thanks!
hrniels
Edit:
Flipping the BOM, as suggested, doesn't work. Notepad (and all other editors I tried, too) displays two squares in this case. Interesting is that if I copy the two squares here (with firefox) I see the right character:
I've also tried it with Komodo Edit with the same result.
Using UTF-8 doesn't help notepad either.
What happens if you put the byte order mark the other way around?
FEFF D802 DC00
(At the moment the byte sequence is being interpreted as the two characters U+02D8 U+00DC, so hopefully flipping the BOM will cause the bytes to be read in the intended order)
Probably you forgot to read the _wfopen() documentation. There they specify the encoding parameter. BTW, I assumed you are already using Unicode (wchars).
I would recommend you to use UTF-8 in files with or without BOM but forcing your fopen to use UTF-8 flag. It looks _wfopen("newfile.txt", "r, ccs=UTF-8"); will work with UTF-8 with or without BOM and also with UTF-16. Do not make the mistake of using the ccs=Unicode, it is a common thing to have UTF-8 files without BOM.
You should really read a little bit about Unicode before trying to work. This about this as a very good investment - it will save you time if you understand how Unicode works.
Here is a start http://blog.i18n.ro/newbie-guide-to-unicode/ and do not forget to read the links from the end of the article.
If you really need a simple text editor that allows you to play with Unicode encodings, use Notepad++ and forget about Notepad.
Your text editor might not like UTF-16. It probably assumes ANSI or UTF-8.
Try typing in the UTF-8 equivalent instead:
0xF0 0x90 0xA0 0x80
This won't help your testing, but will make sure your font isn't at fault. A text editor that does support UTF-16 is Komodo Edit.