Is there a way to dynamically change the encoding for terminal input so that "a" is ऄ and "b" is ब (in unicode) and so on...? - unicode

I would like to write in arbitrary fonts in the terminal, such as Chinese, Devangari, Mayan Hieroglyphs (a font that is not even part of unicode yet), etc.
I would like to press "a" to get ऄ, etc., basically I say "enter encoding DEVANAGARI" and now "a" is ऄ, etc. Or I say "enter encoding MAYAN" and "a" is some "private unicode space" glyph, etc. How can I do this? Can I set it dynamically somehow, maybe using Swift (for Mac) if I had a Mac app running in the background?
For example, I would imagine like this:
$ change-script DEVANAGARI
$ a
# replaced with
$ ऄ
# etc
How can I do this in the Terminal app, or in any app for that matter?
This way I can use the ASCII keyboard to write in arbitrary fonts even if they aren't in unicode.

Answering the title question: yes. This can be accomplished by changing the active keyboard. Apple supplies a bunch (including many QWERTY-style for various scripts that roughly map the English/latin letters to (rough) equivalents in the script). As noted in a comment, there are tools available to create custom layouts if the supplied ones are not sufficient.
As to the question of doing it programatically, that's trickier. The accepted answer to this (old) SO question suggests that you can programatically switch keyboards (presumably: installed ones).
But by a close read of your question and follow-up, it seems like you basically want to create/modify the active keyboard dynamically (?). I'd be surprised if anything like that is supported, but I'm also not sure why it would be necessary to do that if you have the ability to switch programatically.

Related

Why Julia returns "\uf8ff" when I use  (Apple logo) unicode?

I thought Julia supports raw unicode input, such as:
julia> test = "π£¢∞§"
"π£¢∞§"
julia> 😘 = 1 ;
julia> print(😘 )
1
However, it seems julia does not support  (Apple logo).
julia>  = 123
ERROR: syntax: invalid character ""
julia> test = ""
"\uf8ff"
I wonder what's the underlying reason for that, and whether there is a way I can use  character in Julia?
I believe this link more properly explains the case of the unicode character that you see as apple's logo.
The problem is that the unicode value used is one of several that is set aside for private use. That means that each operating system, or application, or implementation is free to use those unicode characters for anything they want. It just so happens that Apple has chosen to use unicode character U+F8FF (decimal value 63743, or on the web as either  or ) as the Apple Logo. But some Windows fonts put in a Windows logo. And some other fonts put in a Klingon Mummification glyph. Or elven script. Or anything they want. And if it isn't defined in your local font, you'll just see a square.
My opinion is that Julia simply doesn't use this special value for anything. This also explains why your "π£¢∞§" characters work nicely - they are proper unicode characters, more largely supported by different platforms.
As a side note, i too see a simple square instead of the apple logo on this instance.
Edit
Here is a list of unicode characters supported by Julia.
To expand on Alex's answer...
Apple's logo () isn't an official Unicode symbol. I think there are very few commercial logos and symbols in the main Unicode tables.
However, Unicode provides some 'anything goes' areas (called PUAs - private use areas) that companies and individuals can fill with their own symbols, so that their users can access certain special glyphs. The main PUA is U+E000 to U+F8FF. Depending on which font you're using, you'll find all kinds of stuff assigned to these codes. On a Mac, I can usually get the Apple logo at "\uf8ff", with the right font selected, but not the Ubuntu symbol or the Windows logo, unless I choose another font. (There's also a fallback mechanism, whereby if you request a code point that the current font doesn't have, the OS will find a suitable substitute in another font and use that.)
[
In Julia, you can only use certain Unicode characters for variable names. Julia wouldn't allow anything from the private use area anyway, unless some fonts were distributed to every computer and everyone agreed on who had which Unicode point. (Mathematica makes extensive use of PUA symbols in their notebooks, because they can and do install their own fonts, and can then access various glyphs from the PUA in the notebook with guaranteed results.)
You are allowed to use emoji characters as variable names, so you could try the Emoji apple, rather than the Apple apple:

Using unicode / utf-8 in programmers editors

There are a lot of programmers editors that claim to support unicode / utf-8. I've tried a number of them (UltraEdit, jedit, emedit) but none of them tell you how to actually enter unicode characters into a file. Some of them tell you how to change the default file encoding to utf-8 or how to select a font that has good support for utf-8, but not how to enter utf-8 into a file using their editor.
The Go language (and some others) support utf-8 and I like the idea of using the actual utf-8 symbols for variables instead of variables with names like omega. I haven't found a programmers editor yet that actually allows you to do this, though.
The only editor / word processor that I've found that lets you how to enter unicode is Microsoft Word. Type the unicode and Alt+X and Word converts it. To get the Greek letter omega type "03c9" followed by Alt+X. UltraEdit will let you copy utf-8 from a web page into it, but their docs don't say how to actually enter utf-8 in a file, and their tech. support people don't know either.
This should be simple, but seems to be completely undocumented. Is there some key combination convention the lets you enter unicode into these editors that supposedly support unicode the way that Ctrl-F is widely used for search?
Thanks.
The standard programmer’s editor vim(1) supports limited Unicode input even if your operating system should be too broken to do so (are there any such, still?).
Just enter ^VuXXXX, where XXXX represents exactly four hex digits.
That will allow you to enter the ~6% of Unicode allocated to the Basic Multilingual Plane. The rest are forbidden to you.
This may be fixed in a newer release.
Otherwise, just use your mouse.
A few techniques I use if an editor is lacking:
Use the Windows charmap.exe utility to select characters and paste into a document.
Install an input method editor (IME) to write in a particular language.
Windows ALT keycodes.
Better to set your keyboard to generate Unicode characters across all Windows applications than to rely on a single application's custom input feature IMO.
Use the EnableHexNumpad feature and you can type any character in the Basic Multilingual Plane using Alt+numbad-plus,hexcode. (May not be of much use on a laptop without a numpad though.)
Or if there are particular characters you want to type a lot, find a keyboard layout that allows you to type them directly. For example eurokb might cover it, or you can make your own with MSKLC.
Old question, but you can type a lot of unicode in GNU Emacs or Vim
GNU Emacs: M-x set-input-method RET tex (or C-x RET C-\ tex) will let you type \omega to generate ω
Vim: Vim digraphs can generate unicode; C-k w * in insert mode gives you ω.
deceze hit the nail on the head. (S)he just didn't elaborate. bobince gave a bit more.
And I'm hazarding a guess that you're a developer or tester working on L14N or I18N. I'm also guessing you need to do more than just a few characters here or there, or you'd be satisfied with pasting from another app. So, I'll share some advice. (note: here, "you" refers to the next person to look here. I'm sure the original poster doesn't care anymore by now. :-))
If you're on Windows 10, install an appropriate keyboard driver that lets you input the characters you want into any application. I'm sure Linux has support for the same sort of thing.
E.g. I'm teaching myself Hindi (हिंदी), so I installed Windows' Hindi (Devanangari) support. I typed "Hindi", in Hindi using that support, then I switched back to US English to do the rest of this post. If all you need are accented characters from Western European languages, you can install the INTL English support and type directly in español or français or whatever.
Don't look at entering Unicode characters as entering some sort of special data amidst your English text. It's just someone else's language. Use their keyboard. Type their language.
I'm writing a flashcard app to help my learning. I'm using the Hindi keyboard support to type characters into Word, WordPad, Excel, and the Visual Studio editor. And that Hindi keyboard support works exactly the same way in all of those apps, as I'd expect it to work in just about any text editor that supports Unicode. And as you saw above, it also works in a simple text edit control in Chrome. No copy and paste. No remembering special codes. It's as ubiquitous as ctrl-F.
It looks like the unicode support in programmers editors (except for some Microsoft products) is mostly read-only. They can open a file with unicode and display the characters, but typing unicode into a file is a different story. If you want to enter unicode in a programmers editor you can copy it from somewhere else (a web page or Microsoft Word or Notepad) and paste it into the editor, but the editors make typing unicode difficult or impossible.
UltraEdit tech support referred me to this web page which explains a lot. Unfortunately none of the solutions worked with UltraEdit.
Microsoft Word and Notepad support unicode entry. Type the unicode value followed by Alt+X and it converts the hexadecimal and displays it. You can then copy and paste it into UltraEdit or one of the other programmers editors. As others have mentioned unicode support depends on support within the operating system as well as the editor.
What got me interested in using unicode in source code files is Mark Summerfield's book Programming in Go. He includes an example .go file that uses unicode. It would be great to use unicode Greek characters for variable names instead of variables named "omega" or "theta".
Using unicode in source code is a bad idea, however. Support for unicode in programmers editors is lousy, and developers would have to save or convert their source code files to utf-8 instead of ASCII. Developer's tools are just not ready to write code in unicode no matter how neat the idea sounds.

Is there a way I can add unicode text to a MBCS MFC menu

I have a MFC application compiled with the MBCS character set. I have a submenu off of my main menu that I would like to add unicode characters to. Can that be done?
You can force the use of Unicode strings even in MBCS apps by explicitely calling the Unicode form of an API and passing it a Unicode string.
In your case, ModifyMenuW() is the API that sets the menu item text (assuming the menu item already exists):
ModifyMenuW(GetMenu()->m_hMenu,ID_APP_ABOUT, MF_BYCOMMAND , 0, L"\u573F");
This code displays a Chinese ideogram (I have no idea of its meaning) instead of the original text
The L in front of the string says it's a Unicode string. \u573F is the way you encode a Unicode char in your C++ ASCII source file. The W at the end of the API name: It stands for Wide and denotes the Unicode form of the API.
Note that if your goal is to translate the full UI of your app, this is a complete other story: The method I showed here is only suitable for one-shot calls. You can't create a full UI that way.
You can translate your MBCS app to Japanese, Russian, whatever,... without switching to Unicode (Although it would be a very good idea to do that switch. But that can be costly for legacy apps).
You have 2 friends to help you out there: appTranslator lets you very easily translate your app (and manage your translations (Disclaimer: This is my own ad ;-) and Microsoft AppLocale helps you test MBCS apps in different codepages without actually changing the codepage of your computer (which requires a reboot).

Unicode Code Point for Command Key Combinations

Can someone please tell me how to determine the unicode character point of a multi-key combination that includes the "command" key? For example, if a user presses the "command" key and "1" key on the keyboard at the same time, what is the unicode character representation for that?
Maybe I'm searching on the wrong thing, but I am not able to locate this in the character maps, keyboard references, or unicode tables I find. I can sort out other key combinations (e.g. shift-1) as there is an obvious character output of "!" that I can look up and find that it is U+0021. When I go to character maps or applications the command key always seems to take an action rather than output a character result to screen.
My app is for iOS, which I would expect to be the same as Mac OS X in terms of the unicode code point. All of the iOS APIs that provide access to the keyboard see it as a source of Unicode characters. Thus the reason I am trying to detect keystrokes this way.
Thanks.
Keyboard codes are basically independent of character codes.
While (as you mention) many keys have standard mappings to standard ASCII codes, it is up to the application to decide what to do with them.
Some input API's may be widely used on a particular OS, and some applications (e.g., terminal emulators) may be used as a common input method for a class of tasks, but there is no universal standard.
Obligatory wikipedia link for Unicode input.
You can't. There simply are no Unicode codepoints that correspond to Command + some-other-character.
The same is true of Shift, by the way. The fact that your computer happens to map certain combinations to certain Unicode codepoints does not imply that Unicode specifies such mappings, or that mappings exist for every combination of keys, or that those mappings are the same for everyone else. I use two keyboards every day; one of them maps Shift+3 to #, the other maps it to £. This is decided by the operating system, not by Unicode. If you tried to detect a Shift+3 keypress by listening for #, your program would seem to me to be broken half the time.
This is a perfect example of an XY question. You don't really care about Unicode -- what you really want to know is how to detect keypresses with the Command modifier on iOS. You should just have asked how to do that! There is probably an API that does exactly what you need that you have simply missed, because you were concentrating on your assumption that the solution would involve Unicode -- and there are probably numerous iOS experts who have not bothered to read this question at all, because they thought your problem related to Unicode rather than iOS.
Simple answer: no.
You haven't told us what sort of computer you are using. Mapping a key press to a Unicode code point is operating system specific, and then it depends on the locale that is active.

Built-in function for converting between unicode characters and virtual keycodes in Cocoa?

Is there a way to convert a unicode character to a Mac virtual keycode? (without building my own table?) It looks like on Windows there is VkKeyScanEx, but I'm not aware of a similar function for Cocoa on OS X.
I'm actually trying to do this for the iPad. I want to convert character taken from the keyboard and convert them into key codes, since the iPad keyboard won't supply keycodes.
The ShortcutRecorder project on GoogleCode has an NSValueTransformer subclass for converting strings to keycodes and vice versa, but I'm not sure if it'll work on iOS. It's a great place to start looking, though.
I'm interested in the reason why it needs to be tagged iPhone/iPad — surely you can do all the conversion in OS X? Also, the iPhone/iPad "keyboard" is fundamentally a text input method (see UITextInput) —it's not that it "won't supply keycodes"; there simply aren't any (and what keycode/modifiers should it supply when you tap "A", hold for a bit, and pick a random accented version?).
If you're going to do this, test it on a variety of (odd) input methods on both the iPad and OS X. If there's an API to insert a string, do so (but this might not work so well for games which read scan codes...). You could even write a custom input method extension which accepted Unicode strings.
It's debatable what should happen when a Dvorak VNC client types to a QWERTY VNC server...
I'll end with a tangential story:
A little over a year ago (before I got an iPhone), I got a N810. If anything, it makes a half-decent SSH/VNC client and has a decent keyboard.
Except it's not a standard keyboard. 1 is Fn-Q and ! is Fn-A, but when I type Fn-A to get "!", the VNC server ends up typing "1". Typing Shift-Fn-A gives me the "!" I was looking for (I think Shift-Fn-Q also works).
Something, somewhere, parses the character "!", decides that it has the same scan code as "1", and types the scan code for 1 with no modifier. It could automatically hold down Shift. It might even be able to insert a string. Instead, it just fails.