Input greek characters in eclipse? - eclipse

I used emacs with haskell mode, now I am trying to use the IDE in eclipse with eclipseFP plug-in support, the problem is that eclipse is unable to recognize (nor input) greek characters! So how can I make eclipse to recognize and input greek characters?

The workspace, and each file have an encoding setting - change it to UTF-8 (type "encoding" in the properties dialog)
That said, you should not put greek characters into your code. Use english, and externalize i18nized values.

Related

Auto-replace unicode \uxxxx characters in a text file with their utf-8 equivalents

I've been given a huge properties file created in eclipse with ISO-8859-1 encoding, and all Greek characters in it are in Unicode format (i.e.: \u03bc\u03af\u03b1\u0020\u03bc\u03ad\u03c1\u03b1). It works fine, but I want the actual file to be human readable.
I converted the file to utf-8, but the characters remained as they were. Is there a way to automatically convert the contents of the file to utf-8 either from inside eclipse, or via an external tool?
This can be done with the AnyEdit Tools plugin:
When installed, in the editor hit Ctrl+A (to select all the text containing ASCII and Unicode formatted characters), right-click and choose Convert > From Unicode Notation.

Apache POI 3.9 generated Excel XSSF has ? in place of special characters like á (Spanish) in weblogic

I am working on an APP in which I have to generate EXCEL with XSSFCellStyle, etc. I am using Apache POI 3.9.
In some field I am doing this:
cell.setValue(myString);
myString may contains special characters like ñ and á, which are Spanish. These characters may come from i18n.properties, or hardcoded as plain String.
myString = "ññññññññ";
All is well in local machine with Tomcat8, but in Weblogic server, in the Excel generated, I see ´?` in place of these characters.
I read somewhere that in Weblogic servers the default charset is UTF-8. The local environment is of Spanish(cp1252), and in Eclipse Luna the workspace charset is also cp1252, so it may be the reason, but I am not sure. Should I change in Preference - Workspace, or in JVM parameters -Dfile.encoding=UTF-8?
I also read about Apache POI encoding handling, that the API handles it all, so I am not to worry about that. All I can do is set font charset, like this:
font.setCharSet(FontCharset.DEFAULT);
But, I cannot see UTF-8 here. In source code I see:
/**
* Charset represents the basic set of characters associated with a font (that it can display), and
* corresponds to the ANSI codepage (8-bit or DBCS) of that character set used by a given language.
*
* #author Gisella Bronzetti
*/
public enum FontCharset {
ANSI(0),
DEFAULT(1),
SYMBOL(2),
MAC(77),
SHIFTJIS(128),
HANGEUL(129),
JOHAB(130),
GB2312(134),
CHINESEBIG5(136),
GREEK(161),
TURKISH(162),
VIETNAMESE(163),
HEBREW(177),
ARABIC(178),
BALTIC(186),
RUSSIAN(204),
THAI(222),
EASTEUROPE(238),
OEM(255);
Neither that of WESTEUROPE. So how can I set it?
Thanks #centic for the hint. I finally find my solution:
Change JVM encoding by settting parameter -Dfile.encoding=UTF-8, or,
Change text file encoding to UTF-8 for my project in Eclipse.
Thus special characters hard-coded in Strings will be converted to ? and I manually fixed them and save the .java as UTF-8 (now is default), and compile the project and make my WAR. Now in Weblogic it's all fine.
I think the problem lies in that in local machine, my java files are encoded as cp1252 according to JVM and Eclipse settings, and then in Tomcat, as in the same environment, also has cp1252 as its decoding setting(both inherited from my Windows 7), so it's ok. But in Weblogic, it only accepts UTF-8 as input, and therefore will only decode my WAR/class files using UTF-8, so characters encoded as cp1252 are not recognized.

Julia: How to deal with special unicode characters

I am working with the Distributions package which uses special unicode characters for many of the variables within types. The normal distribution, for instance, uses μ and σ. If I want to edit the standard deviation, I need to somehow type:
n.σ = 5.0
Is it possible to type these values into the repl (outside of using copy-paste)? How does one create these characters with one's keyboard?
Thank you
At the REPL, use LaTeX shortcuts, e.g. type \sigma and press tab to autocomplete. Note you need to using Julia 0.3 or higher for this to work.
Many text editors have add-ins to do something similar, e.g. https://github.com/mvoidex/UnicodeMath for SublimeText.
In Windows 10 and Linux under most modern desktops you can add a Greek keyboard map and then switch between Greek and English using [[windows button]]+space. Since many Latin / English letters are derived from ancient Greek, these have analogs so S types a Sigma (σΣ), D, a delta(δΔ) etc. ετψ.
Figured it out by looking up "Entering Unicode in Linux" on Google.
One can press Ctrl+Shift+u, then the 4-digit UTF-16 Hex encoding for the character. For example, σ = u03bc
Tt is difficult to remember all the unicodes or latex shortcuts (at least for me) or search it on the web. When working with REPL or jupyter notebook, Julia has provided a simple way to do it as mentioned here:
You can also get information on how to type a symbol by entering it in the REPL help, i.e. by typing ? and then entering the symbol in the REPL (e.g., by copy-paste from somewhere you saw the symbol).
For example:

Using unicode / utf-8 in programmers editors

There are a lot of programmers editors that claim to support unicode / utf-8. I've tried a number of them (UltraEdit, jedit, emedit) but none of them tell you how to actually enter unicode characters into a file. Some of them tell you how to change the default file encoding to utf-8 or how to select a font that has good support for utf-8, but not how to enter utf-8 into a file using their editor.
The Go language (and some others) support utf-8 and I like the idea of using the actual utf-8 symbols for variables instead of variables with names like omega. I haven't found a programmers editor yet that actually allows you to do this, though.
The only editor / word processor that I've found that lets you how to enter unicode is Microsoft Word. Type the unicode and Alt+X and Word converts it. To get the Greek letter omega type "03c9" followed by Alt+X. UltraEdit will let you copy utf-8 from a web page into it, but their docs don't say how to actually enter utf-8 in a file, and their tech. support people don't know either.
This should be simple, but seems to be completely undocumented. Is there some key combination convention the lets you enter unicode into these editors that supposedly support unicode the way that Ctrl-F is widely used for search?
Thanks.
The standard programmer’s editor vim(1) supports limited Unicode input even if your operating system should be too broken to do so (are there any such, still?).
Just enter ^VuXXXX, where XXXX represents exactly four hex digits.
That will allow you to enter the ~6% of Unicode allocated to the Basic Multilingual Plane. The rest are forbidden to you.
This may be fixed in a newer release.
Otherwise, just use your mouse.
A few techniques I use if an editor is lacking:
Use the Windows charmap.exe utility to select characters and paste into a document.
Install an input method editor (IME) to write in a particular language.
Windows ALT keycodes.
Better to set your keyboard to generate Unicode characters across all Windows applications than to rely on a single application's custom input feature IMO.
Use the EnableHexNumpad feature and you can type any character in the Basic Multilingual Plane using Alt+numbad-plus,hexcode. (May not be of much use on a laptop without a numpad though.)
Or if there are particular characters you want to type a lot, find a keyboard layout that allows you to type them directly. For example eurokb might cover it, or you can make your own with MSKLC.
Old question, but you can type a lot of unicode in GNU Emacs or Vim
GNU Emacs: M-x set-input-method RET tex (or C-x RET C-\ tex) will let you type \omega to generate ω
Vim: Vim digraphs can generate unicode; C-k w * in insert mode gives you ω.
deceze hit the nail on the head. (S)he just didn't elaborate. bobince gave a bit more.
And I'm hazarding a guess that you're a developer or tester working on L14N or I18N. I'm also guessing you need to do more than just a few characters here or there, or you'd be satisfied with pasting from another app. So, I'll share some advice. (note: here, "you" refers to the next person to look here. I'm sure the original poster doesn't care anymore by now. :-))
If you're on Windows 10, install an appropriate keyboard driver that lets you input the characters you want into any application. I'm sure Linux has support for the same sort of thing.
E.g. I'm teaching myself Hindi (हिंदी), so I installed Windows' Hindi (Devanangari) support. I typed "Hindi", in Hindi using that support, then I switched back to US English to do the rest of this post. If all you need are accented characters from Western European languages, you can install the INTL English support and type directly in español or français or whatever.
Don't look at entering Unicode characters as entering some sort of special data amidst your English text. It's just someone else's language. Use their keyboard. Type their language.
I'm writing a flashcard app to help my learning. I'm using the Hindi keyboard support to type characters into Word, WordPad, Excel, and the Visual Studio editor. And that Hindi keyboard support works exactly the same way in all of those apps, as I'd expect it to work in just about any text editor that supports Unicode. And as you saw above, it also works in a simple text edit control in Chrome. No copy and paste. No remembering special codes. It's as ubiquitous as ctrl-F.
It looks like the unicode support in programmers editors (except for some Microsoft products) is mostly read-only. They can open a file with unicode and display the characters, but typing unicode into a file is a different story. If you want to enter unicode in a programmers editor you can copy it from somewhere else (a web page or Microsoft Word or Notepad) and paste it into the editor, but the editors make typing unicode difficult or impossible.
UltraEdit tech support referred me to this web page which explains a lot. Unfortunately none of the solutions worked with UltraEdit.
Microsoft Word and Notepad support unicode entry. Type the unicode value followed by Alt+X and it converts the hexadecimal and displays it. You can then copy and paste it into UltraEdit or one of the other programmers editors. As others have mentioned unicode support depends on support within the operating system as well as the editor.
What got me interested in using unicode in source code files is Mark Summerfield's book Programming in Go. He includes an example .go file that uses unicode. It would be great to use unicode Greek characters for variable names instead of variables named "omega" or "theta".
Using unicode in source code is a bad idea, however. Support for unicode in programmers editors is lousy, and developers would have to save or convert their source code files to utf-8 instead of ASCII. Developer's tools are just not ready to write code in unicode no matter how neat the idea sounds.

How do I make \uXXXX appear in Eclipse?

How do I let my Eclipse use \uXXXX symbols?
Should I change the font?
Eclipse will never use \u escapes for display in the console window. That's just not in its repertoire.
However, that's probably not what you want.
If you have coded some Java with a \u escape in the source, your first problem is to configure the run / debug configuration to use an appropriate encoding for the console window. UTF-8 is usually the right answer. Then, you need to select an appropriate font in the eclipse preferences for the particular character you've chosen. However, whatever you do, "\uxxxx" will never be what comes out. What you will get is the character specified by your unicode escape.
If you're just trying to see unicode output in the console, make sure the font you're using supports unicode and that the output encoding is set to UTF-8.
When running this in my pretty vanilla install of Eclipse:
System.out.println("\u0CA0_\u0CA0");
I get this as expected in the Eclipse console output:
ಠ_ಠ