Chinese characters in emacs --no-window-system on windows - emacs

I recently started using emacs on Android in Termux. My projects feature some chinese characters, which are displayed there just fine.
When later opening the same file in Emacs on Windows, I was disappointed to see them displayed as \xxxx
I am not sure how to search for a solution, because I do not what the problem is.
The only thing I found related to my problem is this:
Unicode characters in emacs term-mode
but it did no help me solve the issue.

You can tell what's going on by looking at the first few characters in the mode line. In Termux, it says UUU, but in Windows it says DDU. These three characters stand for:
the coding system for keyboard input
the coding system for terminal output
the coding system of the file in the buffer
U stands for UTF-8, while D stands for various DOS code pages. (You can find this using M-x list-coding-systems. This is all described in the Mode Line section of the Emacs manual.)
So this means that Emacs is reading the file correctly, but it thinks that the terminal is unable to display the Chinese characters, so it uses the \uxxxx notation as a fallback. I'm not sure how to get this to work properly in a Windows terminal, but try M-x set-terminal-coding-system with utf-8 - it might just work.
As an aside, if you run Emacs as a "normal" Windows application instead of in a terminal, the characters should display correctly automatically, so if there is a particular problem preventing you from doing so, it might be worth trying to fix that instead.

Related

Japanese characters displaying as random symbols

I have a Japanese program, but when I try to run it all the Japanese characters turn into random symbols like in the link to the picture
http://puu.sh/avcAp/04de126564.png
I've tried searching for answers but I only see ways to fix it if it shows up as blank rectangles and not the random symbols I'm getting.
You can see little bits of english just fine though, like in the link, you can see the .exe ... (I'm trying to install it but there's an error and since it's all these random symbols I have no idea what it means.)
Edit:
I'm running Windows 7 Ultimate. This is a program I downloaded online, but I'm sure it's not broken. The symbols in the picture are a screenshot from an error window (The windows default one with a red X on the left), however all the text that's related to that program that is Japanese is also random symbols, and my picture is just an example.
Edit 2:
Here is a better image
http://puu.sh/aveXV/7d15ecf2d4.png
Where you can see some Japanese characters, but some are random characters. This is when I first run the .exe to install and it's asking for what directory I want to install to
Thanks for your help in advance
It sounds like a non-Unicode program. It's not ideal, but you can change your non-Unicode program language in Region and Language settings in Control Panel to Japanese (requires reboot).
It only affects non-Unicode programs, so most modern English programs will be unaffected. I've changed mine to Chinese (PRC) and the only program I noticed affected was "cmd.exe". It allowed me to type and display Chinese at the console.

Issue with coding Windows-1250 in Perl

I have a text file encoded in Windows-1250. I'm using Windows 7 EN.
I would like to iterate through this file line by line in Perl code with
print. In console I cannot see the diacritic signs.
Could you give me any solution?
It depends on what you are going to do with the text, but for many cases
it's possible to code independently on encoding. Anyway, if you redirect
output to a file and the result is OK (read: can be displayed opened by
text editor in Windows 1250 mode using proper font), your code is not the
problem.
The other thing is that you want to see CE characters in your console.
For that to work you need to do:
set your console window to use font capable of displaying them (you
may need to install such font, I don't remember The Right Way in Win 7)
set your console to Windows-1250 mode using command chcp 1250
Note that this is basically the same you would need to do with your viewer
or editor to see the characters. Except that while many editors are able
to detect encoding themselves (sometimes even correctly) and pick the right
font, consoles typically need help from you.
Your problem might be similar to what has been solved here. I also
recommend reading the other post I'm referencing there.

Emacs (Multi)Term vs Xterm vs Console & TMUX

I'm an Emacs user trying to learn a software tool that is best run from a terminal. The default set-up to get the most out of that tool is to use xterm for interaction and call Vim for editing. One could simply replace Vim with Emacs in this setup, but then one would spend most of the time working outside of Emacs in an Xterm.
I figured out there is (Multi)Term-mode in Emacs, but it is really hard to find out about its pros and cons. So I have the following questions:
[Without X11]: Why or when would anybody use Emacs (Mutli)Term instead of Console & TMUX (or GNU Screen)?
[With X11] How does Emacs (Multi)Term compare to Xterm?
Obviously speed is one criteria for comparison, but I'm sure there are other.
You'd use Emacs term over tmux/screen if you're more familiar with Emacs and already use it for many other things and/or if you spend more time in Emacs than in the terminal.
Emacs's Term is much less sophisticated and much less reliable than xterm. But it works within Emacs so if you live in Emacs, it might be a good option.
Note that you may also prefer to use Emacs's M-x shell functionality, which gives you a command line without giving you an actual terminal emulator. That means that the commands are edited in Emacs before being sent to the underlying command-line program, so all the usual Emacs editing can be used there (and the history manipulation as well as command completion is performed by Emacs as well, which can be great, or can be disappointing (e.g. if the completion needs info which Emacs does not have)).

How to Make Emacs Display Chinese Characters

I often use simple Chinese phrases like "你好" to test that my code can handle non-ascii characters. Whenever I enter Chinese characters directly into Emacs, they just come out as question marks.
Emacs can sometimes display characters properly if I open a premade text file but not always. For example, if I create a simple text file in Notepad with "你好" in it, the 好 displays fine but 你 just shows a box. Can Emacs handle Chinese characters? If so, how do I set it up?
I'm running Emacs 22.3.1 on Windows with the Courier New font, but I'm also curious about having this work on Linux. I have all the needed Eastern Language packages installed. I can edit in Chinese in Notepad with no problem.
The Emacs 23 release solves the problem I was having on Windows. Chinese characters work properly with no fussing or hacking. I can write Chinese directly in a buffer or open a file with no issues. Emacs's unicode support wasn't fully implemented until version 23.
Emacs 23 Release Notes
Have you tried this (leim)?
http://www.khngai.com/emacs/chinese.php
Liberation Mono font, which I use under Emacs, can display these characters.

looking for a UTF-8 text editor

I am looking for a (simple) text editor that can handle text in different encodings in the same document.
I need to develop some sites with mixed Japanese and English text and the editors I have now (on an English Windows system) are unable to display the Japanese text.
Jedit files don't display the Japanese text I have inputted but when I look at the file in a browser it shows up correctly.
Gvim shows all Japanese text in the editor as question marks and also in the browser.
In Gvim inputting the kanji works (you input the pronounciation and then press space bar to get the kanji) but when you confirm the kanji you want it replaces that kanji with question marks. (1 question mark for every kanji).
Can someone recommend me a text editor to edit html and php files that is able to display utf-8 encoded text and also save as an utf-8 file ?
thank you.
After reading about emacs I installed it. see below.
Thanks everybody for the hints.
if you don't have a unicode font yet you have to find one online or buy one.
here are the instructions to install the font on a windows system http://support.microsoft.com/kb/314960
jEdit
I changed my font in Jedit to a UTF font and now the Japanese shows up normally.
inputting the Japanese is still problematic as you don't see what you are typing.
(to change your font to edit files go to Utilities -> Global Options -> text area
select a Unicode font and you'll be able to see the Japanese characters.
gVim
I am still trying to figure out how to add a font in gvim. Once I know how to do that I ll update this.
Emacs
Emacs does not show the kanji correctly, they are displayed as ??? but at least I can see what I type in Japanese and select the right word.
so at this point I have to say that in jEdit I can see Japanese text but I can't input Japanese text. Gvim I can input Japanese text but inside the text area it is displayed as ??? and the same goes for Emacs.
adding a font in emacs and gvim is sadly enough not a trivial task.
At the moment I use notepad with the Arial unicode MS font and saving as UTF-8 file as my Japanese editor. Not ideal but at least it works.
Notepad++ is highly recommended.
Emacs correctly handles UTF-8 for me. (And of course, it can edit HTML and PHP files).
I would recommend Vim still. The problem you were seeing with questions marks is probably an issue with the font you were using. When displaying text that contains characters not in the currently language applications typically display them as empty boxes or question marks. See here for UTF-8 support in Vim.
This section of the Vim manual is also helpful, especially for setting up UTF-8 in Windows.
There is an issue with most Unicode-aware text editors: when you select a font, they stick to it. If the font does not include a glyph for a character, then the default substitution character (I believe U+FFFD, REPLACEMENT CHARACTER) is used.
In contrast, web browsers typically try to find a glyph for the characters they have to display among all the fonts provided by the system.
So, what you need, if you don't have the font "Arial Unicode MS" or similar (including Japanese glyphs), is an editor that tries to match glyphs with other fonts except the selected one.
Until someone provides a link for such an editor, I'll suggest a (somewhat extreme :) editor:
Install the latest stable python 2.x version for MS Windows (currently 2.6).
Include "idle" in the installation.
Start → Programs → Python 2,6 → Idle (Python Gui)
The "idle" editor is typically used to edit python code (and test it interactively in the Python shell). However, it can be used as a plain fully-Unicode-aware text editor, and when saving text including non-ASCII chars, it defaults to UTF-8 encoding.
Now, idle is based on Tkinter, which is an interface to tk, which is a gui library for tcl; tcl/tk, like web browsers, when asked to display a character for which no glyph is present in the widget font, it searches other fonts too.
However far-fetched this may seem, I really believe it would help; if no other solution helps you, give it a try.
Vim works fine for me as a UTF-8 text editor.
Firstly, you need a font that has the characters you are using. Choosing another text editor won't help you with this (unless it searches for other fonts for the correct characters when the font you are using doesn't have them). If you are using gVim, you can set the font like:
set guifont=Consolas
(This is not to say that Consolas is the font you want.) You probably want to put this in the .vimrc file so that it is always used.
Secondly, Vim needs to interpret the file as UTF-8, which it doesn't always automatically do. To make it do this, do:
set encoding=utf8
You can also see what encoding it is using with:
set encoding?
EmEditor is written by a Japanese company for exactly this purpose. It is a fine text editor with good performance/simplicity but pretty much all the features expected of a capable editor; I use it as my default when on the Windows platform, as well as for editing Japanese web page templates. It deserves to be better-known IMO; it is at least as good as, say, TextPad, but with full Unicode support.
Unfortunately it is not free, however you can find a free version of the old EmEditor 6 at sites such as download.com.
You can use just Notepad.exe with the "Arial Unicode MS" font (if all of your text is left-to-right, given the English windows version). Just Save as, select UTF-8.
In general, use your favourite editor with a font like "Arial Unicode MS". I mention this one because is the font with the greatest Unicode coverage I have seen,
Try BabelPad. Editing-wise, it's simple. Unicode-support-wise, it's awesome!
It sounds like maybe the problem with Jedit is the font - are you using a font that can display all the characters correctly?
To be more precise, Arial Unicode MS is a reasonable choice for a Unicode font that can display a wide range of characters across the range of languages. There are certain issues with it that can make it less than optimal for some languages used in isolation - this is why there are also language specific Unicode fonts included with Windows.
I've never had a problem with vim as long as I use a font that actually contains the characters I want. It needs to be a monospace font. :set enc=utf8 to get to utf8 mode. Then you can use :digraph command to get a display of available characters, and see how each is displayed.
To add a font, add it in Windows (Control Panel/Fonts/Add Font). If it's a monospace font, it will then show up in vin in /Edit/Font.
Just to add another one: I just checked that Programmer's Notepad 2 has some UTF-8 setting too.
(vim and emacs do just fine as well)
EditPlus seems to be an better option for UTF-8 as I have used it.
EditPad Lite and Pro fully support Unicode as of version 6. (Disclaimer: Those are my own products.)
If you get question marks, you're using an encoding that does not support Japanese characters. In EditPad, you can change the text encoding (Unicode, legacy code pages) via Convert, Text Encoding. You can set the defaults per file type in Options, Configure File Types, Encoding.
If you see squares instead of Japanese characters, select a Japanse font or Unicode font. You can do this in EditPad via Options, Font.
To type Japanese, simply install a Japanese keyboard driver in the keyboard settings in the Windows Control Panel, if you haven't already.
EditPad Pro has preconfigured file types for PHP and HTML.
Kate. and by extension, any other KDE program that uses Kate as an embedded KPart (KWrite, Quanta+, KDevelop). It handles lots of encodings, but i like to always use UTF-8. It also has a huge collection of syntax highlightings.
Try SciTE http://gisdeveloper.tripod.com/scite.html. It's just great ;)
For very basic UTF-8 multilingual text editing, I have had good luck with BabelPad (www.babelstone.co.uk): it's free, simple and robust and displays almost everything with no fuss. When the editing needs are more severe, I resort a lot to EditPad Pro, or occasionally Notepad++. For non-Unicode editing on Windows, I'm a TextPad user--my staff and I have probably spent about 200,000 hours in TextPad, with only occasional forays into NotePad2, MadEdit, jEdit, XML Copy Editor, and EPCedit. The latter two handle UTF-8 XML files well. All of the editors mentioned above are free except TextPad and EditPad Pro. Thanks to the person who suggested Emeditor. I'll try it out. --PFSchaffner
I like jEdit for it's ability to ident wrapped lines. Really nice when editing XML files. A word of warning though: It's Java, so it's not light fast, like you would expect a text editor to be.
Text codecs are fully supported. It distinguishes between text files with and without the header identifying the file format (byte order mark), calling them UTF-8 and UTF-8Y. This is something that I'm missing in other text editors.
Try EditPlus. It has specific support for HTML, syntax highlighting and can also work as a simple IDE for any compiler.
On the Mac: SubEthaEdit has excellent support for character encodings.
TextPad is a good utility too. It's a trialware, but does the job fine. See how to set char-encoding-setting-in-textpad.
For japanese, Sakura Editor is exceptional. It can display UTF-8, EUC-JP, SJIS and so on.
http://www.ultraedit.com/ is a multiplatform editor that does UTF-8 and all kinds of conversions between formats
EditPad Pro ... is recommended for u
cheers ;)