Filename in chinese shows as unicode characters - emacs

when using C-x C-f, the filename which includes Chinese characters are shown as following:
How can I configure it to show Chinese words? Thank you.
=====updated=======
System: OS X 10.8.4
Emacs version: GNU Emacs 24.3.1 (x86_64-apple-darwin)

I think this may be caused by one of those annoying interactions between the operating system and Emacs. Emacs doesn't seem to know how to interpret the file names, so let's try to help it by inserting this in your .emacs file.
(setq default-buffer-file-coding-system 'utf-8-unix)
(setq default-file-name-coding-system 'gb2312)
(setq default-keyboard-coding-system 'utf-8-unix)
(setq default-process-coding-system '(utf-8-unix . utf-8-unix))
You may need to try a different system instead of gb2312.

Related

Emacs Windows org-mode encoding

I have an org-mode file for todos, and I have renamed the todo bullets to "Att göra" (TODO in swedish). However org-mode (in Windows, see exact version below) think it is "Att göra" (pressing M-S-RET). I can see åäö correct, but "Att göra" is not intepreted as a TODO item. I can also see in the configuration files that it is spelled "Att göra", still org-mode think it is "Att göra".
I have tried to save the configuration files and my org-mode file in UTF-8 (C-x RET f utf-8 RET). I have the following in my Emacs configuration:
;; Prefer utf-8
(prefer-coding-system 'utf-8)
(setq coding-system-for-read 'utf-8)
(setq coding-system-for-write 'utf-8)
(custom-set-variables
;; custom-set-variables was added by Custom.
;; If you edit it by hand, you could mess it up, so be careful.
;; Your init file should contain only one such instance.
;; If there is more than one, they won't work right.
'(inhibit-startup-screen t)
'(keyboard-coding-system (quote utf-8))
'(selection-coding-system (quote utf-8)))
This happens only in Emacs for Windows (I think this is the version number):
GNU Emacs 24.3.1 (i386-mingw-nt6.1.7601) of 2013-03-17 on MARVIN
It works in Mac, Cygwin (however I have trouble with M-S-RET as the terminal doesn't transfer that sequence right, despite that I have disabled the Alt-Enter shortcut to go fullscreen), Linux etc. It is only in Emacs Windows standalone client.
If you have any idea why this is, I would be very greateful for your suggestions.
I want to leave this question answered.
Brady told med to add
(modify-coding-system-alist 'file "" 'utf-8-unix)
to my Emacs file and it works! No idea what it really does.

Emacs: current buffer's coding system

I have used to pc for developing erlang program, one is mac os x 10.6, the other is mac os x 10.7.
In ".emacs" file of both pc, it contain the following script
;;handle emacs utf-8 input
(set-terminal-coding-system 'utf-8)
(set-keyboard-coding-system 'utf-8)
(prefer-coding-system 'utf-8)
But when I input remark including Chinese characters in one pc and saved, and download to the other pc. The chinese characters can't be shown correctly. The same story for reverse operation.
I want to know how to check the current file's encoding type? Is there any command can do that?
I believe you're looking for buffer-file-coding-system. M-x describe-variable will tell you more about it, and you can set it by M-x eval-expression and use (setq buffer-file-coding-system 'coding-system-i-want). That will set it for a single buffer; once you've got it working, you can add entries to file-coding-system-alist to permanently set the option as you'd like.
you may also set
coding-system-for-write
coding-system-for-read
To modify the coding system in order to to read and write all ‘.txt’ files using the coding system chinese-iso-8bit, you can execute this Lisp expression:
(modify-coding-system-alist 'file "\\.txt\\'" 'chinese-iso-8bit)
For more informations see Recognizing Coding Systems.

How to check which Emacs I am using?

I have two Emacs (Aquamacs and text-based Emacs) on my Mac.
In my .emacs file, I can check if I'm using Aquamacs with ...
(boundp 'aquamacs-version)
How can I check if I'm using text based emacs?
EDIT
Jürgen Hötzel's answer works, but for text based emacs, using
(unless (null window-system) ...)
is better as (window-system) is not defined.
M-x emacs-version
ad some more characters here......
Sorry, from .emacs, just call
(emacs-version)
I know this question was answered a long time ago, but I found another answer by typing emacs --help. This gives a list of options in which you can find emacs --version.
In my case, emacs --version prints: GNU Emacs 24.3.1.
I have only tested this solution on Linux with my current version of Emacs. I do not know if the same solution applies to Windows, or to older versions of Emacs, but in theory it should.
(if (window-system)
"window-based"
"text-based")
Or, you could use this:
(if (or (eq window-system 'ns)
(eq window-system 'mac))
(message "hello, world!"))
It will only print "hello, world!" when you run a graphical Emacs in OS X.
Errr... (not (boundp 'aquamacs-version)), perhaps?

How to configure GNU Emacs to write UNIX or DOS formatted files by default?

I've had these functions in my .emacs.el file for years:
(defun dos2unix ()
"Convert a DOS formatted text buffer to UNIX format"
(interactive)
(set-buffer-file-coding-system 'undecided-unix nil))
(defun unix2dos ()
"Convert a UNIX formatted text buffer to DOS format"
(interactive)
(set-buffer-file-coding-system 'undecided-dos nil))
These functions allow me to easily switch between formats, but I'm not sure how to configure Emacs to write in one particular format by default regardless of which platform I'm using. As it is now, when I run on Windows, Emacs saves in Windows format; when I run in UNIX/Linux, Emacs saves in UNIX format.
I'd like to instruct Emacs to write in UNIX format regardless of the platform on which I'm running. How do I do this?
Should I perhaps add some text mode hook that calls one of these functions? For example, if I'm on Windows, then call dos2unix when I find a text file?
I've got a bunch of these in my .emacs:
(setq-default buffer-file-coding-system 'utf-8-unix)
(setq-default default-buffer-file-coding-system 'utf-8-unix)
(set-default-coding-systems 'utf-8-unix)
(prefer-coding-system 'utf-8-unix)
I don't know which is right, I am just superstitious.
I up-voted question and answer, but spent a couple minutes possibly improving on the info, so I'll add it.
First, I checked documentation on each variable and function in user181548's answer, by (first cutting and pasting into Emacs, then) putting cursor over each, and typing C-h v RET and C-h f RET respectively.
This suggested that I might only need
(prefer-coding-system 'utf-8-unix)
Experimenting with the other lines didn't seem to change pre-existing buffer encodings (typing C-h C RET RET to check (describe-coding-system) and g each time to refresh), so I omitted the other lines and made a key-binding to quickly change any old files that were still DOS, that is,
(defun set-bfr-to-8-unx ()
(interactive)
(set-buffer-file-coding-system
'utf-8-unix)
)
(global-set-key (kbd "C-c u")
'set-bfr-to-8-unx
)
For the curious, to discover the 3rd and 4th line of above function, (set-buffer-file-coding-system 'utf-8-unix), I used C-x RET f RET to manually change the current buffer's encoding, then M-x command-history RET to see how those keys translate to code.
Now maybe my git commit's will stop whining about CRs.

Hiding ^M in Emacs

Sometimes I need to read log files that have ^M (control-M) in the line endings. I can do a global replace to get rid of them, but then something more is logged to the log file and, of course, they all come back.
Setting Unix-style or dos-style end-of-line encoding doesn't seem to make much difference (but Unix-style is my default). I'm using the undecided-(unix|dos) coding system.
I'm on Windows, reading log files created by log4net (although log4net obviously isn't the only source of this annoyance).
(defun remove-dos-eol ()
"Do not show ^M in files containing mixed UNIX and DOS line endings."
(interactive)
(setq buffer-display-table (make-display-table))
(aset buffer-display-table ?\^M []))
Solution by Johan Bockgård. I found it here.
Modern versions of emacs know how to handle both UNIX and DOS line endings, so when ^M shows up in the file, it means that there's a mixture of both in the file. When there is such a mixture, emacs defaults to UNIX mode, so the ^Ms are visible. The real fix is to fix the program creating the file so that it uses consistent line-endings.
What about?
C-x RET c dos RET C-x C-f FILENAME RET
I made a file that has two lines, with the second having a carriage return. Emacs would open the file in Unix coding, and switching coding system does nothing. However, the universal-coding-system-argument above works.
I believe you can change the line coding system the file is using to the Unix format with
C-x RET f UNIX RET
If you do that, the mode line should change to add the word "(Unix)", and all those ^M's should go away.
If you'd like to view the log files and simply hide the ^M's rather than actually replace them you can use Drew Adam's highlight extension to do so.
You can either write elisp code or make a keyboard macro to do the following
select the whole buffer
hlt-highlight-regexp-region
C-q C-M
hlt-hide-default-face
This will first highlight the ^M's and then hide them. If you want them back use `hlt-show-default-face'
Edric's answer should get more attention. Johan Bockgård's solution does address the poster's complaint, insofar as it makes the ^M's invisible, but that just masks the underlying problem, and encourages further mixing of Unix and DOS line-endings.
The proper solution would be to do a global M-x replace-regexp to turn all line endings to DOS ones (or Unix, as the case may be). Then close and reopen the file (not sure if M-x revert-buffer would be enough) and the ^M's will either all be invisible, or all be gone.
You can change the display-table entry of the Control-M (^M) character, to make it displayable as whitespace or even disappear totally (vacuous). See the code in library pp-c-l.el (Pretty Control-L) for inspiration. It displays ^L chars in an arbitrary way.
Edited: Oops, I just noticed that #binOr already mentioned this method.
Put this in your .emacs:
(defun dos2unix ()
"Replace DOS eolns CR LF with Unix eolns CR"
(interactive)
(goto-char (point-min))
(while (search-forward "\r" nil t) (replace-match "")))
Now you can simply call dos2unix and remove all the ^M characters.
If you encounter ^Ms in received mail in Gnus, you can use W c (wash CRs), or
(setq gnus-treat-strip-cr t)
what about using dos2unix, unix2dos (now tofrodos)?
sudeepdino008's answer did not work for me (I could not comment on his answer, so I had to add my own answer.).
I was able to fix it using this code:
(defun dos2unix ()
"Replace DOS eolns CR LF with Unix eolns CR"
(interactive)
(goto-char (point-min))
(while (search-forward (string ?\C-m) nil t) (replace-match "")))
Like binOr said add this to your %APPDATA%.emacs.d\init.el on windows or where ever is your config.
;; Windows EOL
(defun hide-dos-eol ()
"Hide ^M in files containing mixed UNIX and DOS line endings."
(interactive)
(setq buffer-display-table (make-display-table))
(aset buffer-display-table ?\^M []))
(defun show-dos-eol ()
"Show ^M in files containing mixed UNIX and DOS line endings."
(interactive)
(setq buffer-display-table (make-display-table))
(aset buffer-display-table ?\^M ?\^M))
(add-hook 'text-mode-hook 'hide-dos-eol)