Emacs automatic encoding conversion

Emacs automatic encoding conversion - emacs

When I open a buffer in Emacs containing a German Umlaut (the word "PrÃ¤sentation" occurs in a string), Emacs automatically converts it to a different encoding as soon as I save the file.
How can I tell Emacs to leave the encoding alone?

M-x set-buffer-file-coding-system is what you are looking for.
You might also have a look at http://www.delorie.com/gnu/docs/emacs/emacs_221.html.

Perhaps you want find-file-literally.

I don't know whether its a bug in stackoverflow, or you really mean you see an A~ and universal currency symbol in Emacs. If the problem is Emacs displaying the wrong characters, then the following might help:
(prefer-coding-system 'utf-8)

Related

recognising encodings in emacs

It is my understanding that txt files do not have encoding information stored so text editors simply make educated guesses about encoding of a given text file and then display the file on screen using that guessed encoding. If the editor guessed right you get your text on the screen, if the editor guessed wrong, then you (sometimes) get gibberish. Am I getting this right so far?
Now on to my problem. I have my bank statements in a csv file. When I open it in MS Excel 14 (MS Office 2010), it recognises the encoding and displays the problematic work as "obračun". Great. When I open the file in Emacs 24.3.1, it fails to recognise the correct encoding and displays the problematic word as "obra鑾n". Not so great.
My question is: how do I tell Emacs which encoding the file is in?
Thanks.

From the Emacs Manual:
If Emacs recognizes the encoding of a file incorrectly, you can reread
the file using the correct coding system with C-x RET r
(revert-buffer-with-coding-system). This command prompts for the
coding system to use.
Give utf-16 a try.

Fast unicode input in Emacs with US layout

I would like to have a quick way to input unicode characters with multicharacter sequences. For example to input ä I would type \a. Searching for this, I found agda-input.
While I could adapt the agda-input for my use, I don't really need the whole emacs mode for my purpose. So I was wondering if such thing already exists.
It is probably also not that difficult to code such input mode. I would appriciate if someone suggested on how to do that.

As #legoscia mentioned, you can use the TeX input method for such things, which is probably more general than agda-input (which seems to be specific for a programming language) and is also built in.
(setq default-input-method "TeX")
Then switch to the input method with C-\ or M-x toggle-input-method. You can then type "ä" with \"a. The minibuffer has hints when you type \.
There are other input methods (M-x list-input-methods), but TeX is a good one if you're not concerned with a specific language, or if you know LaTeX.

Emacs - how to avoid or replace wrong character encodings?

Assume that I receive a Spanish text written in MS word and saved as plain text (.txt). Unfortunately, all the Spanish accents show up like this:
Un \372ltimo an\341lisis
Can anybody tell me how I can avoid this, or at least how I can replace these characters? They are simply not found by the replace-regexp-functions, otherwise I could write a little elisp function that replaces every occurence of them by the associated Spanish accented character.

This looks like ISO 8859-1 (Latin-1) encoding.
Visit the file with that coding system instead. If Emacs does not automatically identify the coding system, you can revisit the file with an explicit coding system with revert-buffer-with-coding-system (C-x RET r).
For example, if you are looking at the garbled file you describe,
C-x RET r
latin-1 RET
yes RET
Then you can set the coding system you want for saving (C-x RET f) and specifying something like utf-8.

Emacs c-mode can't recognize utf-8?

I need to read one C++ head file which has some Chinese and was encoded using utf-8.
Emacs should recognize this encoding, but it turns out:
Then, I changed it to text-mode, it works:
I also tested for python-mode, lisp-mode, etc, all works except c-mode, c++-mode, java-mode, seems there are something wrong with cc-mode, or the cc-vars?
Please help me if you know how to fix this weird problem.

That looks more like a missing font (rather than encoding) issue; i.e., your system lacks a properly configured Chinese italic font.

Actually, it is arguably a bug in Emacs: it should fallback to some other font (non-italics, if needed) rather than display blank squares. We have fixed a few such problems ober the years, so try the latest Emacs-24 pretest to see if the bug is already fixed there, and otherwise M-x report-emacs-bug

xemacs: dotemacs config so that one can paste without getting "funny" chars

Copying text from websites via browser, paste into xemacs (21.4) buffer, and tildes, quotes, etc. don't copy correctly.
Example: he’s a dummy -> he\222s a dummy.
Can YOU copy & paste it without problems? If so, please help - how to config my .emacs to solve this. Thanks.

Fire this in your .emacs:
(set-clipboard-coding-system 'utf-16le-dos)
That should do it. Don't forget to thi C-x C-e on that statement, or restart xemacs.

This isn’t a clipboard or cygwin problem. If you save a UTF-8 text file with curly quotes in notepad and open it in XEmacs 21.4, you’ll get junk. According to the XEmacs reference documentation, Unicode is not supported before version 21.5.6. Maybe try a later version?

You're attempting to copy+paste smart quotes into XEmacs. In this case, '\222' is the octal code for the character RIGHT SINGLE QUOTATION MARK (U+2019) encoded in the code page Windows-1252, which has the character encoding 0x92.
XEmacs uses UTF-8 internally, so you'll have to configure the copy+paste to convert from Windows-1252 to UTF-8. I don't know how to do that.

Simplest thing to do is write a quick function that translates those characters using replace-string.
You could also have xemacs set to accept that code page directly.

Switch to emacs, it works like a champ (GNU Emacs 23.0.91.1 (i386-mingw-nt6.0.6002) from Emacsw32 here). This may be the Emacsw32 patches in action.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Emacs automatic encoding conversion - emacs

When I open a buffer in Emacs containing a German Umlaut (the word "PrÃ¤sentation" occurs in a string), Emacs automatically converts it to a different encoding as soon as I save the file. How can I tell Emacs to leave the encoding alone?

M-x set-buffer-file-coding-system is what you are looking for. You might also have a look at http://www.delorie.com/gnu/docs/emacs/emacs_221.html.

Perhaps you want find-file-literally.

I don't know whether its a bug in stackoverflow, or you really mean you see an A~ and universal currency symbol in Emacs. If the problem is Emacs displaying the wrong characters, then the following might help: (prefer-coding-system 'utf-8)

Related

recognising encodings in emacs

Fast unicode input in Emacs with US layout

Emacs - how to avoid or replace wrong character encodings?

Emacs c-mode can't recognize utf-8?

xemacs: dotemacs config so that one can paste without getting "funny" chars

Categories

Resources