I had been cheerfully hacking away on a UTF-8 file in emacs (version 23.3.1) when I was called away from my computer. When I returned, my session (it was on a remote system) had closed, so I logged back in and re-opened my file. As might be expected, I was alerted there was a more recent auto-saved version, and did I want to restore that? Sure.
When it pulled in the auto-saved version, all my high-bit characters, which had previously been showing fine on the screen, were converted into number strings that I assume were references to code points or something; when I attempted to save, emacs then threw the encoding error that it couldn't save with the present encoding.
Particularly irritating was that when I opened the auto-saved version in emacs directly, the encoding was correct (well, apparently so -- my high-bit characters all appeared correctly).
Why, when I attempted to recover the file, even though apparently the autosaved file was in the right encoding, did emacs get confused about my encoding and choke? Or more to the point: how do I get emacs to not do that anymore, and to keep everything in UTF-8 where it belongs?
Related
I recently started using emacs on Android in Termux. My projects feature some chinese characters, which are displayed there just fine.
When later opening the same file in Emacs on Windows, I was disappointed to see them displayed as \xxxx
I am not sure how to search for a solution, because I do not what the problem is.
The only thing I found related to my problem is this:
Unicode characters in emacs term-mode
but it did no help me solve the issue.
You can tell what's going on by looking at the first few characters in the mode line. In Termux, it says UUU, but in Windows it says DDU. These three characters stand for:
the coding system for keyboard input
the coding system for terminal output
the coding system of the file in the buffer
U stands for UTF-8, while D stands for various DOS code pages. (You can find this using M-x list-coding-systems. This is all described in the Mode Line section of the Emacs manual.)
So this means that Emacs is reading the file correctly, but it thinks that the terminal is unable to display the Chinese characters, so it uses the \uxxxx notation as a fallback. I'm not sure how to get this to work properly in a Windows terminal, but try M-x set-terminal-coding-system with utf-8 - it might just work.
As an aside, if you run Emacs as a "normal" Windows application instead of in a terminal, the characters should display correctly automatically, so if there is a particular problem preventing you from doing so, it might be worth trying to fix that instead.
I recently started to use Visual Studio Code on Server Systems where I did not have Studio IDE installed. I like it very much but I'm running into a problem.
When I open a file (used Notepad++ before) the editor detects the encoding and sets it for me. I have many files on windows servers that are still with windows-1252 but vscode just uses UTF-8 by default.
I know I can reopen with encoding Western (Windows 1252) but I often forget it and I have sometimes destroyed some content while saving it.
So I did not find any parameter yet, is there a way to make vscode detect the encoding and set it automatically when I open a file?
To allow Visual Studio Code to automatically detect the encoding of a file, you can set "files.autoGuessEncoding":true (in the settings.json configuration file).
https://github.com/Microsoft/vscode/pull/21416
This obviously requires an updated verison of the application compared to when the question was originally asked.
Go to File-> Preferences -> User Settings
Add (or update) the entry "files.encoding": "windows1252" to the right editor window and save
Now VSCode opens all text files using windows-1252 when there is no proper encoding information set.
EDIT:
In 2017's June release the files.autoGuessEncoding setting was introduced. When enabled it will guess the file's encoding as good as possible. Its default value is false .
Add guide by image :
File >> Preferences >> Settings
Enter autoGuessEncoding and make sure checkbox is checked
beware, auto guessing in vscode still does not work as expected, the guessing, is VERY inaccurate, and does still open as guessed encoding, even when the guess library returns also the confidence score being low - they use jschardet (https://www.npmjs.com/package/jschardet)
if the score of guess is not close to 100%, it simply should rather open in "files.encoding" instead of guessed encoding, but that does not happen, vscode authors should make better use of the confidence score jschardet returns
i open either utf-8 which guesses ok, and second encoding i use is windows-1250, which in 99% cases detects wrong as some other windows-... encoding or even iso-8859-... and such... cry daily at the computer having to experience this
tuning up the confidence checking and fallback to default encoding would do, it needs someone skilled to check their source and offer them a fix
From Peminator's answer:
Beware: auto guessing in VSCode still does not work as expected, the guessing, is VERY inaccurate,
This should slightly improve with VSCode 1.67 (Apr. 2022) with (insider released):
Allow to set files.encoding as language specific setting for files on startup
we now detect changes to the editor language and there is almost always a transition from plaintext in the beginning until languages are resolved to a target language
if we detect that the new language has a configured files.encoding override
and that encoding is different from the current encoding
and the editor is not dirty or in save conflict resolution
:
we reopen the file with the configured encoding
Unfortunately I cannot tell apart this from happening vs. the user changing the language from the editor status bar.
So if the user changes language mode to bat from to ps1 and ps1 has a configured encoding, the encoding will change.
I'm working with a git repository where some of the files are encoded in latin-1 and some of them in utf-8. I'm using Eclipse CDT to work with them, and it's configured to use UTF-8 as default encoding.
The thing is, when I open latin-1 encoded files, some of the characters are not shown properly , and despite I've just tried also the Luna version, which came out 2 days ago, the problem persists (It's supposed that latin-1 and latin-2 are supported now, according to the review information).
Furthermore, and here comes the real trouble, when I modify and save latin-1 encoded files, they are being saved as UTF-8 (as configured in Eclipse), so if I push these changes to the repository, quite a lot of conflicts will emerge, messing up the entire commit.
Is there some way of telling Eclipse to keep the original encoding for each file?
Thank you.
I've been using NB for a long time, with different versions, but today something strange happened, I installed NB7.0.1 and tried to compile some old projects, but it couldn't open one file, saying :"The file cannot be safely opened with encoding GBK. Do you want to continue opening it ?" I pressed on "Yes" and it opened it with errors, lots of empty rectangles in places of the "." characters and some strange Chinese/Japanese characters, this file was a normal, good java file which I've worked on in Nb6.9, NB7.0, never caused any problem, now NB7.0.1 somehow can't open it, so I uninstalled NB7.0.1, and tried to open it with other editors like notepad,wordpad and NB6.9, now the all display strange characters. Seems NB7.0.1 changed it's encoding or the reading of it. Anyone has similar problem, and how to fix it ?
I faced similar issue before.
when I closed and reopened the netbeans the Old project opened normally.
I was digging through the header files for SDL in Linux when I tried to open the file from the SDL library called "SDL_opengl.h" in Emacs. For some reason, it always causes it to crash. It opens just fine in Vim and in gedit.
Has anyone else had an issue with Emacs just plain refusing to open a particular file? What sort of things should I look for to find what is causing the problem? Mind you, I was able to open every other "SDL_*.h" file in that directory; just that one gives me trouble.
Much appreciated in advance!
I would be interested to see the exact error message, and stack trace if possible.
I suspect file encoding, special characters, file size, cc-mode parsing, or something like that to be the culprit. (emacs 22 and libsdl1.2 on ubuntu 9 with utf-8 screen works fine for me)
Converting my comment into an answer b/c the comments get cut off.
Try loading the file with
M-x find-file-literally
Since this (appears to) resolve the issue for giogadi, I think that points to perhaps the colorization of the buffer. cc-mode does its own colorization...
Oh goodness, I'm a dunce.
So I apparently underestimated both the size of the file AND the speed of Emacs in opening said large files.
I decided to sit and wait to see if it dies completely on its own (as opposed to me xkill-ing it), and after a whole minute, the file is loaded.
So that solves one problem - the file is being loaded. However, why would Emacs take so long to do it? I have no strange settings enabled that should cause it to lag more than usual.
have you hilit-mode on?
with hilit-auto-highlight-maxout and a great value?
I have had the same problem with header-files, so reduce that value.
maybe it is hs-mode (hideshow-mode)?