How to setup Visual Studio Code to detect and set the correct encoding on file open - encoding

I recently started to use Visual Studio Code on Server Systems where I did not have Studio IDE installed. I like it very much but I'm running into a problem.
When I open a file (used Notepad++ before) the editor detects the encoding and sets it for me. I have many files on windows servers that are still with windows-1252 but vscode just uses UTF-8 by default.
I know I can reopen with encoding Western (Windows 1252) but I often forget it and I have sometimes destroyed some content while saving it.
So I did not find any parameter yet, is there a way to make vscode detect the encoding and set it automatically when I open a file?

To allow Visual Studio Code to automatically detect the encoding of a file, you can set "files.autoGuessEncoding":true (in the settings.json configuration file).
https://github.com/Microsoft/vscode/pull/21416
This obviously requires an updated verison of the application compared to when the question was originally asked.

Go to File-> Preferences -> User Settings
Add (or update) the entry "files.encoding": "windows1252" to the right editor window and save
Now VSCode opens all text files using windows-1252 when there is no proper encoding information set.
EDIT:
In 2017's June release the files.autoGuessEncoding setting was introduced. When enabled it will guess the file's encoding as good as possible. Its default value is false .

Add guide by image :
File >> Preferences >> Settings
Enter autoGuessEncoding and make sure checkbox is checked

beware, auto guessing in vscode still does not work as expected, the guessing, is VERY inaccurate, and does still open as guessed encoding, even when the guess library returns also the confidence score being low - they use jschardet (https://www.npmjs.com/package/jschardet)
if the score of guess is not close to 100%, it simply should rather open in "files.encoding" instead of guessed encoding, but that does not happen, vscode authors should make better use of the confidence score jschardet returns
i open either utf-8 which guesses ok, and second encoding i use is windows-1250, which in 99% cases detects wrong as some other windows-... encoding or even iso-8859-... and such... cry daily at the computer having to experience this
tuning up the confidence checking and fallback to default encoding would do, it needs someone skilled to check their source and offer them a fix

From Peminator's answer:
Beware: auto guessing in VSCode still does not work as expected, the guessing, is VERY inaccurate,
This should slightly improve with VSCode 1.67 (Apr. 2022) with (insider released):
Allow to set files.encoding as language specific setting for files on startup
we now detect changes to the editor language and there is almost always a transition from plaintext in the beginning until languages are resolved to a target language
if we detect that the new language has a configured files.encoding override
and that encoding is different from the current encoding
and the editor is not dirty or in save conflict resolution
:
we reopen the file with the configured encoding
Unfortunately I cannot tell apart this from happening vs. the user changing the language from the editor status bar.
So if the user changes language mode to bat from to ps1 and ps1 has a configured encoding, the encoding will change.

Related

Disable competing language servers / code highlighters

I want to add a language server to handle completion / highlighting / etc. for a file.
As a basis for testing stuff I am using an example from Microsoft (https://github.com/microsoft/vscode-extension-samples/tree/main/lsp-sample) and changed it to be active for any File. This language server highlights any word in all capital letters.
When opening a C++ File, I get the highlighting / completion of my language server and the default one for C++ (See image).
I would like to detect if some other extension / build in highlighter is active for a file and deactivate it for this workspace or the current file if it is impossible for the current workspace)
Is there a way to do this in a generic way where I do not have to know which extensions are highlighting code?
If no, is there a way to do this, if I know a set of extensions I want to deactivate?
I finally had enough time to try around a bit more and found that providing your own language for the same extension is enough.
In package.json I added an element in contributes.languages with extensions containing .cpp (since I am using cpp files for testing).
I also copied over some implementation code from an example.
This suppressed the default highlighter and code completion for cpp files. Since i am only implementing a semantic token provider, I can see the default highlighting before my provider takes over. (I think this could be solved by adding a syntax highlighter, but this is already sufficient fo my preliminary testing)
I am not sure, how stable all of this would be as a plugin (I only tested in Extension Development Host mode)
Here an image of the results:

How to prevent auto save when executing code in Visual Studio

When I run my code in Visual Studio, it auto saves my code -without prompting (!)- even though I have auto-save set to off. Is there a way to prevent this?
I've made sure auto-save is off in preferences. I haven't seen any setting that applies specifically to "on execute"
As an example of how this made me lose some hair, I opened a file that had some unknown characters (which opened just fine in Powershell ISE) and I didn't know until the code failed when I executed it. And since it auto-saved on execute, it wrote the bad characters to the file, corrupting it for all eternity. Not to mention, if I modify a few characters to test something, I don't want it to auto-save for me.
Search for autosave under File-> Preferences - Settings and turn it off

Netbeans file cannot be safely opened

I get files from friend who don't use netbeans IDE, when i open file that contain special caracter like 'é','à',... it show me this popup message :
if i say yes it open the file and changes those caracters to '�' like or
Any idea how to open the file safely?
The letters you are mentioning seem to be French. You need to open the file, specifying the original encoding, then save the file as UTF-8
I recently encountered a very similar problem (I have some javascript files in Chinese which translated into similar non-human readable text upon re-opening the file in NetBeans).
My OS: Linux Mint (version 17, Cinnamon; Notepad++ not available and gedit did not solve the problem).
Netbeans Version: 8.0.1
However, I was blessed to have found the history feature! I was able to get a former version of my file restored and backed it up immediately.
To access a file's history simply click on the History button found on the left side of the tool bar between the tabs of open files at the top of the IDE and the actual source code. (You can also right click on the file name and selected History -> Show History). Then Double click on a *Timestamp representing a valid version of your file. Just below the table of Timestamps the old 'backup' file and the current 'corrupted' file should appear side-by-side. (You can preview several historical versions of the file until you find one that works best for you; of course, when choosing a file I suggest one which is still usable and has the most current Timestamp associated with it!) ). Right click again on the 'backup' version of your choice -> Revert from History. Click back on the Source button found right next to the History button.
Finally, to change the default encoding, I applied the fix suggested by Sebas and Danny here:
How to change file encoding in NetBeans?
Please note that the path to the netbeans.conf file is different (at least with version 8.0.1 on my Linux machine). The path on my machine was : ~/netbeans-8.0.1/etc/netbeans.conf.
This saved the day for me and I hope it helps someone else out there! Bonne chance.

R character encodings across windows, mac and linux

I use OS X and I am currently cooperating with a windows user and deploying the scripts on a linux server. We use git for version control, and I keep getting R scripts from his end where the character encoding used has mixed latin1 and utf8 encodings. So I have a couple of questions.
Is there a simple to use editor for windows that handle UTF8 with more grace than Winedt that my coauthor currently uses? I use emacs, but I am having a hard time selling getting him to switch.
How to set up R in Windows so that it defaults to reading and writing UTF8?
This is driving me crazy. Has anyone found a solution for it (be it in the workflow or in the software used) who cares to share?
Take a look at ?Encoding to set the encoding for specific objects.
You might have luck with options(encoding = ) see ?options, (disclaimer, I don't have a windows machine)
As for editors, I haven't heard complaints about encoding issues with Crimson editor which lists utf-8 support as a feature.
TextPad is a well featured editor supporting R syntax that allows you to specify the target platform for files (Win/UNIX/Mac/keep current encoding) when you save them. The only problem with it is that some of the keyboard shortcuts are nonstandard (e.g. 'Find' is F5, not F3).

Tool to convert code source from a codepage to UTF-8?

I'm working on an open source project. The original project contains comments in russian and is using codepage 1251. I'm using codepage 1252 and the russian comments aren't displayed correctly in Visual Studio Express 2008, not nice but anyway I can't read russian. Someone using codepage 950 (traditional chinese) tried to compile the project and was unable to do it, because of the code page! Now it is really annoying.
I think that using unicode (and more exactly UTF-8 with signature) as file format for the code source is the way to go.
Problem: how to convert the whole source code easily?
I have already though about:
Let Visual Studio save the source code as UTF-8. But: My computer is using codepage 1252 and I found no way to tell VS that the original code source is using codepage 1251 so that the conversion won't be correct.
Edit: As pointed by "LicenseQ" there is a way to open a single file in VS with another encoding: click Arrow near Open button in open dialog, chose "Open With" and then chose "Code Editor (with encoding)".
Of course I could change the codepage of my computer for the time of the conversion. But it's a global setting in Windows and you need to reboot the computer so that I'm looking for a more friendly solution.
I've found a tool called CodePageConverter which do exactly what I need, but can't a do it as batch job.
Does anyone know another tool (a command line tool would be perfect) to convert from a codepage to UTF-8?
Edit: As suggest by tkotitan seems iconv to be the solution I was looking for. There is a windows version of iconv. And now that I know the name of this tool, I was able to find over posts on stackoverflow dealing with analogous issues.
In a unix world the utility is called iconv.
Not sure if there is a windows equivalent.
You can ask VS 2008 to open file with encoding (click Arrow near Open button in open dialog)
Or you can change regional settings to add russian region as default ;)