The documentation to the PCRE NSIS plugin states it has some support for unicode regex's.
When writing my regular expression it works with a non unicode NSIS instance but not with the NSIS unicode installer.
Is there anything special which must be done in order to parse unicode encoded strings through PCRE on unicode nsis?
The plugin seems to support searching in utf-8 text, but it not a plugin supporting the unicode version of NSIS.
Its source code should be upgraded / ported to Unicode NSIS.
Related
My python file:
print('Amanhã')
I am using the integrated terminal in VSCode 1.28.1, on Windows 10 Pro.
When I activate a Python 3.6-based virtual environment then run this script, it executes as expected and I see Amanhã in the terminal.
But when I activate a Python 3.5-based virtual environment then run this script, it fails with a UnicodeEncodeError:
UnicodeEncodeError: 'charmap' codec can't encode character '\xe3' in position 5: character maps to <undefined>
If I run set PYTHONIOENCODING=utf8 in the 3.5-based environment, then execute the script, the Unicode error is gone but the output is not exactly as expected: Amanhã
How can I see Amanhã in the 3.5-based venv?
(I replicated this in the normal Windows terminal (cmd.exe), not inside VSCode -- exact same result. I also will note that sys.getdefaultencoding() returns utf-8 both before and after the set PYTHONIOENCODING=utf8 command)
Based on the incorrect output, your terminal is using cp437, which doesn't support the character ã.
Pre-Python 3.6, Python encodes Unicode to the encoding of the terminal on Windows. As of Python 3.6, Python uses Unicode Win32 APIs when writing to the terminal and, as you have found, works much better.
If you must use Python 3.5, check out win-unicode-console.
Windows 10 Jetbrains Clion 2018.2.1
MinGW-W64 Encoding
input[1] :
std::cout << "가나다라 abc" << std::endl;
output[1] :
媛?섎떎??abc
Settings > Editor > File Encodings
I can solve with Path setting: EUC-KR, but only configured file only. I have to configure Encoding setting every each project, every each file.
I found when using CMD, chcp 65001 works, but clion's cpp output cannot manage like that.
VM option :
-Dconsole.encoding=EUC_KR
-Dconsole.encoding=EUC-KR
-Dconsole.encoding=UTF8
-Dconsole.encoding=UTF-8
-Dfile.encoding=EUC_KR
-Dfile.encoding=EUC-KR
-Dfile.encoding=UTF8
-Dfile.encoding=UTF-8
I tried them all, each by each, but they does not work.
cygwin basically works well, but mingw-w64 doesn't work well. I searched for hours, but I couldn't find any answer.. :(
Thank you for reading..
Unfortunately, the Windows console only supports standard characters. If you try to print a special character, it will appear abnormally.
Windows cmd uses a simple ASCII table, while Windows uses an extended ANSI table. The 128 first characters are identical, so you can use them only.
I have a long-standing application that I'm normally able to compile equally well from Visual Studio or from a makefile using Gnu Make (but still using the Microsoft C++ compiler).
Recently I modified it by incorporating a third-party library. On trying to compile it for the first time from within Visual Studio, I obtained the common "C2664: ...cannot convert parameter 1 from 'const char *' to 'LPCWSTR'" error, which I resolved by going to the 'General' tab in the Project Properties dialog and selecting the "Use Unicode Character Set" option.
I'd now like to compile the application from my makefile, but naturally I get the same error. Is there a compiler switch that I can use to have an equivalent effect to "Use Unicode Character Set", or any other way of effecting this from within the makefile?
It's not a dedicated compiler switch. Unicode is selected based on preprocessor macros. IIRC, UNICODE for MFC and _UNICODE for the MSVCRT. Use /D UNICODE /D _UNICODE.
During my CS studies, we have a good bit of group assignments. We program in Java using Eclipse. We (or atleast i try to get them to) share code using Mercurial and BitBucket. I'm running Mac OSX 10.7 and the others are running Windows 7. We often have problems with the encoding when we share code. Danish characters such as æ, ø and å is often a mess.
What settings should we run across our eclipse setups to ensure that the encoding will be the same (and what encoding would be preferred?) On Windows, Eclipse defaults to Cp1252 and on MacOS it defaults to MacRoman. I've been trying to get everyone to use UTF-8, but code they previously wrote (in Cp1252) wont show correctly, so they are forced to switch around a lot, which usually ends up in them defaulting back to Cp1252 and forgetting about it when they submit code to a shared repository.
For me it works to use the standard encoding (Cp1252) in Eclipse on Windows and to tell Eclipse on Mac to use the encoding ISO-8859-1. On Mac I configured this for my whole workspace in the settings (under General --> Workspace).
Encode old 1250 texts into UTF8 by hand and use only these versions
Speaking from experience, I believe the best solution is for everybody to use UTF-8, that can represent any Unicode character.
The workaround CP1252 & ISO-8859-1 is not perfect, some characters are not compatible between them. Moreover, most IDEs default to UTF-8, if someone must have the trouble of changing Eclipse encoding settings, I believe it should be Windows users.
So after much headache using CP1252 & ISO-8859-1, I decided to change all my files to UTF-8. In case someone is interested, you can do that on Unix with a command like this, that will change all files in the current directory and its subdirectories:
find . -name "*.java" -exec sh -c "iconv -f ISO-8859-1 -t UTF-8 {} > {}.utf8" \; -exec mv "{}".utf8 "{}" \;
Since you are informing the original encoding, iconv will be able to convert without messing with accents and special characters.
Then ask everybody to create a new workspace, configure all encoding configuration on Eclipse to UTF-8 (Windows users) and import the project again.
I have problem with charset in my NetBeans on Windows when I open files, which were editting in NB on Linux by my coworkers.
I guess it should be unicode in both.
What I should to do to resolve this problem?
I can find proper option.
I use frensh and changed the netbeans encoding to Iso-8859-1 and it worked for me.
tried UTF-8 before didn't do it
my symptoms were as follows:
- a website hosted on linux and developped by another dev
- downloaded on my windows 8 , netbeans 8.0 beta or 8.1
- when opening a file for the first type it was saying "cannot option safely..." if chose Yes all my frensh special chars were messed up
-Hicham
right-click on Project -> Properties -> Sources -> Encoding
for maven project, put project.build.sourceEncoding in pom->project->properties
We had the same problem with Eclipse because of mixed Windows and Linux developers. If you use Java you have 3 options:
change to Unicode charset. Though we couldn't do that with Eclipse on Windows, maybe it works out for you. Linux should be usually on Unicode already.
change to Iso-8859-1 on Linux, seems to be compatible with CP1252
use the tool native2ascii to change non-ASCII-characters in strings to their explicit unicode representation (IMO this is the most robust solution, though it's Java only I guess)
The most easy way to solve this is by a terminal command
$ sudo sh netbeans-8.0.2-linux.sh