The problem I have can be re-produced by pasting this code into PyCharm:
chinese = [u'这', u'是', u'一', u'些', u'中', u'文']
print chinese
When you set a breakpoint at the print line and start debugging, you could see tha the variable chinese in the watch window is displayed as
[u'\u8fd9', u'\u662f', u'\u4e00', u'\u4e9b', u'\u4e2d', u'\u6587']`
However, I expect it to be
[u'这', u'是', u'一', u'些', u'中', u'文']
Unless I double click this variable, it does not show the characters directly.
How can I solve this problem?
Related
I searched everywhere for this, the problem is that the search criteria is very similar to other questions.
The issue I have is that file (script actually) is embedded in another file. So when I open the parent file I can see the script as massive string with several \n and \r\n codes. I need a way to convert these codes to what they should be so that it formats the code correctly then I can read said code and work on it.
Quick snippet:
\n\n\n\n\nlocal scriptingFunctions\n\n\n\n\nlocal measuringCircles = {}\r\nlocal isCurrentlyCheckingCoherency
Should covert to:
local scriptingFunctions
local measuringCircles = {}
local isCurrentlyCheckingCoherency
perform a Regex Find-Replace
Find: (\\r)?\\n
Replace: \n
If you don't need to reconvert from newlines to \n after you're done working on the code, you can accomplish the trick by simply pressing ctrl-f and substituting every occurrence of \n with a new line (you can type enter in the replace box by pressing ctrl-enter or shift-enter).
See an example ctrl-f to do this:
If after you're done working on the code you need to reconvert to \n, you can add an invisible char to the replace string (typing it like ctrl-enter invisibleChar), and after you're done you can re-replace it with \n.
There's plenty of invisible chars, but I'd personally suggest [U+200b] (you can copy it from here); another good one is [U+2800] (⠀), as it renders as a normal whitespace, and thus is noticeable.
A thing to notice is that recent versions of vscode will show a highlight around invisible chars, but you can easily disable it by clicking on Adjust settings and then selecting Exclude from being highlighted.
If you need to reenable highlighting in the future, you'll have to look for "editor.unicodeHighlight.allowedCharacters" in the settings.
In vscode, when I open a file containing unicode characters,
I notice that tabs do not always advance to the next tab stop.
For example, the following might be part of an ASCII-flavor table
a<tab>b<tab>c<tab>d<tab>e
α<tab>β<tab>γ<tab>δ<tab>ε
𝔸<tab>𝔹<tab>ℂ<tab>𝔻<tab>𝔼
While sublime text renders it correctly(IMO)
vscdoe has a different idea
As I understand it, vscode renders a tab by
replacing it with a proper number of space characters.
So if there are characters showing using proportional fonts,
no integer number of spaces will make it to the proper stop.
(See this related issue.)
So my question is, how can I fix this?
Is it possible to tell vscode that
"Fine, if you were to assume that those unicode characters
are 2 spaces wide when tabbing,
would you please render them as 2 spaces wide?"
If I copy and paste the four symbols from the character selection panel (I'm on macOS) they change to the following: ♠️ ♣️ ♥️ ♦️, whereas I'd like the heart and diamond to be red.
EDIT: Interestingly, i've noticed that if I type the sequence 👁🗨♥️, and then I hit backspace when the cursor is between those two characters, they both transform into 👁♥️! (the same happens with the other three)
Can someone explain what is happening?
I guess this is because your browser doesn't know about these special characters. But I think you can check this page https://www.w3schools.com/charsets/ref_utf_symbols.asp
and replace special characters with unicode codes from page
or from this page http://graphemica.com/%E2%9D%A4
I am new to notepad++ and like it very much, since I can customize how my text documents look more easily than with wordpad. However, I would like to know if it’s possible to enter accented characters like in wordpad (I thought it was a windows thing, but perhaps it isn’t). In wordpad, I can type, for instance, ctrl-’ then i to get an accented í character. Similarly, I can type ctrl-shift-~ then n to get the accented ñ character. It makes it much easier to enter accented characters than copying and pasting from the character map application, or trying to remember code points. When I tried this method in notepad++ I just got the plain character without the accents. I should also mention that when I open documents with such accented characters already present they appear just as expected. Is there a way to enter accented characters like this in notepad++ using only the keyboard? I am using the latest notepad++ under Windows 7.
In Notepad++ you can go to “Edit” then select “Character Panel” near the bottom of the drop down menu. It will show you the ASCII set available which includes most accented characters. You find the character you want and there will be a number for it, to easily use that, press and hold your ALT key, then, on your keypad on the right side of your keyboard type zero followed by the number for that character. So for something like “ñ” for example, the code for it is 241, so you would press ALT and then type 0241 on the keypad while holding down ALT and you will get the character you need. That works in most Windows programs, even in here.
This only works for ASCII characters in the range of 0 to 255. I don't know of a method other than copying and pasting from the “Character Map” app available in Windows for Unicode. Though I did test Wordpad with the Decimal number of the Hex value you see for a Unicode character above 255 and it will work with the ALT+#### in there, and probably other places, but it doesn't work in Notepad or Notepad++ for some strange reason, sadly. Two I use a lot and have memorized are ALT+0147 and ALT+0148 for the quotation marks “like these”, so once you use the numbers enough you tend to get used to them, or you can jot down the ones you use the most.
For anyone searching for a solution and coming across this page, try this (Windows): install and use the US International keyboard instead of the plain US keyboard. Search for "windows keyboard us international install" or something similar. I liked the techlanguage.com write-up on it and the teckangaroo.com step by step on how to install. Hope this helps someone in future looking around as I was earlier today for how to easily meet this need.
You can make your own keyboard layout to enter arbitrary characters anywhere in Windows, using MSKLC. Here's one I made earlier.
I think it is configured in the input method. With input method containing the characters you mentioned, you can press key combinations to get special letters.
You can add a keyboard layout preset in Windows. Under "Language and Regions" - "Language" - "Language settings" - "Input method" settings in Control Panel, you can add all what you want. Like this:
Switch keyboard layout with Alt + Shift.
I have a bizarre problem: Somewhere in my HTML/PHP code there's a hidden, invisible character that I can't seem to get rid of. By copying it from Firebug and converting it I identified it as or 'Zero width no-break space'. It shows up as non-empty text node in my website and is causing a serious layout problem.
The problem is, I can't get rid of it. I can't see it in my files even when turning Invisibles on (duh). I can't seem to find it, no search tool seems to pick up on it. I rewrote my code around where it could be, but it seems to be somewhere deeper in one of the framework files.
How can I find characters by charcode across files or something like that? I'm open to different tools, but they have to work on Mac OS X.
You don't get the character in the editor, because you can't find it in text editors. #FEFF or #FFFE are so-called byte-order marks. They are a Microsoft invention to tell in a Unicode file, in which order multi-byte characters are stored.
To get rid of it, tell your editor to save the file either as ANSI/ISO-8859 or as Unicode without BOM. If your editor can't do so, you'll either have to switch editors (sadly) or use some kind of truncation tool like, e.g., a hex editor that allows you to see how the file really looks.
On googling, it seems, that TextWrangler has a "UTF-8, no BOM" mode. Otherwise, if you're comfortable with the terminal, you can use Vim:
:set nobomb
and save the file. Presto!
The characters are always the very first in a text file. Editors with support for the BOM will not, as I mentioned, show it to you at all.
If you are using Textmate and the problem is in a UTF-8 file:
Open the file
File > Re-open with encoding > ISO-8859-1 (Latin1)
You should be able to see and remove the first character in file
File > Save
File > Re-open with encoding > UTF8
File > Save
It works for me every time.
It's a byte-order mark. Under Mac OS X: open terminal window, go to your sources and type:
grep -rn $'\xFEFF' *
It will show you the line numbers and filenames containing BOM.
In Notepad++, there is an option to show all characters. From the top menu:
View -> Show Symbol -> Show All Characters
I'm not a Mac user, but my general advice would be: when all else fails, use a hex editor. Very useful in such cases.
See "Comparison of hex editors" in WikiPedia.
I know it is a little late to answer to this question, but I am adding how to change encoding in Visual Studio, hope it will be helpfull for someone who will be reading this sometime:
Go to File -> Save (your filename) as...
And in File Explorer window, select small arrow next to the Save button -> click Save with Encoding...
Click Yes (on Do you want to replace existing file dialog)
And finally select e.g. Unicode (UTF-8 without signature) - that removes BOM