In vscode, when I open a file containing unicode characters,
I notice that tabs do not always advance to the next tab stop.
For example, the following might be part of an ASCII-flavor table
a<tab>b<tab>c<tab>d<tab>e
α<tab>β<tab>γ<tab>δ<tab>ε
𝔸<tab>𝔹<tab>ℂ<tab>𝔻<tab>𝔼
While sublime text renders it correctly(IMO)
vscdoe has a different idea
As I understand it, vscode renders a tab by
replacing it with a proper number of space characters.
So if there are characters showing using proportional fonts,
no integer number of spaces will make it to the proper stop.
(See this related issue.)
So my question is, how can I fix this?
Is it possible to tell vscode that
"Fine, if you were to assume that those unicode characters
are 2 spaces wide when tabbing,
would you please render them as 2 spaces wide?"
If I copy and paste the four symbols from the character selection panel (I'm on macOS) they change to the following: ♠️ ♣️ ♥️ ♦️, whereas I'd like the heart and diamond to be red.
EDIT: Interestingly, i've noticed that if I type the sequence 👁🗨♥️, and then I hit backspace when the cursor is between those two characters, they both transform into 👁♥️! (the same happens with the other three)
Can someone explain what is happening?
I guess this is because your browser doesn't know about these special characters. But I think you can check this page https://www.w3schools.com/charsets/ref_utf_symbols.asp
and replace special characters with unicode codes from page
or from this page http://graphemica.com/%E2%9D%A4
I have searched all over the Internet for an answer. I have achieved this once before, but I can't remember how I did this...
I have a long text file with alot of encoded characters, for example
\u0119,\u015b\\u0107
How do I change characters like \u0119 to ę, etc?
This question is not off-topic. In past I also use notepad++ for programing. Today i use Atom. You can find a lot questions about notepad++ in stack overflow - for instance: Removing duplicate rows in Notepad++ or this Convert tabs to spaces in Notepad++ (and many more). So please do not give minus points to this question.
Answer: I assume that when you go to menu>Encoding you will see 'Encde in UTF-8.'
I use this site to create part of my answer: https://superuser.com/questions/576431/notepad-inserting-special-unicode-characters-in-utf-8
If you see character codes like \u0119,\u015b\u0107 in your file this probably mean that they are just on encoded - and their codes are put expliicty as raw text.
So to change this codes into UTF-8 characters, go to
menu>run>run> type: charmap> click run
the windows charmap will show up, so check ''advanced view' an there put you character code (without \u prefix - so for instance only 0119) in filed 'go to Unicode'. Then click on 'select' and 'copy' and close window
Then go to menu>search>replace and in filed 'replece with' past you character, and inf filed 'find what' put its code (with prefix, for instance \u0119). And click 'Replace All'
Do steps 1-3 for each character code (you can check thad your done when you click menu>find> and type '\u' in "find what". If you not find any code then you job is end.
I would like to deal with floral formulae by my DSL coded in groovy, so I need some special symbols such as female sign and Superscripts and Subscripts.
Thanks to the great answers that I found on stackoverflow questions like this now I'm able to
insert special unicode symbols in source code in VIM (MacVim) this way:
CTRL+V. U 2 6 4 0.
However, I would like to be able to do the same in Eclipse IDE (I'm trying to use Groovy/Grails Tool Suite Version: 3.1.0.RELEASE to develop a grails project)
Question: How can I insert in the Eclipse editor a 4 digit unicode symbol by knowing the encoding ( without cut & paste from another source) ?
There appear to be a few ways to get the unicode characters on a Mac. The first few don't appear to be what you want exactly, but included for completeness.
1) Make sure System Preferences->Keyboard "show keyboard & character viewers in menu bar" is selected. Then you can click on that (normally accessible via option+cmd+T, but not in eclipse) to get the Character Viewer. You can then double-click a special character you want and it should insert at cursor.
2) Under the default setup, you should be able to click Option + key to get an alternate character. Use the keyboard viewer from #1 to see what maps to what. Note you can switch to some more mappings using Shift at the same time. This will only get you a subset of unicode characters.
3) From here: Under System Preferences->Languages & Text, go to Input Sources tab. Select the Unicode Hex Input source. You may need to assign switching input sources (under System Prefs->Keyboard->Keyboard Shortcuts->Keyboard) to a hotkey combo (default probably conflicts with spotlight, so change to something else). After that, you should be able to use said hotkey combo to switch to the Unicode Input Source - in that mode, you can hold Option down and enter a hex 4-digit key code, which will result in the character being placed at cursor.
I can write Arabic/Urdu/Persian on MS Word or Notepad just fine, but whenever I insert any English word or number, the sequence is just disturbed and seems like the all the words have been shuffled in the sentence.
Look at the example below:
یہ ایک مثال ہے اردو کی ...
Now I inserted an English word and it became:
یہ ایک مثال ہےword اردو کی ...
So you can see almost all of the words have been jumbled ... what is the solution for that ?
For example:
باللغة العربية “keyboard” انا أريد أن أعرف الكلمة
Finish typing the Arabic word and add a space after it (this space separates the embedded text from the Arabic text to its right).
Insert special character U+200F (to render the preceding space an Arabic character). The character name is "Right to Left Mark".
Insert special character U+202A (to begin the left-to-right embedding). The character name is "Left to Right Embedding".
Insert another space (to separate the embedded text from the Arabic text that will continue to its left).
Change the keyboard to e.g. English and type the left-to-right word.
Insert special character U+202C (to restore the bidrectional state to what it was before the left-to-right embedding). The Character name is "Pop directional formatting".
Change the keyboard back and continue writing in Arabic.
If you're working in Microsoft Office or Open Office, the "special characters" can be found under "insert" [Insert -> symbols -> other symbols -> special characters in MS 2013]. Scroll through until you find the character with the appropriate Unicode number, and if the Unicode number does not appear in your version of MS Word, select it by its name [as indicated above].
You can also add the character by writing it's unicode and then selecting it and pressing Alt+X - but that can be confusing because it needs constant change between Arabic and English.
All of the special characters involved in this little manoeuvre are invisible characters (their job is simply to change the direction of the text) so don't be surprised if it looks like you're not inserting anything.
Pay attention to select the RTL option from the ribbon when the majority of your paragraph is RTL and keep it selected [as shown in the picture in this answer https://stackoverflow.com/a/46050171/8558867 ].
Before you start typing in Arabic/Persian make sure you have chosen "Right-to-Left-Direction" button. This button can be found on Paragraph tab just left side of AZ sorting button. Also select "Align Text Right" button which can be found in Paragraph tab left side of Justify button.
Start typing your language
Before putting an English word put an space then select left ALT + SHIFT and type your English word
Once finished your English words select right ALT + SHIFT and then put a space and keep typing your language again
Hope this helps
This is OK; they're not shuffled: you're seeing them in LTR rendering mode.
You just need to make them right-to-left. In Notepad or Word, press right Ctrl+Shift to make their direction right-to-left and it will be okay. (It's like having <p dir="rtl">...</p> in HTML).
The control characters LRE and RLE (0x202A and 0x202B) and also LRM and RLM (0x200E and 0x200F) need to be applied to the whole paragraph, i.e they should come at the beginning of the sequence. Some text display widgets of some platforms may discard these control characters though, particularly older (pre-2000) platforms or those who do not support Unicode bidirectional algorithm correctly. Newer OS'es and programs should be fine; try with Windows Notepad for example.
I personally recommend using the platform's means to make the text RTL, and avoid special control characters because they're invisible and may cause surprising results if they go out of control. So you'd better use Word's API to make the text RTL, or if your output is HTML put them in <div dir="rtl">...</div> tags. For plain text file, user has to manually press the Ctrl+Shift keys himself.
Edit: this was written as a clarification answer to the first answer here, I later edited the first answer and added the important notes I wrote here [the edit still needs approval though].
I was able to fix my text by following the steps in the first answer here.
In case anyone faces troubles while following the steps, let me clarify some things:
If you are entering an English word in an Arabic text, make sure that RTL option in the ribbon is selected [circled in red in the following figure]:
Keep it selected throughout the paragraph irrespective of the language you are using [as long as the majority of the paragraph is written in an RTL language like Arabic or Hebrew].
Where to find the special characters and how to insert them:
You can write the unicode of the character and then select it and press "Alt + X". However, this can be a bit confusing because of the need to change back and forth between English and Arabic to write the codes, so the best thing to do is enter them 'manually' by inserting their names.
You can do that by going to Insert -> Symbol -> More Symbols -> Special characters [scroll down]. Then select the name of the characters you need to use instead of its unicode.
The names of the characters you'll need to use [as specified in the first answer here] are:
"Right to Left Mark" : U+200F.
"Left to Right Embedding": U+202A.
"Pop Directional Formatting": U+202C.
As the first answer says, nothing will appear on the screen because it's a non-printing character, so it's normal if you felt like nothing happened when you insert.
If you need to do it the other way around, that is, insert a Hebrew or Arabic word in an English text, just reverse the use of unicodes -- Or follow the steps in the following link: https://superuser.com/a/1247476/767967
If you want to know more about what the special characters do and what it means to make your paragraph LTR or RTL, visit the following link: http://dotancohen.com/howto/rtl_right_to_left.html#Directionality
Select the paragraph (e.g. using triple click) and use the button for right-to-left direction (¶◀) in the Paragraph section of the Start pane.
As Hossein’s answer explains, the issue is the directionality in the paragraph. It changes to left to right when you insert a Latin letter, and you need to fix this manually.
You need to add an invisible RLE Unicode Character at the start of the line [^].
It's : 0x202B hex = 8235 decimal or RIGHT-TO-LEFT EMBEDDING (RLE).
It's necessary for Notepad but MS-Word is able to handle it. you need to right align your text correctly.
How to enter RLE: http://www.fileformat.info/tip/microsoft/enter_unicode.htm
In word processing, you have a main text direction which is either left-to-right or right-to-left (or top to bottom, but let's ignore that :-), and you have a text direction for individual characters, which will also be left to right or right to left.
The word processor splits the text into chunks of strings with the same character ordering, then displays these chunks according to the main text ordering.
It seems that your main text ordering was left to right. As long as all your text is arabic, there is just one chunk with arabic text. You see already it is displayed left aligned and not right aligned because the text ordering is left to right. The characters are displayed right to left because that is how arabic is displayed.
When you inserted latin text, you had three chunks: Arabic, latin, arabic. These three chunks are displayed left to right because that is the main text ordering. That would be fine for text that is mostly latin (like "The arabic words for dog and cow are ... and ..."). For text that is mostly arabic with the occasional latin word, you need to change the main text ordering to "right to left".
Just follow this:
Copy and paste the arabic text into from word or text document to ADOBE Illustrator.
Save the illustrator document as in .EPS format.
Open indesign and place the .EPS document into the place you want.
Since indesign can't handle arabic text issue by it self, this method will help many designers.