I am trying to simulate Word's displaying of non-printing characters. There is no problem with all of them but anchors and I didn't found any info related to them. Is anchor special character placed in text or is it parameter of floating object and just displayed as special character?
Thank you for answer
The anchor, unlike most non-printing characters, can never print. It's merely a visual aid to inform the user with which paragraph or character a graphic with text flow formatting is associated. It's not possible to detect an anchor directly in the document text using Word's API (object model). It's bound to the graphic and would require analyzing the properties of the Shape object.
It could be determined by analyzing the document's WordOpenXML, although the term "anchor" is not used. The information could be deduced from the location and settings of the nodes that define where and how the graphic appears.
Is anchor special character placed in text or is it parameter of floating object and just displayed as special character?
I'm going to try to answer the "is it in text" question.
If, while debugging, you try to get a textual character for an anchor from a range's text, it won't be there. There won't even be a 0-width non-visible character there, like when you move a text cursor to the right past a non-printable character, but it doesn't actually move because there's something there (this may be editor-dependent, I have Notepad++ in mind).
So no, it's not in text.
But, at the same time, it will interfere with searches. E.g. If you put the word "text" on a line, put a text box on that line to create an anchor, and then search for "^13text" (with wildcards enabled, ^13 means the end-of-paragraph mark), it won't find it.
So yes, it must be in text because it interferes with searches.
So this might be a contradiction, but let's keep going. If it's in text, where is it? If you place the text cursor on the previous line, hold shift, and move it once to the right, the text box will be highlighted.
So it must be at the start.
But, there is also evidence that contradicts this. If you have a field at the start of the line with text on it, you can move it once to the right as before, and then once to the left, and though you have part of the line highlighted, the text box won't be part of the selection.
So, I really have no idea whether it's text or not, or where it is if so, but hopefully this helps someone else.
Related
If I copy and paste the four symbols from the character selection panel (I'm on macOS) they change to the following: ♠️ ♣️ ♥️ ♦️, whereas I'd like the heart and diamond to be red.
EDIT: Interestingly, i've noticed that if I type the sequence 👁🗨♥️, and then I hit backspace when the cursor is between those two characters, they both transform into 👁♥️! (the same happens with the other three)
Can someone explain what is happening?
I guess this is because your browser doesn't know about these special characters. But I think you can check this page https://www.w3schools.com/charsets/ref_utf_symbols.asp
and replace special characters with unicode codes from page
or from this page http://graphemica.com/%E2%9D%A4
I edited a document from a client with some highlights then later decided to remove the highlights for comments instead.
For whatever reason, the document highlighted a number of bullet point and numbered list sections which I could not revert when I attempted to select the entire document and change the highlighting to 'No Fill'.
The highlighted bullet point/number lists did not allow me to select them to revert.
Searches on Google seemed to result in a ton of convoluted "[Solved]" responses on their forum which didn't fix the issue for me (or resulted in a TLDR response from my brain...):
Google Search: open office remove highlight bullet lists
[Solved] Yellow highlighting won't go away
[Solved] Bullet highlighting will not go away.
[Solved] Surprise Yellow Highlighting on Bullets & Numbers
Permanently highlighted bullets.
[Solved] Oddities Involving Bullets/Outlines & Font Styles
[Solved] Bullet color
Seriously... what the heck!? How can this be so hard? So I decided this issue needed some serious StackOverflow help...
Version info:
Apache OpenOffice-4.1.4
AOO414m5(Build:9788) - Rev. 1811857
2017-10-11 20:12
So after all that...
I figured it out. But its still crazy how it's not answered very clearly in the resources above... I hope this helps someone not spend as much time on this in the future.
If you double-click the first bullet/number of the list... it appears to select the first word of the first item of the list, BUT you'll see that it also selects the list bullets/numbers with a dark gray highlight.
Now selected, you can remove the highlight from the list.
Selecting all of the document doesn't select the numbered/bulleted lists.
Well, most of this solutions didnt helped me.
But I found a simple way to fix it:
Select Highlight option.
Position to the left of the bullet until the cursor converts to a white arrow.
One click to highlight entire text line. One click again to un-highlight the entire text line (including the bullet).
Select the highlighted area, rather "highlight" the highlighted area and press CTRL+Q, it is a paragraph formatting issue and this should remove all formatting from the selected area.
The answers above didn't work. Try this (mouse-select means left-click and drag the selection of words, aka highlighting but wanted to avoid confusion):
Turn on paragraph marks ¶ in Word
Add a clean paragraph before the highlighted-bullet sentence. (Clean means it's unbulleted, without colour highlight, unformatted)
Mouse-select the entire bulleted sentence containing the highlighted bullet. Make sure the selection also goes left before the highlighted bullet to include the clean paragraph above it i.e. your selection should include the ''¶'' mark of the clean paragraph you created in 2.
Apply white/clear/no-colour highlighting.
It's actually pretty simple though I was having trouble with it myself. Just select all the items of that particular bulleted/numbered list and highlight them. Then select the items again and remove the highlight. Doing that removed the highlights from the bullets too for me.
Super frustrating but here's the fix that's always worked for me (even with .doc or .docx file):
Double click the bulleted/numbered list item so they all highlight
Ctrl + Spacebar (resets character formatting)
Apply any needed formatting (font type, bold, etc.)
This will keep the formatting on the paragraph (indents, header type, etc.) but will just allow you to change the format of the actual text that is highlighted - which is likely all you want.
Hope that works for you!
The highlighted text in the paragraph that you have highlighted past the period id causing this issue. If you want to keep the last sentence highlighted but remove the highlight bullet just remove highlight on the period at the end of the paragraph and the highlighted bullet goes away :)
(If you can) Start from above, add in a new clean bullet point, copy/paste the desired text from the problematic highlighted bullet point, then delete the problematic highlighted bullet point altogether.
I want to view a word document along with the unicode representation of the special characters.
For example, I want to a word doc containing :
Hi,
How are you ?
As ,
Hi \r\n How are you ?
Is there any way to do this?
Not programatically. Any software or software mode would suffice.
In Word, select the character and press "alt-x".
This appears to be unavailable in Word for Mac version 2016 (according to Microsoft Answers), or in Office 365's version 16.
If you want to see format control characters as visible symbols, which is what your example is about, then there does not seem to be any direct way. But if you click on the “¶” button (in the Start pane, Paragraph group in new versions of Word), Word adds symbols at ends of visible lines to indicate presence of such controls, e.g.
Hi,·¶
How·are·you?·¶
Here “¶” indicates the presence of CR (U+000D CARRIAGE RETURN, “\r”), whereas a symbol resembling “⤶” would indicate LF (U+000A LINE FEED, “\n”), which indicates a forced line break without paragraph break in Word. And “·” indicates a normal space (U+0020 SPACE), whereas “°” would indicate a no-break space (U+00A0 NO-BREAK SPACE).
For visible characters, the AltX method described by #JasonPlutext works well. You don’t even need to select the character. You can just click between it and the next character, to place the cursor there, and then press AltX.
I have a Word document with fields of the reference variety, which occur in the form "[field].[field]"--in other words, there's a period between the two fields. I want to globally replace this with a space.
Word offers the ^d special character to search for fields, but for some reason the query "^d.^d" does not find anything. However, ".^d" does. Now comes the problem, however--what do I specify as the replacement text in order to retain the field code? If using regular expressions, I could use a "Find What Expression" such as \1, but with regexp ("wild card") mode the ^d is not permitted.
I guess I could write a macro...
I would like to add to Bibadia's solution.
An example of an index entry field; we want to change a name we misspelled.
Make sure hidden formatting is displayed (toggle with SHIFT+CTRL+F8).
Make sure wildcards option is not selected. To search for fields, use the opening and closing field braces code (optionally use ^w for spaces, as Bibadia suggested):^19 XE "Deo, John" ^21
Replace won't recognize field braces character, but will allow to insert the clipboard's content. ;). To do that, insert in text the correct entry. CTRL+F9 to insert field and type:XE "Doe, John"
Select the field above and copy
Use ^c in the replace box
Hit Replace All
Ta-da!
It's usually better to go the macro route when finding fields because, as you say, the find algorithm that Word uses doesn't work the way you might hope with fields.
But if you know exactly what the fields contain, you can specify a search pattern that will probably work (however not in wildcard mode).
For example, if you want to look for figure number field pairs such as
{ STYLEREF 1 \s }.{ SEQ Figure \* ARABIC \s 1 }
(which would typically be the same set of fields everywhere in the document)
If you only really need to look for the following:
{ STYLEREF 1 \s }.<any field>
you could ensure that field codes are displayed and search for
^d STYLEREF 1 \s ^21.^d
or
^19 STYLEREF 1 \s ^21.^19
If you need to be more precise, you can spell out the second field as well.
"^d" only works for finding the field beginning, not the field end.
It's a shame that ^w wants to find at least 1 whitespace character because otherwise it would be more robust to look for
^19^wSTYLEREF^w1^w\s^w^21.^19
Perhaps someone else knows how to work around that without using wildcards?
Torzaburo,
I suggest that you do this using a macro. You can start by recording the macro, and later refining your processing steps within the macro.
First turn on the hidden characters by navigating to Home > Paragraph > toggle the show/hide Paragraph symbol. Also, select all and toggle the field codes on (right-click and select "Toggle Field Codes".
Open a new blank Word doc in addition to the one you have open. You will use this later. Start the macro recording and find the field using the "^d" (field code) as you said.
When the field is found, copy only the field text within the brackets, and not the full field reference. While the macro is still recording, ALT + TAB to the new blank document and paste the field code in as plain text.
At this point, do the necessary find & replace processing to the field codes. Highlight the processed field codes, copy, ALT + TAB back to the original document, and paste back between the { } brackets.
Stop the macro recording. Add any further custom processing to the macro VBA.
Select-All and re-toggle the field codes. Update the field codes.
You don't need a macro. Just toggle all field codes on by using Alt+F9. Then do a find and replace for what you want to change. Once the replacement is complete, use Alt+F9 again to toggle the field codes back off.
Disclaimer: I didn't originate this solution, but it's clean and elegant and I thought it should be included here:
(Adapted from Search & Replace Field Codes in Word):
Create or find a single instance of the field you want to convert text to
Toggle Field Codes visible (AltF9)
Copy the code for the field you want to use to the Clipboard (highlight and CtrlC)
Open the Replace dialog box (CtrlH), insert the text you want to replace in the Find What box and then enter ^c in the Replace With box.
This will replace your text with the contents of the Clipboard, turning it into the field code you copied in step 3. It also copies formatting information (font, color, etc.), to control how the field will appear when hidden. (Caveat: I've tested this with Word 2003 under Windows 7 only.)
Coming in late on this, probably way too late for Beth (sorry Beth). And this may not be quite what Beth was looking for. But for anyone interested ...
It sounds like Beth may have created captions throughout the document using INSERT CAPTION (hence the presence of field codes). This means these captions will have been (automatically) created in CAPTION style.
To globally replace the separator "." with " " (space) in such captions, take two steps:
[1] Go to REFERENCES | INSERT CAPTION, then click on NUMBERING and replace the SEPARATOR "." with "EM-DASH". This will replace all separators in captions for the selected label in the CAPTION Window. If you have other labels in use in the document (e.g. FIGURE), select the other labels one by one and repeat this process.
[2] Do a find/replace searching for special character "em-dash" (^+) in style CAPTION, replacing with " ". Click REPLACE ALL.
Voila!
NOTE: This presumes that em-dash does not appear in the caption text anywhere. If it does, then you'll need to do a pre- and post- "fiddle" to ensure these em-dashes are not touched by the global replace above.
The "pre-fiddle" is to do a global find/replace across captions, replacing the em-dash ("^+") with some other string (e.g. "EM-DASH") that doesn't ever occur in any caption's text. Then you do the separator change as described above. Finally, the "post-fiddle" is to restore the em-dashes that were in the captions, by doing a global replace of the string "EM-DASH" with the actual em-dash character "^+".
I can write Arabic/Urdu/Persian on MS Word or Notepad just fine, but whenever I insert any English word or number, the sequence is just disturbed and seems like the all the words have been shuffled in the sentence.
Look at the example below:
یہ ایک مثال ہے اردو کی ...
Now I inserted an English word and it became:
یہ ایک مثال ہےword اردو کی ...
So you can see almost all of the words have been jumbled ... what is the solution for that ?
For example:
باللغة العربية “keyboard” انا أريد أن أعرف الكلمة
Finish typing the Arabic word and add a space after it (this space separates the embedded text from the Arabic text to its right).
Insert special character U+200F (to render the preceding space an Arabic character). The character name is "Right to Left Mark".
Insert special character U+202A (to begin the left-to-right embedding). The character name is "Left to Right Embedding".
Insert another space (to separate the embedded text from the Arabic text that will continue to its left).
Change the keyboard to e.g. English and type the left-to-right word.
Insert special character U+202C (to restore the bidrectional state to what it was before the left-to-right embedding). The Character name is "Pop directional formatting".
Change the keyboard back and continue writing in Arabic.
If you're working in Microsoft Office or Open Office, the "special characters" can be found under "insert" [Insert -> symbols -> other symbols -> special characters in MS 2013]. Scroll through until you find the character with the appropriate Unicode number, and if the Unicode number does not appear in your version of MS Word, select it by its name [as indicated above].
You can also add the character by writing it's unicode and then selecting it and pressing Alt+X - but that can be confusing because it needs constant change between Arabic and English.
All of the special characters involved in this little manoeuvre are invisible characters (their job is simply to change the direction of the text) so don't be surprised if it looks like you're not inserting anything.
Pay attention to select the RTL option from the ribbon when the majority of your paragraph is RTL and keep it selected [as shown in the picture in this answer https://stackoverflow.com/a/46050171/8558867 ].
Before you start typing in Arabic/Persian make sure you have chosen "Right-to-Left-Direction" button. This button can be found on Paragraph tab just left side of AZ sorting button. Also select "Align Text Right" button which can be found in Paragraph tab left side of Justify button.
Start typing your language
Before putting an English word put an space then select left ALT + SHIFT and type your English word
Once finished your English words select right ALT + SHIFT and then put a space and keep typing your language again
Hope this helps
This is OK; they're not shuffled: you're seeing them in LTR rendering mode.
You just need to make them right-to-left. In Notepad or Word, press right Ctrl+Shift to make their direction right-to-left and it will be okay. (It's like having <p dir="rtl">...</p> in HTML).
The control characters LRE and RLE (0x202A and 0x202B) and also LRM and RLM (0x200E and 0x200F) need to be applied to the whole paragraph, i.e they should come at the beginning of the sequence. Some text display widgets of some platforms may discard these control characters though, particularly older (pre-2000) platforms or those who do not support Unicode bidirectional algorithm correctly. Newer OS'es and programs should be fine; try with Windows Notepad for example.
I personally recommend using the platform's means to make the text RTL, and avoid special control characters because they're invisible and may cause surprising results if they go out of control. So you'd better use Word's API to make the text RTL, or if your output is HTML put them in <div dir="rtl">...</div> tags. For plain text file, user has to manually press the Ctrl+Shift keys himself.
Edit: this was written as a clarification answer to the first answer here, I later edited the first answer and added the important notes I wrote here [the edit still needs approval though].
I was able to fix my text by following the steps in the first answer here.
In case anyone faces troubles while following the steps, let me clarify some things:
If you are entering an English word in an Arabic text, make sure that RTL option in the ribbon is selected [circled in red in the following figure]:
Keep it selected throughout the paragraph irrespective of the language you are using [as long as the majority of the paragraph is written in an RTL language like Arabic or Hebrew].
Where to find the special characters and how to insert them:
You can write the unicode of the character and then select it and press "Alt + X". However, this can be a bit confusing because of the need to change back and forth between English and Arabic to write the codes, so the best thing to do is enter them 'manually' by inserting their names.
You can do that by going to Insert -> Symbol -> More Symbols -> Special characters [scroll down]. Then select the name of the characters you need to use instead of its unicode.
The names of the characters you'll need to use [as specified in the first answer here] are:
"Right to Left Mark" : U+200F.
"Left to Right Embedding": U+202A.
"Pop Directional Formatting": U+202C.
As the first answer says, nothing will appear on the screen because it's a non-printing character, so it's normal if you felt like nothing happened when you insert.
If you need to do it the other way around, that is, insert a Hebrew or Arabic word in an English text, just reverse the use of unicodes -- Or follow the steps in the following link: https://superuser.com/a/1247476/767967
If you want to know more about what the special characters do and what it means to make your paragraph LTR or RTL, visit the following link: http://dotancohen.com/howto/rtl_right_to_left.html#Directionality
Select the paragraph (e.g. using triple click) and use the button for right-to-left direction (¶◀) in the Paragraph section of the Start pane.
As Hossein’s answer explains, the issue is the directionality in the paragraph. It changes to left to right when you insert a Latin letter, and you need to fix this manually.
You need to add an invisible RLE Unicode Character at the start of the line [^].
It's : 0x202B hex = 8235 decimal or RIGHT-TO-LEFT EMBEDDING (RLE).
It's necessary for Notepad but MS-Word is able to handle it. you need to right align your text correctly.
How to enter RLE: http://www.fileformat.info/tip/microsoft/enter_unicode.htm
In word processing, you have a main text direction which is either left-to-right or right-to-left (or top to bottom, but let's ignore that :-), and you have a text direction for individual characters, which will also be left to right or right to left.
The word processor splits the text into chunks of strings with the same character ordering, then displays these chunks according to the main text ordering.
It seems that your main text ordering was left to right. As long as all your text is arabic, there is just one chunk with arabic text. You see already it is displayed left aligned and not right aligned because the text ordering is left to right. The characters are displayed right to left because that is how arabic is displayed.
When you inserted latin text, you had three chunks: Arabic, latin, arabic. These three chunks are displayed left to right because that is the main text ordering. That would be fine for text that is mostly latin (like "The arabic words for dog and cow are ... and ..."). For text that is mostly arabic with the occasional latin word, you need to change the main text ordering to "right to left".
Just follow this:
Copy and paste the arabic text into from word or text document to ADOBE Illustrator.
Save the illustrator document as in .EPS format.
Open indesign and place the .EPS document into the place you want.
Since indesign can't handle arabic text issue by it self, this method will help many designers.