GitHub heading text to URL fragment identifier - github

In GitHub's markdown we can create an anchor to any heading, like the example below:
# Heading 1
Click [here](#heading-2) to go to the second heading
# Heading 2
This is the second heading :)
The problem is that I couldn't find the rules to convert the text in the heading (in this case "Heading 2") to the fragment identifier (in this case heading-2). There are some non-intuitive behaviors, like some characters getting omitted and accented letters not changing to their non-accented versions ("Á!á?É!é?" would be turned into ááéé, for example).
I want to automate this process so it would be really useful to see the rules for this conversion with all possible cases and exceptions.
Where can I find this information?

Related

Microsoft Word: is anchor character or not?

I am trying to simulate Word's displaying of non-printing characters. There is no problem with all of them but anchors and I didn't found any info related to them. Is anchor special character placed in text or is it parameter of floating object and just displayed as special character?
Thank you for answer
The anchor, unlike most non-printing characters, can never print. It's merely a visual aid to inform the user with which paragraph or character a graphic with text flow formatting is associated. It's not possible to detect an anchor directly in the document text using Word's API (object model). It's bound to the graphic and would require analyzing the properties of the Shape object.
It could be determined by analyzing the document's WordOpenXML, although the term "anchor" is not used. The information could be deduced from the location and settings of the nodes that define where and how the graphic appears.
Is anchor special character placed in text or is it parameter of floating object and just displayed as special character?
I'm going to try to answer the "is it in text" question.
If, while debugging, you try to get a textual character for an anchor from a range's text, it won't be there. There won't even be a 0-width non-visible character there, like when you move a text cursor to the right past a non-printable character, but it doesn't actually move because there's something there (this may be editor-dependent, I have Notepad++ in mind).
So no, it's not in text.
But, at the same time, it will interfere with searches. E.g. If you put the word "text" on a line, put a text box on that line to create an anchor, and then search for "^13text" (with wildcards enabled, ^13 means the end-of-paragraph mark), it won't find it.
So yes, it must be in text because it interferes with searches.
So this might be a contradiction, but let's keep going. If it's in text, where is it? If you place the text cursor on the previous line, hold shift, and move it once to the right, the text box will be highlighted.
So it must be at the start.
But, there is also evidence that contradicts this. If you have a field at the start of the line with text on it, you can move it once to the right as before, and then once to the left, and though you have part of the line highlighted, the text box won't be part of the selection.
So, I really have no idea whether it's text or not, or where it is if so, but hopefully this helps someone else.

bullet point indenting not working - word 2007

Adjust List Indents function in Microsoft Word 2007 not working once the list goes past 10.
For heading 09 I open the Adjust List Indents function (By right clicking) and set the "Text Indent at" value to .05. This works. However for every heading after 10 following the exact same steps does not work.
This is not an indent issue, its the alignment in your numbering style. Take some time to study this:
http://wordfaqs.mvps.org/NumberAlignment2007.htm#NumberedLists
A related MS Word skill which will leverage your efforts by an order of magnitude is to learn how to define custom List Styles and assign them to custom Paragraph Styles. There is a very good tutorial here:
http://shaunakelly.com/word/numbering/numbering20072010.html
and the analagous bullet list version:
http://shaunakelly.com/word/bullets/controlbullets20072010.html
I would add one level of efficiency to those tutorials:
You don't need to create a Paragraph Style for each list/bullet level. You only need to assign a Paragraph Style to the first level. When the Paragraph Style is applied to text, the List Style will be applied correctly to all levels based on the indentation of the list items.

How to increase visual length of form text field in Word?

When a form text field is inserted in a Word document, the grey shaded length is about 5 characters long. How can this length be increased?
Allthough it is a rather crude measure (and I don't recommend it), you can set "Properties -> Default Text" to as many blanks as you want the size. But this comes for a price: as long as you move into the field by pressing TAB, all blanks are selected and get typed over. When you use the mouse, you click the cursor anywhere into the field and start typing ... so your entry might be pre and post fixed by a number of blanks that you have to trim away in e.g. an exit macro.
I recommend old form fields as the last resort (i.e. there must be a good reason to use them) and would prefer (in that order)
native Word2010/2007 fields (text or Rich text - perhaps not backwords compatible)
legacy ActiveX fields (compatible with W2003)
Legacy (old) form fields

ignore differences in syntax in beyondcompare

In a branch of code I have changed all of the code from obj.varname to obj("varname") and when I compare the code I would like to ignore these differences since varname is the same.
I have a regular expression that I think I need but unfortunately can't get the comparison to be ignored using Beyond Compare from Scooter
^obj\("\w*"\)|obj\.\w*$
I am following this tutorial http://www.scootersoftware.com/support.php?zz=kb_unimportantv3
So my question: is this even possible with beyond compare? If yes, please share a solution including either instructions or post your screenshots.
Beyond Compare 3's Professional edition supports this through its Text Replacements feature. If you've already purchased a Standard edition license you need to revert to trial mode to test it: http://www.scootersoftware.com/suppo...?zz=kb_evalpro
Load your two files in the Text Compare.
Open the Session Settings dialog from the the Session menu, and on the Replacements tab click New to create a new replacement.
In the Text to find edit, use (\w+)\.(\w+)
In the Replace with edit, use $1("$2")
Check the Regular expression checkbox.
The alternative would be to mark any instance of obj.varname and obj("varname") as unimportant. The basic steps would be this:
Load your two files in the Text Compare.
Open the Session Settings dialog from the Session menu, and on the Importance tab click the Edit Grammar... button.
In the next dialog click the New... button below the top listbox.
Change the Element name field to something useful (say, "PropertyAccess").
Change the Category* to List.
In the Text in list* edit, add these two lines:
obj.varname
obj("varname")
Click OK to close the Grammar Item dialog and then click OK again to close the Text Format* grammar item.
Uncheck "PropertyAccess" (or whatever you named it) in the Grammar elements listbox in the Session Settings dialog, then click OK to close it.
This approach isn't as flexible or clean. In the steps above you're matching specific, hardcoded object and variable names, so obj.varname is unimportant but obj.othervar isn't, even if it's aligned against obj("othervar"). If text on both sides is unimportant the difference will be unimportant; if one side is important it will be an important difference. So, with the above steps, obj.varname and obj("varname") will be unimportant everywhere, but it will work correctly since they'll either be matched to other cases that also match those definitions (and thus unimportant) or will be matched to something else that doesn't match that definition, which will be important and will make the difference important.
You can use regular expressions to match more general text categories, but you probably don't want to. For example, if you wanted to match all text that followed that pattern you could use these two lines instead:
\w+\.\w+
\w+\("\w+"\)
And then check the Regular expressions checkbox in the Grammar Item dialog so they're matched that way.
The upside/downside to that is that any text that matches those patterns is then unimportant. abc.newvar vs. def.varname would be considered an unimportant difference because both sides match the unimportant definition. That's good for things like comments or whitespace changes, but probably isn't what you want to do here.

How to do search and replace involving fields in Microsoft Word?

I have a Word document with fields of the reference variety, which occur in the form "[field].[field]"--in other words, there's a period between the two fields. I want to globally replace this with a space.
Word offers the ^d special character to search for fields, but for some reason the query "^d.^d" does not find anything. However, ".^d" does. Now comes the problem, however--what do I specify as the replacement text in order to retain the field code? If using regular expressions, I could use a "Find What Expression" such as \1, but with regexp ("wild card") mode the ^d is not permitted.
I guess I could write a macro...
I would like to add to Bibadia's solution.
An example of an index entry field; we want to change a name we misspelled.
Make sure hidden formatting is displayed (toggle with SHIFT+CTRL+F8).
Make sure wildcards option is not selected. To search for fields, use the opening and closing field braces code (optionally use ^w for spaces, as Bibadia suggested):^19 XE "Deo, John" ^21
Replace won't recognize field braces character, but will allow to insert the clipboard's content. ;). To do that, insert in text the correct entry. CTRL+F9 to insert field and type:XE "Doe, John"
Select the field above and copy
Use ^c in the replace box
Hit Replace All
Ta-da!
It's usually better to go the macro route when finding fields because, as you say, the find algorithm that Word uses doesn't work the way you might hope with fields.
But if you know exactly what the fields contain, you can specify a search pattern that will probably work (however not in wildcard mode).
For example, if you want to look for figure number field pairs such as
{ STYLEREF 1 \s }.{ SEQ Figure \* ARABIC \s 1 }
(which would typically be the same set of fields everywhere in the document)
If you only really need to look for the following:
{ STYLEREF 1 \s }.<any field>
you could ensure that field codes are displayed and search for
^d STYLEREF 1 \s ^21.^d
or
^19 STYLEREF 1 \s ^21.^19
If you need to be more precise, you can spell out the second field as well.
"^d" only works for finding the field beginning, not the field end.
It's a shame that ^w wants to find at least 1 whitespace character because otherwise it would be more robust to look for
^19^wSTYLEREF^w1^w\s^w^21.^19
Perhaps someone else knows how to work around that without using wildcards?
Torzaburo,
I suggest that you do this using a macro. You can start by recording the macro, and later refining your processing steps within the macro.
First turn on the hidden characters by navigating to Home > Paragraph > toggle the show/hide Paragraph symbol. Also, select all and toggle the field codes on (right-click and select "Toggle Field Codes".
Open a new blank Word doc in addition to the one you have open. You will use this later. Start the macro recording and find the field using the "^d" (field code) as you said.
When the field is found, copy only the field text within the brackets, and not the full field reference. While the macro is still recording, ALT + TAB to the new blank document and paste the field code in as plain text.
At this point, do the necessary find & replace processing to the field codes. Highlight the processed field codes, copy, ALT + TAB back to the original document, and paste back between the { } brackets.
Stop the macro recording. Add any further custom processing to the macro VBA.
Select-All and re-toggle the field codes. Update the field codes.
You don't need a macro. Just toggle all field codes on by using Alt+F9. Then do a find and replace for what you want to change. Once the replacement is complete, use Alt+F9 again to toggle the field codes back off.
Disclaimer: I didn't originate this solution, but it's clean and elegant and I thought it should be included here:
(Adapted from Search & Replace Field Codes in Word):
Create or find a single instance of the field you want to convert text to
Toggle Field Codes visible (AltF9)
Copy the code for the field you want to use to the Clipboard (highlight and CtrlC)
Open the Replace dialog box (CtrlH), insert the text you want to replace in the Find What box and then enter ^c in the Replace With box.
This will replace your text with the contents of the Clipboard, turning it into the field code you copied in step 3. It also copies formatting information (font, color, etc.), to control how the field will appear when hidden. (Caveat: I've tested this with Word 2003 under Windows 7 only.)
Coming in late on this, probably way too late for Beth (sorry Beth). And this may not be quite what Beth was looking for. But for anyone interested ...
It sounds like Beth may have created captions throughout the document using INSERT CAPTION (hence the presence of field codes). This means these captions will have been (automatically) created in CAPTION style.
To globally replace the separator "." with " " (space) in such captions, take two steps:
[1] Go to REFERENCES | INSERT CAPTION, then click on NUMBERING and replace the SEPARATOR "." with "EM-DASH". This will replace all separators in captions for the selected label in the CAPTION Window. If you have other labels in use in the document (e.g. FIGURE), select the other labels one by one and repeat this process.
[2] Do a find/replace searching for special character "em-dash" (^+) in style CAPTION, replacing with " ". Click REPLACE ALL.
Voila!
NOTE: This presumes that em-dash does not appear in the caption text anywhere. If it does, then you'll need to do a pre- and post- "fiddle" to ensure these em-dashes are not touched by the global replace above.
The "pre-fiddle" is to do a global find/replace across captions, replacing the em-dash ("^+") with some other string (e.g. "EM-DASH") that doesn't ever occur in any caption's text. Then you do the separator change as described above. Finally, the "post-fiddle" is to restore the em-dashes that were in the captions, by doing a global replace of the string "EM-DASH" with the actual em-dash character "^+".