What is difference between \ and \\ - ms-word

I have an embedded word document in my worksheet, names "Rec1"
The fields code are same as below:
{LINK Excel.SheetMacroEnabled.12 "C:\\Documents and Settings\\user\\Desktop\\Salaries\\StaffSalaries.xlsm" مالي!R2C13 \a \f 4 \r \* MERGEFORMAT}
What is the different and using "\ \" (double BackSlash) character with "\" one?

Word field codes originate in the C programming language. In that language, the backslash is used to indicate what in Office are called "switches" (like parameters). You see this a lot in command-lines, as well.
So in the LINK field you show us, \a, \f 4, \r and * Mergeformat are telling Word how to manage the field code (more info at https://support.office.com/en-us/article/field-codes-link-field-09422d50-cde0-4b77-bca7-6a8b8e2cddbd?ui=en-US&rs=en-US&ad=US).
\a tells the field it should update automatically
\f 4 tells Word to maintain Excel's original formatting
\r instructs Word to use RTF conversion for displaying the content
* are formatting switches, in this case, manually applied formatting should be retained when the field is updated
Because a single backslash denotes a switch, when you want to pass a literal backslash you need to double it up. This is the case for a file path, for example.

A backslash \ is often used to escape characters in many applications and programming languages. But since it's an escape character, it also needs to escape itself, if you literally mean \.
So in an environment where \ is an escape character, you need a double blackslash \\ to mean \.

Related

Perl regex presumably removing non ASCII characters

I found a code with regex where it is claimed that it strips the text of any non-ASCII characters.
The code is written in Perl and the part of code that does it is:
$sentence =~ tr/\000-\011\013-\014\016-\037\041-\055\173-\377//d;
I want to understand how this regex works and in order to do this I have used regexr. I found out that \000, \011, \013, \014, \016, \037, \041, \055, \173, \377 mean separate characters as NULL, TAB, VERTICAL TAB ... But I still do not get why "-" symbols are used in the regex. Do they really mean "dash symbol" as shown in regexr or something else? Is this regex really suited for deleting non-ASCII characters?
This isn't really a regex. The dash indicates a character range, like inside a regex character class [a-z].
The expression deletes some ASCII characters, too (mainly whitespace) and spares a range of characters which are not ASCII; the full ASCII range would simply be \000-\177.
To be explicit, the d flag says to delete any characters not between the first pair of slashes. See further the documentation.

Why do Atom snippets need four backslashes at once in their body in order to print a single backslash

I just realized this while setting up a snippet.
'.source':
'shrug':
'prefix': 'shrug'
'body': '¯\\\\_(ツ)_/¯'
In order to print the typical ¯\_(ツ)_/¯ shrug, you need 4 backslashes. Using 2 backslashes doesn't cause any errors, but the backslash won't be printed. I would understand it if why you'd need 2, but why 4?
The four backslashes in atom snippets is due to snippets using the generic CSON notation (Coffeescript style JSON).
It's well described in this comment on an issue from the atom-snippets repo
I think that four backslashes makes sense, however notationally
inconvenient.
It has to do with the levels of interpretation a snippet goes through
before it ends up in your text buffer:
The snippet is declared in a CSON file, the parsing of string elements
in this format is "backslash sensitive" i.e. \n represents the newline
character and a \ is represented as .
The snippet then has to be
parsed by the snippet body parser. The parser uses one \ to escape the
following character, e.g. \ becomes . So the process goes as follows:
\ --CSON--> \ --BodyParser--> \
The reason two backslashes used to work, was because the snippet body
parser never really handled escaped characters (the escape cases were
handled explicitly rather than in a generic way) this was why we had
bug #60.
The process could be made more notationally friendly if the snippets
were stored in a custom format. Then we would have more control over
how it is parsed, such as not interpreting backslashes before they are
being fed to the body parser.

When are double quotes required to create a KDB/q symbol?

Normally, for simple character strings, a leading backtick does the trick.
Example: `abc
However, if the string has some special characters, such as space, this will not work.
Example: `$"abc def"
Example: `$"BAT-3Kn.BK"
What are the rules when $"" is required?
Simple syntax for symbols can be used when the symbol consists of alphanumeric characters, dots (.), colons (:), and (non-leading) underscores (_). In addition, slashes (/) are allowed when there is a colon before it. Everything else requires the `$"" syntax.
The book 'Q for mortals', which is available online, has a section discussing datatypes. For symbols it states:
A symbol can include arbitrary text, including text that cannot be
directly entered from the console – e.g., embedded blanks and special
characters such as back-tick. You can manufacture a symbol from any
text by casting the corresponding list of char to a symbol. (You will
need to escape special characters into the string.) See §6.1.5 for
more on casting.
q)`$"A symbol with blanks and `"
`A symbol with blanks and `
The essential takeaway here is that converting a string to a symbol is required when special characters are involved. In the examples you have given both space " " and hyphen "-" are characters that cannot be directly placed into a symbol type.

How do I check if a character is a Unicode new-line character (not only ASCII) in Rust?

Every programming language has their own interpretation of \n and \r.
Unicode supports multiple characters that can represent a new line.
From the Rust reference:
A whitespace escape is one of the characters U+006E (n), U+0072 (r),
or U+0074 (t), denoting the Unicode values U+000A (LF), U+000D (CR) or
U+0009 (HT) respectively.
Based on that statement, I'd say a Rust character is a new-line character if it is either \n or \r. On Windows it might be the combination of \r and \n. I'm not sure though.
What about the following?
Next line character (U+0085)
Line separator character (U+2028)
Paragraph separator character (U+2029)
In my opinion, we are missing something like a char.is_new_line().
I looked through the Unicode Character Categories but couldn't find a definition for new-lines.
Do I have to come up with my own definition of what a Unicode new-line character is?
There is considerable practical disagreement between languages like Java, Python, Go and JavaScript as to what constitutes a newline-character and how that translates to "new lines". The disagreement is demonstrated by how the batteries-included regex engines treat patterns like $ against a string like \r\r\n\n in multi-line-mode: Are there two lines (\r\r\n, \n), three lines (\r, \r\n, \n, like Unicode says) or four (\r, \r, \n, \n, like JS sees it)? Go and Python do not treat \r\n as a single $ and neither does Rust's regex crate; Java's does however. I don't know of any language whose batteries extend newline-handling to any more Unicode characters.
So the takeaway here is
It is agreed upon that \n is a newline
\r\n may be a single newline
unless \r\n is treated as two newlines
unless \r\n is "some character followed by a newline"
You shall not have any more newlines beside that.
If you really need more Unicode characters to be treated as newlines, you'll have to define a function that does that for you. Don't expect real-world input that expects that. After all, we had the ASCII Record separator for a gazillion years and everybody uses \t instead as well.
Update: See http://www.unicode.org/reports/tr14/tr14-32.html#BreakingRules section LB5 for why \r\r\n should be treated as two line breaks. You could read the whole page to get a grip on how your original question would have to be implemented. My guess is by the point you reach "South East Asian: line breaks require morphological analysis" you'll close the tab :-)
The newline character is declared as 0xA from this documentation
Sample: Rust Playground
// c is our `char`
if c == 0xA as char {
println!("got a newline character")
}

Allowed characters in CSS 'content' property?

I've read that we must use Unicode values inside the content CSS property i.e. \ followed by the special character's hexadecimal number.
But what characters, other than alphanumerics, are actually allowed to be placed as is in the value of content property? (Google has no clue, hence the question.)
The rules for “escaping” characters are in the CSS 2.1 specification, clause 4.1.3 Characters and case. The special rules for quoted strings, as in content property value, are in clause 4.3.7 Strings. Within a quoted string, any character may appear as such, except for the character used to quote the string (" or '), a newline character, or a backslash character \.
The information that you must use \ escapes is thus wrong. You may use them, and may even need to use them if the character encoding of the document containing the style sheet does not let you enter all characters directly. But if the encoding is UTF-8, and is properly declared, then you can write content: '☺ Я Ω ⁴ ®'.
As far as I know, you can insert any Unicode character. (Here's a useful list of Unicode characters and their codes.)
To utilize these codes, you must escape them, like so:
U+27BA Becomes \27BA
Or, alternatively, I think you may just be able to escape the character itself:
content: '\➺';
Source: http://mathiasbynens.be/notes/css-escapes