I discovered that the '+' character in a custom node key is converted silently to a space character. I obviously need to escape these special characters, but I could not find documentation about which characters are not allowed in keys.
Thanks!
There should be no conversion, except for casting non-strings to a string.
When the generateIdsoption is used, the key is added as id="KEY" attribute to the generated HTML element, so the standard restrictions apply.
The key is also internally used as JavaScript hash key.
I'd recommend plain ascii keys, but '{', '.', '~', ... should be no problem as well.
As far as I know, + is interpreted as space by browsers, when part of a URL, so maybe you see the conversion there.
Related
I opened ./helix/config.toml to set some keymaps and since I am a spanish speaker I wanted to add a functionality to the ñ key.
I added the following configuration:
[keys.normal]
ñ = "move_char_right"
When I tried to reopen Helix it said the following:
Bad config: unexpected character found: `\u{f1}` at line 13 column 1
Press <ENTER> to continue with default config
I tried to reference it with both UFT and hex and none of them seemed to work (probably because the keys were set as an Enum or a Struct). So here I am asking.
This is part of the TOML syntax: To use characters other than ASCII letters, digits, underscores, or dashes in keys, the key itself needs to be in quotes.
[keys.normal]
"ñ" = "move_char_right"
See "bare keys" and "quoted keys" in https://toml.io/en/v1.0.0#keys. Quoting from there, but omitting examples:
Bare keys may only contain ASCII letters, ASCII digits, underscores, and dashes (A-Za-z0-9_-). Note that bare keys are allowed to be composed of only ASCII digits, e.g. 1234, but are always interpreted as strings.
Quoted keys follow the exact same rules as either basic strings or literal strings and allow you to use a much broader set of key names. Best practice is to use bare keys except when absolutely necessary.
I need to have a temporary delimiter, inserted server-side, that cannot possibly exist in content created by user.
The purpose for this is to have prepared content for CSV export, with configurable value delimiter, that will replace this untypeable character client-side, right before the export.
Does such character even exist?
There is no character that cannot possibly exist; however there are many characters (in particular control codes - those lower than decimal 32, excluding cr/lf/tab) that are extremely unlikely to exist in any reasonable text content. This is why escaping is often required in text-based protocols. There is no reserved space of characters that will be escaped in CSV, other than those already used in CSV itself.
Zero-width joiner is a unicode invisible kind of character which exist but do not exist. You can use that! :)
I've read that we must use Unicode values inside the content CSS property i.e. \ followed by the special character's hexadecimal number.
But what characters, other than alphanumerics, are actually allowed to be placed as is in the value of content property? (Google has no clue, hence the question.)
The rules for “escaping” characters are in the CSS 2.1 specification, clause 4.1.3 Characters and case. The special rules for quoted strings, as in content property value, are in clause 4.3.7 Strings. Within a quoted string, any character may appear as such, except for the character used to quote the string (" or '), a newline character, or a backslash character \.
The information that you must use \ escapes is thus wrong. You may use them, and may even need to use them if the character encoding of the document containing the style sheet does not let you enter all characters directly. But if the encoding is UTF-8, and is properly declared, then you can write content: '☺ Я Ω ⁴ ®'.
As far as I know, you can insert any Unicode character. (Here's a useful list of Unicode characters and their codes.)
To utilize these codes, you must escape them, like so:
U+27BA Becomes \27BA
Or, alternatively, I think you may just be able to escape the character itself:
content: '\➺';
Source: http://mathiasbynens.be/notes/css-escapes
From Programming in Scala section 6.10 (Page 151):
Identifiers in user programs should not contain '$' character, even though it will compile; if they do this might lead to name clashes with identifiers generated by Scala compiler.
I am sure it's a reason for this, but why not prevent use of the '$' character in alphanumeric identifiers?
Some of the identifiers generated internally by the Scala compiler contain '$' characters. If you create new identifiers with '$' characters, you might clash with the internally generated characters, and chaos ensues. OTOH, you sometimes need to '$' characters, either on those (now very rare) occasions when access to the internally generated Scala characters is necessary, or because someone used such an identifier in Java code you wish to call (where it's legal, if also discouraged).
I am interested in theory on whether Encoding is the same as Escaping? According to Wikipedia
an escape character is a character
which invokes an alternative
interpretation on subsequent
characters in a character sequence.
My current thought is that they are different. Escaping is when you place an escape charater in front of a metacharacter(s) to mark it/them as to behave differently than what they would have normally.
Encoding, on the other hand, is all about transforming data into another form, and upon wanting to read the original content it is decoded back to its original form.
Escaping is a subset of encoding: You only encode certain characters by prefixing a special character instead of transferring (typically all or many) characters to another representation.
Escaping examples:
In an SQL statement: ... WHERE name='O\' Reilly'
In the shell: ls Thirty\ Seconds\ *
Many programming languages: "\"Test\" string (or """Test""")
Encoding examples:
Replacing < with < when outputting user input in HTML
The character encoding, like UTF-8
Using sequences that do not include the desired character, like \u0061 for a
They're different, and I think you're getting the distinction correctly.
Encoding is when you transform between a logical representation of a text ("logical string", e.g. Unicode) into a well-defined sequence of binary digits ("physical string", e.g. ASCII, UTF-8, UTF-16). Escaping is a special character (typically the backslash: '\') which initiates a different interpretation of the character(s) following the escape character; escaping is necessary when you need to encode a larger number of symbols to a smaller number of distinct (and finite) bit sequences.
They are indeed different.
You pretty much got it right.