Is there a "n/a" symbol in unicode? - unicode

Is there an unicode symbol for "n/a"? There are some fractions like ½, but a n/a symbol seems to be missing.
If there is none, what would be the most appropriate unicode symbol to use for n/a in a website (which should be contained in common fonts, to avoid needing a webfont)?

Looking at the Unicode code charts, I do not see a single N/A symbol. I do, however, see ⁿ (U+207F) and ₐ (U+2090), which you could separate with / (U+002F) eg: ⁿ/ₐ, or ̷ (U+0337), eg: ⁿ̷ₐ, or ̸ (U+0338), eg: ⁿ̸ₐ. Probably not what you are hoping for, though. And I don't know if "common" fonts implement them, either.

For future reference, the fastest way I know to answer questions like the OP's when I have them myself is to go to unicodelookup.com, because of the way it works: there's a search bar at the top, and you just type a string and it will return any and all unicode characters containing that string (this is also a great way to discover new and useful symbols). So in the OP's case, he could proceed like this:
first try entering "not" (without the quotes) in the search field
visually scan through the results... doing so would not reveal a "not
applicable" character in this case
try again but this time entering "applic" in the search field
again, doing so would not turn up anything along the lines of what he's
looking for
At that point he would be reasonably confident the current Unicode standard does not have a "n/a" symbol.
If you use Firefox you can define a keyword like "uni" to search that site from the URL bar, meaning any time the browser is open and regardless of what page or site is currently showing, you could do this:
hit [F6]... this moves the cursor to the URL bar at the top
type something like "uni applic" and hit [Enter]... this brings up the
unicodelookup.com website with the search results for "applic" already
showing
For the above to work you would need to define your keyword ("uni" or wtv you prefer) to point to location http://unicodelookup.com/#%s.

There's a Negative Acknowlege icon...
␕ symbol for negative acknowledge 022025 9237 0x2415 ␕
Found by searching negative on the Unicode Lookup site.
I'm not a fan, and for my purposes have just gone with __N/A__ (Markdown..)

I see lots of answers going head-on at the "Not Applicable" abbreviation, without exploring what a symbol is. A quick search for the equivalent phrase "out of scope" brings up a couple of variations on the No symbol: ⃠ – this seems to fit the bill (and since I was looking for a way to represent inapplicability, I'll be using it in my technical document).
Per the Wikipedia article, the Unicode codepoint U+20E0 is a combining character, so it is superimposed on the preceding character; e.g. ! ⃠ overlays an exclamation point. To get it to appear isolated, use a non-breaking space
If you don't want to bother with the combining symbol, the article mentions there's also an emoji U+1F6AB 🚫 but it's typically going to be colored red, or won't render!

There's actually a single character that could be repurposed for this: the "Square Na" character ㎁ (U+3381), which is used to represent the nanoampere in fullwidth (CJK) scripts.

What about the "SYMBOL FOR NULL" ␀ (U+2400)?

Related

Multiple regex in one command

Disclaimer: I have no engineering background whatsoever - please don't hold it against me ;)
What I'm trying to do:
Scan a bunch of text strings and find the ones that
are more than one word
contain title case (at least one capitalized word after the first one)
but exclude specific proper nouns that don't get checked for title case
and disregard any parameters in curly brackets
Example: Today, a Man walked his dogs named {FIDO} and {Fifi} down the Street.
Expectation: Flag the string for title capitalization because of Man and Street, not because of Today, {FIDO} or {Fifi}
Example: Don't post that video on TikTok.
Expectation: No flag because TikTok is a proper noun
I have bits and pieces, none of them error-free from what https://www.regextester.com/ keeps telling me so I'm really hoping for help from this community.
What I've tried (in piece meal but not all together):
(?=([A-Z][a-z]+\s+[A-Z][a-z]+))
^(?!(WordA|WordB)$)
^((?!{*}))
I think your problem is not really solvable solely with regex...
My recommendation would be splitting the input via [\s\W]+ (e.g. with python's re.split, if you really need strings with more than one word, you can check the length of the result), filtering each resulting word if the first character is uppercase (e.g with python's string.isupper) and finally filtering against a dictionary.
[\s\W]+ matches all whitespace and non-word characters, yielding words...
The reasoning behind this different approach: compiling all "proper nouns" in a regex is kinda impossible, using "isupper" also works with non-latin letters (e.g. when your strings are unicode, [A-Z] won't be sufficient to detect uppercase). Filtering utilizing a dictionary is a way more forward approach and much easier to maintain (I would recommend using set or other data type suited for fast lookups.
Maybe if you can define your use case more clearer we can work out a pure regex solution...

diff text documents but ignore single character differences? Set a minimum edit distance filter?

I have two versions of a large book in txt format and I'd like to compare them to find significant changes between the versions, ignoring small single character differences.
There are lots of diffing tools that can ignore whitespace differences, but I also want to ignore small typos and single or couple character differences. For example, one version of the book has a repeated misspelling of leige hundreds of times and this is corrected in the next version to liege. Some proper nouns have also changed their spelling. (I could make custom workarounds for each misspelling, but would like something more general purpose)
Since I only care about more significant multi-word differences want I really want is to set a filter that ignores changes for a line unless the Levenshtein edit distance is above some threshold.
Looking around all the diff/comparisons tools I find seem to have code in mind so they lack any feature around ignoring small text changes. Google's diff_match_patch library is great for diffing plaintext and ignoring whitespace changes (demo here) but doesn't seem to have an out of the box way to ignore single character non-whitespace differences.
tl;dr; Are there any diff tools that can compare text documents but filter out minor single character non-whitespace differences?
In Beyond compare you can define "replacements".
An example:
Differences are marked red:
Then you can go to Session->Session Settings and set a replacement:
Or even easier: Mark the text and define the replacement immediate:
Now the difference is unimportant and marked blue:
With one click you can ignore the unimportant differences (red arrow in the screenshot).
Technical remark: I use BC4 with the pro edition.

Correct syntax for newline in Github Bio

Here is an example on my github profile - https://github.com/jack17529
I want to change this -
Silver Bullet in Issue KILLING.____
Master Mind to create Issues.______
My strongest language is Python not English.
I want to have newline instead of blanks.
like this -
Silver Bullet in Issue KILLING.
Master Mind to create Issues.
My strongest language is Python not English.
I have checked Bitbucket Bio is nowhere related to Github Bio.
Maybe they don't allow us to do it via the normal way, But It is possible to do of course. We can use the auto newline rule for the words which are too long for appending to the current line, for our need. All we need to do is putting other Unicode Spaces instead of normal space. And normal space between lines, for using newline rule against forbidden newline rule.
And if you want a free line, because of the character limitations, you can use the longer one;
" " instead of " "      (Try selecting spaces between quotes with your mouse)
Also this trick allows me create unnecessary spaces in the Stack Overflow too, like above, in the quote box.
Here is the result: github.com/cosmicog:
I have tried other answers, html ways, but no, they handle html tricks of course.
Note: This causes a bad look in the list view and the profile overview tooltip:
Maybe that is why it is not allowed but I hope they will fix this in the future.
As told to me by github support there is no way !, see here -
According to Github Support
I just did it by simply copying and pasting the character corresponding to this codepoint | unicode-table.com | as many time as needed in order to align the text the way I wanted.
This is the procedure I followed: at the end of each line I pressed Enter, then I filled the new line with 7 instances of the character mentioned above; then I pressed Enter again and started the new line with its text.
This question is a little stale, but I found it before I solved this myself, so I thought I'd drop my solution.
The bio doesn't appear to honor markdown, but neither does it accept HTML entities or elements. I worked around this with non-breaking characters to create long "words" similar to how you've used "_".
You can see in my bio that I needed a " " and a "‑" to format mine. The long word will pop to the next line. If you have a real short line, you can extend it with a lot of non-breaking spaces, but this probably isn't necessary. Since you cannot enter " " you need to use copy/paste or ALT codes (not looked up, but someone might add these for you). Those are the real characters above, so you can take them from this answer.
Refer: How to create newline in Github Bio
Just use   in HTML editor mode to new line is OK, This is my GitHub Bio

What are the unicode ranges for Hindi accented characters?

I'm trying to gather a Unicode list of all the 'o' like shapes in the Hindi character-set. In fact, a list of any characters (in any language) that makes uses of separate characters to indicate an accent would be better.
I intend to use this unicode-list in a RegExp.
I been trying to edit a list of character-ranges by outputting them in an Input TextField, but editing this text causes weird issues (the keyboard-cursor isn't place on the correct character, selections suddenly dissappear / incorrectly warps... in other words... HINDI HELL!)
I've tried this with Notepad++ too, but although it was more responsive, it eventually crapped out on me like it did in the Flash Player textfield. This seems to occur especially while removing the [] block (nulls?) characters. Some of them trigger odd behaviors.
Anyways, all I want is a list of the accents.
An example of a few are in the image below (but I would need ALL accents):
Thanks!
You can find pdf's containing lists of unicode ranges, grouped by language, here: http://unicode.org/charts/
For Hindi, you probably want Devanagari or Devanagari Extended.
Here is the character class for Devanagari combining marks:
[\u901\u902\u903\u93c\u93e\u93f\u940\u941\u942\u943
\u944\u945\u946\u947\u948\u949\u94a\u94b\u94c\u94d
\u951\u952\u953\u954\u962\u963]
This is only the basic Devanagari block (not Devanagari Extended).
If you want the complete set (for all languages), you can do it problematically.
You start from the Unicode date file at ftp://ftp.unicode.org/Public/6.1.0/ucd/UnicodeData.txt, described by TR-44 (http://unicode.org/reports/tr44/#Property_Definitions)
You can use the Canonical_Combining_Class field (see at http://unicode.org/reports/tr44/#Canonical_Combining_Class_Values) to filter the exact characters you want.
Can't be more precise, because "accent" a bit vague :-)
You might even have to also look at General_Category to get the filter right (and exclude certain marks, or symbols, or punctuation).
And a script doing this would definitely be better than trying to mess with text editors.
One of the characteristics of combining characters is that they combine :-)
So you might get all kind of puzzling results (like this: http://www.siao2.com/2006/02/17/533929.aspx :-)

Where to get a reference image for any unicode code point?

I am looking for an online service (or collection of images) that can return an image for any unicode code point.
Unicode.org does not have an image for each one, consider for example
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=31cf
EDIT: I need to use these images programmatically, so the code chart PDFs provided at unicode.org are not useful.
The images in the PDF are copyrighted, so there are legal issues around extracting them. (I am not a lawyer.) I suspect that those legal issues prevent a simple solution from being provided, unless someone wants to go to the trouble of drawing all of those images. It might happen, but seems unlikely.
Your best bet is to download a selection of fonts that collectively cover the entire range of characters, and display the characters using those fonts. There are two difficulties with this approach: combining characters and invisible characters.
The combining characters can easily be detected from the Unicode database, and you can supply a base character (such as NBSP) to use for displaying them. (There is a special code point intended for this purpose, but I can't find it at the moment.)
Invisible characters could be displayed with a dotted square box containing the abbreviation for the character. Those you may have to locate manually and construct the necessary abbreviations. I am not aware of any shortcuts for that.