I have to display $CₓH\subscript{y}$.
Is there any chance to display a subscripted 'y' in Unicode?
\u2093 represents the subscripted 'x'
Usually you do this with formatting. Unicode's selection of superscript and subscript characters doesn't stem from the need or desire to cover whole alphabets but rather to enable specific use cases, e.g. writing IPA. Furthermore, if you're using a good OpenType font it can also support proper subscripts for arbitrary characters at the font level (where a glyph isn't simply scaled down by the layout engine, but rather a specifically-designed subscript glyph from the font is used).
In fact, since you're already using TeX or something vaguely similar to it, just let one of the many implementations render it. There are lots of things you simply cannot do in plain text without formatting, and this is one of them.
The subscript and superscript characters in Unicode do not cover the whole alphabet.
See the Wiki article on this topic or this answer on SO.
In Sublime Text this subscripted y works: ᵧ. Copied from here: https://lingojam.com/SubscriptGenerator
EDIT This is actually the greek letter gamma
Related
On a computer screen, are the characters made up of pixels? If so, it means that characters are images!
And if the characters are made up of pixels, then why are there ASCII, UNICODE and other standards that associate binary digits to different character formats, but there are no standards that associate image formats with binary digits? Because if both are made up of pixels (characters and images), what is the difference between them?
No, #1: Characters are not "on a computer screen". What goes on the screen is the result of all kinds of rendering and painting and combining onto a 2-D grid of pixels.
No, #2: Unicode characters are independent of the specific fonts used to present them graphically. So, with one font, a character will end up producing certain pixels, and with another font - other pixels altogether.
No, #3: Character strings are held in your computer's memory as sequences of bytes, i.e. numeric values (with each character typically occupying one byte, or two, or a variable number of bytes).
On a computer screen, are the characters made up of pixels? if so, it means that characters are images!
On a typical modern screen, yes the graphical representation of a character is a group of pixels. No, computers don't always have a screen
For example in the past people used to interact with computers via multiple types of terminals like a mechanical terminal where the texts are printed directly to paper. Or sometimes a vector screen or a 16-segment/14-segment display is also used where the text representation has no pixel at all. Many computers don't even have a screen or a way to display characters and interact with humans via switches, LEDs, punched cards, network or serial port...
So the premise of the question is already wrong. Characters has nothing to do with pixels. Even when displaying characters on the screen then the pixels representing a character also vary depending on the font face and font size
Character traditionally means a symbol or a glyph representing something. In computing character means a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language. None of them says anything about pixels
Each language has a known set of symbols, so logically they're grouped together and each assigned a number. The whole set of those numbers and their mappings is called a character set. You can see that it makes sense to associate numbers with characters but doing the same for images make no sense. What are the common thing in images that we can map?
In the past there were no need to cooperate with people using other languages so each group of people chose a small set that works for their own language. However with the advent of portable devices and the internet, that doesn't work anymore. It'll be extremely awkward to receive a message that you can't read, or send an email that the customer sees as a bunch of garbage. That's why a bigger character set called Unicode was invented
However character set is just a way to map numbers to glyphs in computers. To deal with characters we also need a way to encode those numbers which is called character encoding. For example in a variable length encoding a long number may be encoded using more bytes. Unicode has multiple encodings like UTF-1, UTF-7, UTF-8, UTF-16 or UTF-32
I totally understand the necessity of integral, and brackets by pieces (2320 2321 239B-23AE) Since it helps building large notations.
But the for the large summation 23B3 23B4, if one stretch these two, they will still lose their shapes. I do not understand what is the logic behind separating this character, or why not a corresponding one for the product symbol 220F. Furthermore, I wonder in which case these two symbols should be used.
This doesn't answer the question fully, but the following paragraph from Unicode Technical Report #28: Unicode 3.2 is probably the most authoritative explanation you can find:
Symbol Pieces. [to follow “APL Functional Symbols”] The characters in the range U+239B..U+23B3, plus U+23B7, comprise a set of bracket and other symbol fragments for use in mathematical typesetting. These pieces originated in older font standards, but have been used in past mathematical processing as characters in their own right to make up extra-tall glyphs for enclosing multi-line mathematical formulae. Mathematical fences are ordinarily sized to the content that they enclose. However, in creating a large fence, the glyph is not scaled proportionally; in particular the displayed stem weights must remain compatible with the accompanying smaller characters. Thus, simple scaling of font outlines cannot be used to create tall brackets. Instead, a common technique is to build up the symbol from pieces. In particular, the characters U+239B LEFT PARENTHESIS UPPER HOOK through U+23B3 SUMMATION BOTTOM represent a set of glyph pieces for building up large versions of the fences (, ), [, ], {, and }, and of the large operators ∑ and ∫. These brace and operator pieces are compatibility characters. They should not be used in stored mathematical text, but are often used in the data stream created by display and print drivers.
This is followed by a table showing how pieces are intended to be used together to create specific symbols.
Unicode mostly standardises existing character repertoires, and keeps their peculiarities so that conversions round-trip properly. A corresponding product symbol is not part of Unicode because the originating character repertoire did not have one. Ask on https://www.unicode.org/consortium/distlist-unicode.html about the provenance of the summation top/bottom.
I am looking for large symbols in unicode like these:
∏ ∐ ∑ ∫
⨀ ⨁ ⨂
⊕ ⊖ ⊗ ⊘ ⊙
⎲
⎳
⌠
⌡
The only one I found is by combining two unicode symbols ⎲and ⎳. Not sure why that exists, but not a large product symbol. That's all I am really looking for (∏ over multiple lines like the sigma). If any of the other ones exist over 2 lines that would be great to know as well. Perhaps there is some way to manually make the large ∏ symbol out of smaller primitives.
⎲and ⎳. Not sure why that exists
When a collection of existing glyphs is added to Unicode, it is desirable to make encoding between character sets round-trip safe. So glyphs that are duplicates or variants of each other are kept anyway.
As of Unicode 10, these are the greek letter pi (and its compat decompositions) available: ∏Ππϖᴨℼℿ There are no top and bottom halves like for integral and summation.
You should not attempt to build a glyph piecewise from other glyphs shifted into position. (You said "primitives", but Unicode does not work that way.) The result is not accessible and somewhat likely to break in rendering on systems other than yours.
The correct solution is to use the ∏ glyph and simply scale up its font size. Look into MathML if you are using only ad-hoc notation so far.
Using an option, I can specify whether even or odd days are valid. For the user interface I search now for suitable symbols to display these options. In mathematics there is no symbol for even and odd numbers, as far as I know.
Does anyone know whether there is perhaps something corresponding in Unicode?
The answer is no.
Given that odd and even numbers are a mathematical concept and mathematics has no symbol for odd and even numbers, maybe except for 2N and 2N+1, you'll find it hard to find a non-existent symbols in Unicode.
You'd have to think of your own characters, or find some in Unicode and just redefine their meaning.
ASCII has versions of the whole Roman alphabet. I was surprised recently to learn that Unicode contains other version/s of those same characters. One example is "U+1D5C4: MATHEMATICAL SANS-SERIF SMALL K", or "𝗄".
Can't LaTeX math mode, or MS Word equation editor, or whatever other program just use a sans-serif font if it wants the letters in a mathematical formula to be sans-serif?
These characters exist so that the semantic distinction between them can be encoded in plain text, or where the specific font shape can't be controlled.
The block you mention is only intended for use in mathematical and technical contexts, where the distinction between, say, 𝑑 as a variable vs. d as a differential operator vs. 𝖽 as an object (in category theory) is important. TR #25 gives another example where losing the distinction between ℋ and H can completely change the meaning of an equation. Being able to encode this formatting into the text itself is also important for ISO 31-11.
All of these characters maintain compatibility mappings with their "normal" Latin and Greek counterparts, so the distinction between them should not affect searching and sorting.
You are confusing the display mode with the encoding for texts.
The idea is that unicode has ALL the symbols used to write known to mankind grouped by usage. That's why you will find many code-points that look alike.
So a formula with a k is different is supposed to be different then a word written with a k. The sans-serif part is just a description of the kind of k best used to display. Tomorrow somebody might want to add a serif k and then how would you describe the difference?