Gnuplot pngcairo and postscript terminals not rendering some special characters?

Gnuplot pngcairo and postscript terminals not rendering some special characters? - unicode

I'm having trouble with the rendering of some characters in the pngcairo and postscript eps (both enhanced) terminals. The characters in question are the simple pipe, |, and the less/greater than characters, <>. These render in a completely broken way, with different characters altogether. To be specific, the following line:
set ylabel "<|S_{dy}(t)-S_{mc}(t)|/{/Symbol s}_{mc}(t)>"
produces the following result:
So, yes, basically it replaces the character with other random ones. Am I doing something wrong? Can this be fixed? This is gnuplot 5.2.2 I'm working with.

So I ended up solving this by resorting to using different fonts. In particular, for the <> I actually went with two slightly different glyphs that better suited my needs from Symbol, and for the | I used Times New Roman. The final line looked like:
set ylabel "{/Symbol \341}{/TimesNewRoman \174}S_{dy}(t)-S_{mc}(t){/TimesNewRoman \174}/{/Symbol s}_{mc}(t){/Symbol \361}"
and rendered correctly.

Related

Are characters made up of pixels?

On a computer screen, are the characters made up of pixels? If so, it means that characters are images!
And if the characters are made up of pixels, then why are there ASCII, UNICODE and other standards that associate binary digits to different character formats, but there are no standards that associate image formats with binary digits? Because if both are made up of pixels (characters and images), what is the difference between them?

No, #1: Characters are not "on a computer screen". What goes on the screen is the result of all kinds of rendering and painting and combining onto a 2-D grid of pixels.
No, #2: Unicode characters are independent of the specific fonts used to present them graphically. So, with one font, a character will end up producing certain pixels, and with another font - other pixels altogether.
No, #3: Character strings are held in your computer's memory as sequences of bytes, i.e. numeric values (with each character typically occupying one byte, or two, or a variable number of bytes).

On a computer screen, are the characters made up of pixels? if so, it means that characters are images!
On a typical modern screen, yes the graphical representation of a character is a group of pixels. No, computers don't always have a screen
For example in the past people used to interact with computers via multiple types of terminals like a mechanical terminal where the texts are printed directly to paper. Or sometimes a vector screen or a 16-segment/14-segment display is also used where the text representation has no pixel at all. Many computers don't even have a screen or a way to display characters and interact with humans via switches, LEDs, punched cards, network or serial port...
So the premise of the question is already wrong. Characters has nothing to do with pixels. Even when displaying characters on the screen then the pixels representing a character also vary depending on the font face and font size
Character traditionally means a symbol or a glyph representing something. In computing character means a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language. None of them says anything about pixels
Each language has a known set of symbols, so logically they're grouped together and each assigned a number. The whole set of those numbers and their mappings is called a character set. You can see that it makes sense to associate numbers with characters but doing the same for images make no sense. What are the common thing in images that we can map?
In the past there were no need to cooperate with people using other languages so each group of people chose a small set that works for their own language. However with the advent of portable devices and the internet, that doesn't work anymore. It'll be extremely awkward to receive a message that you can't read, or send an email that the customer sees as a bunch of garbage. That's why a bigger character set called Unicode was invented
However character set is just a way to map numbers to glyphs in computers. To deal with characters we also need a way to encode those numbers which is called character encoding. For example in a variable length encoding a long number may be encoded using more bytes. Unicode has multiple encodings like UTF-1, UTF-7, UTF-8, UTF-16 or UTF-32

Large product ∏ symbol in unicode

I am looking for large symbols in unicode like these:
∏ ∐ ∑ ∫
⨀ ⨁ ⨂
⊕ ⊖ ⊗ ⊘ ⊙
⎲
⎳
⌠
⌡
The only one I found is by combining two unicode symbols ⎲and ⎳. Not sure why that exists, but not a large product symbol. That's all I am really looking for (∏ over multiple lines like the sigma). If any of the other ones exist over 2 lines that would be great to know as well. Perhaps there is some way to manually make the large ∏ symbol out of smaller primitives.

⎲and ⎳. Not sure why that exists
When a collection of existing glyphs is added to Unicode, it is desirable to make encoding between character sets round-trip safe. So glyphs that are duplicates or variants of each other are kept anyway.
As of Unicode 10, these are the greek letter pi (and its compat decompositions) available: ∏Ππϖᴨℼℿ There are no top and bottom halves like for integral and summation.
You should not attempt to build a glyph piecewise from other glyphs shifted into position. (You said "primitives", but Unicode does not work that way.) The result is not accessible and somewhat likely to break in rendering on systems other than yours.
The correct solution is to use the ∏ glyph and simply scale up its font size. Look into MathML if you are using only ad-hoc notation so far.

Subscripted 'y' in unicode

I have to display $CₓH\subscript{y}$.
Is there any chance to display a subscripted 'y' in Unicode?
\u2093 represents the subscripted 'x'

Usually you do this with formatting. Unicode's selection of superscript and subscript characters doesn't stem from the need or desire to cover whole alphabets but rather to enable specific use cases, e.g. writing IPA. Furthermore, if you're using a good OpenType font it can also support proper subscripts for arbitrary characters at the font level (where a glyph isn't simply scaled down by the layout engine, but rather a specifically-designed subscript glyph from the font is used).
In fact, since you're already using TeX or something vaguely similar to it, just let one of the many implementations render it. There are lots of things you simply cannot do in plain text without formatting, and this is one of them.

The subscript and superscript characters in Unicode do not cover the whole alphabet.
See the Wiki article on this topic or this answer on SO.

In Sublime Text this subscripted y works: ᵧ. Copied from here: https://lingojam.com/SubscriptGenerator
EDIT This is actually the greek letter gamma

Why does unicode multiple characters representing the same letter?

ASCII has versions of the whole Roman alphabet. I was surprised recently to learn that Unicode contains other version/s of those same characters. One example is "U+1D5C4: MATHEMATICAL SANS-SERIF SMALL K", or "𝗄".
Can't LaTeX math mode, or MS Word equation editor, or whatever other program just use a sans-serif font if it wants the letters in a mathematical formula to be sans-serif?

These characters exist so that the semantic distinction between them can be encoded in plain text, or where the specific font shape can't be controlled.
The block you mention is only intended for use in mathematical and technical contexts, where the distinction between, say, 𝑑 as a variable vs. d as a differential operator vs. 𝖽 as an object (in category theory) is important. TR #25 gives another example where losing the distinction between ℋ and H can completely change the meaning of an equation. Being able to encode this formatting into the text itself is also important for ISO 31-11.
All of these characters maintain compatibility mappings with their "normal" Latin and Greek counterparts, so the distinction between them should not affect searching and sorting.

You are confusing the display mode with the encoding for texts.
The idea is that unicode has ALL the symbols used to write known to mankind grouped by usage. That's why you will find many code-points that look alike.
So a formula with a k is different is supposed to be different then a word written with a k. The sans-serif part is just a description of the kind of k best used to display. Tomorrow somebody might want to add a serif k and then how would you describe the difference?

How to set Matlab not to use or use always regional number formatting?

My Matlab sometimes uses regional number formatting, and sometimes does not. May I ask it to do this always or may be not to do this at all?
As you see, it uses comma as decimal separator in one line of property editor and uses period as separator in neighboring lines.

As mentioned in the comments, this is not a feature but a bug.
I have tried to reproduce it in version 2012b, but was unable to input non-integer numbers for x, y and z. If this was not possible before it would explain why it has gone unnoticed until now.
So, please post a bug report and it will probably be fixed in one of the the next versions that is published.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse