Formatting tabular data using unicode characters - unicode

I need to produce a calculation trace file containing tabular data showing intermediate results. I am currently using a combination of the standard ascii pipe symbols (|) and dashes (-) to draw the table lines:
E.g.
Numerator | Denominator | Result
----------|-------------|-------
6 | 2 | 3
10 | 5 | 2
Are there any unicode characters that could be used to produce a more professional looking table?
(The file must be a raw text format and cannot use HTML or any other markup)
Edit: I've added an example of what the table now looks like having taken the suggestion on board and used the unicode box drawing characters:
Numerator │ Denominator │ Result
──────────┼─────────────┼───────
6 │ 2 │ 3
10 │ 5 │ 2

There are Unicode box drawing characters (look for Box Drawing under Geometrical Symbols - the chart itself is a PDF). I don't have any idea how widely supported those characters are, though.

Your table is getting help from the monospaced font triggered by the code tags here. Proportional fonts can prevent tabular alignment of the digits. Unicode has digits that retain tabular alignment regardless of fonts in the Mathematical Alphanumeric Symbols from 1D7CE-1D7FF like these 𝟶𝟷𝟸𝟹𝟺𝟻𝟼𝟽𝟾𝟿
&#x1D7F6 &#x1D7F7 &#x1D7F8 &#x1D7F9 &#x1D7FA &#x1D7FB &#x1D7FC &#x1D7FD &#x1D7FE &#x1D7FF

You should look at this Javascript Box Drawing Demo. This is a JavaScript Unicode box drawing tool whose purpose is to make it easy for users to draw Unicode box art in HTML textareas. There you will see how to draw boxes using the arrow keys.
First you should select a style other than "Off".
Then using the arrow keys move around and you will see the box being drawn as you type
Once you are satisfied with the look of your drawing, simply copy it from the box and paste it on your HTML code.

Related

Is there a Unicode Character for a cylinder database?

I am looking for a single symbol/letter to represent a data source in text of an HTML page (or Markdown).
I would like to use the cylinder shape has usual for a database.
Maybe I am not searching for the right word, but I am not figuring out any Unicode for a cylinder or something similar.
Is there a Unicode Character for a cylinder?
There’s U+26C1 WHITE DRAUGHTS KING ⛁ and U+26C3 BLACK DRAUGHTS KING ⛃, which are similar visually if not semantically.
I tend to (mis)use 🛢 (Oil drum) \x1F6E2, since I have more databases than oil drums. (See Unicode.org )
There's a file cabinet U+1F5C4. 🗄
And a card file box and card file index. 🗃🗂
You can also use a square shape to represent a database (looks like a server farm):
▤ Square with Horizontal Fill (U+25A4)
⌸ Apl Functional Symbol Quad Equal (U+2338)

Excel::Writer::XLSX separating column with bold border

I've worked with the Excel::Writer::XLSX module to build a lot of spreadsheets, but I was wondering if there is a way to basically separate certain columns with a bold border. In my script I use a merge range of columns, so for example, C-E is merged, F-H, etc... So I would like to put a bold divider along the E column, the H column, etc... I think it would be some kind of add_format, however I'm already using an add_format to get the date centered, the headers rotated 90 degrees, etc. Here is a quick snapshot of what I'm talking about (this is what I'm trying to copy via perl).

What the character codes are in the cmap table in TrueType fonts

Wondering what the "character codes" are for the cmap table in TrueType fonts. Microsoft talks about the Character to Glyph Index Mapping Table, but I don't see what the character or glyph index mean.
Wondering if somewhere in the font file you specify the encoding, such as Unicode 11.0, and then the character codes are equal to the Unicode code points such as U+0061 for a. Or if the character codes are instead the "browser" character codes (decimal codes I guess), such as 97 for a.
Basically wondering how you map keyboard characters to font glyphs, and what that really means. I think you not so much want to map keyboard codes to the font glyphs, but unicode codes like U+0061 to the font glyphs, so if in JavaScript (for example) you can do \u03A9 and it will give you Ω if your font supports that.
Trying to understand the anatomy of a font file in terms of how it maps the mathematical glyphs as vectors/paths, to characters or codes of some sort.
The short, but perhaps not desired, answer is of course "read the OpenType spec. It takes a while", so a slightly longer, but easier and less detailed answer would be http://pomax.github.io/CFF-glyphlet-fonts, although that skips over TTF so let's look at that here:
Your input code gets run through whatever is the applicable CMAP given the context you're applying the font to, which maps the computer's code (ascii code, unicode code point, ISO-2022-jp, what have you) to a glyph id. For TTF specifically, that id is then used as array offset in the "loca" table, which is the "glyph index to data location" table and specifies the byte offset in the "glyf" table for each glyph that the font contains. You then consult the glyf table at that byte offset, and start parsing the bytes as specified by https://learn.microsoft.com/en-us/typography/opentype/spec/glyf

How can I detect any unicode characters which have descenders, using .NET

I am trying to minimize the vertical distance between controls on a programmatically constructed Windows Form (using C#). This involves setting the Height property appropriately.
I have found that if the text of the control does not contain any letters with descenders in them (i.e. does not have any of the characters j, g, p, q or y) then the control Height can be smaller than when it does contain such letters (if it does contain letters with descenders then the descenders are chopped off if the Height isn't enough).
It will work fine to test for any of the above 5 characters as long as the language is English, or English - like, but I need to be able to cater for (just about) any language.
Is there a way, given some arbitrary Unicode character (and perhaps a font) to determine if that Unicode character has a descender or not?
There is no property defined for Unicode characters to indicate the presence of a descender, and it’s really a feature of glyph design rather than characters. For example, “Q” has a descenders in many fonts, and “J” has one in some. Besides, given the context, you should also consider diacritic marks placed below a letter, not just descenders of base letters. And probably diacritics above letters, too.
So you would need to read the font information (when available) about character dimensions, or tentatively draw characters in your software and measure their dimensions.
As a rule of thumb, any line height below 1.1 times the font size will cause problems with some characters and fonts. Using 1 (“setting solid”) is not enough, because characters may in fact extend outside the font size.
In Windows, you call GetPath() to get an array containing the X/Y coordinates of every point making up the perimeter or outline of the string of glyphs. Search the array for min/max, which will get you the rectangle exactly enclosing the string. Right to the edge of the letters.

How is transformation of code point to final character implemented in Unicode?

Characters included in BMP as specified by 4 digits,
and those characters outside of BMP contains 5 or 6 digits.
But my doubt is:
how is the finanal character drawed from value of code point?
Are the pictures of each character restored in each computer and when displaying just show the matching picture?
Or the final glyph is a computed result of code point itself?
Each Unicode character has a code. The software displaying the character obtains a glyph for that character code - usually from a font installed onto the hosting computer. It then uses the obtained glyph to display the character.
If it can't find a glyph for that character (many fonts for Latin characters completely omit the glyphs used for East Asian languages characters) it formally can't display it. It will then either indicate error or use a supplement glyph meaning that the actual glyph can't be displayed (it can be a question mark or a square or whatever).