Are they unicode code points that enable geometric transformation such as rotation and mirroring? - unicode

Playing with Unicode to create symbols with the already large set of combiners and other modifiers allows to already go far.
Although, there are times where some arrows are only given in a single direction, or a diacritic is available only placed above, but not for example bellow on the left side.
So are they modifiers/combiners that allow to instruct such a composition?
For example, the combining rectangle allows to make something like a̻. At least on the current terminal, it's rendered with a rectangle on the above right position compared to the a glyph to which it's combined, having it's longest side oriented horizontally. Now, what if :
the goal is to place the rectangle at the top left, top middle, etc.?
the goal is to rotate the rectangle before it's combined with the main glyph?
the goal is to mirror the rectangle before it's combined with the main glyph?
Obviously the last point don't make much difference for a rectangle, but for asymmetric glyphs it would.

No, there is no such mechanism in Unicode. Different positional variants of the same diacritic are encoded as separate characters. For example, U+0307 COMBINING DOT ABOVE, U+0358 COMBINING DOT ABOVE RIGHT, and U+1DF8 COMBINING DOT ABOVE LEFT are all different codepoints. There is currently no way to represent, say, a generic combining dot below right in Unicode.
Similarly, arbitrary Unicode characters cannot be mirrored or rotated. Where such transformations make a meaningful distinction (for example the pair “E” and “Ǝ”), they have once again been encoded atomicly.
There are some very specific circumstances where such modifiers can be applied. In Sutton SignWriting, rotation is a productive feature. Rotating glyphs is necessary to display text correctly, so a number of rotation modifiers have been defined. For example, U+1D800 SIGNWRITING HAND-FIST INDEX points upwards in its base orientation (𝠀), but by appending U+1DAA1 SIGNWRITING ROTATION MODIFIER-2 you can make it point north-west instead (𝠀𝪡).
For emoji only, Unicode also specifies a mechanism for defining whether a given glyph is supposed to face left or right. For example, “🚗‍⬅️” would be an automobile going to the left and “🚗‍➡️” would be an automobile going to the right. No commercially available fonts presently support this mechanism, however.

Related

Unicode value for right arrow with two strokes

I want to use the → character with two // strokes through the arrow but cannot find the unicode value for it anywhere. Does this exist in unicode? If not, is there a way to recreate it?
There are six Unicode characters whose name matches a right arrow with a double stroke, making use of the regular expression: /right.*arrow.*double.*stroke/.
Only two characters appear to be relevant candidates:
⇻ U+21FB RIGHTWARDS ARROW WITH DOUBLE VERTICAL STROKE
⭼ U+2B7C RIGHTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE HORIZONTAL STROKE
(* RIGHTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE VERTICAL STROKE)
Notes:
The official Unicode name of U+2B7C was initially wrong, but a corrected name has been added later as an alias.
U+2B7C appears to be quite uncommon, no suitable font was available in the OS used for the screenshot. Still, it is possible to see what it should look like in the Miscellaneous Symbols and Arrows - Range: 2B00–2BFF PDF document:
I was not successful in finding what you were looking for (negative result). U+0219B is a "Rightwards Arrow with Stroke" and U+021FB "Rightwards Arrow with Double Vertical Stroke". If it exist, it would probably be called "Rightwards Arrow with Double Stroke". https://en.wikipedia.org/wiki/Arrow_(symbol)
The following Unicode sequences should describe your character, but unfortunately fonts are not helping.
→⃫ : \u2192\u20EB
⟶⃫ : \u27F6\u20EB
They are normal and long arrow, with the combining U+20EB: COMBINING LONG DOUBLE SOLIDUS OVERLAY (long double slash overlay). You may find a technical font which can display both in the expected way.
You may get something acceptable also with:
⎯⎯⎯⃫⟶ \u23AF\u23AF\u23AF\u20EB\u27F6 (using arrow extension line)
⎯⎯⃫⟶ \u23AF\u23AF\u20EB\u27F6
Depending on the environment, one of the two seem much better (on my computers).
So: you can express it (semantically) with Unicode, but standards fonts are not helping us. You should experiment with many symbols/mathematical fonts, to get an acceptable solution.
As alternative, you can build such image easily with SVG (and use a SVG as character image).

Drawing text using PdfTextArray in iTextSharp - how?

I am drawing text in a PDF page using iTextSharp, and I have two requirements:
1) the text needs to be searchable by Adobe Reader and such
2) I need character-level control over where the text is drawn.
I can draw the text word-by-word using PdfContentByte.ShowText(), but I don't have control over where each character is drawn.
I can draw the text character-by-character using PdfContentByte.ShowText() but then it isn't searchable.
I'm now trying to create a PdfTextArray, which would seem to satisfy both of my requirements, but I'm having trouble calculating the correct offsets.
So my first question is: do you agree that PdfTextArray is what I need to do, in order to satisfy both of my original requirements?
If so, I have the PdfTextArray working correctly (in that it's outputting text) but I can't figure out how to accurately calculate the positioning offset that needs to get put between each pair of characters (right now I'm just using the fixed value -200 just to prove that the function works).
I believe the positioning offset is the distance from the right edge of the previous character to the left edge of the new character, expressed in "thousandths of a unit of text space". That leaves me two problems:
1) How wide is the previous character (in points), as drawn in the specified font & height? (I know where its left edge is, since I drew it there)
2) How do I convert from points to "units of text space"?
I'm not doing any fancy scaling or rotating, so my transformation matrices should all be identity matrices, which should simplify the calculations ...
Thanks,
Chris

Is there a downwards double arrow with stroke unicode character?

I want the character ⇓ with stroke, just like ⇏ but downwards, but I can't find it. Does it exist?
Edit:
If you don't see the arrows (e.g. you use IE),
I want the character [downwards double arrow] with stroke, just like [rightwards double arrow with stroke] but downwards, but I can't find it. Does it exist?
There is no such character as a precomposed character (i.e., as a single encoded character, a code point assigned to a character), but you can in principle represent it using an arrow character followed by a combining overlay character.
The character “⇏” U+21CF RIGHTWARDS DOUBLE ARROW WITH STROKE has been defined as having the canonical decomposition RIGHTWARDS DOUBLE ARROW (U+21D2) COMBINING LONG SOLIDUS OVERLAY (U+0338). In principle, a character should be expected to be rendered the same way as its canonical decomposition. In practice, things don’t always go that way.
Along the same lines, a downwards double arrow with stroke could be written as the two-character sequence DOWNWARDS DOUBLE ARROW (U+21D3) COMBINING LONG SOLIDUS OVERLAY (U+0338) or, in HTML, as ⇓̸. In practice, few fonts contain these characters, and browsers may fail to implement the combination properly. Moreover, in many fonts, the result is awkward. In Arial Unicode MS and in DejaVu Serif, the result might be acceptable, but only the latter is free (can be legally used as a downloadable font via #font-face). Here’s the combination as rendered by your browser with the SO stylesheets in effect: ⇓̸.
It doesn't seem to exist, according to this page (compared to this).

How can I detect any unicode characters which have descenders, using .NET

I am trying to minimize the vertical distance between controls on a programmatically constructed Windows Form (using C#). This involves setting the Height property appropriately.
I have found that if the text of the control does not contain any letters with descenders in them (i.e. does not have any of the characters j, g, p, q or y) then the control Height can be smaller than when it does contain such letters (if it does contain letters with descenders then the descenders are chopped off if the Height isn't enough).
It will work fine to test for any of the above 5 characters as long as the language is English, or English - like, but I need to be able to cater for (just about) any language.
Is there a way, given some arbitrary Unicode character (and perhaps a font) to determine if that Unicode character has a descender or not?
There is no property defined for Unicode characters to indicate the presence of a descender, and it’s really a feature of glyph design rather than characters. For example, “Q” has a descenders in many fonts, and “J” has one in some. Besides, given the context, you should also consider diacritic marks placed below a letter, not just descenders of base letters. And probably diacritics above letters, too.
So you would need to read the font information (when available) about character dimensions, or tentatively draw characters in your software and measure their dimensions.
As a rule of thumb, any line height below 1.1 times the font size will cause problems with some characters and fonts. Using 1 (“setting solid”) is not enough, because characters may in fact extend outside the font size.
In Windows, you call GetPath() to get an array containing the X/Y coordinates of every point making up the perimeter or outline of the string of glyphs. Search the array for min/max, which will get you the rectangle exactly enclosing the string. Right to the edge of the letters.

iOS japanese handwriting input code help please

I have a series of questions about writing code for iOS and including handwritten recognition of japanese. I am a beginner, so be gentle and assume I am stupid ...
I'd like to present a japanese word in hiragana (japanese phonetic alphabet), then have the user handwrite the appropriate kanji (chinese character). Then, this is internally compared to the correct character. Then, user gets feedback (if they were correct or not).
My questions here revolve around the handwritten input.
I know normally if one uses the chinese keyboard this type of input is possible.
How can I institute something similar, without using the keyboard itself? Are there already library functions for this (I feel there must be since that input is available on the chinese keyboard)?
Also, Kanji aren't exactly the same as chinese characters. There are unique characters that japanese people invented themselves. How would I be able to include these in my handwriting recognition?
We worked on a similar exercise back at University.
As the order of the strokes is well defined with kanji and there are only 8 (?) different strokes. Basically each Kanji is a well-ordered sequence of strokes. Like te (hand) is the sequence "The short falling backward stroke" and then twice the "left to right stroke" and finally "The long downward stroke with the little tip at the bottom". There are databases that give you this information.
Now the problem is almost reduced to identify the correct stroke. You will still run into some ambiguities where you have to take into consideration in which spatial relation some strokes are to some others.
EDIT: For stroke recognition we snapped the free hand writing to 45 degrees (Where is the little circle symbol on the keyboard?) angles, thus converting it into a sequence of vectors along one of these directions. Let's assume that direction zero is from bottom to top, direction 1 bottom right to top left, 2 from right to left and so on CCW.
Then the first stroke of te (手) would be [23]+ (as some write it falling and some horizontal)
The second and third stroke would be 6+
and the last would be 4+[123] (as with the little tip, every writer uses a different direction)
This coarse snapping was actually enough for us to recognize kanjis. Maybe there are more sofisticated ways, but this simple solution managed to recognize about 90% of kanjis. It couldn't grasp only the handwriting of one professor, but the problem was that also no human except himself could read his handwriting.
EDIT2: It is important that your user "prints" the Kanji and doesn't write in calligraphy, since in calligraphy many strokes are merged into one. Like when writing a kanji with the radical of "rice field" in calligraphy, this radical morphs into something completely different. Or radicals with a lot of horizontal dashes (like the radical of "speech" iu) just become one long wriggly line.