OpenType properties in OOXML - openxml

Is there a way to request specific OpenType properties in OOXML (Office Open XML) ?
For example I would like some digits to be in lining format and some digits to be in old style format. How can this be instructed in the OOXML code?
Here is an example of oldstyle and lining figures (the font is Garamond Premier Pro):
(It is the same font, only on the upper line the OpenType feature of oldstyle figures is activated.)

Related

Font for "math bold script" unicode charset

I wouldn't believe I have been stuck on this for one hour, but it seems the fonts for extended unicode characters are not easyly available as TTF / OTF for use on computers, especially with graphic software where unicode fallback doesn't work
especifically I looking for the so called Math bold script
somehting like : 𝓓𝓮𝓶𝓸 𝓯𝓸𝓷𝓽 𝓐𝓑𝓒𝓖𝓟 𝓮𝓻𝓽𝓷𝓭 (<- those are extended chars)
as in https://textfancy.com/font-converter/
as imagen at: https://snipboard.io/fNYd7w.jpg
(becouse I am not sure we all see the same glyphs)
Note: what I am looking for, is a standrd TTF font, which normal glyphs are equal to those extended glyphs, meaning that the A looks like the 𝓐, B like 𝓑, and so on. So I could use the font as normal font in every software.
The STIX math fonts support the Unicode Mathematical Alphanumeric Symbols block.
https://www.stixfonts.org/
https://github.com/stipub/stixfonts
(Note: the variable fonts don't include support for that block of characters; only the static fonts do.)
Please note the intended use of those Unicode characters, as pointed out in the STIX project:
The sans serif, fraktur, script, etc., alphabets in Plane 1 (U+1D400-U+1D4FF) are intended to be used only as technical symbols.

Where can I get a Font-family to language pair map for Microsoft Word

I am programmatically generating a MSWord 2011 bilingual file(contains text from 2 languages) using docx4j. My plan is to set the font-family of text based on the language in the text. eg: When I have a Latin and Indian language passed, all text containing English will have 'Times New Roman' and Hindi as 'Devanagari' as their font type.
MS Word documentation doesn't have any information on this. Any help to find a list of all prominent languages MS-Word supports and their corresponding Font-Families appreciated.
The starting point is the rFonts element.
As it says:
This element specifies the fonts which shall be used to display the
text contents of this run. Within a single run, there may be up to
four types of content present which shall each be allowed to use a
unique font:
• ASCII
• High ANSI
• Complex Script
• East Asian
The use of each of these fonts shall be determined by the Unicode
character values of the run content, unless manually overridden via
use of the cs element
For further commentary and the actual algorithm used by docx4j (in its PDF output), which aims to mimic Word, see RunFontSelector
To simplify a bit, you need to work out which of the 4 attributes Word would use for your Hindi (from its Unicode character values), then set that attribute to the font you want.
You can set the attribute to an actual font name, or use a theme reference (see the RunFontSelector code for how that works).
If I were you, I'd create a docx in Word which is set up as you like, then look at its underlying XML. If it uses theme references in the font attributes, you can either use the docx you created as a template for your docx4j work, or you can manually 'resolve' the references and replace them with the actual font names.
If you want to programmatically reproduce what Word has created for you, you can upload your docx to the docx4j webapp to generate suitable code.
Finally, note that the fonts need to be available on the computer opening the docx. (Unless the fonts are embedded in the docx) If they aren't, another font may be substituted.

How can I substitute one glyph for another in an OpenType PostScript OTF font file?

I'm trying to use fonts from the Nitti Basic family for programming. These fonts are packaged as OpenType PostScript OTF files.
Its U+002D (HYPHEN-MINUS) glyph works well as a hyphen, but not so well as a minus.
For example, it doesn't line up with the horizontal bar of the plus sign.
On the other hand, Nitti's glyph for U+2212 (MINUS) is perfect as a minus (of course), and this is what I need when programming. It's not feasible for me to actually use codepoint U+2212; after all, U+002D is what you get when you press the minus sign on the keyboard and it's what programming languages use for subtraction.
So instead I'd like to steal the glyph from U+2212 and use it for U+002D, so that that character looks like a minus sign.
How can I do it?
Update: Yes, it is possible to use U+002D as a hyphen in source code.
As mentioned above, a minus sign is what I need.
I agree with Jukka, there are tools to do this.
However, please don't forget that a font is usually protected by very similar contracts as software. In this case the link you provided for example points to a legal document that reads (amongst much other):
"Except as permitted herein, you may not rename, modify, adapt,
translate, reverse engineer, decompile, disassemble, alter or
otherwise copy the Bold Monday Font Software."
Notice the fact that you're not permitted legally to change this font. If you read the rest of the agreement you'll see a lot of restrictions on the actual use of the font as well. Make sure you're not breaking your license by what you are doing...
For posterity, here's how to do it:
Obtain Adobe's AFDKO font tools and install them.
Put the OTF files into an empty directory.
Run ttx *.otf to convert the OTF files to TTX (XML).
Edit each TTX file in a text editor:
In the cmap section, change occurrences of hyphen to minus. This table maps characters to glyphs. Character U+002D was originally mapped to the hyphen glyph; this change maps it to the minus glyph.
Over the whole file, change ocurrences of NittiBasic to NittiBasicM and Nitti Basic to Nitti Basic M. This will distinguish the modified version of the font from the original once it's installed.
Rename the TTX files, replacing Nitti Basic with Nitti Basic M.
Run ttx -b *.ttx to convert the TTX files back to OTF.
Finally, install the newly-created OTF files.
Tools like FontForge can be used to edit a font in a simple manner.
Note that in programming, too, HYPHEN-MINUS has multiple uses: as a minus sign, but also (in some languages) as allowed in identifiers, as well as in comments, where it usually appears in the role of hyphen. In some uses, a HYPHEN glyph will look odd.

Non unicode to Unicode conversion, for any font!

I have a html file with text encoded in a non-unicode font. I need to convert that file to unicode. I searched for a convertor. But, most of the convertors work for only a list of fonts, not for all fonts.
My font is very specific, text is in Devanagari script.
I have the file, I have the font, now, please suggest me a tool or technique. Thanks.
Unicode is not about fonts, it is about encoding. You need to find a converter that can convert your text to Unicode. What is the encoding of your text?
Apache Tika has the ability to pull text from PDF files via knowledge of font behavior. So if the file is in fact a PDF you have a chance. If you have a text file full of font indices in no particular encoding, you have a big programming job ahead of you.

how to generate Chinese Characters using Postscript?

Does anyone knows how to generate Chinese characters using Postscript or related tools? I'd like to use unicode to represent Chinese characters but it seems that Postscript doesn't support unicode, yet. In addition, I'd like to specify several fonts to generate the same character.
Thus, I have two questions:
1. how to use unicode in Postscript? Or how to enumerate Chinese Character set in the postscript way?
2. How to specify the fonts configurations using Postscript?
At last, in case postscript cannot do this job, what tools should I turn to for my purpose?
Thank you very much!
-Jin
In Adobe's official PostScript language specification there is no specific support for Unicode fonts. (And this is the final version of the spec for PS Level 3, valid since its publication in 1999 -- PostScript as a language is no longer developed...)
However, PostScript supports (since Level 2) multi-byte fonts (2-, 3- and 4-bytes) in a generic way (see 'CID'). All PostScript fonts need an "encoding": an encoding basically is a table telling at which index position of a font which glyph description for a given character can be found. So while there are no Unicode fonts as such, there are multi-byte CID fonts which provide ranged subsets of Unicode.
Also, there are no freely re-distributable CMaps. (A CMap .) If you need a CMap, you have to derive it from the Windows codepage and the matching Adobe CMap.
If you just look for a "super-simple" method to use Unicode text strings with no need of checking for ranges, language etc.: sorry to disappoint you. There is no way. That would be a pipe dream.
Have a look at CID-keyed fonts instead. These are designed to include a large number of glyphs. (Page 364ff in PLRM)
Update: Linked to the correct page with CID font description.