Zebra Programming Language (ZPL) II using ^FB or ^TB truncates text at specific lenghts - truncate

I am writing code to print labels for botanic gardens. Each label is printed individually but with different information on each label. Each label contains a scientific name which can vary greatly in size and thus can go over 2 lines (our label size is 10cm wide by 2.5cm high).
Our problem occurs mainly with the name when we go over 24 characters (See line with **).
If we choose a name that has 24 characters or less then it prints fine.
Anything more it will not print.
If we take all the other "items" off the label and just leave the "name" element then it prints only the first 24 characters and truncates the rest (we did this to test whether a possible overlap between our ^FB block and another element could be causing this problem).
We tried this with other elements that use a ^FB and we found that they displayed the same behaviour but varied in the length at which this issue occurred: for example "cc" (short for country code) had a limit of 21 characters.
For added information: we compile this code within a BASIC environment and use variables such as ":name:", ":Acc.dt":" as seen bellow. Our database provides this information and we have checked for any internal routines that would have truncated long names etc. Our code was working fine in ZPL but we recently had to move to ZPL II (we purchased a newer model GX430t) and had to modify our ZPL code at which point this problem started to occur.
Here is our code:
^XA
^LH40,40
^MMT
^PW1200
^LL1200
^FO16,05^A0N,35,^FDAcc. num.^FS
^FO170,05^A0,35,^FV":accnum:"^FS
^FO360,05^A0,35,^FV":qual:"^FS
^FO350,35^A0N,30,^FDAcc.dt.^FS
^FO450,35^A0N,30,^FB790,3,0,L,
^FH\^FV":accdt:"^FS
^FO430,70^^A0N,25,^FB790,3,0,L,
^FH\^FDProv. type^FS
^FO560,70^A0N,25,^FV":provtype:"^FS
^FO800,225^A0N,30,^FB790,3,0,L,
^FV":cc:"^FS
**^FO10,100^A0N,40,^FB790,3,0,L,
^FV":name:"^FS**
^FO1000,05^A0,35,^FV":proptype:"^FS
^FO5,225^A0,25^FVColl.^FS
^FO55,225^A0,25^FV":coll:"^FS
^FO375,225^A0,25,^FV":consstat:"^FS
^FO1000,70^A0,25,^FV":reqby:"^FS
^FO535,180^BCN,55,N,N,N^FV":qual:"^FS
^FO60,45^BCN,35,N,N,N^FV":accnum:"^FS
^PQ1,0,1,Y
^XZ
Here is what we have tried to fix this (apologies if some seem like wild cards):
Changing font type, size, and location on label;
Changing ^FO to ^FT;
Looked at our internal database logic;
Taking away ^FH\;
Changing the values within the ^FB line (we tried nearly all possible permutations);
Manually typed in a name longer than 24 characters (using notepad - no database/compiler) - same issue.
Any thoughts on this would be greatly appreciated
Kerry

I've had this issue before, and across printer manufacturers, firmwares and languages.
First, some paraphrased explanations straight out of the 2014 ZPL II Programming Guide (P1012728-009 Rev. A).
"The ^TB command prints a text block with defined width and height. The text block has an automatic word-wrap function. If the text exceeds the block height, the text is truncated."
"The ^FB (Field Block) command allows you to print text into a defined block type format. It can format a ^FD (Field Data) string into a block of text using the origin, font, and
rotation specified for the text string, and it contains an automatic word-wrap function."
Technically, the difference between a text block and a field block is that height is in dots for the former and in lines for the latter.
Also notice that although not mentioned, the ^FB command also truncates text that does not fit in the number of lines specified, and here's where the font size of the A0 command and the line spacing of the FB command now play an important role in determining whether to show or truncate that second or third line.
Incidentally, in other languages such as TSPL there is no truncation of text blocks--if you tell the block to be 3 lines in height but there's enough text for 4 lines, line 4 overlaps line 3 to indicate this--which may seem awful, but it is better than the data loss of truncation, which is not obvious.
For both commands:
"Using ^FT (Field Typeset) for your data takes the baseline origin of
the last possible line of text, meaning that the field block will be
filled from bottom to top."
"Using ^FO (Field Origin) means that the field block will be filled from top to bottom."
In reality, I have only been able to make the ^FB command work as expected, but that may be because ^TB is not implemented in the firmware I've worked with (ZPL II "compliant" Bluetooth printers).
You can test the following snippet for a 2x2 label in the Labelary Viewer:
^XA
~TA0
^MTD
^MNW
^MMT
^MFN
~SD15
^PR6
^PON
^PMN
^PW406
^LS0
^LRN
^LL406
^LT0
^LH0,0
^CI0
^XZ
^XA
^FO324,10,0^FB386,2,0,C,0^A0R,36,28.8^FH^FD"The King" Cupcake^FS
^FO278,10,0^FB386,1,0,C,0^A0R,28,22.4^FH^FDUse By 11/24/2015 02:45 PM^FS
^FO152,10,0^FB386,1,0,C,0^A0R,24,19.2^FH^FD11/24/2015 02:45 PM^FS
^FO62,140,0^FB250,1,0,R,0^A0R,24,19.2^FH^FDSL: 4 hours^FS
^FO38,10,0^FB386,1,0,L,0^A0R,18,14.4^FH^FDPREP DATE:^FS
^FO8,10,0^FB386,1,0,L,0^A0R,28,22.4^FH^FD11/24/2015 10:45 AM^FS
^FO62,10,0^FB50,1,0,L,0^A0R,24,19.2^FH^FDEMP:^FS
^FO92,10,0^FB376,3,0,J,0^A0R,18,14.4^FH^FDIngredients: 1 1/2 cups all-purpose flour, 1 teaspoon baking powder, 1/2 teaspoon salt, 8 tablespoons (1 stick) unsalted butter, room temperature, 1 cup sugar, 3 large eggs, 1 1/2 teaspoons pure vanilla extract, 3/4 cup milk.^FS
^PQ3,,,Y
^XZ
In particular, I've preceeded the A0 and FD commands with FB. Using the viewer, you can quickly test the effects of changing from FT and FO in the ingredients line, the effects of changing the A0 font sizes and the effects of changing the FB number of lines from say 3 to 2 (the viewer does not truncate text btw).
Of course there is no match for actually printing a label, for your ZPL II "compliant" printer may or may not truncate text according to its manufacturer and firmware version.
I hope that helps!

Related

itextsharp , why is GetSingleSpaceWidth() returning 0 when a space is visually obvious?

Hi All,
This is a question related to itextsharp version 5.5.13.1. I am using a custom LocationTextExtractionStrategy implementation to extract sensible words from a PDF document. I am calling the method GetSingleSpaceWidth of TextRenderInfo to determine when to
join two adjacent blocks of characters into a single word as per the SFO link
itext java pdf to text creation
This approach has generally worked well. However, if you look at the attached document, the words "Credit" and "Extended" is giving me some problems.
Why are all the characters shown encircled in the screen capture returning a zero value for GetSingleSpaceWidth? This causes a problem . Instead of two separate words, my logic returns me one word "CreditExtended".
I understand that itextsharp5 is not supported any more. Any suggestions would be highly appreciated?
Sample document
https://drive.google.com/open?id=1pPyNRXvnUyIA2CeRrv05-H9q0sTUN97d
As already conjectured in a comment, the cause is that the font in question does not contain a regular space glyph, or even more exactly, does not map any of its glyphs to the Unicode value U+0020 in its ToUnicode map.
If a font has a ToUnicode map, iText uses only the information from that map. Thus, iText does not identify a space glyph in that font, so it cannot provide the actual SingleSpaceWidth value and returns 0 instead.
The font in question is named F5 and has this ToUnicode map:
/CIDInit /ProcSet findresource begin
14 dict begin
begincmap
/CIDSystemInfo
<< /Registry (Adobe)
/Ordering (UCS)
/Supplement 0
>> def
/CMapName /Adobe-Identity-UCS def
/CMapType 2 def
1 begincodespacerange
<0000> <FFFF>
endcodespacerange
4 beginbfchar
<0004> <0041>
<0012> <0043>
<001C> <0045>
<002F> <0049>
endbfchar
1 beginbfrange
<0044> <0045> <004D>
endbfrange
13 beginbfchar
<0102> <0061>
<0110> <0063>
<011A> <0064>
<011E> <0065>
<0150> <0067>
<015D> <0069>
<016F> <006C>
<0176> <006E>
<017D> <006F>
<0189> <0070>
<018C> <0072>
<0190> <0073>
<019A> <0074>
endbfchar
5 beginbfrange
<01C0> <01C1> <0076>
<01C6> <01C7> <0078>
<0359> <0359> [<2026>]
<035A> <035B> <2018>
<035E> <035F> <201C>
endbfrange
1 beginbfchar
<0374> <2013>
endbfchar
endcmap
CMapName currentdict /CMap defineresource pop
end
end
As you can see, there is no mapping to <0020>.
The use of fonts in this PDF page is quite funny, by the way:
Its body is (mostly) drawn using Calibri, but it uses two distinct PDF font objects for this, F4 which uses WinAnsiEncoding from character 32 through 122, i.e. including the space glyph, and F5 which uses Identity-H and provides the above quoted ToUnicode map without a space glyph. Each maximal sequence of glyphs without gap is drawn separately; if that whole sequence can be drawn using F4, that font is used, otherwise F5 is used.
Thus, CMI, (Credit, and sub-indexes are drawn using F4 while I’ve, “Credit, and Extended” are drawn using F5.
In your problem string “Credit Extended”, therefore, we see two consecutive sequences drawn using F5. Thus, you'll get a 0 SingleSpaceWidth both for the “Credit t and the Extended” E.
At first glance these are the only two consecutive sequences using F5, so you have that issue only there.
As a consequence you should develop a fallback strategy for the case of two consecutive characters both coming with a 0 SingleSpaceWidth, e.g. using something like a third of the font size.

Tk text widget index expressions and Unicode

(this question is based on that)
Let us consider the following code:
package require Tk 8.6
pack [text .t]
.t insert end "abcdefgh\nабвгґдеє\n一伊依医咿噫欹泆"
puts "[.t index 1.4+1l] [.t index 1.4+2l]"
puts "[.t index 3.4-1l] [.t index 3.4-2l]"
exit 0
Output:
2.2 3.2
2.6 1.8
I would rather expect +1l and -1l to preserve the column if the line is long enough, that is, to print 2.4 3.4 and 2.4 1.4. It looks like the result depends on the number of bytes needed to encode each character.
Should it be this way? Is it documented somewhere?
What font are you using? What exact patch-version of Tk are you using? (It should be reported by doing puts [package require Tk].)
I think the text widget currently uses character widths when working out the actual motions when doing index movement by lines. This has changed between past versions. The problem is that different bits of code want different things: sometimes you want visible motions (e.g., when handling users' cursor motion, especially with tabs set) and sometimes you want character-space motions (which is what you appear to be expecting).
Tk shouldn't ever be doing anything (you can see) with the byte widths of unicode characters. It's really supposed to handle that transparently (at least for any character in the Basic Multilingual Plane; you might find bugs outside that).

Unifont & UnicodeData.txt how do I deduce if character is full or half width (x-advance)

Is there reliable way for determining if glyph in unifont is half width like latin characters (ie all in chart 0002) which take left half space only or full width like character 0x06E9 (from chart 0006)?
Pixel analysis is not good solution for me as it would fail on many characters like spaces.
I'd prefer to use information from UnicodeData.txt:
http://www.unicode.org/Public/UNIDATA/UnicodeData.txt
Unfortunately I'm not able to find good match between unifont and any field from data.
Chart 0002: http://unifoundry.com/png/plane00/uni0002.png
Chart 0006: http://unifoundry.com/png/plane00/uni0006.png
Looks like you'll need the source code '.hex' for the version of unifont you're using and the appropriate versions of the Unicode Utilities from [1]. 'unigenwidth' [2] seems to generate code related to the width of characters in Unifont; perhaps you'll need to write a parser to look through that code and give you what you want?
[1] http://unifoundry.com/unicode-utilities.html
[2] http://manpages.ubuntu.com/manpages/trusty/man1/unigenwidth.1.html

How do I add a new Arabic vowel-sign in the PUA area of a font?

I am using Ubuntu 14.04, with FontForge compiled from the Git repo as of 31
July.
I'm trying to add a vowel-sign to an Arabic font, Graph, by Future Soft Egypt:
http://openfontlibrary.org/en/font/graph
I have added glyphs where the Unicode code-point already exists (eg peh,
U+067E), and that works fine. I am now trying to add a vowel sign where no
Unicode code-point exists - it is a "damma with tail", used by some writers in
Swahili to mean "o".
I decided to put it in the PUA at U+E909, and copied the font's damma (U+064F)
and added a tail:
http://kevindonnelly.org.uk/swahili/images/dammas.png
I generated the font, and set up the keyboard to emit that character.
The glyph comes up OK, but there are two problems, as can be seen here:
http://kevindonnelly.org.uk/swahili/images/output.png
showing at top "bubu", using the original damma, and at bottom "bobo", using
the new damma-with-tail.
(1) The damma-with-tail is too far to the left, even though the anchor points
in FF have not been moved.
(2) Worse, the damma-with-tail means that only the isolated versions of the
consonant glyphs get used - in the second line the two bs should be joined, as
in the first line.
I'm not sure whether this is a function of using the PUA, or whether it's due
to my missing some step I need to take in FF (eg the Encoding -> Add Encoding
Slots that needs to be done for the consonants), but if anyone could shed some
light on how to fix the two problems, I'd be very grateful.

How do I use length() for unicode characters?

When working in the Moovweb SDK, length("çãêá") is expected to return 4, but instead returns 8. How can I ensure that the length function works correctly when using Unicode characters?
This is a common issue with Unicode characters and the length() function using the wrong character set. To fix it you need to set the charset_determined variable to make sure the correct character set is being used before making the call to length(), like so in your tritium code:
$charset_determined = "utf-8"
# your call to length() here
In Unicode, there is no such thing as a length of a string or "number of characters". All this comes from ASCII thinking.
You can choose from one of the following, depending what you exactly need:
For cursor movement, text selection and alike, grapheme clusters shall be used.
For limiting the length of a string in input fields, file formats, protocols, or databases, the length is measured in code units of some predetermined encoding. The reason is that any length limit is derived from the fixed amount of memory allocated for the string at a lower level, be it in memory, disk or in a particular data structure.
The size of the string as it appears on the screen is unrelated to the number of code points in the string. One has to communicate with the rendering engine for this. Code points do not occupy one column even in monospace fonts and terminals. POSIX takes this into account.
There is more info in http://utf8everywhere.org