How to interpret iTextSharp TJ/TF result - itext

I'm implementing an automation process with PowerShell using iTextSharp lib, to extract needed information about several PDF documents.
Based on this PDF content portion:
It returns this result:
[(1)-1688.21(1)-492.975(0)-493.019(0)]TJ
[(5)-493.019(0)-17728.1(2)]TJ
I can extract the literal values with some regex manipulation but, only using this method the result is:
$line -replace "^\[\(|\)\]TJ$", "" -split "\)\-?\d+\.?\d*\(" -join ""
1000
502
Of course, these results are not integral, and I need more specification on the reading/parsing.
I'm suspecting that the numbers between the literal characters (e.g -1688.21,-492.975,...), may be useful, but I didnt find explanation about such parameters.
What they represent?

When you are wondering about details of the PDF format, you should have a look into the PDF specification ISO 32000.
Operands
Operator
Description
array
TJ
Show one or more text strings, allowing individual glyph positioning. Each element of array shall be either a string or a number. If the element is a string, this operator shall show the string. If it is a number, the operator shall adjust the text position by that amount; that is, it shall translate the text matrix, Tm. The number shall be expressed in thousandths of a unit of text space (see 9.4.4, "Text Space Details"). This amount shall be subtracted from the current horizontal or vertical coordinate, depending on the writing mode. In the default coordinate system, a positive adjustment has the effect of moving the next glyph painted either to the left or down by the given amount. Figure 46 shows an example of the effect of passing offsets to TJ.
(ISO 32000-1, Table 109 – Text-showing operators)
Thus,
I'm suspecting that the numbers between the literal characters (e.g -1688.21,-492.975,...), may be useful, but I didnt find explanation about such parameters.
What they represent?
For each such number, the operator adjusts the text position by that amount. The number is expressed in thousandths of a unit of text space. This amount is subtracted from the current horizontal or vertical coordinate, depending on the writing mode.

Related

How to truncate UILabel and leave specifically x trailing characters?

Examples (x=1):
ABC -> ABC
ABCDEFGHIJKL -> ABCDE...L
ABCDEFGHIJKLMNOPQRS -> ABCDE...S
(truncating after 5 characters is an arbitrary choice for this question - it will depend on the width of the label of course)
I'm basically looking for the same functionality as truncating in the middle, but where I can specify how many characters to leave on the trailing end. Is this available in Swift or is there a reasonable workaround?
You can measure any string with .size(withAttributes:) function. Where attributes is your font.
Here is the idea:
get length of the last x characters plus dots symbol
iteratively measure the string starting from the beginning and adding 1 char on each step
if width of the leading string plus tail excess the given width, take just the string from the previous step.

Remove commas and decimal places from number field

I am trying to add two zero place holders in front of a field without changing the actual values involved. The field is an order number that is being pulled from MOMs. So right now that fields' formula is {cms.ORDERNO}.
When I try '00'+{cms.ORDERNO} the field displays 001,254.00. How can I remove the decimals and comma so it displays 001254?
The usual trick is to pad with plenty of extra digits on the left and then only take the six you really want from the right. This would handle any order number ranging from 1 to 999999.
right("000000" + totext({cms.ORDERNO}, "0"), 6)
When you don't specify a format string, as you tried, it uses default settings which usually come from Windows. By the way, if I recall correctly cstr() and totext() are equivalent for the most part but totext() has more options.
You should also be able to specify "000000" as the format string to produce the left-padded zeroes. Sadly I don't have Crystal Reports installed or I'd check it out for you to be sure. If this is the case then you probably don't need a formula if you just want to use the formatting options for the field on the canvas. If you do use a formula it's still simple.
totext({cms.ORDERNO}, "000000")
You definitely want to use the Replace formula a few times for this. The formula below converts ORDERNO into string, removes any commas and trailing decimal places, then adds the two zeroes at the beginning:
`00` + REPLACE(REPLACE(CSTR({cms.ORDERNO}),".00",""),",","")
So for example, if cms.ORDERNO is 1,254.00 the output from this formula would be 001254
I know this is older, but better solutions exists and I ran across this same issue. ToText has what you need built right in.
"00" + ToText({cms.ORDERNO}, 0, "")
From the Crystal Documentation:
ToText (x, y, z)
x is a Number or Currency value to be converted into a text string; it
can be a whole or fractional value.
y is a whole number indicating the number of decimal places to carry
the value in x to (This argument is optional.).
z is a single character text string indicating the character to be
used to separate thousands in x. Default is the character specified in
your International or Regional settings control panel. (This argument
is optional.)

Format Matlab data to only have digits after the decimal place

I used dlmwrite to output some data in the following form:
-1.7693255974E+00,-9.7742420654E-04, 2.1528647648E-04,-1.4866241234E+00
What I really want is the following format:
-.1769325597E+00, -.9774242065E-04, .2152864764E-04, -.1486624123E+00
A space is required before each number, followed by a sign, if the number is negative, and the number format is comma delimited, in exponential form to 10 significant digits.
Just in case Matlab is not able to write to this format (-.1769325597E+00), what is it called specifically so that I can research other means of solving my problem?
Although this feels morally wrong, one can use regular expressions to move the decimal point. This is what the function
myFormat = #(x) regexprep(sprintf('%.9e', 10*x), '(\d)\.', '\.$1');
does. The input value is multiplied by 10 prior to formatting, to account for the point being moved. Example: myFormat(-pi^7) returns -.3020293228e+04.
The above works for individual numbers. The following version is also able to format arrays, providing comma separators. The second regexprep removes the trailing comma.
myArrayFormat = #(x) regexprep(regexprep(sprintf('%.9e, ', 10*x), '(\d)\.', '\.$1'), ', $', '');
Example: myArrayFormat(1000*rand(1,5)-500) returned
-.2239749230e+03, .1797026769e+03, .1550980040e+03, -.3373882648e+03, -.3810023184e+03
For individual numbers, myArrayFormat works identically to myFormat.

Maximum length of a string after performing unicode casefolding

I need to perform casefolding on a set of strings, and must ensure beforehand that they will not exceed a given length after this is done (to hard-code the needed buffer size). The problem is that a string length (in code points) may change after casefolding is applied. See, e.g., in Python3:
>>> "süß".casefold()
'süss'
Now, the maximum number of code points a string may contain after performing casefolding can be computed easily:
>>> max(len(chr(s).casefold()) for s in range(0x10FFFF + 1))
3
But is it valid in all cases? I mean, is it possible that the sequence of code points (the order in which they appear) might affect the final length of the string, due to some arcane property of Unicode? Or can I assume that the final string will always be at most 3 times longer than the original?
The Unicode standard defines casefolding as follows:
toCasefold(X): Map each character C in X to Case_Folding(C).
So every character in a string is casefolded regardless of context and the results are concatenated. This means that your assumption is correct: A casefolded string is guaranteed to have at most three times the number of code points of the original.

Ab Initio - Formatting a number in Left alignment

I have a requirement in Ab Initio to format a number in left alignment. I shouldn't be using String conversion (as Strings are left aligned by default), as it might cause compatibility problems in the other end.
For example, if my Field has 7 bytes length, and I'm getting only two digits as my input, then these two digits should go into the first two bytes of my field (left aligned), instead of the last two bytes.
So, is there any in-built function in Ab Initio, that can format a number as left aligned?
You can convert it to string and let it ride. Ab Initio will automatically convert between string and decimal. Also, the physical representation will be the same for these two types.
If you are trying to use a non-ascii based format (int, float, etc.) I don't think there is a built-in function for this and you will probably have to do something rough like cast it to a void type then to a string type using hex_to_string() to preserve the exact bits and then right pad with spaces.