get the exact position of text from image in tesseract - iphone

Using GetHOCRText(0) method in tesseract I'm able to retrieve the text in html and on presenting the html in webview i'm able get the text but the postion of text in image is different from the output. Any idea is highly helpful.
tesseract->SetInputName("word");
tesseract->SetOutputName("xyz");
tesseract->Recognize(NULL);
char *utf8Text=tesseract->GetHOCRText(0);
and output image

If you have the hocr output, you should have a tag for each word. These tags should have class="ocrx_word" and name="bbox x1 y1 x2 y2" where the x and y are the top left and bottom right corner of the bounding box around the word. I don't think it's possible to automatically use this information to format a text document - would require translating pixel differences to number of tabs/spaces. But, you should be able to render text in the given location.

GetBoxText() method will return exact position of each characters in an array.
char *boxtext = _tesseract->GetBoxText(0);
NSString* aBoxText = [NSString stringWithUTF8String:boxtext];

Related

Showing image on a acro text field position

I had a PDF document which has acro text fields which is already shared to the client. Now the client wants to insert signature image in one of the text fields. My manager asked me to try a way to do the same.My idea is to replace an image on top of the text field position and resize the text field as image size.
For replacing acro text field of pdf into image, i am trying as below
1.Finding the text field by its field id
String position = null;
List<FieldPosition> fieldPositons = form.getFieldPositions("50106");
for (FieldPosition position :fieldPositons) {
this.position = position.position;
}
2. Setting the image into that position of text field
Image image = Image.getInstance("ImageFileName");
Float dimensions = position.split("x");
image.setAbsolutePosition(dimensions[0], dimensions[1]);
content.addImage(image);
Based on the given image width and height i need to change the width and height of acro text field.
Can any one tried as below, Is my logic works with itext pdf library. Let me know if any idea to replace acro text field with image
It is never a good idea to change the dimensions of an AcroField as all the content in a PDF is added at absolute positions. When you change the dimension of a field, you risk that it will overlap with other content.
Hence, your best option is to adapt the size of the image to the size of the AcroField. As you already indicated, you can get the field position like this:
AcroFields.FieldPosition f = form.GetFieldPositions("50106")[0];
Note that it's not a good idea to make this a string. You can use the FieldPosition object like this:
int page = f.page;
Rectangle rect = f.position;
You can scale and position the image like this:
Image image = Image.getInstance("ImageFileName");
image.ScaleToFit(rect.Width, rect.Height);
image.SetAbsolutePosition(rect.Left, rect.Bottom);
The ScaleToFit() method will scale the image in such a way that it will fit the dimensions of the form field, respecting the original aspect ratio of the image (you don't want to add a stretched image).
You need to page variable to add the image to the correct page:
PdfContentByte canvas = stamper.GetOverContent(page);
canvas.AddImage(image);
Important:
If you add the image as described above, you need to remove the field (this happens automatically if you flatten the form).
If the form field is a button, then you shouldn't use the above code. Instead you should replace the icon of the button. This is described in my answer to the question How to change a Button Icon of a PDF Formular with itextsharp? This answer explains the standard way to do what you're trying to do, but it requires that the field where you want to add the image is a button field (and based on your question, it seems that it's a text field).

Text region extraction by finding co-ordinates of text from an image

I am developing an image processing software that extracts/crops and enhances this cropped single page form from an image taken from a cellphone camera.The form has no rectangular boundaries to simplify the process of extraction.Yes it is a white background black text format but nothing apart from that is fixed.Now some Text will be present which will verify that the image is of the form required.So my questions are these.
1) Can i search for a specific regular expression using leptonica library itself or do i have to shift focus to other libraries like the tessarect API to do this.So far i have not found anything of this sort
2) Now suppose i know the text at the top left corner and the bottom right corner and i search it succesfully.Can i get the co-ordinates of the particular text that i am searching and then crop the image accordingly?
Leptonica doesn't do anything with text, it's an image processing library.
To enable acquiring position of the text, add tessedit_create_hocr 1 to you Tesseract config file (or set this option whichever way you configure Tesseract if you're using it as a library).
The result is no longer a text file, but a UTF-8-encoded HTML file (note: it's not valid XML). Its format is self-explanatory. It will contain positions and dimensions of all words on all pages in pixels, as found on the input image. You need to parse that HTML, find the words you're looking for, and then get bounding boxed of those words.

Can I tell iText how to clip text to fit in a cell

When I call setFixedHeight() on a PdfPCell, and add more text than fits in the given height, iText seems to print the prefix of the string which fits.
Can I control this clipping algorithm? For example:
Print a suffix of the string rather than a prefix.
Mark a substring of the string as not to be removed. This is with footnote references. If I add text saying "Hello World [1]", the [1] is a reference to a footnote and should not be removed. It's okay to remove the other characters of the string, like "World".
When there are multiple words in the string, iText seems to eliminate a word that doesn't fit, while I would like it partially printed. That is, if the string is "Hello World", and the cell has room only for "Hello Wo...", I would like that to be printed, rather than just "Hello", as iText prints.
Rather than printing characters in their entirety, print only part of them. Imagine printing the text to a PNG and chopping off the top and/or bottom part of the PNG to fit it in the space available. For example, notice that the top line and the bottom line are partially clipped here:
Are any of these possible? Does iText give me any control over how text is clipped? Thanks.
This is with reference to iText 2.1.6.
I have written a proof of concept, ClipCenterCellContent, where we try to fit the text "D2 is a cell with more content than we can fit into the cell." in a cell that is too small.
Just like in your other question ( iText -- How do I get the rendered dimensions of text? ), we add the content using a cell event, but we now add it twice: once in simulation mode (to find out how much space is needed vertically) and once for real (using an offset).
This adds the content in simulation mode (we use the width of the cell and an arbitrary height):
PdfContentByte canvas = canvases[PdfPTable.TEXTCANVAS];
ColumnText ct = new ColumnText(canvas);
ct.setSimpleColumn(new Rectangle(0, 0, position.getWidth(), -1000));
ct.addElement(content);
ct.go(true);
float spaceneeded = 0 - ct.getYLine();
System.out.println(String.format("The content requires %s pt whereas the height is %s pt.", spaceneeded, position.getHeight()));
We now know the needed height and we can add the content for real using an offset:
float offset = (position.getHeight() - spaceneeded) / 2;
System.out.println(String.format("The difference is %s pt; we'll need an offset of %s pt.", -2f * offset, offset));
PdfTemplate tmp = canvas.createTemplate(position.getWidth(), position.getHeight());
ct = new ColumnText(tmp);
ct.setSimpleColumn(0, offset, position.getWidth(), offset + spaceneeded);
ct.addElement(content);
ct.go();
canvas.addTemplate(tmp, position.getLeft(), position.getBottom());
In this case, I used a PdfTemplate to clip the content.
I also have answers to your other questions, but I don't have the time to answer them right now.
For straight Text box clipping, I adapted the C# code given here
http://itextsharp.10939.n7.nabble.com/Limiting-Text-Width-using-PdfContentByte-td2481.html
to the Java code below. The clipping area ends up outside this rectangle, so you can still draw a rectangle on the same exact coordinates.
cb.saveState();
cb.rectangle(left,top,width,height);
cb.clip();
cb.newPath();
// perform clipped output here
cb.restoreState();
I used a try/finally to ensure restoreState() was called.

iOS: Pdf scanner get coordinates of text

I am using CGPDFScanner to scan the pdf. Should I use Td operator to find positions of text? Can I have an example that how to use this operator to get positions of the text? Current I have used Tj and TJ operator to find the text. Now I would like to know position of each word in a single page of pdf. How can I do that?
Thanks
Look this library:
https://github.com/KurtCode/PDFKitten/
search and highlight text
To get the coordinates of the text you need to keep track of the text transformation matrix. See section 5.3.1, "Text Positioning Operators" of the PDF 1.4 Reference. (I'm not sure if later versions of the reference number things the same or not.) While the Td operator will set the current translation in the text matrix, there are other operators that affect the text matrix and other text state, as well. You need to keep track of the text matrix as the file is processed. The Tm operator will directly set the text matrix. The TD operator moves to the next line and offsets by the x and y parameters. T* just moves to the next line.

Core Text - select text in iPhone?

I need to render rich text using Core Text in my view (simple formatting, multiple fonts in one line of texts, etc.). I am wondering if text rendered this way can be selected by user using (standard copy / paste function)?
I implemented a text selection in CoreText. It is really a hard work... But it's doable.
Basically you have to save all CTLine rects and origins using CTFrameGetLineOrigins(1), CTLineGetTypographicBounds(2), CTLineGetStringRange(3) and CTLineGetOffsetForStringIndex(4).
The line rect can be calculated using the origin(1), ascent(2), descent(2) and offset(3)(4) as shown bellow.
lineRect = CGRectMake(origin.x + offset,
origin.y - descent,
offset,
ascent + descent);
After doing that, you can test which line has the touched point looping the lines (always remember that CoreText uses inverse Y coordinates).
Knowing the line that has the touched point, you can know the letter that is located at that point (or the nearest letter) using CTLineGetStringIndexForPosition.
Here's one screenshot.
For that loupe, I used the code shown in this post.
Edit:
To draw the blue background selection, you have to paint the rect using CGContextFillRect. Unfortunately, there's no background color in NSAttributedString.