How to get coordinates of recognized characters - tesseract

I have a very simple OCR app based on Tesseract. After the recognition
step, I also provide a user verification step that allows correction
in case OCR is wrong. To improve the user interface, I plan to draw a
rectangle on top of the OCR-ed character on the original input image,
and put it side by side with the OCR output. To get to that, I need
the coordinate of the recognized characters.
I tried something like this but it seems to give me gibberish:
ETEXT_DESC output;
tess->Recognize(&output);
text = tess->GetUTF8Text();
Now if I access output->count, it gives me some value above 10,000,
which is obviously wrong because the whole image only has 20 or so characters.
Am I on the right track? Can I have some direction please?

Maybe it's helpful to get the coordinates of the boxes.
Try the executable of tesseract. Use the command
"tesseract.exe [image] [output] makebox"
Afterall you get the coordinates of each character, one per row. Then you are able to compare.

The tesseract executable has an option hocr to output recognized characters and their coordiantes in html format. To get this programmatically, the FAQ says to refer to baseapi.h.

Related

Way of formatting Word so that a line occur at a given distance from page border?

I don't know how to properly specify this question, but basically I would like to format a document like specified here : http://etd.lib.hku.hk/thesis-form/Theses%20Binding%20specification(ed%20January%202018)c.pdf
It's on the 3rd page of the PDF document. So I need to input a line at the exact distance from the top border, while another line occurs at some exact distance from the bottom.
Does MS Word give the flexibility to do this?
Thanks!
Someone points out this question is off-topic for this forum but just in case someone is here anyway: use multiple text boxes can be the solution.
A user can adjust a text box's relative position on a page and thus achieve the formatting needed.

Text region extraction by finding co-ordinates of text from an image

I am developing an image processing software that extracts/crops and enhances this cropped single page form from an image taken from a cellphone camera.The form has no rectangular boundaries to simplify the process of extraction.Yes it is a white background black text format but nothing apart from that is fixed.Now some Text will be present which will verify that the image is of the form required.So my questions are these.
1) Can i search for a specific regular expression using leptonica library itself or do i have to shift focus to other libraries like the tessarect API to do this.So far i have not found anything of this sort
2) Now suppose i know the text at the top left corner and the bottom right corner and i search it succesfully.Can i get the co-ordinates of the particular text that i am searching and then crop the image accordingly?
Leptonica doesn't do anything with text, it's an image processing library.
To enable acquiring position of the text, add tessedit_create_hocr 1 to you Tesseract config file (or set this option whichever way you configure Tesseract if you're using it as a library).
The result is no longer a text file, but a UTF-8-encoded HTML file (note: it's not valid XML). Its format is self-explanatory. It will contain positions and dimensions of all words on all pages in pixels, as found on the input image. You need to parse that HTML, find the words you're looking for, and then get bounding boxed of those words.

Matlab imwrite() quality

I'm very new to Matlab, though I know a few other programming languages, so please forgive me if this is something simple. I have not been able to find any answers to this, either on StackOverflow or elsewhere.
I produce a figure using the following code:
figure(6),imageplot(P); drawnow;
Which looks like this:
I then save this image to my computer using the following commands:
imwrite(P, 'images/plot.png');
And the resulting image is tiny, and missing some of the color information:
If, however, I utilize the save function in the open figure (image #1) and save it manually, I get exactly what I want, which is that exact image stored on my computer.
How would I program that? I assumed that imwrite() would just write the image directly, but apparently I'm doing something wrong. Any advice? Perhaps it has something to do with the imageplot command? I cannot seem to get that to work in imwrite.
Update: Based on the comments below, I have begun using "imresize" with the "nearest" option. This scales the image properly, but the resulting image is still curiously darker (and therefore has less information) than if I hit the "save" button in the figure.
Image saved from figure:
Image using "imresize" with "nearest" option:
The MATLAB imwrite command saves exactly the number of pixels as specified in your image matrix. This is the actual result of your computation; the reason the output is "tiny" is because it is supposed to be. To make it larger, would be to simply scale/zoom it as required.
The save figure option however does something quite different: it rasterizes the the output you obtain in the figure window and gives you the option for saving it as an image. This is evident in the fact that when you do so, you obtain a white background in addition to your result which is really just the grey background you see before you save it; this can be adjusted by resizing the figure window before utilizing the save option.
If you're looking to simply make the output figure larger, I would recommend using something along the lines of the imresize command.
Say, if you want the default size to be twice the size of the real result, simply use:
imresize(P, 2.0);
For more options, try help imresize.
The command you need for the "Save As..." functionality of figures is called "print". I often use
print(gcf, '-dpng', 'some_filename.png')
or
print(gcf, '-depsc', 'some_filename.eps','-r0')
to save a figure as it is shown on screen. The format png offers a small filesize and excellent quality, and it is understood by most image viewers and browsers. The eps format is a vector format, which I use for printig. The '-r0' option specifies "use the same size as given by the screen resolution" for the vector format properties.

Is it possible to determine the (pixel-)width of text-strings bevore SVGs are created with scripts

I am about the create a bunch of SVG graphics with (probabably) a perl script. These SVG graphics will contain text blocks. Since I want to "connect" such text blocks (of varying widths) with lines I'd like to know what width a text will be so that I can draw the connecting lines' length accordingly.
I have seen in SVG get text element width that it could be possible with java script. But that's probably not what I am after since I don't intend to host the SVG in a browser.
So, I thought that maybe there's a way to find out the desired width at the script's runtime. If someone can point me to a solution (also outside the realm of perl but on windows), I'd be very gratefu.
I did that exactly that about a year ago using PDF::API2 and advancewidth function: https://metacpan.org/module/PDF::API2::Content#width-txt-advancewidth-string-text_state-
Note that you need to correlate DPI of PDF and SVG: they may be different (I actually did that just dividing values by 1.25, you can be better).
PDF::API2 gives you very accurate values that works for Inkscape (in my case) well.

Perl+Image::Magick usage: how to assemble several areas in one image into a new image?

I'm new to ImageMagick and haven't figured out how to assemble several areas into a new image.
E.g., I know the "geometry" of words "hello" and "world" respectively in an image, what I need to do
is to retrieve the word images and put then into one line image while keep their relative positions.
Question1: Suppose I use the perl API, how should I use Composite() or other correct methods to do this?
my $geom = sprintf('%dx%x+%d+%d', $word->{width}, $word->{height}, $offsetx, $offsety);
$x = $lineimg->Composite($wordimg, $geom);
warn "$x" if "$x";
Suppose $lineimg's size is big enough to hold all word images and the geometry has been computed.
This code gives out a complain by ImageMagick:
Exception 410: composite image required `Image::Magick' # Magick.xs/XS_Image__Magick_Mogrify/7790 ...
Question2: currently I only know how to crop a word image out of the original one and then Clone() method
to restore the original image. Is there a way to copy instead of crop a specific area from a image? This way
can save the time to copy back and forth the whole image several times.
Does anybody know how to write this kind of processing? I appreciate all your help and suggestions!
-Jin
From the sounds of things Image::Magick is not the right tool for your task. Image::Magick is generally for manipulating entire image - filtering, scaling, converting between formats etc.
Consider the GD module, which can do just about any of the drawing operations you will need.