I am now using tesseract to recognize character. Now I find tesseract has the outline of each recognized character. My problem is can I convert the outline to a polyline? That is if i can not find the corresponding character, I can use polyline to draw like a char.
Tesseract's ResultIterator class (API examples) can produce the coordinates of the bounding boxes of the recognized characters; however, the boxes are rectangles, not contours of the character shape.
Related
Please suggest how to connect the dotted pixels in an image like below:
Original Image
I want to apply OCR on this image. I have tried some morphological operations such as thickening and bridging but not obtaining the correct output as expected (NH5343320).
The original image is also uploaded. On applying horizontal edge detection on the original image, I got the dotted image as above. Is there any another methods available for applying OCR in these kind of images.
I would crop out and fill in a template for each of the available letters. Presumably, that would be the letters [A-Z] and the digits [0-9] like this.
0.png
3.png
Now I would do a sub-image search for each of them in your original image. I am doing this at the command-line with ImageMagick but you could use Matlab, OpenCV, or CImg or the Python, Perl, PHP, C, C++ bindings of ImageMagick.
So, I look for the 3 first:
compare -metric rmse -dissimilarity-threshold 1 -subimage-search plate.png 3.png result.png
25607.9 (0.390752) # 498,46
So, the 3 is found at coordinates 498,46. There will be 2 output files, output-0.png which looks like this:
and output-1.png in which you can see the brightest areas showing where the match is best:
Likewise with the 0:
compare -metric rmse -dissimilarity-threshold 1 -subimage-search plate.png 0.png result.png
31452.6 (0.479936) # 664,44
I am developing an image processing software that extracts/crops and enhances this cropped single page form from an image taken from a cellphone camera.The form has no rectangular boundaries to simplify the process of extraction.Yes it is a white background black text format but nothing apart from that is fixed.Now some Text will be present which will verify that the image is of the form required.So my questions are these.
1) Can i search for a specific regular expression using leptonica library itself or do i have to shift focus to other libraries like the tessarect API to do this.So far i have not found anything of this sort
2) Now suppose i know the text at the top left corner and the bottom right corner and i search it succesfully.Can i get the co-ordinates of the particular text that i am searching and then crop the image accordingly?
Leptonica doesn't do anything with text, it's an image processing library.
To enable acquiring position of the text, add tessedit_create_hocr 1 to you Tesseract config file (or set this option whichever way you configure Tesseract if you're using it as a library).
The result is no longer a text file, but a UTF-8-encoded HTML file (note: it's not valid XML). Its format is self-explanatory. It will contain positions and dimensions of all words on all pages in pixels, as found on the input image. You need to parse that HTML, find the words you're looking for, and then get bounding boxed of those words.
I'm doing character recognition for a regional language. While extracting the image, the dots are being separately identified as characters.
%% Plot Bounding Box
for n=1:size(propied,1)
rectangle('Position',propied(n).BoundingBox,'EdgeColor','g','LineWidth',2)
end
hold off
%% Characters being Extracted
figure
for n=1:Ne
[r,c] = find(L==n);
n1=imagen(min(r):max(r),min(c):max(c));
imshow(~n1);
end
Original code: http://www.mathworks.com/matlabcentral/fileexchange/22922-image-segmentation-extraction
Since you are doing character/text recognition, you are more likely to want collections of words or lines of text, and not individual characters. And if you really want to do the latter, then it is more robust after you have identified the individual words.
So, the simplest approach here is using the standard morphological opening (assuming the text is black, otherwise use closing) operator. Start with a large horizontal structuring element (SE). Applying a opening with this SE will divide your image in lines of text. In each line you use a shorter horizontal SE to obtain the individual words. Then for each word you consider a vertical SE for opening such that it joins accents and other typographical details.
For example, here is an input image, its opening with a horizontal SE of radius 35, the opening with a horizontal SE of radius 7, and a opening with a vertical SE of radius 7.
I didn't apply the third operation in isolated components, but you should do so to not risk joining two lines of text. And this is all assuming straight horizontal lines of text, of course. Drawing the bounding boxes on this final image gives the result you are after:
Note that some letters ("ty", and "ny") were connected in the beginning, so they appear as a single letter in this output. This is a separate problem to be handled, which might or not be a concern for you.
Given an image consisting of black lines (a few pixels wide) on white background, what is a good way to find the coordinates along the the lines, say for every 10th pixel or so? I am considering using PIL for the task, but other python or java-based libraries would also be OK.
Ideally the coordinates would point to the middle of the line, but as the lines are narrow, it's enough that they point somewhere inside the line.
A very short line or a point should be identified with at least one coordinate.
Usually, Hough transformation is used to find lines. It gives you the parameters describing the line (which can be transformed easily between different representations), and you can sample this function to get your sample points. See http://en.wikipedia.org/wiki/Hough_transform and https://stackoverflow.com/questions/tagged/hough-transform+python
I only found this http://coding-experiments.blogspot.co.at/2011/05/ellipse-detection-in-image-by-using.html implementation in python, which actually searches for ellipses.
I have a very simple OCR app based on Tesseract. After the recognition
step, I also provide a user verification step that allows correction
in case OCR is wrong. To improve the user interface, I plan to draw a
rectangle on top of the OCR-ed character on the original input image,
and put it side by side with the OCR output. To get to that, I need
the coordinate of the recognized characters.
I tried something like this but it seems to give me gibberish:
ETEXT_DESC output;
tess->Recognize(&output);
text = tess->GetUTF8Text();
Now if I access output->count, it gives me some value above 10,000,
which is obviously wrong because the whole image only has 20 or so characters.
Am I on the right track? Can I have some direction please?
Maybe it's helpful to get the coordinates of the boxes.
Try the executable of tesseract. Use the command
"tesseract.exe [image] [output] makebox"
Afterall you get the coordinates of each character, one per row. Then you are able to compare.
The tesseract executable has an option hocr to output recognized characters and their coordiantes in html format. To get this programmatically, the FAQ says to refer to baseapi.h.