How to define regions on an image and pass to tesseract-ocr? - command-line

How can I define regions on an image, then pass this data to Tesseract-OCR's command-line, so that only text within the defined regions would be extracted?
I'm guessing this may be similar to the use of an image-map in HTML.
Thank in advance for your responses.

I found out how to pass in regions on an image to Tesseract.
Although it cannot be done through the command line, the Tesseract 3.02 API supports the function SetRectangle(int left, int top, int width, int height) that allows you to restrict the text extraction to the region specified.
It must be called after the SetImage() function.
Thanks again.

Related

Psychtoolbox - Filloval

I am new to Matlab and Psychtoolbox. I need to change color saturation. When creating a circle Screen('FillOval',window, is there a way of getting a handler to the Oval object and is it rendered as image? Thanks in advance
Unfortunately, (as far as I know) the FillOval function doesn't create a handle like you would be used to with matlab figures / patches. The best way to change the color is simply with the RGB index argument.
If you forget the arguments that belong in Psychtoolbox functions, type the name with a question mark to see the help file. In this case, type this in the command line:
Screen('FillOval?')
The arguments are:
Screen('FillOval', windowPtr [,color] [,rect] [,perfectUpToMaxDiameter]);
If i wanted to change the saturation, I would just redraw the Oval and change the RGB values I filled into the Fill Oval function. e.g. put in [255,0,0] on the first flip and [255,50,50] on the second.
It kind of sounds like you may want to opt for the "MakeTexture" and "DrawTexture" functions. With this function, you can take any image matrix and transform it into a texture handle with "MakeTexture". With "DrawTexture" you can then draw the image into the psych toolbox window. DrawTexture is nice, because it allows you to easily change the opacity of the texture.
I recommend exploring the help functions to learn more about this option.

CTLine or NSAttributedString -- get image bounds without a graphics context?

Is this possible? Basically, I have a bunch of NSAttributedString objects and corresponding CTLine objects. I want to get the image bounds before the drawRect stage. So at this point, there is nothing to draw into. I will then use these image bounds to decide exactly what I need to create for drawing.
EDIT: Another measurement of the size would probably work just fine. But calling the deceptively named CTLineGetTypographicBounds function only returns the width. If I pass in addresses of ascent and descent floats, they come back as zero.
EDIT: The given answer works great in MacOS. Can anyone do it in iOS?
If you are developing for iOS6+. You can use the following method:
CTLineRef line;
// Create the line...
CGRect bounds = CTLineGetBoundsWithOptions(line, kCTLineBoundsUseGlyphPathBounds);
// use bounds...
This is gives the same bounds as CTLineGetImageBounds() assuming you have no transforms applied in your context, but does not require the context. For iOS 5 and below, you would need to use the method described by Иван.
CTLineGetTypographicBounds() gives me a different width than this function or image bounds. I am not sure why. And the ascent and descent returned are those of the font and not the characters displayed in the CTLineRef.
Yes, you can, but not so easy.
You should, generally use CTLineGetTypographicBounds() which, for me, does return ascent, descent and leading, but a bit messed up - 'ascent' equals the total height (i.e. what should be ascent + descent) and 'descent' is always the maximum descent of the font - no matter if you have descending characters or not.
Other way is to retrieve the CTRun(s) from the line (CTLineGetGlyphRuns), then get the glyphs array (CTRunGetGlyphs or CTRunGetGlyphsPtr) and then using CTFontGetBoundingRectsForGlyphs and CTFontGetAdvancesForGlyphs build up the information you need.
EDIT:
I've just found this method: "- (NSRect) boundingRectWithSize:(NSSize)size options:(NSStringDrawingOptions)options" of NSAttributedString which seems to do exactly what is needed.
Hope, this is helpful...
Bounds returned by CTLineGetTypographicBounds() are not the same as image bounds. As the name, (and Иван's answer) suggests, ascent etc. are defined for the font and won't change based on the string. For example, you would use it if you want to find the correct line height if you have a multiline text, as line height normally should not depend on the exact characters you use.
CTLineGetImageBounds() on the other hand, returns the bounds that exactly fit the image. For example, if you want to draw a box around a single line, this is what you need.
CTLineGetImageBounds() needs a context because there may be text transforms and things like that. If you don't want to worry about that, just use a dummy context. For example:
CTLineRef line;
// create the line...
UIGraphicsBeginImageContext(CGSizeMake(1, 1));
CGContextRef context = UIGraphicsGetCurrentContext();
CGContextSetTextPosition(context, 0, 0);
CGRect bounds = CTLineGetImageBounds(line, context);
UIGraphicsEndImageContext();
// use bounds...
Another method is to convert the string to glyphs usingCTFontGetGlyphsForCharacters() and then calling CTFontGetBoundingRectsForGlyphs() with the glyph array you get from the first function. The latter function returns "the overall bounding rectangle for the glyph run" so don't worry about having to do processing on the individual bounding rects. If used both these functions successfully in iOS.
If you do this remember the mapping between glyphs and characters is not always one to one, especially when the string has non-English characters.

How to get coordinates of recognized characters

I have a very simple OCR app based on Tesseract. After the recognition
step, I also provide a user verification step that allows correction
in case OCR is wrong. To improve the user interface, I plan to draw a
rectangle on top of the OCR-ed character on the original input image,
and put it side by side with the OCR output. To get to that, I need
the coordinate of the recognized characters.
I tried something like this but it seems to give me gibberish:
ETEXT_DESC output;
tess->Recognize(&output);
text = tess->GetUTF8Text();
Now if I access output->count, it gives me some value above 10,000,
which is obviously wrong because the whole image only has 20 or so characters.
Am I on the right track? Can I have some direction please?
Maybe it's helpful to get the coordinates of the boxes.
Try the executable of tesseract. Use the command
"tesseract.exe [image] [output] makebox"
Afterall you get the coordinates of each character, one per row. Then you are able to compare.
The tesseract executable has an option hocr to output recognized characters and their coordiantes in html format. To get this programmatically, the FAQ says to refer to baseapi.h.

Get image width and height in pixels

so i have looked at a couple other questions like this and from what i saw none of the answers seemed to answer my question. I created a program that creates ASCII art, which is basically a picture of text instead of colors. the way i have the program set up at the moment you have to manually set the Width and Height of the pixels. If the width and height of the pixels is too large it simply wont work. so basically what i want to do is have a function to automatically set the width and height to the size of the picture. http://www.mediafire.com/?3nb8jfb8bhj8d is the link to the program now. I looked into pixel grabber but the constructor methods all needed a range of pixels. I also have another folder for the classes, http://www.mediafire.com/?2u7qt21xhbwtp
on another note this program is incredibly inefficient, i know that it is inefficient in the grayscaleValue() method, but i dont know if there is any better way to do this. Any suggestions on this program would be awesome too. Thanks in advance! (this program was all done on eclipse)
After you read the image into your BufferedImage, you can call getWidth() and getHeight() on it to get this information dynamically. See the JavaDocs. Also, Use a constructor for GetPixelColor to create the BufferedImage once and for all. This will avoid reading the entire file from disk for each channel of each pixel.
For further code clean up, change series of if statements to a switch construct, or an index into an array, whichever is more natural. See this for an explanation of the switch construct.
One last comment: anything inside a class that logically represents the state of an object should be declared non static. If, say, you wanted to render two images side by side, you would need to create to instances if GetPixelColor, and each one should have its own height and width attributes. Since they're currently declared static, each instance would be sharing the same data, which is clearly not desireable behavior.

Matlab: Adding symbols to figure

Below is the user interface I have created to simulate LDPC coding and decoding
The code sequence is decoded iteratively by passing values between the left and right nodes through the connections.
The first thing it would be good to add in order to improve visualization is to add arrows to the connections in the direction of passing values. The alternative is to draw a bigger arrow at the top of the connection showing the direction.
Another thing I would like to do is displaying the current mathematical operation below the connection (in this example c * H'). What I don't know how to do is displaying special characters and mathematical symbols and other kinds of text such as subscript and superscript in the figure (for example sum sign and subscript "T" instead of sign ="'" to indicate transposed matrix).
I would be very thankful if anyone could point to any useful resources for the questions above or show the solution.
Thank you.
To add arrows, you can either use the built-in QUIVER, or, for more options, ARROW from the file exchange. Both of these have to be plotted into axes, so if you want a big arrow on the top, you have to create an additional set of axes above the main axes.
As far as I know, you cannot use TeX or LaTeX symbols in text uicontrols. However, you can use them in axes labels. Thus, I suggest that you add an XLabel to the axes, for example
xlabel('\sigma c*H_T')
or (note the $-signs required for LaTeX)
xlabel('$\sum c*H_T$','interpreter','latex')
EDIT
I hadn't mentioned the use of text (as suggested by #gnovice and #YYC) because I thought it wasn't possible to place text outside of the axes. It turns out that I was wrong. text(0.5,-0.2,'\Sigma etc.') should work fine as well. I guess the only advantage of using 'xlabel' would be that you can easily add and position the axes label during GUI creation.
In regards to the 1st question, annotation (http://www.mathworks.com/access/helpdesk/help/techdoc/ref/annotation.html) might be an alternative solution.
In regards to the 2nd question, try text property in Matlab Help.
Search "Character Sequence" for the special characters; search "Specifying Subscript and Superscript Characters" for the subscript and superscript.
For drawing the arrow, I would go Jonas' suggestion arrow.m by Erik Johnson on the MathWorks File Exchange. It's the easiest way I've found to create arrows in figures.
For creating text with symbols, you can use the function TEXT. It lets you place text at a given point in an axes, and you can use the 'tex' (default) or 'latex' options for the 'Interpreter' property to get access to different symbols. For example, this places the text you want at the point (0,0) using 'latex' as the interpreter:
hText = text(0,0,'$\sum c*H_T$','Interpreter','latex');
The variable hText is a handle to the text object created, which you can then use with the SET command to change the properties of the object (string, position, etc.).