I understand that I can ask Tesseract to return text back at word level, textline level, paragraph level, block level.
I need to form my own cluster of words, which may be a portion of a text line or include multiple lines. Once I have this cluster of words, I'd like to organize them from left-to-right, top-to-bottom for readability.
I assume Tesseract has this ability since I can get back textline level words in order or paragraph level with words in the right level. Can I access this method from the tess4j API?
Or can someone point me to the algorithm so I can implement it on my own?
Thanks
Edit
Here's an example. Suppose my image has this block of text
John Doe Adam Paul Sara Johnson
Vice President Director of IT Head of Human Resources
jdoe#xyz.com apaul#xyz.com sjohnson#xyz.com
If I ask tess4j for textline level words, then I get 3 lines:
John Doe Adam Paul Sara Johnson
and
Vice President Director of IT Head of Human Resources
and
jdoe#xyz.com apaul#xyz.com sjohnson#xyz.com
Instead what I want is
John Doe
Vice President
jdoe#xyz.com
and
Adam Paul
Director of IT
apaul#xyz.com
and
Sara Johnson
Head of Human Resources
sjohnson#xyz.com
I wrote my own algorithm which sorts the words. The basic idea is a Comparator that shows words from top-to-bottom, and left-to-right (english language specific of course).
I use the bottom edge (ie minY) of the word for comparing because it should be about the same for words of different sizes while the top edge (ie maxY) may be higher for bigger words.
I also allow for some margin of error in y-axis comparison because the image could be tilted slightly or the OCR decides it wants to draw the bounding box slightly offset. ie. Words may be higher or lower than other words on the same line.
new Comparator<Word>() {
#Override
public int compare(Word w1, Word w2) {
Rectangle b1 = w1.getBoundingBox()
, b2 = w2.getBoundingBox();
double yDiff = Math.abs(b1.getMinY() - b2.getMinY());
double marginDiff = b1.getHeight()/2.0;
if( yDiff < marginDiff ) {
int xDiff = Double.compare(b1.getMinX(), b2.getMinX());
return xDiff;
} else {
return Double.compare(b1.getMinY(), b2.getMinY());
}
}
}
Related
I would like to quantify the shape of a line on the wings of butterflies which can vary from quite straight to squiggly similar to the horizon in a landscape, or similar to a graph (per each x value there is only 1 y value), although overall orientation varies. My idea is to use the free hand tool to trace the line of interest and then let an ImageJ macro quantify it (automating this may be tricky because there are many line-like structures). Two traits seem useful to me;
the proportion between the length of the drawn line and the straight line between the end points.
'Dispersion' of the line such as calculated in the Directionality plugin.
Other traits such as what proportion of the line is below or under the straight line that connects the extremes may also be useful.
How can this be coded? I am building an interactive macro that prompts the measuring of various traits for an open image.
Hopefully the below (non-functional) code will convey what I am trying to do.
//line shape analysis
run("Select None");
setTool("free hand");
waitForUser("Trace the line between point A and B");
length= measure();
String command = "Directionality";
new PlugInFilterRunner(da, command, "nbins=60, start=-90, method=gradient");
get data...
//to get distance between points A and B
run("Select None");
setTool("multipoint");
waitForUser("Distances","Click on points A and B \nAfter they appear, you can click and drag them if you need to readjust.");
getSelectionCoordinates(xCoordinates, yCoordinates);
xcoordsp = xCoordinates;
ycoordsp = yCoordinates;
makeLine(xcoordsp[0], ycoordsp[0], xcoordsp[1], ycoordsp[1]);
List.setMeasurements;
StrLength = List.getValue("Length");
I have looked online for solutions but found surprisingly little about this relatively simple issue.
warm regards,
Freerk
Here is a simple solution to determine to what extent the line deviates from a straight line between pint A and B. The 'straightness' is the proportion between the two measures.
// To meausure line length to compare to length of straight line aka Euclidean distance
run("Select None");
setTool("polyline");
waitForUser("Trace the line between point V5 and V3 by clickinmg at each corner finish by double click"); // Points V5 and V3 refer to point A and B It can be adjusted
run("Measure");
getStatistics(Perim);
FLR=Perim; // FLR For forewing lenght real
// to get Euclidian distance between points A and B
run("Select None");
setTool("multipoint");
waitForUser("Distances","Click on points A and B \nAfter they appear, you can click and drag them if you need to readjust.");
getSelectionCoordinates(xCoordinates, yCoordinates);
xcoordsp = xCoordinates;
ycoordsp = yCoordinates;
makeLine(xcoordsp[0], ycoordsp[0], xcoordsp[1], ycoordsp[1]);
List.setMeasurements;
FLS = List.getValue("Length"); // FLS For forewing length straight
}
I would still be grateful for more sophistcated line parameters
I would like to know if it is possible to find label of the beginning of a village, city or town?
I mean if there is some difference in 2 points, one 100 meters before the mark and the other 100 meters after the mark so that I know which one is in the city (even though there would be inaccuracy of one "way" segment)...
I tried to find difference in tags. In some cases I found that the mark is precisely between 2 ways but in another case the mark is in the middle of a way and even though I was not able to identify change between segment out of city and in the city (tag "place" doesn't create boundary of the city in terms of these marks but rather cadastral boundary)....
Thanks for any help
I am already aware that this question, more specifically, should have been directed to GIS StackExchange. I already asked the question here but sadly, got no answers or any suggestions. I used ST_ClosestPoint() to project a point to the nearest line (point on line interpolation) using this approach.
Select ST_ClosestPoint(line, pt) As closest_pt_line
From
(Select ad.geom As pt,
st.geom As line,
ST_Distance(ad.geom, st.geom) As d
From ad, st
Where ST_DWithin(st.geom, ad.geom, 10.0)
Order By d
) As foo;
Based on this projected point, I need to calculate the length of street in both directions and width of street by making using of building polygons layer. The scenario can be visualized like this:
Visualization of the scenario
I am aware of ST_Length function but it returns the length of whole linestring rather than length 1 and length 2. Any advice to calculate lengths of street and width would be highly appreciated?
Can't for the life of me figure out why when I extract text using iTextSharp, some of the text comes in backwards.
using (iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(#"C:\Temp\pdftest\sample.pdf"))
{
string sText = iTextSharp.text.pdf.parser.PdfTextExtractor.GetTextFromPage(reader, 1, new iTextSharp.text.pdf.parser.LocationTextExtractionStrategy());
}
*Reason I using a LocationTextExtractionStragegy is because I will be using coordinates to pull the text from this position. I've just included a crop of the full PDF for my example. If I use a SimpleTextExtractionStrategy, the "B uy 5 egt 5" and " eerf" don't show up.
Output (from sample code):
B uy 5 egt 5
eerf
4x6 PRINTS Download free
CVS Mobile App.
Promo code O H m OBILe PICS
sed items available in all stores We reserve the right to
There's definitely something weird going on the with "eerf". In the pdf, the cursor goes horizontal when you try to select it (Big red FREE).
[
If I use acrobat professional, Advanced -> PDF Optimizer, select Transparency, then save the file, the text is extracted correctly and the "Red Free" is selectable.
So two questions, how can I emulate the PDF Optimizer in iTextSharp?
Or, how can iTextSharp read this text correctly?
As you can see this is my first post so don't beat me up too bad.
Additional Test:
I even extended the LocationTextExtractionStrategy and RegionTextRenderFilter so I could return the coordinates of each Textchunk. The weird thing about the "Big Red" Free, is the F's start and end points was the exact same. Same case with the R, and two E's. I would have expected that the end point was equal to the start point + the width of text.
I have a series of questions about writing code for iOS and including handwritten recognition of japanese. I am a beginner, so be gentle and assume I am stupid ...
I'd like to present a japanese word in hiragana (japanese phonetic alphabet), then have the user handwrite the appropriate kanji (chinese character). Then, this is internally compared to the correct character. Then, user gets feedback (if they were correct or not).
My questions here revolve around the handwritten input.
I know normally if one uses the chinese keyboard this type of input is possible.
How can I institute something similar, without using the keyboard itself? Are there already library functions for this (I feel there must be since that input is available on the chinese keyboard)?
Also, Kanji aren't exactly the same as chinese characters. There are unique characters that japanese people invented themselves. How would I be able to include these in my handwriting recognition?
We worked on a similar exercise back at University.
As the order of the strokes is well defined with kanji and there are only 8 (?) different strokes. Basically each Kanji is a well-ordered sequence of strokes. Like te (hand) is the sequence "The short falling backward stroke" and then twice the "left to right stroke" and finally "The long downward stroke with the little tip at the bottom". There are databases that give you this information.
Now the problem is almost reduced to identify the correct stroke. You will still run into some ambiguities where you have to take into consideration in which spatial relation some strokes are to some others.
EDIT: For stroke recognition we snapped the free hand writing to 45 degrees (Where is the little circle symbol on the keyboard?) angles, thus converting it into a sequence of vectors along one of these directions. Let's assume that direction zero is from bottom to top, direction 1 bottom right to top left, 2 from right to left and so on CCW.
Then the first stroke of te (手) would be [23]+ (as some write it falling and some horizontal)
The second and third stroke would be 6+
and the last would be 4+[123] (as with the little tip, every writer uses a different direction)
This coarse snapping was actually enough for us to recognize kanjis. Maybe there are more sofisticated ways, but this simple solution managed to recognize about 90% of kanjis. It couldn't grasp only the handwriting of one professor, but the problem was that also no human except himself could read his handwriting.
EDIT2: It is important that your user "prints" the Kanji and doesn't write in calligraphy, since in calligraphy many strokes are merged into one. Like when writing a kanji with the radical of "rice field" in calligraphy, this radical morphs into something completely different. Or radicals with a lot of horizontal dashes (like the radical of "speech" iu) just become one long wriggly line.