Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to make a program that takes an image as input and outputs text. Now I know that I can use a neural network to turn an image of single character into that character. The difficult part is: given an image with text in it, how would I produce all the rectangles around each individual character? What method could I use to do it?
A basic approach is to make a histogram of black pixels. First: project all pixels on a line. The deep valleys in the histgram indicate separation between lines (try different angles if the paper might be tilted). Then, per line (or per page if you know the font is monospaced) project the pixels on a horizontal histogram. This will give you a strong indication of inter character spaces. As a minimum this gives you a value for the average character height and width that will help you in next steps.
After that, you need to take care of kerning (where characters overlap). Find the connected pixels, possibly by first doing dilatation or erosion on the image to compensate for scanning artifacts.
Depending on the quality of the scan image you may have to use more advanced techniques, but this will get you going.
This doesn't sound like artificial intelligence, it sounds like you're talking about OCR:
http://en.wikipedia.org/wiki/Optical_character_recognition
See google tesseract
http://code.google.com/p/tesseract-ocr/
EDIT The unedited question was asking about artificial intelligence.
To me the question per se does not seem clear.
As it talks about OCR will leave a couple of articles here that they may help (they help me at least):
Improve OCR Accuracy
How to use image preprocessing to improve the accuracy of Tesseract
Also as mentioned above tesseract is a good OCR open-source python library (the one that i personally use as well). Other approaches that you may take is through sklearn
You may also want to check this stackoverflow post.
I am also pretty sure that you can use researchgate to check for any papers out there (I found some, just not sure if this is what you need)
I think that the above generic answer suits the generic question.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I am searching for segmenting gray matter from a T1 weighted brain MRI scan. But I could not get the correct tutorial to follow it. Please suggest me an algorithm that works better and accurately to segment the gray matter alone from the T1 wieghted MRI scan image. There are several tools to segment gray matter in matlab but I need algorithm to segment the gray matter. Please suggest me the algorithm.
Why reinvent the wheel? SPM does a good job of segmentation and the MATLAB source code is freely available: http://www.fil.ion.ucl.ac.uk/spm/
You can examine the algorithm that is used and customize it for your own purposes if you wish. It produces probabilistic maps of gray matter, white matter, and csf that you can use in subsequent analyses. There are also a variety of options to complete the segmentation in both normalized and native space. I highly recommend it as a place to get started, and then you can branch off from there depending on your needs.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I am planing to develop an app like Word Lens. Can any one suggest a good library that I can use? or any one explain technology behind the Word Lens App? is it reel time image matching or OCR? I know some image processing library like OpenCv, tesseract...Any help is greatly appreciated...
I'm one of the creators of Word Lens. Although there are some OCR libraries out there (like tesseract), we decided to make our own in order to get better results and performance. Our general algorithm goes like this:
copy the image from the camera and get its grayscale component
level out the image so the text stands out clearly against the background
draw boxes around things that look like characters & sentences
do OCR: match the pixels in each box against a database of characters -- this is actually pretty hard!
collect the characters into words, look up in a dictionary (this is hard too, because there will be mistakes in the OCR)
draw the results back onto the image
Image matching by itself is not good enough, because of the huge variety of fonts, words, and languages out there.
OpenCV is a great library to get up and running with, and to learn more about computer vision in general. I would recommend building off their examples, and playing around there. Have fun!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I was looking at CamScanner, Genius Scan, and JotNot and trying to figure out how they work.
They are known as 'Mobile Pocket Document Scanners.' What each of them do is take a picture of a document through the iPhone camera, then they find the angle/position of the document (because it is nearly impossible to shoot it straight on), straightens the photo and readjusts the brightness and then turns it into a pdf. The end-result is what looks like a scanned document.
Take a look here of one of the apps, Genuis Scan, in action:
http://www.youtube.com/watch?v=DEJ-u19mulI
It looks pretty difficult to implement but I'm thinking someone smart on stackoverflow can point me in the right direction!
Does any know how one would go about developing something like that? What sort of library or image processing technologies do you think they're using? Anyone know if there is something open source that is available?
I found an open source library that does the trick:
http://code.google.com/p/simple-iphone-image-processing
It probably is pretty difficult, and you will likely need to find at least some algorithms or libraries capable of detecting distorted text within bitmaps, analyzing the likely 2D and 3D geometric distortion within a text image, image processing to correct that distortion with its inverse, and DSP filtering to adaptively adjust the image contrast... plus use of iOS APIs to take photos in the first place.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I am looking for information on how to draw flame fractals from googling around I could not find much, either pages explain how to use third party tools or way too complicated for me to grasp. Anyone know how/why they work? or point me in the direction of not overly complicated implementations?
I have written a beamer presentation that covers the basics in flame fractals:
https://www.math.upenn.edu/~peal/files/Fractals[2009]Beamer[Eng]-PAXINUM.pdf
All images are done from my java implementation of the flame algorithm.
The source code can be found here:
http://sourceforge.net/projects/flamethyst/
I believe that the pdf http://flam3.com/flame_draves.pdf together with the implementation in java above should get you a long way.
You could read the original paper by Scott Draves, which details precisely how and why they work, as well as a guide to an implementation in pseudocode.
As long as you have some basic knowledge of maths, it should be relatively straightforward to understand (though it is rather long!). To be honest, you can probably ignore much of it and just read about the code, since much of the text is background info.
Fractal flames are basically a variant of an iterated function system (IFS). You have a series of functions through which you pass a single point over and over again. Each function is a combination of an affine transformation and one or more variations.
Each iteration, only one function is chosen (at random), and the resulting point is accumulated into a buffer and used as the starting point of the next iteration.
The buffer is then saved as an image, after having been post-processed and filtered, as described in the flame paper.
The best reference is still the original implementation, flam3.
I think fractals would be too computationally expensive to do in real time.
If I Google "simulating fire in computer graphics" I get a number of interesting things that suggest that it's not a trivial problem (surprise). SIGGRAPH is a conference whose proceedings you'll want to check out. But be warned - this is very mathematically challenging.
Have a look at http://formulas.ultrafractal.com/
There you can download the "Completed Formula Pack"
The file enr.ucl file should contain the formula for the flame fractal.
For more info:
http://www.ultrafractal.com/kb/flamefractals.html
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm preparing to give a talk about some topic I choose from artificial intelligence area (neural networks). I'm looking for something interesting, used in a real life and preferably not too complicated (the simpler it is, the easier it is for students to understand and the more interested they will be). I thought that it's a good place to look for advice ;)
Code applying neural networks to text recognition.
I think the concept of text recognition is interesting and understandable.
Toby Segaran's interesting book "Programming Collective Intelligence" contains a simple neural net example for learning search results relevancy. He offers the code from the book free on his site.
The neural net is in chapter4 code. Not sure if you could figure out the code without the text - if you don't mind spending a little money, the book certainly wouldn't hurt.
Learn your neural net the sine wave. It's simple. You only need 4 neurons. And the weights will clearly show how it's working. It was the example that made it click for me.
The "Real life applications" section of the (english) Wikipedia article about "Artificial neural networks" lists some (quite general) applications of neural networks.