I'm thinking of developping a mobile OCR app to detect words from mobile pictures.
The purpose if only to detect what words are in the picture, the layout is not important.
Also it would be use on very short texts.
I'm currently thinking of adapting tesseract to iphone and android.
I wonder if anyone has had any related experience? What are the limits etc.
Thanks!
Google Goggles does that... It takes a snapshot, reduces the color depth, scans through various contrast ratios, and exposes letters/words. It then performs a google search on what it found.
checkout this article http://www.itwizard.ro/interfacing-cc-libraries-via-jni-example-tesseract-163.html and this example http://code.google.com/p/mezzofanti/
i run it on my G1 and it's OK for single words, but it is very slow if you have 2-3 lines.
maybe with new phones (dual-core ones) you might get several lines in some seconds
Related
I am trying to find a solution for this AR app as the topic tells.
I want my app to recognize a hand-written number by the user.
The app will tell the user to write down for example number 24 on a paper and move the camera over the written number to see the 3d object.
This might be used for saving a Birthday, a wedding date .. etc
For accuracy, the app instructions will show the user a preview to tell please write the number 24 similar to this..
Although each hand writing will differ, but at least we do not get curly "2"-s or "4" with an open edge ..etc
So here we need AR to recognize the number, or be able to read the number according to approximation.
And the first question is: Is such a behavior doable or anyone familiar with a similar concept?
After searching similar apps, I found "Ink Hunter" apps for tatoo preview-s, although these apps use symbols not number, but we can think of a number as a symbol as well.
Also as this video: https://www.youtube.com/watch?v=9rXJcIE2Fcs shows, each user draws the symbol in a different way and still they get it working.
I am using Unity3d and Vuforia.
Vuforia offers free samples(unity3d packages) on the website, and there is one named "Text Recognition" , and here's the tutorial link: https://www.youtube.com/watch?v=W3MK6nC5FWE
But unfortunately couldn't make it work.
If someone has developed such a functionality using these sample projects from vuforia or have any ulternative method please I need you help :)
thanks in advance moghes
Here's a tutorial our team created on text recognition using the Hololens and Vuforia with Unity: https://www.youtube.com/watch?v=WdMeHgD4fMY. In the first portion of the video, we show how to get text recognition working with just Vuforia and Unity - no Hololens required. For your application, just change the text to numbers.
I believe the biggest challenge you will have is the "hand-written" component. From our research, Vuforia prefers computer-generated, predefined font types.
I am working on an app that needs to translate the text of an image in realtime with the iphone camera.Is there any way to implement it? any sdk or tutorial will be helpful.
My suggestion would be a combination of the following:
The open-source Tesseract OCR engine for getting the text from the image(Quite recent iOS wrapper here: https://github.com/ldiqual/tesseract-ios)
One of the translation services discussed in this question for translations: https://stackoverflow.com/questions/6151668/alternative-to-google-translate-api
E.g. a tutorial like this on how to get a real-time camera view with overlays: http://www.musicalgeometry.com/?p=1273
Please note that these are just ideas on how to do this with as fast progress as possible.
Some apps that offer real-time translations even try to find a suitable font and display the translated text at the exact same position as the original text was. I am afraid that this is not possible without investing lots of work and developing your own OCR engine.
Hope this helps.
Does using libraries like OpenEars will drastically enlarge my app size? Or I can just extract the text to speech stuff, and get away with it...Probably removing all those langs? I don't have any idea.
I checked and OpenEars sample app is 33MB - which is big!
So my question is - can I implement text to speech in my app without compromising the size so much? I mean I can live with 2-3 MBs but 30...
Thank you!
OpenEars developer here. Just follow the instructions here to reduce your final app size, there's no need to ship all the voices or any features of the framework that you aren't making use of. Depending on which voices you're using and which feature, you might see an app size increase of between 6 megs and ~20 unless you're using a large number of the available voices. The sample app uses all of the framework features and two voices, and supports a few iOS versions, so its binary size reflects that.
My guess is you can't, audio will take up a lot of space.
Removing unneeded language will free some space but 2-3 mb for all the audio guess that isen't really possible.
I've designed the User Interface of an iPhone app and I wish to show an online demo of that consisting for the moment of a series of static images representing the main steps of the app.
According to you what is the best way to do this simulation?
You know, something like a series of single webpage, optimized for mobile, containing a single image linking to the next step, but I was wondering if exists a much elegant and sophisticated solution, with a transition effect for example or other features.
I hope I was clear enough :)
Any help will be sincerely appreciated.
Thanks in advance for your attention.
This sounds like a good use for Briefs Briefs App Website. This pretty much allows you to create an interface and step through it as if it were an application. I believe you'll need to have a developer account to run the app that will read the brief on your phone (since it wasn't able to be released in the app store).
An alternative to static images would be to make a video. I use the iShowU video screen capture tool and set it to record the iPhone/iPad simulator window. I then run through the screens, type inputs, etc. In addition to recording the video, the program records my voice as I narrate the app's features.
As to transition effects, the video will capture whatever transition animations are in your program.
In the end you have a video that you could give your user, put on YouTube, or whatever.
You can do this easily and for free on AppDemoStore. You just have to upload the app screenshots and then add hotspots which are used for the navigation through the demo.
AppDemoStore offers also the sophisticated features you are asking for:
iPhone specific transition effects such as slide up/down/left/right, fade and flip
gestures icons for the hotspots
text boxes and callouts
multiple hotspots on a screen in order to create a simulation of the app (and not just a linear demo)
Here's a sample demo: http://www.appdemostore.com/demo?id=1699008
Moreover, the demos created on AppDemoStore run in any browser and mobile device and can be embedded in your webpage or blog (like you do it with a YouTube video). With the FREE account, you can create up to 10 demos with unlimited screenshots and all the features specified above.
Regards,
Daniel
I wanted to write an Android and/or an iPhone app that entails taking a picture of something (right now, I just want to limit to text) after which the app parses the text to make use of it. For example, perhaps taking picture of a sentence (or may be just fragments) will be then parsed by the app to bring up more information about the book. Title, author, ISBN etc. And even may be information about other books that are similar in content to this book.
Is this possible to do something like this? Is there an API that exists already that parses the content of an image? How is an image stored in Android and iPhone? Is it possible to implement the app in one platform and not the other?
I'd appreciate any input or advice that you guys have to offer. Thank you!
You're looking for this, possibly.
It's called OCR, or Optical Character Recognition.
Also check out ZXing a great library for decoding one- and two-dimensional barcodes. There are both iPhone and Android versions.