How does CamScanner, Genius Scan, and JotNot work? [closed] - iphone

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I was looking at CamScanner, Genius Scan, and JotNot and trying to figure out how they work.
They are known as 'Mobile Pocket Document Scanners.' What each of them do is take a picture of a document through the iPhone camera, then they find the angle/position of the document (because it is nearly impossible to shoot it straight on), straightens the photo and readjusts the brightness and then turns it into a pdf. The end-result is what looks like a scanned document.
Take a look here of one of the apps, Genuis Scan, in action:
http://www.youtube.com/watch?v=DEJ-u19mulI
It looks pretty difficult to implement but I'm thinking someone smart on stackoverflow can point me in the right direction!
Does any know how one would go about developing something like that? What sort of library or image processing technologies do you think they're using? Anyone know if there is something open source that is available?

I found an open source library that does the trick:
http://code.google.com/p/simple-iphone-image-processing

It probably is pretty difficult, and you will likely need to find at least some algorithms or libraries capable of detecting distorted text within bitmaps, analyzing the likely 2D and 3D geometric distortion within a text image, image processing to correct that distortion with its inverse, and DSP filtering to adaptively adjust the image contrast... plus use of iOS APIs to take photos in the first place.

Related

Best way to handle multiple screen aspects on mobile [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
So, I have been been looking a lot into this topic and the internet seems to be rather vague and divided. I have found that a lot of people handle different screen sizes and aspect ratios by using certain scripts to scale and anchor game objects.
Some people say that you should have assets of different sizes, and enable/disable them based on the screen size. While this method (to me) seems more efficient, it feels like this is suggested less than the other method.
So I would like to ask what the best method is. (Or if there is such a thing as the "best")
The best way is by using a Canvas Scaler and making good use of the anchor-point of the UI elements in your scene and using Layout Groups, this way they will fit nicely into almost all of the aspect ratios.
A few videos covering these topics:
Jimmy Vegas
Unity 3D With Scott
Cat Trap Studios

Improving my uml class diagram for a media library [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm making a class diagram for a media library, like iTunes or Windows Media Player. My library contains audio, video and images.
I'm fairly new to this, so I'm not sure if I'm heading in the right direction. This is what I got so far:
I feel like there should be a few more classes. Does anyone have some tips/suggestions on how to improve/expand this class diagram?
EDIT!
I've tried to make the playlists a bit more clearer. I've also added an interface:
It seems fine to me in the main lines:
The Media specialization seems correct
The Person specialization seems correct
The Directs and Composes relationships seem right
Nothing seems wrong here. But the Playlist composition is however not very clear. I have no obvious alternative, but here is the point...
How it is introduced, your playlist might be composed by images, videos, audio records. The question is the relationship between the compositions.
If you wish a playlist composed by image OR videos OR audio records non-exclusively, the playlist should be composed by medias in general.
If you wish a playlist composed by image OR videos OR audio records exclusively, things become quite subtle. In your representation this is not obvious at all. At least a note should be welcome in order to specify the exclusive composition relationship. A solution would be to specialize the playlists: the specialized version would be instantiated on the insertion of the first element. This is up to what you really want to show. In any case, an explanation note would be very useful.

Real TIme Image Processing (OCR) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I am planing to develop an app like Word Lens. Can any one suggest a good library that I can use? or any one explain technology behind the Word Lens App? is it reel time image matching or OCR? I know some image processing library like OpenCv, tesseract...Any help is greatly appreciated...
I'm one of the creators of Word Lens. Although there are some OCR libraries out there (like tesseract), we decided to make our own in order to get better results and performance. Our general algorithm goes like this:
copy the image from the camera and get its grayscale component
level out the image so the text stands out clearly against the background
draw boxes around things that look like characters & sentences
do OCR: match the pixels in each box against a database of characters -- this is actually pretty hard!
collect the characters into words, look up in a dictionary (this is hard too, because there will be mistakes in the OCR)
draw the results back onto the image
Image matching by itself is not good enough, because of the huge variety of fonts, words, and languages out there.
OpenCV is a great library to get up and running with, and to learn more about computer vision in general. I would recommend building off their examples, and playing around there. Have fun!

Streaming webcam to my webpage [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I am trying to build a webpage that shows the video of a webcam on live.
Right now, I have no webcam, so I'll have to buy one. I don't mind how much it costs, I just want it working with a good resolution (at least 720p).
I don't know which kind of camera I should buy and which programming language is better for that (if it's possible I would prefer not to use Flash).
Can you help me?
Sorry for my bad English, I'm trying to improve ;)
Alex
To show in a webpage, you can use IP-Camera. They cost a little more, but they can serve their images as independent network node. They also supports voice and live compression (H264 and MPEG4).
Best brand is Axis, but there are lots of options.
For Axis camera models, adding view to page would be as easy as add this item to your page:
<img src='http://192.168.1.20/axis-cgi/mjpg/video.cgi'>
It works for most browser, not IE. For IE, they have support as well here.

How do I make an OCR Program? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to make a program that takes an image as input and outputs text. Now I know that I can use a neural network to turn an image of single character into that character. The difficult part is: given an image with text in it, how would I produce all the rectangles around each individual character? What method could I use to do it?
A basic approach is to make a histogram of black pixels. First: project all pixels on a line. The deep valleys in the histgram indicate separation between lines (try different angles if the paper might be tilted). Then, per line (or per page if you know the font is monospaced) project the pixels on a horizontal histogram. This will give you a strong indication of inter character spaces. As a minimum this gives you a value for the average character height and width that will help you in next steps.
After that, you need to take care of kerning (where characters overlap). Find the connected pixels, possibly by first doing dilatation or erosion on the image to compensate for scanning artifacts.
Depending on the quality of the scan image you may have to use more advanced techniques, but this will get you going.
This doesn't sound like artificial intelligence, it sounds like you're talking about OCR:
http://en.wikipedia.org/wiki/Optical_character_recognition
See google tesseract
http://code.google.com/p/tesseract-ocr/
EDIT The unedited question was asking about artificial intelligence.
To me the question per se does not seem clear.
As it talks about OCR will leave a couple of articles here that they may help (they help me at least):
Improve OCR Accuracy
How to use image preprocessing to improve the accuracy of Tesseract
Also as mentioned above tesseract is a good OCR open-source python library (the one that i personally use as well). Other approaches that you may take is through sklearn
You may also want to check this stackoverflow post.
I am also pretty sure that you can use researchgate to check for any papers out there (I found some, just not sure if this is what you need)
I think that the above generic answer suits the generic question.