I would like to start on Chinese hand-writing recognition program for IPhone...but I couldn't find any library or API that can help me to do so. It's hard for me to write the algorithm myself because of my time span.
Some of suggestion recommended that I should make use of a back-end server to do the recognition work. But I don't know how to set up that kind of server.
So any suggestion or basic steps that can help me to achieve this personal project?
You might want to check out Zinnia. Tegaki relies on other APIs (Zinnia is one of them) to do the actual character recognition.
I haven't looked at the code, but I gather it's written in C or C++, so should suit your needs better than Tegaki.
Related
I want to create an app for OSX which would work as an addon (displaying some overlaying information) to other app. Something like Poker Tracker for example - it shows extra information for poker games while playing on tables.
Just wondering is it possible using Swift? Can you point me to some direction what to look for? some libraries helping with such case? Never developed anything for OSX but keen to learn.
Thanks in advance.
Just wondering is it possible using Swift?
Yes, you can use Swift to create macOS applications. It's not magic, though -- your Swift code can only do things that are actually possible.
Can you point me to some direction what to look for?
Look for an API that lets other apps interact with the host application. That API will define what your "add on" application can reasonably do.
Without some sort of API or scripting interface, it's going to be very difficult to write a program that interacts with the host application. The best option is probably the Accessibility API in macOS. Accessibility is meant as an assistive technology, but it's often repurposed for tasks like automated testing. You might be able to use it to gain some level of control over the host app.
As far as I know it doesn't expose any API, so it would need to be image scraped.
This is really a tall order, and doubly so if you're asking basic questions about language capabilities. I think you'd have much better luck creating an efficient user interface so that the user could enter the relevant information directly, e.g. what cards the other users are showing, bet sizes, etc.
I want to start doing some of my coding by voice recognition software (maybe 10-20% of the work I do).
I've seen that some people have had success with Dragon Natural Speaking (DNS) software, but I use a Mac, and unfortunately, Dragon only works on Windows.
Has anyone used the Carnegie Melon open source Sphinx http://cmusphinx.sourceforge.net/ for programming?
Are there other options that I could implement on a Mac? I don't mind dropping a little bit of cash to make this a reality. Ideally it would be a system where I could add in my own commands. (Check out the awesome stuff this guy did, with DNS: https://www.youtube.com/watch?v=8SkdfdXWYaI)
There is a protoype plugin for IDEA written by JetBrains developers. The work was done during one of their hackathons.
If you are not fixed with Sphinx, I would recommend Kaldi as an adaptable, compatible open-source speech recognizer. With kaldi you can adapt your own grammar and commands and retrain the underlying models. In addition, there is a python-wrapper that makes Kaldis use easy and convenient.
We need to build an image processing application for smartphones (mainly for iPhone). The operations consist of:
image filtering
composite geometrical transformation
region growing
Also, the user is required to specify (touch) some parts of the image. Those parts would serve as inputs for the app. Example: eyes and lip in a face.
We have built a desktop version of this app. The processing part is quite heavy and we extensively used BufferedImage class.
Should we use CodeNameOne for building this app? If not then what alternatives do you suggest?
Please consider the following factors:
Performance
Ease of writing the code (for image processing)
I gave an answer for this in our discussion forum but I think its a worthwhile question for a duplicate post:
Generally for most platforms you should be fine in terms of performance except for iOS & arguably Windows Phone.
Codename One is optimized for common use cases, since iOS doesn't allow for JIT's it can never be as fast as Java on the desktop since its really hard to do some optimizations e.g. array bound check elimination. So every access to an array will contain a branch check which can be pretty expensive for image processing.
Add to that the fact that we don't have any image processing API's other than basic ARGB and you can get the "picture", it just won't be efficient or easy.
The problem is that this is a very specific field, I highly doubt you will find any solution that will help you with this sort of code. So your only approach AFAIK is to build native code to do the actual image processing heavy lifting.
You can do this with Codename One by using the NativeInterface API which allows you to invoke critical code in native code and use cn1lib's to wrap them as libraries. You would then be able to get native performance for that portion of the code but this would only make sense for critical sections in the code. If you write a lot of native code the benefits of Codename One start to dissipate and you might as well go to native.
How would you go about comparing a spoken word to an audio file and determining if they match? For example, if I say "apple" to my iPhone application, I would like for it to record the audio and compare it with a prerecorded audio file of someone saying "apple". It should be able to determine that the two spoken words match.
What kind of algorithm or library could I use to perform this kind of voice-based audio file matching?
You should look up Acoustic Fingerprinting see wikipedia link below. Shazam is basically doing it for music.
http://en.wikipedia.org/wiki/Acoustic_fingerprint
I know this question is old, but I discovered this library today:
http://www.ispikit.com/
Sphinx does voice recognition and pocketSphinx has been ported to the iPhone by Brian King
check https://github.com/KingOfBrian/VocalKit
He has provided excellent details and made it easy to implement for yourself. I've run his example and modified my own rendition of it.
You can use a neural networks library and teach it to recognize different speech patterns. This will require some know how behind the general theory of neural networks and how they can be used to create systems that will behave a particular way. If you know nothing about the subject, you can get started on just the basics and then use a library rather than implementing something yourself. Hope that helps.
I am pretty dissatisfied with all the available media players, and I was also looking for a major project to really get into programming. so I am thinking of writing my own media player . Or to be more accurate a gui-frontend for mplayer (something similar to smplayer). How hard would this be.? I have plenty of time (months), and am willing to learn anything.
I practically don't have any knowledge of any windows/gui libraries . My programming experience : tried lots of different languages, wrote a couple of websites in php, lots of practice in java (although did nothing major) . Thats all
Can someone provide some guidance, about where to get started. what all to read. Which language should be used. is C#/.net a good language for this? since I am no expert in any language and have dabbled in plenty of different languages , I think I can pick up any language. Though My main concern is my lack of any practical knowledge . So guide me please.
Lastly my preference is windows (haha whatever), so thats what my target is and thats where I'll doing my coding.
To sum it up I want to create a guifrontend for mplayer that would work in windows.
Thanks
Edit: by mplayer I mean mplayer (the linux one) , and not WIndows media player.
One good place to start could be looking at how the code for gmplayer works - gmplayer is the graphic frontend for mplayer on Linux. It could be that all you really need to do is port the gmplayer code to Windows, then you get a fully integrated GUI instead of just a frontend.
Also, feature request: a nice friendly UI for putting video / audio effects on the output stream (it is so hard to use in the CLI version that most mplayer users probably don't even know it is in there).
I know what I'm going to recommend you is not what you're looking for, BUT:
I'd create a front-end for VLC, which uses Qt, a GUI framework which is extremely usable and easy to start with, in C++.
From my experience as an user, VLC is also more stable and has more features.
Start by copying a working implementation. As you mentioned, SMPlayer exists as a working example of what you want. I'd recommend starting by either hacking it to work better (the playlist really needs more intuitive controls, and multiple monitor support in Windows was nonexistent last time I tried it) or trying to duplicate it in your language of choice.
The benefits of hacking on an existing probject include: the existing codebase works, the margin of work required to make a noticeable change is much smaller, and the existing developers are able to help you come to speed with internals. Also, learning the project's language (C++) would be useful, though it may not be worth the effort if it's more interesting to copy its features in your favorite language.
C# is great for creating any desktop gui quickly. Best way to start with the gui design is to play a bit with the drag/drop components available in visual studio. For the functionality you can use this: http://msdn.microsoft.com/en-us/library/dd564585%28VS.85%29.aspx .