Best open source speech recognition API and engine - neural-network

I am making my speech recognition project for PC(working on Windows 8) and new in this area.The project should have basic functionality like dictation with accuracy in email,notepad etc and should response to local commands of PC.
I am using sphinx4 for my speech recognition project.I want to know,is there any better open source API than cmu sphinx? I want better in context of accuracy and large vocabulary.
Is kaldi(deep neural network based) better than cmu sphinx(HMM based)?I want to know which is better for what?
what is difference between speech API and speech Engine,as a developer what i will require to develop my software?
Please help me to give a clear vision about above questions and if possible provide some speech recognition developer or researcher community link.

Related

Speech recognition using Openears framework?

Operears: The speech recognition(Speech to text) framework for iPhone(iOS Devices), I have installed openears demo app on my iPhone device, It works well but only for a list of words like GO, CHANGE, MODEL. Can we make speech recognition more generic for a real time speech recognition, that is, not limited to few words. It should be generic.
Openears:
http://www.politepix.com/openears/
You have to use new Language Model instead of their default one.
The language model is the vocabulary that you want OpenEars to understand, in a format that its speech recognition engine can understand.
The smaller and better-adapted to your users' real usage cases the language model is, the better the accuracy.
An ideal language model for PocketsphinxController has fewer than 200 words.
You can dynamically create new language model through the LanguageModelGenerator class.
See the Details about LangaugeModelGenerator & Openears Basic concepts here
Note:
Please post the queries regarding Openears only in their forum
You can see more Speech-To-Text SDK's here

Quick way to perform speech recognition of very short vocabulary in iPhone

I am programming an app for research purposes. I need a quick way to perform speech recognition of very small vocabulary (as small as 5 words in the entire dictionary). I know of many speech recognition frameworks like OpenEars, ATT Watson Speech API, Dragon etc. But it requires you to invest a lot of time in reading.
Since the focus of our app is not speech recognition, we want to do it in a quick way. I know that if I have only 5 words, then I can replace them with 5 choices as well, but that is not appropriate.
Any ideas on this? Thank you.
OpenEars developer here. OpenEars has a quickstart tutorial that can get you started recognizing a small vocabulary in about 5-10 minutes: http://www.politepix.com/openears/tutorial
if you don't bother of non iOS, then Voxforge can be a good starting point.
http://www.voxforge.org/home/downloads
I also get many help from this site with HTK when I wrote thesis.
this web site contains training procedure step by step which will be good for you.
hope this can help in small vocabulary speech recognition.
(HTK itself has sample training procedure for 10 digits)

How to recognize the human voice by code in iphone?

I want to integrate voice detection in my iPhone app. The iPhone app allow the user to search the word by using their voice. But, i don't know a single info about Voice Recognition in iPhone. Can you please suggest me any ideas,tutorials or sample code for this?
You can also use Google Chrome API to integrate voice recognition on your application, but there is a big problem : the API works only with FLAC encoded files, but this encoding isn't supported natively on iOS... :/
You can see those 2 links for more information :
http://www.albertopasca.it/whiletrue/2011/09/objective-c-use-google-speech-iphone/
http://8byte8.com/blog/2012/07/voice-recognition-ios/
EDIT :
I realized an application including voice recognition using Nuance SDK, but it's not free to use. You can register for free and get a developer key that allows you to test your application for 90-days. An application example is included, you can see the code, it's very easy to implement.
Good luck :)
The best approach will probably be to:
Record the voice on the phone
Send the recording to a server that runs the speech recognition software
Then return something to the phone to indicate what it should do
This approach is favorable as there are a number of open source voice to text softwares out there & you are not limited by computing power in the backend.
Having said that, iOS has OpenEars which is based on Pocket Sphinx. It looks promising...
Well voice recognition is not correlated with iphone. All you can do is record the voice in iphone. Once done, you can either code your one voice recognition module, or find a third party API and reuse it.
You can do google search on that.

free speech recognition engines for iOS?

I am looking for some free speech recognition engines to use in my iphone application... can you suggest any?
Nuance just opened the doors for developers to the Dragon Mobile SDK (they are industry-leaders). have a look at NDEV Mobile
There are a couple of wrappers of the Sphinx speech recognition engine (http://cmusphinx.sourceforge.net/) available for iOS
https://github.com/KingOfBrian/VocalKit
Through reading his summary you can see he is actually pointing people towards http://www.politepix.com/openears

How to do Map Navigation for iPhone support?

Here is Google Map Navigation for Android Mobiles:
http://www.youtube.com/watch?v=lwggXqMZZ8w
Can we achieve this functionality for iphone? Is there any map api available for this which provides directions and details on GPS bases as well text (source to destination) bases?
If Yes, please provide some links regarding those.
It is against google's terms of service (relating to MapKit) to do turn by turn navigation, however you can work around it using the google maps web api and create custom code.
For speech recognition you can use this:
I used
Open Ears, which was quite easy and works really fine for text-to-speech and recognition.
OpenEars is an open source framework for performing continuous speech
recognition, text-to-speech, and language model generation in iOS. It
uses the CMU Pocketsphinx, CMU Flite and MITLM libraries.