What is the current best speech recognition API for ios to match few keywords? [closed] - ios5

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I am looking for an API for ios (free ideally) that will allow to do some speech recognition. I have seen few posts for this: iPhone speech recognition API? and free speech recognition engines for iOS? and after a bit of prospect i have gathered the sdk that looks quite interesting:
http://dragonmobile.nuancemobiledeveloper.com/public/index.php?task=home
http://www.politepix.com/openears
http://www.creaceed.com/ceedvocalsdk/ (not free :-\ )
http://www.ispeech.org/
is there any of those that really stand out of the crowd and quite recent? how do they really differentiate from each other?

If you want to track just few keywords, you should not look for speech recognition API or service. This task is called Keyword Spotting and it uses different algorithms than speech recognition. Speech recognition tries to find all the words that has been said and because of that it consumes way more resources than keyword spotting. Keyword spotter only tries to find few selected keywords or keyphrases. It's way simple and way less resource consuming.
The only possible solution to archive this funcitonality is to use open source package like OpenEars powered by Pocketsphinx
http://www.politepix.com/openears
Openears has Rejecto plugin that implements something similar.
Pocketsphinx itself has recently implemented open source effective keyword spotting too, but it didn't get into Openers yet. It's only available through pocketsphinx API, you need to create kws search and set the target word to look for. I hope soon this functionality will reach OpenEars too.

Nuance gives developers free access (but not for high volume) - See http://www.masshightech.com/stories/2011/09/26/daily13-Nuance-tweaks-mobile-dev-program-with-free-access-to-Dragon.html or http://dragonmobile.nuancemobiledeveloper.com/public/index.php?task=home
Nuance services are typically offered commercially and require up front fees and transaction fees. The interesting news above is that they now make low volume use of their services available to developers for free. So, for development, testing, and demonstration you can probably use the free Nuance services. However, unlike the Google services that come free in Android, if your app has thousands of users you will likely have to pay for Nuance services.

We have been developing CeedVocal SDK since 2008, it's based on Julius & FLite open source projects.
Here's some context: we wanted to make our app (Vocalia) for speech recognition back in 2008 and basically picked Julius (hesitated with Pocket Sphinx, which appears to be good as well) and optimized its file format so that it would boot in 1-2 sec instead of 20sec on the original iPhone. Then we dutifully trained our own acoustic models in 6 languages. We designed the API, and eventually decided to offer it to other developers as an SDK.
CeedVocal basically supports 2 modes of operation:
matching of words (or small phrases)
keyword spotting
In the first mode of operation, it tries to align the input speech to a word (or phrase) in its list of acceptable input. This forces the input to a pre-known word, even if the speech is something else. Accuracy is good. In the second mode of operation, it will try to pick one of its keywords into the stream of speech. This is a difficult case, and it can be less accurate.

Related

Create VST plug-ins [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
I want to create a third-party plug-in for Serato (a software for DJs).
I searched in their site and I saw that Serato supports VST (VST2) plug-ins. So my question now is what should I read in order to create a VST plug-in?
Thank you in advance.
A good starting point would be the wikipedia site for VSTs, just to get the basics if you are not familiar with this technology, first you need to know the creators of the VSTs: Steinberg.
VST SDK is a set of C++ classes based around an underlying C API. The
SDK can be downloaded from their website.
Therefore I would recommend starting with something simple. Let’s review a few options:
JUCE
This technology is trending for a few reasons, like their homepage says:
With support for PC, Mac and Linux, JUCE is the perfect tool for
building powerful and complex applications. JUCE also supports the
development of plug-ins: VST, AU and AAX. Run your desktop
applications on mobile! One-click deployment to Android and iOS
(requires Android Studio and XCode) Adjust the user interface of your
application with the Projucer live coding engine Use the best audio
performance available on iOS and Android.
So the pros of this technology are the big community, multi-platform and that is free, at least for non-commercial developments (then if you want to sell it you have to pay). The cons would be that you need to have a little more than the basics of C++ to get started, fortunately there are a lot of tutorials on their page, youtube and the internet, the community is growing so if you have issues you can always ask.
SynthEdit and FL SynthMaker
If you don’t want to get into the code that fast you can start practicing with these, as they don’t require programming expertise, or only a few basics.
SynthEdit is a framework and a visual circuit design that allows you
to create your own synths with only drag & drop without programming.
Therefore giving you the flexibility of using your DSP algorithms
inside the modules.
This is cool if you want to start going quickly, this currently has a cost you can check on their official website.
FL SynthMaker, aka Flowstone, comes free with FL studio. It has a straightforward drag-and-drop graphical interface and a wide range of components. You can use it to code modules and DSP in Ruby and comes with loads of examples to get started quickly and its capacity to assist you in creating a prototype within a short time is a plus.
FLowstone is a programming application that is used to create virtual
instruments effects and computer control of external hardware without
the need to write basic code. The instruments and effects you create
in SynthMaker can be used in FL Studio as 'native' plugins and shared
with other FLowstone users.
MAX MSP
Max, also known as Max/MSP/Jitter, is a visual programming language for music and multimedia developed and maintained by San Francisco-based software company Cycling '74. Over its more than thirty-year history, composers, performers, software designers, researchers, and artists have used it to create recordings, performances, and installations.
The Max program is modular, with most routines existing as shared
libraries. An application programming interface (API) allows
third-party development of new routines (named external objects).
Thus, Max has a large user base of programmers unaffiliated with
Cycling '74 who enhance the software with commercial and
non-commercial extensions to the program. Because of this extensible
design, which simultaneously represents both the program's structure
and its graphical user interface (GUI), Max has been described as the
lingua franca for developing interactive music performance software.
SOUL
The SOUL project is creating a new language and infrastructure for
writing and deploying audio code. It aims to unlock improvements in
latency, performance, portability and ease-of-development that aren't
possible with the current mainstream techniques that are being used.
SOUL unlocks native-level speed, even when hosted from slower, safer
languages. The SOUL language makes audio coding more accessible and
less error-prone, enhancing productivity for both beginners and expert
professionals.
Maximilian
Is a cross-platform and multi-target audio synthesis and signal processing library. It was written in C++ and provides bindings to Javascript. It's compatible with native implementations for MacOS, Windows, Linux and iOS systems, and client-side browser-based applications. The main features are:
sample playback, recording and looping
support for WAV and OGG files.
a selection of oscillators and filters enveloping
multichannel mixing for 1, 2, 4 and 8 channel setups controller
mapping functions
effects including delay, distortion, chorus, flanging granular
synthesis, including time and pitch stretching atom synthesis
real-time music information retrieval functions: spectrum analysis,
spectral features, octave analysis, Bark scale analysis, and MFCCs
example projects for Windows and MacOS, using command line and
OpenFrameworks environments
example projects for Firefox and Chromium-based browsers using the
Web Audio API ScriptProcessorNode (deprecated!)
example projects for Chromium-based browsers using the Web Audio API
AudioWorklet (e.g. Chrome, Brave, Edge, Opera, Vivaldi)
Extras
A few months ago I found this community that is focused on audio programming. They also have a Youtube channel with hundreds of tutorials and a discord server where you can ask questions, and even show your projects or even get a job. If you are interested. It’s called the “The audio Programmer”
Hope this helps you get started. I know there are a lot of option out there and this might confuse you at the beginning but I hope this little guide helps you choose a good starting point depending on your needs and goals since every technology offers different things.

iPhone graphic design advice for developers [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I'm a developer who is making an app without a graphic designer for the first time. I am competent at making user interfaces that fits conventions and the Apple Human Interface Guidelines, but when it comes adding that extra layer of decoration to make the app sexy, I'm totally inexperienced.
Does anyone have any pointers or resources for helping developers such as myself act like graphic designers, in particular for iPhone apps?
I have a technical knowledge of photoshop, without having an artistic ability with it. I like to believe that I have a good eye for judging aesthetics, but I've never been good at creating something aesthetically pleasing from scratch.
"Acting as" requires being one, so learn the basics of graphic design. One popular book for beginners is The Non-Designer's Design Book. It's not about Photoshop, it's about recognizing why a design works to improve your judgement. There is more logic behind it than you may think. Usually being pleasing is the same as conveying useful information, "design is how it works as much as how it looks".
Review screenshots of existing iOS apps: Pttrns, Well Placed Pixels, Beautiful Pixels, or keep your own collection using LittleSnapper and CandyBar.
Unfortunately most tutorials are step by step instructions to reach a goal, but they don't bother much in why or how combining certain effects works. Then there are a lot of subtleties which you will have to dig in blog posts. Erik Tjernlund posted a good link (flyosity.com), here is another (bjango.com). These details create immediate trust from the user. There are plenty of tutorial sites on Google, but learning PS is a long-term goal.
An (offtopic) option now is to buy professional services. Example, Articles from Sophia Teutschler got help from the IconFactory. It's cost effective to invest your time in what you do best to pay for what they do best.
I really like Mike Rundle's (#flyosity) blog post – "Crafting Subtle & Realistic User Interfaces" – as a good, hands-on introduction on how to think about creating beautiful user interfaces. Follow some of his advice and your apps will automatically look much better.
To get inspiration, I highly recommend the Pttrns site. Look at how different apps solve common tasks.
My last advice is to practice a lot. My experience is that using the most commonly used tools (Photoshop and Illustrator) doesn't come naturally for us developers. Seeing a professional using these tools can sometimes be a real eye-opener. Especially workflow and how they use the tools to guide them in the creative process.
I am frequently visiting this website: http://app.itize.us/wp/
Not for directly copying others design or functionality but I always get ideas on how to design GUI elements here, often by mixing many of the different styles. I will also recommend you to just play with all of the different layer options you get when you double-click a layer in Photoshop, learned a lot by doing that!
The Web Designers Guide to iOS Apps is excellent but it does focus on NimbleKit. If you're not using NK the design discussions are still valuable.
You can follow tutorials here. I am not vary much familiar about photoshop/illustrator but may be these tutorials be helpful.
Having a "good eye" and knowing what looks nice is good, but if you don't have that initial "vision" then you will be spending a lot of time playing around until you stumble on the design that looks good and even then you may never reach that point.
As developers, we are very good at following the guidelines put down by Apple and making sure that we follow those - after all it's a nice logical set of rules to follow and that's exactly what we do when we write code - follow logical rules.
Unfortunately the design side of things doesn't have rules that we can follow. Yes, we may be technical at using Photoshop or some other drawing application, but when it comes to actually having that spark of inspiration, that's not something we can just click a button for.
Looking at other applications is one way to go. But then you may end up having an app that looks like another app or a collection of a number of apps and then you may have problems with a fluid user interaction.
My own approach to this problem was to go out and find someone who is really good at doing that art stuff and working with them. I struggled for a long time designing my own stuff, but looking back, it was obvious it was a developer (me) doing the design. I'm not sure what it is, but there's an extra something that these graphic artists seem to be able to do that I just can't get and that makes all the difference.
But the flip side to this is that he can't code. Sometimes it's best to just stick to what you're best at.

What are good resources to learn about web services for an iOS developer? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I am a C# developer which I specialize in Win forms and web application. I work in financial field and most of my experience is connecting to Oracle/sql/sybase and get data and display on the screen.
Recently I taught myself how to develop iPhone / iPad applications. It went very well. Now I want to learn how to connect using web services to my own databases and get data or upload data.
So I need to learn Web services, SOAP, WSDL and whatever else that I need. I don't have any experience in it but if someone can direct me to the right books I will buy them and read them. i want to start writing in my office and connect to my databases and be able to do a proof of concept. Any ideas?
Particularly because you're starting out, I would suggest looking at RESTful services. The API is essentially a URL using HTTP GET, PUT, POST, or DELETE. The output can be XML, JSON, whatever you want. Very simple to construct and test. And because the API is so simple, you don't necessarily need to add another library to your project and increase the code size.
The second chapter of the book iPhone Games Projects talks about how to use a RESTful rankings system to record game scores to a server from an iPhone game.
I don't think you need books for that subject. Try using some API's to make your life easier, and check some code samples. For example:
For JSON, SBJSON is a standard: https://github.com/stig/json-framework/
For XML, have a look at this Apple example: https://developer.apple.com/library/ios/#samplecode/SeismicXML/Introduction/Intro.html#//apple_ref/doc/uid/DTS40007323
An easy way to retrieve info from a online database is to use PHP. You simply call a PHP file located on your server from your iPhone application. The PHP file takes in info you sent it via GET/POST methods (if required), retrieves info from your database, and echo's it in XML/JSON.

How do you collaboratively write specs? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am working with a small team (2 others) of developers that are geographically dispersed, and I'm looking for good ways for us to collaborate on specs... We're thinking we might use Google Docs to write the spec in so we can all have access to modify it in a central location.
What have you done? What good ideas do you have?
If you have an intranet or VPN, I would actually consider installing and using a small Wiki for these specs.
Compared to Google docs you get:
Much better versioning and change tracking (IMHO)
Much easier to start new documents for subsections
An actual markup rather than WYSIWYG (a matter of taste, I prefer LaTeX to Word).
Possible to attach variety of other file types
Very easy to backup
Very easy to create an offline version
You don't have to worry about storing sensitive materials elsewhere.
The disadvantage is that it is not WYSIWYG, which may or may not be an issue to you.
Of course, you can pick a Wiki implementation that supports a better editor, and possibly even a synchronous collaboration one.
Google Wave - exactly what it's meant for - collaboration
IMHO, a word processor is the wrong tool for a programmer. A spec should be written in a plain text editor, and utilize lightweight markup such as reStructuredText, AsciiDoc etc.
The benefits of such an approach are:
There are excellent tools to manage plain text, that are already in the hands of programmers (VCS, automated build systems, diff, patch, programming editors, grep, etc.)
A markup language allow for expressing intent rather then formatting.
That in mind, a Wiki seem to be the obvious choice.
Personally my tool chain of choice is:
reStructuredText as the markup language.
Trac as a Wiki
Firefox + the it's all text extension
Emacs + rst-mode
The choice of technology is one issue and Google docs is a good choice IMHO. But the real challenge is how to manage the process e.g. divide the tasks.
My suggestion is to first make sure that the platform and all related technologies are decided-upon as best as feasible. Then, compose a a thorough table of contents. A well-designed TOC will allow you to divide tasks properly and not "step" on each others' work. From then on you each "flesh" out your assigned sections as well as review each others' work.
In effect, each TOC subsection becomes an atomic unit of work that can be assigned and maintained by an individual who is also accountable for said section(s).
Good luck!
I think it depends on
How heavily into writing the specs you all are
If you're likely writing at the same time
Whether you intend to publish the specs.
Google Docs is nice and easy to get started with. It's also great that you can now export folders all at once. Still, for something that's going to be published to the web, a wiki or general cms is a better presentation vehicle. A wiki will also integrate with your existing site.
If you've got small specs, primarily written by one person then use whatever tool is available where you're hosting the project code or website. If you're not likely to be editing at the same time then a wiki is good.
I've done the wiki thing, the passed document thing and the Google Docs thing.
The wiki thing has a low starting effort and lasts a pretty long time. At a certain size it does get to be a pain.
The passed document thing (writes, email, edit, email, etc) only works while one person is starting everything up. As soon as there are even minor edits then it sucks.
The Google Docs thing is fine until you have several docs and several editors or want to publish it online.
hth
This isn't programming related, but I've personally used Google Docs to write shared documents and found it easy to use.
I would suggest enabling Google Gears however, in the event that the Google servers go down momentarily or an internet connection isn't available.
For writing specs collaboratively, you could try Gingko.
It's a card-tree editor, which means it's a mix between index cards and an outliner, with real-time collaboration and full Markdown support (as well as basic LaTeX).
We are still missing several features (version history, comments, etc), but for some the benefits of having everything in a tree structure outweigh these drawbacks.
Writing specs with it is great, because you can create a card for each user story, and drill into it as much as you like (and organize them into categories if you'd like).
http://gingkoapp.com

Which Workflow Engine do you recommend? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I am kicking around the idea of using a workflow engine on this upcoming project. We know that there is a lot of caveats with using a workflow engine and we have a lot of development experience in many platforms so we would be willing to let the choice of workflow engine take precedence over our favorite toolset or developer IDE.
We are more interested in internal workflow (i.e. petri net for easily changeable ERP purposes without involving additional coder time) than external workflow (i.e. aggregating SOAP calls into a transaction aware, higher level SOA). Which workflow engine would you recommend? We have superficially looked at offerings by Oracle, Microsoft, and some open source stuff too. It's all very overwhelming so please respond only if you have real life experience with implementing internal workflow.
If you can use a state machine, then I'd recommend an open source project called StateLess by Nicholas Blumhardt (Autofaq creator). His approach avoids the issue of long running workflows being held by a runtime engine, as the state is defined by a simple variable such as a string or int.
Here is a sample state machine:
var phoneCall = new StateMachine<State, Trigger>(State.OffHook);
phoneCall.Configure(State.OffHook)
.Permit(Trigger.CallDialed, State.Ringing);
phoneCall.Configure(State.Ringing)
.Permit(Trigger.HungUp, State.OffHook)
.Permit(Trigger.CallConnected, State.Connected);
phoneCall.Configure(State.Connected)
.OnEntry(() => StartCallTimer())
.OnExit(() => StopCallTimer())
.Permit(Trigger.LeftMessage, State.OffHook)
.Permit(Trigger.HungUp, State.OffHook)
.Permit(Trigger.PlacedOnHold, State.OnHold);
// ...
phoneCall.Fire(Trigger.CallDialled);
Assert.AreEqual(State.Ringing, phoneCall.State);
Your state can be an integer which will allow you to feed it the current state from a database. This can be set on the constructor of the state machine as follows:
var stateMachine = new StateMachine<State, Trigger>(
() => myState.Value,
s => myState.Value = s);
You can implement this in just one assembly, compared to the multiple projects you need to run Windows Workflow. Maintenance is extremely low, there is no "designer" that generates code for your, etc. Again, it is simple, and there lies the beauty.
I've deployed both K2 and WF systems in the past. K2 is quite strong, but spendy. WF is an underdog, but improving quickly. Both integrate with the .NET stack (MOSS specifically) quite well and both have very good tool integration. Both are relatively easy to develop for once you understand the workflow model.
You can get solutions support from many different MS partners for both, although my guess is WF is a bit easier to get solutions support for (i.e. more partners have more consultants who know WF than K2).
Unfortunately, I don't have any experience with the Oracle product or the open source alternatives you mentioned, so I can't comment on those.
If you are overwhelmed, I recommend you take a look at the WF Virtual Labs (bottom of the page). They will let you get your hands on the technology, get the lingo down, go through a few scenarios. Once you have that, understanding how WF can fit into what you trying to do should be substantially easier. Also, I can recommend Essential Windows Workflow -- very good book. Here's a good intro on WF 4.0 from PDC.