Does choice of sound card effect timidity's performance? - midi

I'm working on web application which will be using timidity++ to converting midi file into wav file. I know that it will not work without sound card, so I must have server with it. My question is: Is the used sound card will impact to the speed of generating wav files? And if the answer to my question is yes, what should I pay attention when I buy sound card. What sound cards what can you recomend?

The sound card should definitely not impact the speed of synthesis, especially if you are doing it offline. In fact, software such as MrsWatson can use a VST instrument plugin to synthesize MIDI events to an audio file and do not require any sound hardware at all.
Disclaimer: I am the author of MrsWatson

Related

How to encode live broadcast of the Local FM radio stations

We are in the midst of research stage for our coming web project. We would like to make a website that streams (all) the Local FM Radio Stations.
In research for the right tools to set up the said website, several questions arises.
What software do we need to encode the live broadcast of (all) the Local FM Radio Stations? How can we connect to FM Radio Stations?
Do we need Virtual Private Server to run the software from question number One, 24/7? Can VPS do that, run a software 24/7?
If we manage to encode the live broadcast of (all) the local FM Radio stations, how do we send this thing to our website? Can we use a simple audio player such as quicktime/flash or html5 audio player and embed it to our website?
I hope someone will help us on this matter. You help is greatly appreciated. :)
Audio Capture
The first thing you need to do is set up an encoder source for your streams. I highly recommend putting the encoder at each radio station. The quality of FM radio isn't the greatest. You will get much better audio quality at the station. In addition, at least here in the US, many radio stations have all of their studios in the same place. It isn't uncommon to find 8 stations all coming from the same set of offices. Therefore, you may only have to install equipment in 3 or 4 buildings to cover all the stations in your market.
Most stations these days are using digital mixing. Buy a sound card that has a compatible digital input. AES/EBU and S/PDIF are common, and sound cards that support these are affordable.
If you must capture audio over the air, make sure you are using high quality receivers (digital where available), with a high quality outdoor antenna. There are a variety of receivers you can purchase, many of which mount directly in a rack.
Encoding
Now for the actual encoding, you need software. I've always had good luck with EdCast (if you can find the version prior to "EdCast Reborn"). SAM is a good choice for stations that have their own music library they need to manage, but I don't suggest it in your case. You can even use VLC for this part.
You will need to pick a good codec. If you want compatibility with HTML5, you will want to encode in MP3 and Ogg Vorbis. aacPlus is a good choice for saving bandwidth while still providing a decent audio quality. Most stations these days use aacPlus when possible, but not all browsers can play it, which is why you also need the other two. You can (and should) use multiple codecs per station.
Server Software
I highly recommend Icecast or SHOUTcast. They take your encoded audio and distribute it to listeners. They serve up an HTTP-like stream, which is generally compatible. If you are interested, I also do Icecast/SHOUTcast compatible hosting, with the goal of being compatible with more devices, particularly mobile.
Playback
Many stations these days use a player that tries HTML5, and falls back to Flash if necessary. jPlayer is a common choice, but there are many others. It is also good to provide a link to a playlist file containing the URL of your stream, so that users can listen in their own audio player if they choose.

Most efficient way to format a large amount of audio files

I am currently developing an app that will contain large amounts of audio, around 60-120 minutes. Most voice audio files. My question is really what is the best way to go about storing them. For example, one single large file, separate audio files, download-as-needed cache files.
Any suggestions on file format?
These are the audio formats decoded by iPhone hardware that should take the least power to play.
Other iPhone OS audio formats employ a hardware codec for playback.
These formats are:
AAC
ALAC (Apple Lossless)
MP3
Whether to have the audio distributed with the app or separately would depend on the use. If you could reasonably expect the user to go through the material sequentially, you may want to allow the user to download part by part or stream the audio to let them conserve space on their device, while if the audio is more random access, you'd probably want it all on the device.
Several apps, including Apple's own, appear to use the open source speex codec for compressed voice-quality audio, even though this seems not to be supported by the hardware or any public API.
As Joachim suggested you can choose from AAC/ALAC/MP3 audio formats. What I'd propose now is to also consider the issue from user experience point of view:
Convert all your audio to chosen format with quality options that
satisfy you and your potential users.
Next, calculate the size of all your files and ask yourself a questions:
"are X megabytes too much to bundle for my kind of app?" and
"will that big/small app bundle encourage users to download my app?".
Optionally play a bit with quality options to shrink files (iterate).
In the next step, decide (based on you app characteristics) whether to bundle all files. For example a game is expected to have all files in place and can be big (users accept that). If your app has e.g. podcasts only, then select the best one and bundle it - once user is hooked he can download the rest (let user trigger that), so files are stored on device. Also provide user the info how much data they are about to download and warn them if file is reasonably big and they're not on Wi-Fi; or introduce the option to download only on Wi-Fi.
I hope that sounds reasonable.
For music, the following approach would be much different.
Since it's just voice, you can reduce the sample rate significantly in the majority of cases. Try [8kHz…20kHz].
In case they are multichannel - Mono should be fine for voice.
Once that's been done, I'd recommend AAC for size and quality balance.
Do some listening tests on your devices. Tweak settings if needed. Then batch process/convert them all. That can reduce your sizes by ten or more if the sources are 16/44.1.
Unless they files are very small (e.g. seconds each) or you have to open and read many of them quickly, I wouldn't bother with the huge file. A few MB is a good size for many cases.

How to listen to mic input and analyse in real time?

Hi unfortunately I've not been able to figure out audio on the iPhone. The best I've come close to are the AVAudioRecorder/Player classes and I know that they are no good fo audio processing.
So i'm wondering if someone would be able to explain to me how to "listen" to the iPhone's mic input in chunks of say 1024 samples, analyse the samples and do stuff. And just keep going like that until my app terminates or tells it to stop. I'm not looking to save any data, all I want is to analyse the data in real time and do stuff in real time with it.
I've attempted to try and understand apples "aurioTouch" example but it's just way too complicated for me to understand.
So can someone explain to me how I should go about this?
If you want to analyze audio input in real-time, it doesn't get a lot simpler than Apple's aurioTouch iOS sample app with source code (there is also a mirror site). You can google a bit more info on using the Audio Unit RemoteIO API for recording, but you'll still have to figure out the real-time analysis DSP portion.
The Audio Queue API is a slight bit simpler for getting input buffers of raw PCM audio data from the mic, but not much simpler, and it has a higher latency.
Added later: There's also a version of aurioTouch converted to Swift here: https://github.com/ooper-shlab/aurioTouch2.0-Swift
AVAudioPlayer/Recorder class won't take you there if you wanna do any real time audio processing. The Audio Toolbox and Audio Unit frameworks are the way to go. Check here for apple's audio programming guide to see which framework suits your need. And believe me, these low level stuff is not easy and is poorly documented. CocoaDev has some tutorials where you can find sample codes. Also, there is an audio DSP library DIRAC I recently discovered for tempo and pitch manipulation. I haven't looked into it much but you might find it useful.
If all you want is samples with a minimum amount of processing by the OS, you probably want the Audio Queue API; see Audio Queue Services Programming Guide.
AVAudioRecorder is designed for recording to a file, and AudioUnit is more for "pluggable" audio processing (and on the Mac side of things, AU Lab is actually pretty cool).

Real-time Pitch Shifting on the iPhone

I have a children's iPhone application that I am writing and I need to be able to shift the pitch of a sound sample using Core Audio. Does anyone have any example code I could look at where this is done. There are many music and game apps in the app store that do this so I know I am not the first one. However, I cannot find any examples of it being done.
you can use dirac-2 from dsp dimension for pitch shifting on the iphone. quote: -
"DIRAC2 is available as both a commercial object library offering unlimited sample rates and phase locked multichannel support and as a free single channel, 44.1/48kHz LE version."
use the soundtouch open source project to change pitch
Here is the link : http://www.surina.net/soundtouch/
Once you add soundtouch to your project, you have to give the input sound file path, output sound file path and pitch change as the input.
Since it takes more time to process your sound its better to modify soundtouch so that when you record the voice, directly give the data for processing. It will make your application better.
I know it's too late for the person who asked but it is really a valuable link (As I found) for any one else who is looking for the solution of the same problem.
So Here we have latest DIRAC3 with it's own audio player classes which will take care of run time pitch and speed(explore for god knows what more) shifting. Run the sample and have huge round of applause for that.
Try Dirac - it's the best technology out there and it's available on Win, Linux, MacOS X and iOS. We're using it in all our products (and a couple of others do as well, search for "Capo" on the App Store). They're at version 3 now which has seen a huge increase in performance since previous versions. Hope this helps.
See: Related question
How much control over pitch do you need... could you precalculate all the different sounds?
If the answer is yes, then you can just pick the right sounds and play them.
You could also use Audio Converter Services in conjunction with AVAudioPlayer, which will allow you to resample the audio (which will effectively repitch them, though they'll change duration).
Alternatively, as the related question points out, you could use OpenAL and AL_PITCH

Streaming Audio Clips from iPhone to server

I'm wondering if there are any examples atomic examples out there for streaming audio FROM the iPhone to a server. I'm not interested in telephony or SIP style solutions, just a simple socket stream to send an audio clip, in .wav format, as it is being recorded. I haven't had much luck with the google or other obvious avenues, although there seem to be many examples of doing this the other way around.
i cant figure out how to register the unregistered account i initially posted with.
anyway, I'm not really interested in the audio format at present, just the streaming aspect. i want to take the microphone input, and stream it from the iphone to a server. i dont presently care about the transfer rate as ill initially just test from a wifi connection, not the 3g setup. the reason i cant cache it is because im interested in trying out some open source speech recognition stuffs for my undergraduate thesis. caching and then sending the recording is possible but then it takes considerably longer to get the voice data to the server. if i can start sending the data as soon as i start recording, then the response time is considerably improved because most of the data will have already reached the server by the time i let go of the record button. furthermore, if i can get this streaming functionality to work from the iphone then on the server side of things i can also start the speech recognizer as soon as the first bit of audio comes through. again this should considerably speech up the final amount of time that the transaction takes from the user perspective.
colin barrett mentions the phones and phone networks, but these are actually a pretty suboptimal solution for asr, mainly because they provide no good way to recover from errors - doing so over a voip dialogue is a horrible experience. however, the iphone and in particular the touch screen provide a great way to do that, through use of an ime or nbest lists for the other recognition candidates.
if i can figure out the basic architecture for streaming the audio, then i can start thinking about doing flac encoding or something to reduce the required transfer rate. maybe even feature extraction, although that limits the later ability to retrain the system with the recordings.