I am using web audio api buffers to play music and sound effect .WAV files in my video game that runs on html5 and javascript. It is being programmed and tested on Safari for Mojave on a Macbook Pro. I use this to get the instant sound playback, in preference to the delayed sound that results from using the regular audio node.
One issue, when playing back buffered sounds, is that when I have a GainNode between the Source and the Destination, I expect the volume to be full when the gain value is 1 and silent when it is zero. But this isn't the case. For some reason the volume is quieter at 0, but still quite audible. But at -1, it is silent. Shouldn't 0 be silent, whether in linear or in logarithmic scale? Are there circumstances where the gain value range would be something other then 0 to 1? I my particular setting, the range seems to be -1 to +1.
Related
I have been trying, for the past weeks, to find a simple way to control audio frequency (I.E. change the pitch of an audio file) from within swift in realtime.
I have tried with AVAudioPlayer.rate but that only changes the speed.
I even tried connecting an AVAudioUnitTimePitch to an audio engine with no success.
AVAudioUnitTimePitch just gives me an error, and rate changes the playback speed which is not what I need.
What I would like to do is, make a sound higher or lower pitch, say from -1.0 to 2.0 (audio source.duration/2 so it would play twice as fast.
Do you guys know of any way to do this even if I have to use external libraries or classes?
Thank you, I am stumped as to how to proceed.
If you use the Audio Unit API, the iOS built-in kAudioUnitSubType_NewTimePitch Audio Unit component can be used to change pitch while playing audio at the same speed. e.g. the kNewTimePitchParam_Rate and kNewTimePitchParam_Pitch parameters are independent.
The Audio Unit framework has a (currently non-deprecated) C API, but one can call C API functions from Swift code.
I am attempting to stream live audio from an iOS device to a web browser. The iOS device sends small, mono wav files (as they are recorded) through a web socket. Once the client receives the wav files, I have the Web Audio API decode and schedule them accordingly.
This gets me about 99% of the way there, except I can hear clicks between each audio chunk. After some reading around, I have realized the likely source of my problem: the audio is being recorded at a sample rate of only 4k and this cannot be changed. It appears that the Web Audio API's decodeAudioData() function does not handle sample rates other than 44.1k with exact precision resulting in gaps between chunks.
I have tried literally everything I could find about this problem (scriptProcessorNodes, adjusting the timing, creating new buffers, even manually upsampling) and none of them have worked. At this point I am about to abandon the Web Audio API.
Is the Web Audio API appropriate for this?
Is there a better alternative for what I am trying to accomplish?
Any help/suggestions are appreciated, thanks!
Alas! AudioFeeder.js works wonders. I just specify the sampling rate of 4k, feed it raw 32 bit pcm data and it outputs a consistent stream of seamless audio! Even has built in buffer handling events, so no need to set any loops or timeouts to schedule chunk playback. I did have to tweak it a bit, though, to connect it to the rest of my web audio nodes and not just context.destination.
Note: AudioFeeder does automatically upsample to the audio context sampling rate. Going from 4k to 44.1k did introduce some pretty gnarly sounding artifacts in the highend, but a 48db lowpass filter (4 x 12db's) at 2khz got rid of them. I chose 2khz because, thanks to Harry Nyquist, I know that a sampling rate of 4k couldn't have possibly produced frequencies above 2khz in the original file.
All hail Brion Vibbers
I am using several instances of AVAudioPlayer to play overlapping sounds, and getting harsh distortion as a result. Here is my situation... I have an app with several piano keys. Upon touching a key, it plays a note. If I touch 6-7 keys in rapid succession, my app plays a 2 second .mp3 clip for each key. Since I am using separate audio streams, they sounds overlap (which they should), but the result is lots of distortion, pops, or buzzing sounds!
How can I make the overlapping audio crisp and clean? I recorded the piano sounds myself and they are very nice, clean, noise-free recordings, and I don't understand why the overlapping streams sound so bad. Even at low volume or through headphones, the quality is just very degraded.
Any suggestions are appreciated!
Couple of things:
Clipping
The "buzzing" you describe is almost assuredly clipping—the result of adding two or more waveforms together and the resulting, combined waveform having its peaks cut off—clipped—at unity.
When you're designing virtual synthesizers with polyphony, you have to take into consideration how many voices will likely play at once and provide headroom, typically by attenuating each voice.
In practice, you can achieve this with AVAudioPlayer by setting each instances volume property to 0.316 for 10 dB of headroom. (Enough for 8 simultaneous voices)
The obvious problem here that when the user plays a single voice, it may seem too quiet—you'll want to experiment with various headroom values and typical user behavior and adjust to taste (it's also signal-dependent. Your piano samples may clip more/less easily than other waveforms depending on their recorded amplitude.)
Depending on your app's intended user, you might consider making this headroom parameter available to them.
Discontinuities/Performance
The pops and clicks you're hearing may not be a result of clipping, but rather a side effect of the fact you're using mp3 as your audio file format. This is a Bad Idea™. iOS devices only have one hardware stereo mp3 decoder, so as soon as you spin up a second, third, etc. voice, iOS has to decode the mp3 audio data on the cpu. Depending on the device, you can only decode a couple audio streams this way before suffering from underflow discontinuities (cut that in half for stereo files, obviously)... the CPU simply can't decode enough samples for the output audio stream in time, so you hear nasty pops and clicks.
For sample playback, you want to use an LPCM audio encoding (like wav or aiff) or something extremely efficient to decode, like ima4. One strategy that I've used in every app I've shipped that has these types of audio samples is to ship samples in mp3 or aac format, but decode them once to an LPCM file in the app's sandbox the first time the app is launched. This way you get the benefit of a smaller app bundle and low CPU utilization/higher polyphony at runtime when decoding the samples. (With a small hit to the first-time user experience while the user waits for the samples to be decoded.)
My understanding is that AVAudioPlayer isn't meant to be used like that. In general, when combining lots of sounds into a single output like that, you want to open a single stream and mix the sounds yourself.
What you are encountering is clipping — it's occurring because the combined volumes of the sounds you're playing are exceeding the maximum possible volume. You need to decrease the volume of these sounds when there's more than one playing at a time.
I'm currently developing an app which plays an audio file (mp3, but can change to WAV to reduce decoding time), and records audio at the same time.
For synchronization purposes, I want to estimate the exact time when audio started playing.
Using AudioQueue to control each buffer, I can estimate the time when the first buffer was drained. My questions are:
What is the hardware delay between AudioQueue buffers being drained and them actually being played?
Is there a lower level API (specifically, AudioUnit), that has better performance (in hardware latency measures)?
Is it possible to place an upper limit on hardware latency using AudioQueue, w or w/o decoding the buffer? 5ms seems something that I can work with, more that that will require a different approach.
Thanks!
The Audio Queue API runs on top of Audio Units, so the RemoteIO Audio Unit using raw uncompressed audio will allow a lower and more deterministic latency. The minimum RemoteIO buffer duration that can be set on some iOS devices (using the Audio Session API) is about 6 to 24 milliseconds, depending on application state. That may set a lower limit on both play and record latency, depending on what events you are using for your latency measurement points.
Decoding compressed audio can add around an order of magnitude or two more latency from decode start.
What I need to do is be able to mix 4 channels of audio (not from a live source, just prerecorded audio files in the app bundle), and change their volumes individually, in real time, preferably with MP3s. What's the best/correct road for me to take, regarding all the various sound APIs for the iPhone?
Thanks!
Storm Sim does this with AVAudioPlayer, which is certainly the simplest methdod. You can call prepareToPlay on each of the player objects then kick them off with play later so there won't be any delay. I also use a blank 1-second audio player on eternal loop to keep the deviceTime counting down, so you can use playAfter to give a specific deviceTime in the future to make all the samples play in-sync or offset relative to each other (deviceTime only ticks if there is some sort of audio playing). The AVAudioPlayerDelegate has interrupted/resumed events and finishedPlaying so you can get notification of what is happening.
However there is only one hardware MP3/AAC decoder, so the other three will use up CPU (and thus battery) doing the decoding. If you want to maximize battery life, use CAF files in IMA4#44100. It is about 1/4 the size of the raw WAV files so it isn't as good as MP3 but the performance is much better, especially if you are using a lot of small audio tracks. If you are using voice you can get away with much less fidelity and smash the files even more. afconvert in terminal can help you getting your source files in the CAF format (you should use CAF files no matter what the encoding).