can DiracLE change Pitch or just the time Stretching - iphone

I am using Dirac LE. i have tried to change the pitch of a sound but got the following error in console..
Can Dirac LE convert the pitch of a sound audio? I know i can only use one channel.

yes you can change the pitch.
please be aware that you need to create and destroy the dirac object when you change something
so:
create the dirac instance with DiracCreate (or the interleaved version)
set the pitch/tempo and other settings
do the processing with DiracProcess
destroy the dirac instance with DiracDestroy
if you want to change a setting you need to create/destroy the dirac instance in the LE version (calling DiracReset is not enough)
oh, and you can process multiple channels by having multiple dirac instances, however these are not linked/synced and this can cause small variations in left/right channel, resulting in an unstable stereo image

Related

How to stretch waveform using audiokit

I'm tracking the amplitude while outputting it as a waveform in real time. I'm using audiokitUI's RollingViewData class to do this with the results as seen below.
My issue is the waveform is very small insensitive to input, maybe looks like the node needs to go through a booster before being plotted? How would i achieve this and are there any methods that do so in audio kit v5? Thanks

how to measure distance and centroid of moving object with Matlab stereo computer vision?

Which Matlab functions or examples should be used to (1) track distance from moving object to stereo (binocular) cameras, and (2) track centroid (X,Y,Z) of moving objects, ideally in the range of 0.6m to 6m. from cameras?
I've used the Matlab example that uses the PeopleDetector function, but this becomes inaccurate when a person is within 2m. because it begins clipping heads and legs.
The first thing that you need deal with, is in how detect the object of interest (I suppose you have resolved this issue). There are a lot of approaches of how to detect moving objects. If your cameras will stand in a fix position you can work only with one camera and use some background subtraction to get the objects that appear in the scene (Some info here). If your cameras are are moving, I think the best approach is to work with optical flow of the two cameras (instead to use a previous frame to get the flow map, the stereo pair images are used to get the optical flow map in each fame).
In MatLab, there is an option called disparity computation, this could help you to try to detect the objects in scene, after this you need to add a stage to extract the objects of your interest, you can use some thresholds. Once you have the desired objects, you need to put them in a binary mask. In this mask you can use some image momentum (Check this and this) extractor to calculate the centroids. If the images in the binary mask look noissy you can use some morphological operations to improve the reults (watch this).

Encoding an image in to the fourier domain of a sound

I'm trying to convert an image to a sound where you can see the image if you were to view the spectrogram of that sound. Kind of like the aphex twin had done in window licker.
So far I have written an iPhone app that takes a photograph and then converts it to grayscale. I then use this gray scale as a magnitude which I'd like to plug back through an inverse FFT.
The problem I have, though, is how do I go from magnitude into the imaginary and real parts.
mag = sqrtf( (imag * imag) + (real * real));
Obviously I can't solve for 2 unknowns. Furthermore I can't find out if those real and imaginary parts are negative or not.
So I'm at a bit of a loss. It must be possible. Can anyone point me in the direction of some useful information?
A spectrogram contains no phase information, so you can just set the imaginary parts to 0 and set the real parts equal to the magnitude. Remember that you need to maintain complex conjugate symmetry if you want to end up with a purely real time domain signal after you have applied the inverse FFT.
The math wonks are right about regenerating from greyscale, but why limit yourself thus? Have you considered keeping a portion of the phase information in the color channels?
Specifically, why not process the LEFT channel into BLUE, the RIGHT channel into RED, and for the GREEN color element, run the transform again on (LEFT-RIGHT), so that you have three spectra.
In one version of "Surround Sound", L-R encodes the rear channel - there is good stuff there.
When regenerating your sound, assign the "real" values to the corresponding channels.
Try the following (formulas - but this editor insists on calling them code..)
LEFT.real=+BLUE
RIGHT.real=+RED
LEFT.imag=+GREEN
RIGHT.imag=-GREEN
Experiment with variations on this, while listening thru some sort of surround sound setup, to see which provides the most pleasing results. Make sure not to drive the thing into clipping, since phase changes occur, regeneration of a complex saturated signal is likely to create clipping.

How to generate and play white noise on the fly with OpenAL?

I'm using OpenAL in my app to play sounds based on *.caf audio files.
There's a tutorial which describes how to generate white noise in OpenAL:
amplitude - rand(2*amplitude)
But they're creating a buffer with 1000 samples and then just loop that buffer with
alSourcei(source, AL_LOOPING, AL_TRUE);
The problem with this approach: Looping white noise just doesn't work like this because of DC offset. There will be a noticeable wobble in the sound. I know because I tried looping dozens of white noise regions generated in different applications and all of them had the same problem. Even after trying to crossfade and making sure the regions are cut to zero crossings.
Since (from my understanding) OpenAL is more low-level than Audio Units or Audio Queues, there must be a way to generate white noise on the fly in a continuous manner such that no looping is required.
Maybe someone can point out some helpful resources on that topic.
The solution with the least change might just be to create a much longer OpenAL noise buffer (several seconds) such that the wobble is at too low rate to easily hear. Any waveform hidden in a 44Hz repeat (1000 samples at 44.1k sample rate) is within normal human hearing range.

iPhone - recognizing a wave form/frequency

I capture a sound using my App. Suppose this sound is a sinusoidal 1 KHz sound and there is a background sound present. How do I identify that this 1 KHz sound is present on the sound?
I mean, I can imagine how elements can be found in images, as for example, if you are looking for a yellow square on an image, all you have to do is to specify the color you want, give it a certain tolerance, and look for a group of pixels that have that color and form a square shape. But what about sounds? How do you identify a wave form and frequency when all you get is an amplitude value that represents 1/44.000 of the wave form in one second?
I don't want the code, as this is too complex for this post, but if you guys can point me on the right direction of how this is done, free source codes that can exemplify the techniques or the math behind it, I appreciate. Thanks
You just need to do a FFT (Fast Fourrier Transform) of the wave to have the frequencies it is composed of. This is a classic task in signal processing (Fourrier Transforms are transformations to switch between time and frequency spaces) so you should find a lot of resources on the subject.
Maybe look at the CoreAudio framework and sample codes too.
(See also CoreAudio overview and the Audio and Video topics in Apple's doc)