How do I use vDSP functions for Short Time Fourier Transform? - iphone

I trying to understand how to use vDSP functions for STFT. So I use FFT code from apple's expamles and I can get FFT of first 1024 frames but how could I get FFT of next 1024 - 2047 frames and so on, until the end of file.. (in this case I imagine the size of file is int f = 10000).
//vDSP variables
DOUBLE_COMPLEX_SPLIT A;
FFTSetupD setupReal;
uint32_t log2n;
uint32_t n, nOver2;
int32_t stride;
double *obtainedReal;
double scale;
log2n = N;
n = 1 << log2n;
stride = 1;
nOver2 = n/2;
int f = 10000;
buffer = malloc(f *sizeof(double));
obtainedReal = malloc(f *sizeof(double));
A.realp = malloc(f *sizeof(double));
A.imagp = malloc(f *sizeof(double));
vDSP_ctozD((DOUBLE_COMPLEX*) buffer, 2, &A, 1, nOver2);
setupReal = vDSP_create_fftsetupD(log2n, FFT_RADIX2);
if (setupReal == NULL) {
NSLog(#"fft_setup failed to allocate enough memory for real FFT\n");
return 0 ;
}
vDSP_fft_zripD(setupReal, &A, stride, log2n, FFT_FORWARD);
scale = (double) 1.0 / (2 * n);
vDSP_vsmulD(A.realp, 1, &scale, A.realp, 1, nOver2);
vDSP_vsmulD(A.imagp, 1, &scale, A.imagp, 1, nOver2);
vDSP_ztocD(&A, 1, (DOUBLE_COMPLEX *) obtainedReal, 2, nOver2);

If you simply want the FFT of the next 1024 elements, add nOver2 to A.realp and to A.imagp, then perform another vDSP_fft_zripD and another vDSP_ztocD. You will probably want to advance obtainedReal too, or the new results will overwrite the old results.
Note that changing A.realp and A.imagp loses the starting addresses, so you will not be able to free this memory unless you recalculate the starting addresses or save them elsewhere before changing A.realp and A.imagp.
Also, 10,000 is not an integer multiple of 1024, so your last portion will not have 1024 elements, so you need to figure out an alternative, such as getting more data or padding the data with zeroes.
You are allocating too much memory for A.realp and A.imagp. Each of them receives half of the elements in buffer, so each of them only needs half as much memory.
Even that much memory is not needed. You can use vDSP_ctozD to move just 1024 elements into A.realp and A.imagp (512 each), then perform an FFT, then move the data to obtainedReal using vDSP_ztocD, then move on to the next group by using vDSP_ctozD to move 1024 new elements into the same space in A.realp and A.imagp that was used before.

Related

Procedural structure generation

I have a voxel based game in development right now and I generate my world by using Simplex Noise so far. Now I want to generate some other structures like rivers, cities and other stuff, which can't be easily generated because I split my world (which is practically infinite) into chunks of 64x128x64. I already generated trees (the leaves can grow into neighbouring chunks), by generating the trees for a chunk, plus the trees for the 8 chunks surrounding it, so leaves wouldn't be missing. But if I go into higher dimensions that can get difficult, when I have to calculate one chunk, considering chunks in an radius of 16 other chunks.
Is there a way to do this a better way?
Depending on the desired complexity of the generated structure, you may find it useful to first generate it in a separate array, perhaps even a map (a location-to-contents dictionary, useful in case of high sparseness), and then transfer the structure to the world?
As for natural land features, you may want to google how fractals are used in landscape generation.
I know this thread is old and I suck at explaining, but I'll share my approach.
So for example 5x5x5 trees. What you want is for your noise function to return the same value for an area of 5x5 blocks, so that even outside of the chunk, you can still check if you should generate a tree or not.
// Here the returned value is different for every block
float value = simplexNoise(x * frequency, z * frequency) * amplitude;
// Here it will return the same value for an area of blocks (you should use floorDiv instead of dividing, or you it will get negative coordinates wrong (-3 / 5 should be -1, not 0 like in normal division))
float value = simplexNoise(Math.floorDiv(x, 5) * frequency, Math.floorDiv(z, 5) * frequency) * amplitude;
And now we'll plant a tree. For this we need to check what x y z position this current block is relative to the tree's starting position, so we can know what part of the tree this block is.
if(value > 0.8) { // A certain threshold (checking if tree should be generated at this area)
int startX = Math.floorDiv(x, 5) * 5; // flooring the x value to every 5 units to get the start position
int startZ = Math.floorDiv(z, 5) * 5; // flooring the z value to every 5 units to get the start position
// Getting the starting height of the trunk (middle of the tree , that's why I'm adding 2 to the starting x and starting z), which is 1 block over the grass surface
int startY = height(startX + 2, startZ + 2) + 1;
int relx = x - startX; // block pos relative to starting position
int relz = z - startZ;
for(int j = startY; j < startY + 5; j++) {
int rely = j - startY;
byte tile = tree[relx][rely][relz]; // Get the needing block at this part of the tree
tiles[i][j][k] = tile;
}
}
The tree 3d array here is almost like a "prefab" of the tree, which you can use to know what block to set at the position relative to the starting point. (God I don't know how to explain this, and having english as my fifth language doesn't help me either ;-; feel free to improve my answer or create a new one). I've implemented this in my engine, and it's totally working. The structures can be as big as you want, with no chunk pre loading needed. The one problem with this method is that the trees or structures will we spawned almost within a grid, but this can easily be solved with multiple octaves with different offsets.
So recap
for (int i = 0; i < 64; i++) {
for (int k = 0; k < 64; k++) {
int x = chunkPosToWorldPosX(i); // Get world position
int z = chunkPosToWorldPosZ(k);
// Here the returned value is different for every block
// float value = simplexNoise(x * frequency, z * frequency) * amplitude;
// Here it will return the same value for an area of blocks (you should use floorDiv instead of dividing, or you it will get negative coordinates wrong (-3 / 5 should be -1, not 0 like in normal division))
float value = simplexNoise(Math.floorDiv(x, 5) * frequency, Math.floorDiv(z, 5) * frequency) * amplitude;
if(value > 0.8) { // A certain threshold (checking if tree should be generated at this area)
int startX = Math.floorDiv(x, 5) * 5; // flooring the x value to every 5 units to get the start position
int startZ = Math.floorDiv(z, 5) * 5; // flooring the z value to every 5 units to get the start position
// Getting the starting height of the trunk (middle of the tree , that's why I'm adding 2 to the starting x and starting z), which is 1 block over the grass surface
int startY = height(startX + 2, startZ + 2) + 1;
int relx = x - startX; // block pos relative to starting position
int relz = z - startZ;
for(int j = startY; j < startY + 5; j++) {
int rely = j - startY;
byte tile = tree[relx][rely][relz]; // Get the needing block at this part of the tree
tiles[i][j][k] = tile;
}
}
}
}
So 'i' and 'k' are looping withing the chunk, and 'j' is looping inside the structure. This is pretty much how it should work.
And about the rivers, I personally haven't done it yet, and I'm not sure why you need to set the blocks around the chunk when generating them ( you could just use perlin worms and it would solve problem), but it's pretty much the same idea, and for your cities too.
I read something about this on a book and what they did in these cases was to make a finer division of chunks depending on the application, i.e.: if you are going to grow very big objects, it may be useful to have another separated logic division of, for example, 128x128x128, just for this specific application.
In essence, the data resides is in the same place, you just use different logical divisions.
To be honest, never did any voxel, so don't take my answer too serious, just throwing ideas. By the way, the book is game engine gems 1, they have a gem on voxel engines there.
About rivers, can't you just set a level for water and let rivers autogenerate in mountain-side-mountain ladders? To avoid placing water inside mountain caveats, you could perform a raycast up to check if it's free N blocks up.

Apple Accelerate framework produce cepstrum unexpected results

I'm following an article on how to produce a cepstrum for use in detecting speech formants and coding it using the iPhone Accelerate framework. However, the results are not quite as the article expects. For unvoiced sections (figure 3 in the article) it shows smaller values in the first few bins. However, when my code runs, the unvoiced sections have large values (towards 1.0) which looks more like a voiced section.
Here is my code:
// copy buffer data into a separate array and apply hamming window
// don't use leadlength because we copied to beginning of buffer
int offset = (int)(s * stepSize);
float *hamBuffer = malloc(n*sizeof(float));
for (int i=0; i < n; i++)
hamBuffer[i] = hpBuffer[offset+i] * ((1.0f-0.46f) - 0.46f*cos(TWOPI*i/((float)n-1.0f)));
// configure float array into acceptable input array format (interleaved)
vDSP_ctoz((COMPLEX*)hamBuffer, 2, &complexArray, 1, halfN);
// free ham buffer
free(hamBuffer);
// run FFT
vDSP_fft_zrip(setupReal, &complexArray, stride, log2n, FFT_FORWARD);
// Absolute square (equivalent to mag^2)
complexArray.imagp[0] = 0.0f;
vDSP_zvmags(&complexArray, 1, complexArray.realp, 1, halfN);
bzero(complexArray.imagp, (halfN) * sizeof(float));
// scale
float scale = 1.0f / (2.0f*(float)n);
vDSP_vsmul(complexArray.realp, 1, &scale, complexArray.realp, 1, halfN);
// get log of absolute values for passing to inverse FFT for cepstrum
float *logmag = malloc(sizeof(float)*halfN);
for (int i=0; i < halfN; i++)
logmag[i] = log10f(fabsf(complexArray.realp[i]));
// configure float array into acceptable input array format (interleaved)
vDSP_ctoz((COMPLEX*)logmag, 2, &complexArray, 1, halfN/2);
// create cepstrum
vDSP_fft_zrip(setupReal, &complexArray, stride, log2n-1, FFT_INVERSE);
// scale again
scale = (float) 1.0 / (2 * n);
vDSP_vsmul(complexArray.realp, 1, &scale, complexArray.realp, 1, halfN);
vDSP_vsmul(complexArray.imagp, 1, &scale, complexArray.imagp, 1, halfN);
//convert interleaved to real
float *displayData = malloc(sizeof(float)*n);
vDSP_ztoc(&complexArray, 1, (COMPLEX*)displayData, 2, halfN);
// print cepstrum to debug window
for (int i=0; i < halfN; i++)
printf("%f\r\n", displayData[i]);
Here are the results of the first several bins:
-1.036735
0.807992
-0.030310
0.201064
-0.048442
0.071084
-0.050529
0.108412
-0.037282
0.080372
-0.003775
0.102596
-0.027706
0.044470
0.010319
0.041597
-0.050533
0.012725
-0.003895
-0.016887
-0.010547
They do 'settle down' towards zero, but the first few numbers are way larger than I was expecting for an unvoiced section. Does my code look incorrect? I think I followed the article quite closely. Why am I getting such large values in the first few bins for an unvoiced section?

Passing AudioQueueBufferRef data to FFT function!

I am trying to compute the frequency of a given sound process through the microphone on the iphone.
I've read all the post about FFT (including all apple code examples e.g aurioTouch,SpeakHere), but not solution to this problem.
I'm using AudioQueue, but how do I to pass the raw data "AudioQueueBufferRef" from the AudioQueue callback function (MyInputBufferHandler) inBuffer->mAudioData . To the Actual FFT "DSPSplitComplex" datatype, so I can compute it. All this using the Accelerate framework.
// AudioQueue callback function, called when an input buffers has been filled.
void AQRecorder::MyInputBufferHandler( void * inUserData,
AudioQueueRef inAQ,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp * inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription* inPacketDesc)
{
for(int i=0; i<inNumPackets; i++) {
printf("%d ",((int*)inBuffer->mAudioData)[i]);
}
}
The FFT function.
RealFFTUsageAndTiming(){
COMPLEX_SPLIT A; //DSPSplitComplex datatype
FFTSetup setupReal;
uint32_t log2n;
uint32_t n, nOver2;
int32_t stride;
uint32_t i;
float *originalReal, *obtainedReal;
float scale;
/* Set the size of FFT. */
log2n = N;
n = 1 << log2n;
stride = 1;
nOver2 = n / 2;
/* Allocate memory for the input operands and check its availability,
* use the vector version to get 16-byte alignment. */
A.realp = (float *) malloc(nOver2 * sizeof(float));
A.imagp = (float *) malloc(nOver2 * sizeof(float));
originalReal = (float *) malloc(n * sizeof(float));
obtainedReal = (float *) malloc(n * sizeof(float));
//How do I pass the data from AudioQueue callback to function?
vDSP_fft_zrip(setupReal, &A, stride, log2n, FFT_FORWARD);
vDSP_fft_zrip(setupReal, &A, stride, log2n, FFT_INVERSE);
}
I haven't find anywhere on how to do this. Please help!
You have to know the C data type of the data in the audio buffer and the data types that the FFT supports. If they are not the same (commonly 16-bit signed int versus short float), then you will have to convert while unpacking and copying the arrays of PCM data (in a for loop). Given real data, you can zero out the imaginary array of the input to the FFT.
Also, the length of the Audio Queue buffer may not be the same as the FFT length, so you may have to save the data from the Audio Queue callback to another queue internal to your app, and have another worker thread pass that data to your analysis/FFT routines as the queue fills.
Amplitude values are:
for(i=0;i<nover2;i++) {
print log10(A.realp[i])
}
Print it after using vdsp_fft_zrip......

Help with IIR Comb Filter

Reverb.m
#define D 1000
OSStatus MusicPlayerCallback(
void* inRefCon,
AudioUnitRenderActionFlags * ioActionFlags,
const AudioTimeStamp * inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames
AudioBufferList * ioData){
MusicPlaybackState *musicPlaybackState = (MusicPlaybackState*) inRefCon;
//Sample Rate 44.1
float a0,a1;
double y0, sampleinp;
//Delay Gain
a0 = 1;
a1 = 0.5;
for (int i = 0; i< ioData->mNumberBuffers; i++){
AudioBuffer buffer = ioData->mBuffers[i];
SIn16 *outSampleBuffer = buffer.mData;
for (int j = 0; j < inNumberFrames*2; j++) {
//Delay Left Channel
sampleinp = *musicPlaybackState->samplePtr++;
/* IIR equation of Comb Filter
y[n] = (a*x[n])+ (b*x[n-D])
*/
y0 = (a0*sampleinp) + (a1*sampleinp-D);
outSample[j] = fmax(fmin(y0, 32767.0), -32768.0);
j++;
//Delay Right Channel
sampleinp = *musicPlaybackState->samplePtr++;
y0 = (a0*sampleinp) + (a1*sampleinp-D);
outSample[j] = fmax(fmin(y0, 32767.0), -32768.0);
}
}
}
Ok, I got a lot of info but I'm having trouble implementing it. Can someone help, it's probably something really easy i'm forgeting. It's just playing back as normal with a little boost but no delays.
Your treatment of the x0[] variables doesn't look right -- the way you have it, the left and right channels will be intermingled. You assign to x0[j] for the left channel, then
overwrite x0[j] with the right channel data. So the delayed signal x0[j-D] will
always correspond to the right channel, with the delayed left channel data being lost.
You didn't say what your sample rate is, but for a typical audio application, a
three-sample delay might not have much of an audible effect. At 44.1 ksamp/sec,
with a 3-sample delay the peaks and troughs of the filter response will be at
multiples of 14,700 Hz. All you'll get is a single peak in the audio frequency
range, in a part of the spectrum where there's hardly any power (assuming the
signal is speech or music).

How do I set up a buffer when doing an FFT using the Accelerate framework?

I'm using the Accelerate framework to perform a Fast Fourier Transform (FFT), and am trying to find a way to create a buffer for use with it that has a length of 1024. I have access to the average peak and peak of a signal on which I want to do the FFT.
Can somebody help me or give me some hints to do this?
Apple has some examples of how to set up FFTs in their vDSP Programming Guide. You should also check out the vDSP Examples sample application. While for the Mac, this code should translate directly across to iOS as well.
I recently needed to do a simple FFT of an 64 integer input waveform, for which I used the following code:
static FFTSetupD fft_weights;
static DSPDoubleSplitComplex input;
static double *magnitudes;
+ (void)initialize
{
/* Setup weights (twiddle factors) */
fft_weights = vDSP_create_fftsetupD(6, kFFTRadix2);
/* Allocate memory to store split-complex input and output data */
input.realp = (double *)malloc(64 * sizeof(double));
input.imagp = (double *)malloc(64 * sizeof(double));
magnitudes = (double *)malloc(64 * sizeof(double));
}
- (CGFloat)performAcceleratedFastFourierTransformAndReturnMaximumAmplitudeForArray:(NSUInteger *)waveformArray;
{
for (NSUInteger currentInputSampleIndex = 0; currentInputSampleIndex < 64; currentInputSampleIndex++)
{
input.realp[currentInputSampleIndex] = (double)waveformArray[currentInputSampleIndex];
input.imagp[currentInputSampleIndex] = 0.0f;
}
/* 1D in-place complex FFT */
vDSP_fft_zipD(fft_weights, &input, 1, 6, FFT_FORWARD);
input.realp[0] = 0.0;
input.imagp[0] = 0.0;
// Get magnitudes
vDSP_zvmagsD(&input, 1, magnitudes, 1, 64);
// Extract the maximum value and its index
double fftMax = 0.0;
vDSP_maxmgvD(magnitudes, 1, &fftMax, 64);
return sqrt(fftMax);
}
As you can see, I only used the real values in this FFT to set up the input buffers, performed the FFT, and then read out the magnitudes.