I am trying to manipulate gain on individual buffers in an OfflineAudioText.
ac and data are previously determined after loading it in
var source = ac.createBufferSource();
source.buffer = data;
var splitter = ac.createChannelSplitter(2);
source.connect(splitter);
var merger = ac.createChannelMerger(2);
var gainNode = ac.createGain();
gainNode.gain.value = 0.5;
splitter.connect(gainNode, 0);
splitter.connect(gainNode, 1);
gainNode.connect(merger, 0, 1);
//error occurs here
gainNode.connect(merger, 1, 0);
var dest = ac.createMediaStreamDestination();
merger.connect(dest);
Error: Failed to execute 'connect' on 'AudioNode': output index (1) exceeds number of outputs (1)
I was not assigning the input correctly:
splitter.connect(gainNode, 0);
splitter.connect(gainNode, 1);
gainNode.connect(merger, 0, 0);
gainNode.connect(merger, 0, 1);
Related
Im trying to re-implement Torch's STFT code in Swift with Accelerate / vDSP, to produce a Log Mel Spectrogram by post processing the STFT so I can use the Mel Spectrogram as an input for a CoreML port of OpenAI's Whisper
Pytorch's native STFT / Mel code produces this Spectrogram (its clipped due to importing raw float 32s into Photoshop lol)
and mine:
Obviously the two things to notice are the values, and the lifted frequency components.
The STFT Docs here https://pytorch.org/docs/stable/generated/torch.stft.html
X[ω,m]=
k=0
∑
win_length-1
window[k] input[m×hop_length+k] * exp(−j * (2π⋅ωk) /win_length)
I believe Im properly handling window[k] input[m×hop_length+k] but I'm a bit lost as to how to calculate the exponent and what -J is referring to in the documentation, and how to convert the final exponential in vDSP. Also, if its a sum, how do I get the 200 elements I need!?
My Log Mel Spectrogram
My code follows:
func processData(audio: [Int16]) -> [Float]
{
assert(self.sampleCount == audio.count)
var audioFloat:[Float] = [Float](repeating: 0, count: audio.count)
vDSP.convertElements(of: audio, to: &audioFloat)
vDSP.divide(audioFloat, 32768.0, result: &audioFloat)
// Up to this point, Python and swift are numerically identical
// insert numFFT/2 samples before and numFFT/2 after so we have a extra numFFT amount to process
// TODO: Is this stricly necessary?
audioFloat.insert(contentsOf: [Float](repeating: 0, count: self.numFFT/2), at: 0)
audioFloat.append(contentsOf: [Float](repeating: 0, count: self.numFFT/2))
// Split Complex arrays holding the FFT results
var allSampleReal = [[Float]](repeating: [Float](repeating: 0, count: self.numFFT/2), count: self.melSampleCount)
var allSampleImaginary = [[Float]](repeating: [Float](repeating: 0, count: self.numFFT/2), count: self.melSampleCount)
// Step 2 - we need to create 200 x 3000 matrix of STFTs - note we appear to want to output complex numbers (?)
for (m) in 0 ..< self.melSampleCount
{
// Slice numFFTs every hop count (barf) and make a mel spectrum out of it
// audioFrame ends up holding split complex numbers
var audioFrame = Array<Float>( audioFloat[ (m * self.hopCount) ..< ( (m * self.hopCount) + self.numFFT) ] )
// Copy of audioFrame original samples
let audioFrameOriginal = audioFrame
assert(audioFrame.count == self.numFFT)
// Split Complex arrays holding a single FFT result of our Audio Frame, which gets appended to the allSample Split Complex arrays
var sampleReal:[Float] = [Float](repeating: 0, count: self.numFFT/2)
var sampleImaginary:[Float] = [Float](repeating: 0, count: self.numFFT/2)
sampleReal.withUnsafeMutableBytes { unsafeReal in
sampleImaginary.withUnsafeMutableBytes { unsafeImaginary in
vDSP.multiply(audioFrame,
hanningWindow,
result: &audioFrame)
var complexSignal = DSPSplitComplex(realp: unsafeReal.bindMemory(to: Float.self).baseAddress!,
imagp: unsafeImaginary.bindMemory(to: Float.self).baseAddress!)
audioFrame.withUnsafeBytes { unsafeAudioBytes in
vDSP.convert(interleavedComplexVector: [DSPComplex](unsafeAudioBytes.bindMemory(to: DSPComplex.self)),
toSplitComplexVector: &complexSignal)
}
// Step 3 - creating the FFT
self.fft.forward(input: complexSignal, output: &complexSignal)
}
}
// We need to match: https://pytorch.org/docs/stable/generated/torch.stft.html
// At this point, I'm unsure how to continue?
// let twoπ = Float.pi * 2
// let freqstep:Float = Float(16000 / (self.numFFT/2))
//
// var w:Float = 0.0
// for (k) in 0 ..< self.numFFT/2
// {
// let j:Float = sampleImaginary[k]
// let sample = audioFrame[k]
//
// let exponent = -j * ( (twoπ * freqstep * Float(k) ) / Float((self.numFFT/2)))
//
// w += powf(sample, exponent)
// }
allSampleReal[m] = sampleReal
allSampleImaginary[m] = sampleImaginary
}
// We now have allSample Split Complex holding 3000 200 dimensional real and imaginary FFT results
// We create flattened 3000 x 200 array of DSPSplitComplex values
var flattnedReal:[Float] = allSampleReal.flatMap { $0 }
var flattnedImaginary:[Float] = allSampleImaginary.flatMap { $0 }
Sorry if this is very obvious, I'm very new to SuperCollider. I've followed a few suggestions from other threads but this application appears unique as I'm using OSC data from Max 8. I've run out of time so I'd hugely appreciate any suggestions.
I'm trying to change the amplitude of my Synth using AmpCompA. I can change the frequency in realtime using OSC messages from Max 8, however, I can't apply AmpCompA in realtime using the same trigger/message. Is this possible another way?
Here is the code:
// setup synth
(
SynthDef.new("sine", {arg out = 0, freq = 200, amp = 1.0;
var sin;
sin = SinOsc.ar(freq);
Out.ar(out, sin * amp);
}).send(s);
)
x = Synth.new("sine", [\freq, 200, \amp, 1.0]);
//test parameter setting
x.set(\freq, 400);
x.set(\amp, 0.1);
x.free;
//read from OSC message
(
f = { |msg|
if(msg[0] != '/status.reply') {
b = 1.0 * AmpCompA.kr(msg[1]);
x.set(\freq, msg[1]); //this does work when I send an OSC message from Max 8
x.set(\amp, b); //this doesn't work? Can't set a control to UGen error
}
};
thisProcess.addOSCRecvFunc(f);
)
s.sendMsg("/n_free", x);
Max 8 Screenshot
Error:
ERROR: can't set a control to a UGen
CALL STACK:
Exception:reportError
arg this =
Nil:handleError
arg this = nil
arg error =
Thread:handleError
arg this =
arg error =
Object:throw
arg this =
UGen:asControlInput
arg this =
Object:asOSCArgEmbeddedArray
arg this =
arg array = [*1]
< FunctionDef in Method SequenceableCollection:asOSCArgArray >
arg e =
ArrayedCollection:do
arg this = [*2]
arg function =
var i = 1
SequenceableCollection:asOSCArgArray
arg this = [*2]
var array = [*1]
Node:set
arg this =
arg args = [*2]
< FunctionDef in Method Collection:collectInPlace >
arg item =
arg i = 1
ArrayedCollection:do
arg this = [*2]
arg function =
var i = 1
Collection:collectInPlace
arg this = [*2]
arg function =
FunctionList:value
arg this =
arg args = [*4]
var res = nil
Main:recvOSCmessage
arg this =
arg time = 626.8060463
arg replyAddr =
arg recvPort = 57120
arg msg = [*2]
^^ The preceding error dump is for ERROR: can't set a control to a UGen
You can use the x.set to send numbers to a synthdef, but you can't use them to send functions or uGens. AmpCompA is a UGen - and therefore can only be used inside a synthdef.
You can modify your synthdef to use it:
(
SynthDef.new("sine", {arg out = 0, freq = 200, amp = 1.0;
var sin;
sin = SinOsc.ar(freq);
Out.ar(out, sin * amp * AmpCompA.kr(freq));
}).send(s);
)
Your OSC message function only would ever send the frequency:
x.set(\freq, msg[1]);
The SynthDef uses the frequency to compute the psychoacoustic amplitude change.
I am trying to port an application to an embedded system that I am trying to design. The embedded system is Raspberry Pi Zero W - based, and uses a custom Yocto build.
The application to be ported is written with SDL / OpenGLES to my understanding. I have a hard time understanding how to make a connection similar to the following depiction:
SDL APP -----> XServer ($DISPLAY) -------> Framebuffer /dev/fb1 ($FRAMEBUFFER)
System has two displays: One HDMI on /dev/fb0 and One TFT on /dev/fb1. I am trying to run the SDL application on TFT. The following are the steps I do:
First, start an XServer on DISPLAY=:1 that is connected to /dev/fb1:
FRAMEBUFFER=/dev/fb1 xinit /etc/X11/Xsession -- /usr/bin/Xorg :1 -br -pn -nolisten tcp -dpi 100
The first step seems like it's working. I can see LXDE booting up on my TFT screen. Checking the display, I get the correct display resolution:
~/projects# DISPLAY=:1 xrandr -q
xrandr: Failed to get size of gamma for output default
Screen 0: minimum 320 x 240, current 320 x 240, maximum 320 x 240
default connected 320x240+0+0 0mm x 0mm
320x240 0.00*
Second, I would like to start SDL-written application using x11. I am thinking that should work in seeing the application on the TFT. In order to do so, I try:
SDL_VIDEODRIVER=x11 SDL_WINDOWID=1 DISPLAY=:1 ./SDL_App
No matter which display number I choose, it starts on my HDMI display and not on the TFT. So now I am thinking the person who wrote the application hardcoded somethings in the application code:
void init_ogl(void)
{
int32_t success = 0;
EGLBoolean result;
EGLint num_config;
static EGL_DISPMANX_WINDOW_T nativewindow;
DISPMANX_ELEMENT_HANDLE_T dispman_element;
DISPMANX_DISPLAY_HANDLE_T dispman_display;
DISPMANX_UPDATE_HANDLE_T dispman_update;
VC_DISPMANX_ALPHA_T alpha;
VC_RECT_T dst_rect;
VC_RECT_T src_rect;
static const EGLint attribute_list[] =
{
EGL_RED_SIZE, 8,
EGL_GREEN_SIZE, 8,
EGL_BLUE_SIZE, 8,
EGL_ALPHA_SIZE, 8,
EGL_SURFACE_TYPE, EGL_WINDOW_BIT,
EGL_NONE
};
EGLConfig config;
// Get an EGL display connection
display = eglGetDisplay(EGL_DEFAULT_DISPLAY);
assert(display!=EGL_NO_DISPLAY);
// Initialize the EGL display connection
result = eglInitialize(display, NULL, NULL);
assert(EGL_FALSE != result);
// Get an appropriate EGL frame buffer configuration
result = eglChooseConfig(display, attribute_list, &config, 1, &num_config);
assert(EGL_FALSE != result);
// Create an EGL rendering context
context = eglCreateContext(display, config, EGL_NO_CONTEXT, NULL);
assert(context!=EGL_NO_CONTEXT);
// Create an EGL window surface
success = graphics_get_display_size( 0 /* LCD */ , &screen_width, &screen_height);
printf ("Screen width= %d\n", screen_width);
printf ("Screen height= %d\n", screen_height);
assert( success >= 0 );
int32_t zoom = screen_width / GAMEBOY_WIDTH;
int32_t zoom2 = screen_height / GAMEBOY_HEIGHT;
if (zoom2 < zoom)
zoom = zoom2;
int32_t display_width = GAMEBOY_WIDTH * zoom;
int32_t display_height = GAMEBOY_HEIGHT * zoom;
int32_t display_offset_x = (screen_width / 2) - (display_width / 2);
int32_t display_offset_y = (screen_height / 2) - (display_height / 2);
dst_rect.x = 0;
dst_rect.y = 0;
dst_rect.width = screen_width;
dst_rect.height = screen_height;
src_rect.x = 0;
src_rect.y = 0;
src_rect.width = screen_width << 16;
src_rect.height = screen_height << 16;
dispman_display = vc_dispmanx_display_open( 0 /* LCD */ );
dispman_update = vc_dispmanx_update_start( 0 );
alpha.flags = DISPMANX_FLAGS_ALPHA_FIXED_ALL_PIXELS;
alpha.opacity = 255;
alpha.mask = 0;
dispman_element = vc_dispmanx_element_add ( dispman_update, dispman_display,
0/*layer*/, &dst_rect, 0/*src*/,
&src_rect, DISPMANX_PROTECTION_NONE, &alpha, 0/*clamp*/, DISPMANX_NO_ROTATE/*transform*/);
nativewindow.element = dispman_element;
nativewindow.width = screen_width;
nativewindow.height = screen_height;
vc_dispmanx_update_submit_sync( dispman_update );
surface = eglCreateWindowSurface( display, config, &nativewindow, NULL );
assert(surface != EGL_NO_SURFACE);
// Connect the context to the surface
result = eglMakeCurrent(display, surface, surface, context);
assert(EGL_FALSE != result);
eglSwapInterval(display, 1);
glGenTextures(1, &theGBTexture);
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
glEnable(GL_TEXTURE_2D);
glBindTexture(GL_TEXTURE_2D, theGBTexture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 256, 256, 0, GL_RGBA, GL_UNSIGNED_BYTE, (GLvoid*) NULL);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrthof(0.0f, screen_width, screen_height, 0.0f, -1.0f, 1.0f);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glViewport(0.0f, 0.0f, screen_width, screen_height);
quadVerts[0] = display_offset_x;
quadVerts[1] = display_offset_y;
quadVerts[2] = display_offset_x + display_width;
quadVerts[3] = display_offset_y;
quadVerts[4] = display_offset_x + display_width;
quadVerts[5] = display_offset_y + display_height;
quadVerts[6] = display_offset_x;
quadVerts[7] = display_offset_y + display_height;
glVertexPointer(2, GL_SHORT, 0, quadVerts);
glEnableClientState(GL_VERTEX_ARRAY);
glTexCoordPointer(2, GL_FLOAT, 0, kQuadTex);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glClear(GL_COLOR_BUFFER_BIT);
}
void init_sdl(void)
{
if (SDL_Init(SDL_INIT_VIDEO | SDL_INIT_GAMECONTROLLER) < 0)
{
Log("SDL Error Init: %s", SDL_GetError());
}
theWindow = SDL_CreateWindow("Gearboy", 0, 0, 0, 0, 0);
if (theWindow == NULL)
{
Log("SDL Error Video: %s", SDL_GetError());
}
...
}
At first glance, I discovered two lines: vc_dispmanx_display_open( 0 /* LCD */ ); and graphics_get_display_size( 0 /* LCD */ , &screen_width, &screen_height);. I tried changing the display parameter to 1, thinking that it refers to DISPLAY=:1, but it did not do anything. I added logs for screen resolution, and I get 1920x1080, which is the resolution of the HDMI display. I think there must be something with the EGL portion of the code that I'm missing. What should I do right now? Is my logic fair enough or am I missing something?
Any requirements, please let me know. Any guidance regarding the issue is much appreciated.
EDIT: I saw that some people use the following, but raspberry pi zero can not find EGL/eglvivante.h for fb functions so I am unable to compile it:
int fbnum = 1; // fbnum is an integer for /dev/fb1 fbnum = 1
EGLNativeDisplayType native_display = fbGetDisplayByIndex(fbnum);
EGLNativeWindowType native_window = fbCreateWindow(native_display, 0, 0, 0, 0);
display = eglGetDisplay(native_display);
To query a MatrixMixer AudioUnit you do the following:
// code from MatrixMixerTest sample project in c++
UInt32 dims[2];
UInt32 theSize = sizeof(UInt32) * 2;
Float32 *theVols = NULL;
OSStatus result;
ca_require_noerr (result = AudioUnitGetProperty (au, kAudioUnitProperty_MatrixDimensions,
kAudioUnitScope_Global, 0, dims, &theSize), home);
theSize = ((dims[0] + 1) * (dims[1] + 1)) * sizeof(Float32);
theVols = static_cast<Float32*> (malloc (theSize));
ca_require_noerr (result = AudioUnitGetProperty (au, kAudioUnitProperty_MatrixLevels,
kAudioUnitScope_Global, 0, theVols, &theSize), home);
The return value on AudioUnitGetProperty for the kAudioUnitProperty_MatrixLevels is (defined in the documentation and in the sample code) a Float32.
I am trying to find the matrix levels in swift and can get the matrix dimensions without an issue. But I am not sure how to create an empty array of Float32 elements that is a UnsafeMutablePointer<Void>. Here's what I have tried without success:
var size = ((dims[0] + 1) * (dims[1] + 1)) * UInt32(sizeof(Float32))
var vols = UnsafeMutablePointer<Float32>.alloc(Int(size))
In the MatrixMixerTest the array is used like: theVols[0]
May need to be modified depending on how you converted other parts,
but the last part of your C++ code can be written in Swift like this:
theSize = ((dims[0] + 1) * (dims[1] + 1)) * UInt32(sizeof(Float32))
var theVols: [Float32] = Array(count: Int(theSize)/sizeof(Float32), repeatedValue: 0)
result = AudioUnitGetProperty(au, kAudioUnitProperty_MatrixLevels,
kAudioUnitScope_Global, 0, &theVols, &theSize)
guard result == noErr else {
//...
fatalError()
}
When a C-function based API is claiming a UnsafeMutablePointer<Void>, you just need an Array variable of an arbitrary type, and pass it as inout parameter.
I want to create a grid which displays error message if the file does not exist:
/* size */
s = get_file_size(recording->filename);
if (s > 0) {
size = g_format_size_full(s, G_FORMAT_SIZE_LONG_FORMAT);
gtk_label_set_text(GTK_LABEL(size_lbl), size);
gtk_widget_hide(error)
g_free(size);
} else {
size = g_strdup(_("Import Errors"));
gtk_widget_show(error)
}
in gtk grid cannot set type of "error" element to display message as in screen shot:
grid = gtk_grid_new();
gtk_grid_set_row_spacing(GTK_GRID(grid), 6);
gtk_grid_set_column_spacing(GTK_GRID(grid), 15);
gtk_container_set_border_width(GTK_CONTAINER(grid), 6);
s_lbl = gtk_label_new(Size:);
size_lbl = gtk_label_new("");
error = ?
error_pixmap = gtk_image_new_from_stock(GTK_STOCK_DIALOG_ERROR, GTK_ICON_SIZE_SMALL_TOOLBAR);
gtk_container_add(GTK_CONTAINER(error), error_pixmap);
gtk_grid_attach(GTK_GRID(grid), s_lbl, 0, 0, 1, 1);
gtk_grid_attach(GTK_GRID(grid), error, 1, 0, 1, 1);
gtk_grid_attach(GTK_GRID(grid), size_lbl, 2, 0, 1, 1);
For any help, thanks.
Screen shot:
[Size]:
Use a GtkBox with GTK_ORIENTATION_HORIZONTAL, or an adjoining cell in the GtkGrid.