Audioworklets and pitch - web-audio-api

I've recently started working with audioworklets and am trying to figure out how to determine the pitch(s) from the input. I found a simple algorithm to use for a script processor, but the input values are different than a script processor and doesn't work. Plus each input array is only 128 units. So, how can I determine pitch using an audioworklet? As a bonus question, how do the values relate to the actual audio going in?

If it worked with a ScriptProcessorNode, it will work in an AudioWorklet, but you'll have to buffer the data in the worklet because, as you noted, you only get 128 frames per call. The ScriptProcessor gets anywhere from 256 to 16384.
The values going to the worklet are the actual values that are produced from the graph connected to the input. These are exactly the same values that would go to the script processor, except you get them in chunks of 128.

Related

How can I mix multiple stereo signals to one with WebAudio?

I'm writing a web app which needs to combine a number of stereo sounds into one stereo output, so I want an equivalent of gstreamer's audiomixer element, but there doesn't seem to be one in WebAudio. ChannelMerger doesn't do quite the same thing - it combines multiple mono signals into one multi-channel signal.
The documentation for AudioNode.connect says that you can connect an output to multiple inputs of other nodes and that attempts to connect the same output to the same input more than once are ignored. But it doesn't say what will happen if you try to connect multiple different outputs to the same input. Would that act as a simple mixer like I want? I suspect not, because what splitting/merging functionality WebAudio does provide (see ChannelMerger above) seems to mostly be based on converting between multiple mono signals and one multi-channel signal with a one channel to one mono signal mapping.
I could take an arbitrary node (I guess a GainNode would work, and I could take advantage of its gain functionality) and set its channelInterpretation mode to "speakers" to actually mix channels, but it only works for 1, 2, 4 or 6 inputs. I'm unlikely to need more than 6, but I will definitely need to be able to handle 3, and possibly 5. That could be done by using more than one mixer (eg for three channels mix inputs 1 and 2 in one mixer, then mix its output with input 3 in a second mixer), but I think I would have to add more GainNodes to balance the mix correctly. A mixer presumably has to attenuate each input to prevent coincident peaks from clipping out of range, so with chained mixers without compensation I'd end up with 1/4,1/4,1/2 instead of 1/3,1/3,1/3?
You almost got it right. Use a single GainNode and connect each source to the single input to the GainNode. This will sum up all of the different connections and produce a single output. If you know all of the individual sources are stereo, you don't need to change anything about the channelInterpretation, channelCountMode, or channelCount to get what you want.
You will probably have to adjust the gain value of the GainNode to reduce the output volume so that you don't overdrive the output device.
Other than, this should all work.

How to write "Big Data" to a text file using Matlab

I am getting some readings off an accelerometer connected to an Arduino which is in turn connected to MATLAB through serial communication. I would like to write the readings into a text file. A 10 second reading will write around 1000 entries that make the text file size around 1 kbyte.
I will be using the following code:
%%%%%// Communication %%%%%
arduino=serial('COM6','BaudRate',9600);
fopen(arduino);
fileID = fopen('Readings.txt','w');
%%%%%// Reading from Serial %%%%%
for i=1:Samples
scan = fscanf(arduino,'%f');
if isfloat(scan),
vib = [vib;scan];
fprintf(fileID,'%0.3f\r\n',scan);
end
end
Any suggestions on improving this code ? Will this have a time or Size limit? This code is to be run for 3 days.
Do not use text files, use binary files. 42718123229.123123 is 18 bytes in ASCII, 4 bytes in a binary file. Don't waste space unnecessarily. If your data is going to be used later in MATLAB, then I just suggest you save in .mat files
Do not use a single file! Choose a reasonable file size (e.g. 100Mb) and make sure that when you get to that many amount of data you switch to another file. You could do this by e.g. saving a file per hour. This way you minimize the possible errors that may happen if the software crashes 2 minutes before finishing.
Now knowing the real dimensions of your problem, writing a text file is totally fine, nothing special is required to process such small data. But there is a problem with your code. You are writing a variable vid which increases over time. That may cause bad performance because you are not using preallocation and it may consume a lot of memory. I strongly recommend not to keep this variable, and if you need the dater read it afterwards.
Another thing you should consider is verification of your data. What do you do when you receive less samples than you expect? Include timestamps! Be aware that these timestamps are not precise because you add them afterwards, but it allows you to identify if just some random samples are missing (may be interpolated afterwards) or some consecutive series of maybe 100 samples is missing.

Peekdata returns only one channel

I have a matlab stereo audio input ala:
aud=analoginput('winsound',0);
addchannel(aud,1:2)
When I'm done running this and ask for
aud_data=getdata(aud);
I get an array showing all the data from both channels. I have verified that this data is, in fact, the valid stereo signal that I want.
However, if I run
aud_peek=peekdata(aud,some_number_of_samples);
whilst data is being collected, I only seem to get one channel's worth of data, though MathWorks says I should get two.
Any thoughts as to why this is happening?
I am using Matlab 7.
Did you try
aud_peek=peekdata(aud,some_number_of_samples,'native');
or
aud_peek=peekdata(aud,some_number_of_samples,'double');

Mixing sound files of different size

I want to mix audio files of different size into a one single .wav file without clipping any file.,i.e. The resulting file size should be equal to the largest sized file of all.
There is a sample through which we can mix files of same size
[(http://www.modejong.com/iOS/#ex4 )(Example 4)].
I modified the code to get the mixed file as a .wav file.
But I am not able to understand that how to modify this code for unequal sized files.
If someone can help me out with some code snippet,i'll be really thankful.
It should be as easy as sending all the files to the mixer simultaneously. When any single file gets to the end, just treat it as if the remainder is filled with zeroes. When all files get to the end, you are done.
Note that the example code says it returns an error if there would be clipping (the sum of the waves is greater than the max representable value.). This condition is more likely if you are mixing multiple inputs. The best way around it is to create some "headroom" in the input waves. You can do either do this in preprocessing, by ensuring that each wave's volume is no more than X% of maximum. (~80-90%, depending on number of inputs.). The other way is to do it dynamically in the mixer code by multiplying each sample by some value <1.0 as you add it to the mix.
If you are selecting the waves to mix at runtime and failure due to clipping is unacceptable, you will need to modify the sample code to pin the values at max/min instead of returning an error. Don't just let them overflow or you will get noisy artifacts.
(Clipping creates artifacts as well, but when you haven't created enough headroom before mixing, it is definitely preferrable to overflow. It is a more familiar-sounding type of distortion, similar to what you get when you overdrive your speakers. See this wikipedia article on clipping:
Clipping is preferable to the alternative in digital systems—wrapping—which occurs if the digital hardware is allowed to "overflow", ignoring the most significant bits of the magnitude, and sometimes even the sign of the sample value, resulting in gross distortion of the signal.
How I'd do it:
Much like the mix_buffers function that you linked to, but pass in 2 parameters for mixbufferNumSamples. Iterate over the whole of the longer of the two buffers. When the index has gone beyond the end of the shorter buffer, simply set the sample from that buffer to 0 for the rest of the function.
If you must avoid clipping and do it in real-time and you know nothing else about the two sounds, you must provide enough headroom. The simplest method is by halving each of the samples before mixing:
mixed = s1/2 + s2/2;
This ensures that the resultant mixed sample won't overflow an int16_t. It will have the side effect of making everything quieter though.
If you can run it offline, you can calculate a scale factor to apply to both waveforms which will keep the peaks when summed below the maximum allowed value.
Or you could mix them all at full volume to an int32_t buffer, keeping track of the largest (magnitude) mixed sample and then go back through the buffer multiplying each sample by a scale factor which will make that extreme sample just reach the +32767/-32768 limits.

iPhone audio and AFSK

Here is a question for all you iPhone experts:
If you guys remember the sounds that modems used to make, or when one was trying to load a program from a cassette tape – I am trying to replicate this in an iPhone for a ham radio application. I have a stream of data (ASCII) and I need to encode it as AFSK at 1200 baud. So basically everything in the stream is converted to a series of 1200 and 2200 Hz tones. It needs to sound something like this: http://upload.wikimedia.org/wikipedia/commons/2/27/AFSK_1200_baud.ogg
I successfully built a bit array out of the string, but when I try to assign tones to each bit I get gaps in the sound, therefore it doesn’t demodulate correctly.
Any thought of how one should tackle this problem? Thank you.
The mobilesynth project is open-source. You might be able to scan that for code that generates the tones you need.
How are you assigning tones to the bits? Remember, a digital audio signal is just a stream of samples with values between -1 and 1. Perhaps there is a clipping issue between tone assignments. This can happen if the signal dives below -1 or above 1. If it stays above or below this range at a constant value, there will be no sound. Maybe you could output your stream of samples to check if this is the case. Or plug the output into an oscilloscope...
Also note that clicking can occur between "uneven" transitions of signals. For example if i output a sample with value 1 followed immediately by a sample with value -1, a click or pop will be produced.