Playing sound in Matlab at +30dB - matlab

As far as I know when I load wav files to matlab with command:
song = wavread('file.wav');
array song have elements with values from -1 to 1. This file (and hardware) is prepared to be played with 80dB. I need to add +30dB to achieve 110dB.
I do +10dB by multiplying by sqrt(10), so to get +30dB I do:
song = song*10*sqrt(10); which is the same as
song = song*sqrt(10)*sqrt(10)*sqrt(10);
Now values of array song have much greater values than -1 to 1 and I hear distorted sound.
Is it because of this values greater than <-1,1> or quality of my speakers/headphones?

The distortion is because your values are exceeding +/-1. The float values are converted to ADC counts, which are either +/-32768 (for a 16-bit ADC) or +/-8388608 (for a right-justified 24-bit ADC) or +/-2147483648 (for a left-justfied 24-bit ADC). For a 16-bit ADC, this is usually accomplished by an operation like adcSample = (short int)(32768.0*floatSample); in C. If floatSample is > +1 or < -1 this will cause wraparound in the short int cast, which is the distortion you hear. The cast is necessary because the ADC expects 16-bit digital samples.
You will need to adjust your amplifier/speaker settings to get the sound level you desire.
Conversely, you could create a copy of your file, lower it by 30 dB, adjust your amplifier/speakers to play the new file at 80 dB, then play the original file at the same amp/speaker settings. This will cause the original file to be played at 110 dB.
As Paul R noted in his comment, I am guessing here that you are using dB as shorthand for dB SPL when referring to the actual analog sound level produced by the full signal chain.

Related

Sine LUT VHDL wont simulate below 800 hz

I made a sine LUT for VHDL, using 256 elements.
Im using MIDI input, so values range 8.17Hz (note #0) to 12543.85z (note #127).
I have another LUT that calculates how many value must be sent to my 48 kHz codec in order to play the sound (the 8.17Hz frequency will need 48000/8.17 = 5870 values).
I have another LUT that contains an index factor, which is 256/num_Values, which is used to call values from the sin table (ex: 100*256/5870 = 4 (with integer rounding)).
I send this index factor to another VHDL file, which is used to calculate which value should be sent back. (ex: index = index_factor*step_counter)
When I get this index, I divide it by 100, and call sineLUT[index] to get the value that I need to generate a sine wave at the desired frequency.
The problem is, only the last 51 notes seem to work for me, and I do not know why. It seems to get stuck on a constant note at anything below that frequency (<650 hz) , and just decrease in volume every time I try to lower the note.
If you need parts of my code, let me know.
Just guessing, I suspect your step_counter isn't going through enough cycles, so your index (into the sine lut) doesn't go through a full 360 degrees for the lower frequencies.
For anything more helpful, you'll probably have to post code.
As an aside, why aren't you using something more like a conventional DDS? Analog Devices has a nice write-up on the basics: DDS Tutorial

Aligning two wav files precisely

I have a tool which compares two audio wav files frame by frame and returns a grade which gives the level of similarity between the two files.
I have an original wav file and a recording of the wav file, since the two files are almost similar i should get a high score of similarity, yet i get a poor score, mainly due to a very slight delay in the recorded file-leading to frame mismatch
My question is- how do i go about aligning the two audio files exactly using MATLAB, so that a valid frame to frame comparison may be done.
You should run a series of comparisons, shifting one of the frame in time and calculating the correlation between two. Highest value of correlation will give you time shift between waves.
I think you can use xcorr to achieve this.
Having had the same problem and without success to find a simple tool to sync the start of video/audio recordings automatically,
I decided to make syncstart (github).
It is a python-based command line tool that calculates the cut needed to bring the recordings into sync.
It uses an fft-based correlation of the start.
The basic code should be easily convertible to matlab:
corr = fft.ifft(fft.fft(s1pad)*np.conj(fft.fft(s2pad)))
ca = np.absolute(corr)
xmax = np.argmax(ca)
if xmax > padsize // 2:
offset = (padsize-xmax)/fs
#second signal (s2) to cut
else:
offset = xmax/fs
#first signal (s1) to cut

How to export sound from timeline of sounds on iOS with OpenAL

I'm not sure if it's possible to achieve what I want, but basically I have a NSDictionary which represents a recording. It's a timeline of what sound id was played at what point in time.
I have it so that you can play back this timeline/recording, and it works perfectly.
I'm wondering if there is anyway to take this timeline, and export it as a single sound that could be saved to a computer if the device was synced with iTunes.
So basically I'm asking if I can take a timeline of sounds, play it back and have these sounds stitched together as a single sound, that can then be exported.
I'm using OpenAL as my sound framework and the sound files are all CAFs.
Any help or guidance is appreciated.
Thanks!
You will need:
A good understanding of linear PCM audio format (See Wikipedia's Linear PCM page).
A good understanding of audio sample-rates and some basic maths to convert your timings into sample-offsets.
An awareness of how two's-complement binary numbers (signed/unsigned, 16-bit, 32-bit, etc.) are stored in computers, and how the endian-ness of a processor affects this.
Patience, interest in learning, and a strong desire to get this working.
Here's what to do:
Enable file sharing in your app (UIFileSharingEnabled=YES in info.plist and write files to /Documents directory).
Render the used sounds into memory buffers containing linear PCM audio data (if they are not already, i.e. if they are compressed). You can do this using the offline rendering functionality of Audio Queues (see Apple audio queue docs). It will make things a lot easier if you render them all to the same PCM format and sample rate (For example 16-bit signed samples #44,100Hz, I'll use this format for all examples), and use the same format for your output. I recommend starting off with a Mono format then adding stereo once you get it working.
Choose an uncompressed output format and mix your sounds into a single stream:
3.1. Allocate a buffer large enough, or open a file stream to write to.
3.2. Write out any headers (for example if using WAV format output instead of raw PCM) and write zeros (or the mid-point of your sample range if not using a signed sample format) for any initial silence before your first sound starts. For example if you want 0.1 seconds silence before your first sound, write 4410 (0.1 * 44100) zero-samples i.e. write 4410 shorts (16-bit) all with zero.
3.3. Now keep track of all 'currently playing' sounds and mix them together. Start with an empty list of 'currently playing sounds and keep track of the 'current time' of the sample you are mixing, for each sample you write out increment the 'current time' by 1.0/sample_rate. When it gets time for another sound to start, add it to the 'currently playing' list with a sample offset of 0. Now to do the mixing, you iterate through all of the 'currently playing' sounds and add together their current sample, then increment the sample offset for each of them. Write the summed value into the output buffer. For example if soundA starts at 0.1 seconds (after the silence) and soundB starts at 0.2 seconds, you will be doing the equivalent of output[8820] = soundA[4410] + soundB[0]; for sample 8820 and then output[8821] = soundA[4411] + soundB[1]; for sample 8821, etc. As a sound ends (you get to the end of its samples) simply remove it from the 'currently playing' list and keep going until the end of your audio data.
3.4. The simple mixing (sum of samples) described above does have some problems. For example if two samples have values that add up to a number larger than 32767, this cannot be stored in a signed-16-bit number, this is called clipping. For now, just clamp the value to 32767, and get it working... later on come back and implement a simple limiter (see description at end).
Now that you have a mixed version of your track in an uncompressed linear PCM format, that might be enough, so write it to /Documents. If you want to write it in a compressed format, you will need to get the source for an audio encoder and run your linear PCM output through that.
Simple limiter:
Let's choose to limit the top 10% of the sample range, so if the absolute value is greater than 29490 (int limitBegin = (int)(32767 * 0.9f);) we will scale down the value. The maximum possible peak would be int maxSampleValue = 32767 * numPlayingSounds; and we want to scale values above limitBegin to peak at 32767. So do the summation into sampleValue as per the very simple mixer described above, then:
if(sampleValue > limitBegin)
{
float overLimit = (sampleValue - limitBegin) / (float)(maxSampleValue - limitBegin);
sampleValue = limitBegin + (int)(overLimit * (32767 - limitBegin));
}
If you're paying attention, you will have noticed that when numPlayingSounds changes (for example when a new sound starts), the limiter becomes more (or less) harsh and this may result in abrupt volume changes (within the limited range) to accommodate the extra sound. You can use the maximum number of playing sounds instead, or devise some clever way to ramp up the limiter over a few milliseconds.
Remember that this is operating on the absolute value of sampleValue (which may be negative in signed formats), so the code here is just to demonstrate the idea. You'll need to write it properly to handle limiting at both ends (peak and trough) of your sample range. Also, there are some tricks you can do to optimize all of the above during the mixing - you will probably spot these while you're writing the mixer, be careful and get it working first, then go back and refactor/optimize if needed.
Also remember to consider the endian-ness of the platform you are using and the file-format you are writing to, as you may need to do some byte-swapping.
One approach which isn't too hard if your files are stored in a simple format is just to combine them together manually. That is, create a new file with the caf format and manually put together the pieces you want.
This will be really easy if the sounds are uncompressed (linear PCM). But, read the documents on the caf file format here:
http://developer.apple.com/library/mac/#documentation/MusicAudio/Reference/CAFSpec/CAF_spec/CAF_spec.html#//apple_ref/doc/uid/TP40001862-CH210-SW1

iPhone audio and AFSK

Here is a question for all you iPhone experts:
If you guys remember the sounds that modems used to make, or when one was trying to load a program from a cassette tape – I am trying to replicate this in an iPhone for a ham radio application. I have a stream of data (ASCII) and I need to encode it as AFSK at 1200 baud. So basically everything in the stream is converted to a series of 1200 and 2200 Hz tones. It needs to sound something like this: http://upload.wikimedia.org/wikipedia/commons/2/27/AFSK_1200_baud.ogg
I successfully built a bit array out of the string, but when I try to assign tones to each bit I get gaps in the sound, therefore it doesn’t demodulate correctly.
Any thought of how one should tackle this problem? Thank you.
The mobilesynth project is open-source. You might be able to scan that for code that generates the tones you need.
How are you assigning tones to the bits? Remember, a digital audio signal is just a stream of samples with values between -1 and 1. Perhaps there is a clipping issue between tone assignments. This can happen if the signal dives below -1 or above 1. If it stays above or below this range at a constant value, there will be no sound. Maybe you could output your stream of samples to check if this is the case. Or plug the output into an oscilloscope...
Also note that clicking can occur between "uneven" transitions of signals. For example if i output a sample with value 1 followed immediately by a sample with value -1, a click or pop will be produced.

How to lower sound on the iphone's sdk Audioqueue?

I'm using Aran Mulhollan' RemoteIOPlayer, using audioqueues in the SDK iphone.
I can without problems:
- adding two signals to mix sounds
- increasing sound volume by multiplying the UInt32 I get from the wav files
BUT every other operation gives me warped and distorted sound, and in particular I can't divide the signal. I can't seem to figure out what I'm doing wrong, the actual result of the division seems fine; some aspect of sound / signal processing must obviously be eluding me :)
Any help appreciated !
Have you tried something like this?
- (void)setQueue:(AudioQueueRef)ref toVolume:(float)newValue {
OSStatus rc = AudioQueueSetParameter(ref, kAudioQueueParam_Volume, newValue);
if (rc) {
NSLog(#"AudioQueueSetParameter returned %d when setting the volume.\n", rc);
}
}
First of all the code you mention does not use AudioQueues, it uses AudioUnits. The best way to mix audio in the iphone is using the mixer units that are inbuilt, there is some code on the site you downloaded your original example from here. Other than that what i would check in your code os that you have the correct data type. Are you trying your operations on Unsigned ints when you should be using signed ones? often that produces warped results (understandably)
The iPhone handles audio as 16-bit integer. Most audio files are already normalized so that the peak sample values are the maximum that fit in a 16-bit signed integer. That means if you add two such samples together, you get overflow, or in this case, audio clipping. If you want to mix two audio sources together and ensure there's no clipping, you must average the samples: add them together and divide by two. Or you set the volume to half. If you're using decibels, that would be about a -6 dB change.