Bandwidth from headphone/microphone jack - iphone

I got interested in this after I saw Square use the headphone jack on the iPhone to send credit card data.
What's the average bandwidth of the headphone jack on the iPhone, average notebook, and average mobile device?
Can it be doubled by sending different data streams on the different channels (left/right)?

One issue is the bandwidth of audio cables, which I won't go into here. As for audio ports, assume a soundcards with a maximum sample rate of 44,100 or 48,000 samples/s at 16 bits/sample/channel, resulting in a maximum bandwidth of 22.05 or 24 kHz (basically a result of the Nyquist-Shannon sampling theorem, though for sound sampling, the sampled signal would also have to be continuous-amplitude for this theorem to apply) and a transfer rate of 176.4 or 192 kBps for stereo.
According to Studio Six Digital, the line-in on the iPhone supports a max sample rate of 48 kHz. The mic on the 3G version also runs at 48 kHz, while the 1st gen iPhone's mic sampled at 8kHz. I haven't been able to find bit depth specs for the iPhone, but I believe it uses 16 bit samples. 24 bit samples is the other possibility.
According to Fortuny over at the Apple forums, who was quoting an Apple Audio Developer Note, the line-in on a MacBook support up to 24 bit samples with a 96 kHz sample rate, for a data rate of 576 kBps. Apple's MacBook External Ports and Connector's page lists the max sample rate as 192 kHz, but they may have switched that with the max sample rate for digital audio using the optical port.
For a rate comparison, phone systems had a sample rate of 8 kHz at 8 bits/sample mono, resulting in a max data rate of 8 kbps. FM has a sample rate of 22.05 kHz at 16 bits/sample/channel and is stereo, resulting in a data rate of 88.2 kBps.
Of course, the above calculations ignore the problem of synchronizing the data stream and error detection and correction, all of which will consume a portion of the signal.

Typical audio device maximum is 48Khz stereo, lots of devices can handle 96 Khz.
But course what comes out of the headphone jack is analog, not digital, and it runs through some filters as well on the way out, so some sort of tone modulation is the way to go. There may be some crosstalk between the stereo channels - how much crosstalk will be very device dependent.
0ld style telephone modems could send 9600 baud over standard analog lines that aren't even as clean as your typical headphone jack. And that's MONO. I would think you could get 2400 baud per channel without working too hard.
You might be able to go as high as 100K baud if you were very clever at signal processing.
Credit card validation systems were designed to run at 2400 baud mono last time I looked at them, It wouldn't surprise me if they still were given how much inertia there is in point of purchase systems.

I'm not sure if this is correct for all systems but almost all if not all sampling systems use a 1 bit delta modulation system that most likely embedded into the dsp chip set on most portable units. The decimation (changing 1 bit to 16,20 or 24 bit) is done in software and so is the anti aliasing filters. Mind you these dfp chips are being optimized via hardware so as to reduce energy consumption, so there may be a limit to what they could produce via software.
As far as nyquist limitations - these don't really come into context when transferring digital information over well controlled data paths. If you look at modems and the way they transmit information - they use a lot of DSP to send a higher band width by using phase shift keying - which looks at the relative phase shift to the carrier signal timing and can differentiate much smaller increments than the normal doubling of the nyquist limit.(sampling at 44khz while producing at data at 20 khz) so the dsp can see a 10 or 20 degree shift in the carrier frequency compared to the 180 degree shift. this is because you have a reference signal to compare with.
Also the data flow is all broadband spread-spectrum encoded which increases density a whole bunch (lookup jesse russell for broadband and Hedy Lamarr in spread-spectrum)
My laptop does 192khz at 24 bit (dell xrs/14z) or so they say. I usually transfer my audio via network connection to my main studio pc which has a ADAT optical to a remote unit so I get superior noise and cross talk levels. laptops and mobile smart phones are full of digital noise and are physically too small to reduce these issues. Until they get digital headphones (not likely soon) then one has to use discrete systems like they do in a professional recording studios.

I've put together a library to answer this question for myself. The iPhone has a pretty typical cutoff of around 20kHz, so the data rate you can achieve just depends on how good your SNR is. The relevant theory is the Shannon-Nyquist limit. I've managed to hit roughly 64kbps with this library, and I think more is possible with better tuning
If you'd like to see the library, it's https://github.com/quiet/quiet
Live demo: https://quiet.github.io/quiet-js/lab.html

20Khz is pretty much the max on any circuit intended to carry audio, because it's pretty much the top of the human ear's frequency response. Given the Nyquist limit, you're probably looking at 10Kb/sec tops. Of course, Back In The Day(TM), we though 9600b/s was high speed, so it might be good enough. And yes, you could double it using stereo output.

Related

Advice on converting ultrasonic rat call signal into human audible range with matlab

I'm doing a project studying rats who squeak in the ultrasonic range (20kHz to 100kHz) using Matlab software and sound files.
I have (or will be getting) a couple .wav audio signals of these rats speaking, and among general analysis of these wave forms, I also want to convert these ultrasonic signals (outside of our hearing), into the human audible range (20hz to 20khz).
Could I get some advice on how to do this conversion (via Matlab programming and not by using equipment)
Looking into this, I've found names such as:
-frequency division
-heterodyning
-envelope detection
-time expansion
but looking into these it seems either they are explained in terms of what the equipment (bat detectors) does, or they sound incredibly similar to each other. e.g. frequency division and time expansion both involve dividing the incoming signal by 10
since I am looking into what seems to be unfamiliar turf, it would be great to find multiple ways to convert the signal (to my knowledge the names above have their own associated positive and negative traits)
Your question is a signal processing question more than a Matlab question, which isn't really what Stack Overflow is about, so you might get some negative votes.
There are indeed a number of methods of changing the frequency of audio (or any signals):
1) Slow it Down: The least disruptive to the signal is simply to slow down the audio. If you are looking to have rat signals up to 100 kHz, you'll need to sample the audio at 200 kHz or greater. Once you have your recording, simply re-save the wav file telling it that the sample rate is 44.1 kHz (or whatever). This will play it more slowly, but all the frequencies will now be audible (unlike the single side band demodulation discussed below). This is definitely the place you should start...it's the easiest and will sound the best.
fs = 200e3; %your original sample rate
myAudio = load('myFile.mat'); %your original audio
fs = 44.1e3; %simply declare that you want a lower sample rate
wavwrite(myAudio,fs,16,'myFile_44kHz.wav'); %save it out at the new rate
2) Single-Side Band: Use the demod command to "demodulate" the signal to lower its frequency. There are a number of demodulation methods available with this command. I'd use "single side band (suppressed carrier)" because that is how the rat itself (and humans) create sound. To do the demodulation, you'll have to assume a "carrier frequency", as if it were a radio signal. If the lowest frequency of a rat squeek is 20 kHz, you can assume a carrier of 20 kHz. This operation will shift all of your audio down by 20 kHz. As a result the squeek that was originall 20-100 kHz, will now be 0-80 kHz. So, you won't hear the whole thing, but you'll hear part of it.
fs = 200e3; %your original sample rate
myAudio = load('myFile.mat'); %your original audio
[b,a]=butter(2,20e3/(fs/2),'high'); %define highpass filter
myAudio = filtfilt(b,a,myAudio); %remove the low frequencies
myAudio = demod(myAudio,20e3,fs,'amssb'); %shift it down 20 kHz
wavwrite(myAudio,fs,16,'myWave_shifted.wav'); %save it out
3) Phase Vocoder (or other Pitch Shifting): To hear the whole 20-100 kHz range (which is 80 kHz bandwidth, which is 4x bigger than the 20 kHz bandwidth of human hearing), you've got to go to more extreme methods. These methods will make the audio sound bizarre, but you can give it a try. There are several algorithms. Look up "phase vocoder". Or, use one of audio processing software packages like Audacity, Raven, etc.

iPhone4s, iPhone 5 max FFTs per second using vDSP

My team and I are planning to build an external accessory for iOS that will sample ultrasonic sound at 256KHZ. It's a lot and I am wondering whether iOS vDSP can do the conversion from time domain to frequency domain for 256,000 samples/sec, or we need to have a hardware based solution for the FFT.
Sample projects from Apple such as aurioTouch are very helpful but I couldn't find that deals with sampling rate more than the professional audio sampling frequency. I need help figuring out the following:
Can vDSP FFTs process 256,000 samples/second? If not, any other creative ways to do the same aside from doing the conversion in the hardware?
The closest discussion I found related to this is
How many FFTs per second can I do on my smartphone? (for performing voice recognition)
A 256 kHz data rate is less than 6 times faster than normal 44100 audio. And float FFTs of real-time audio data using the vDSP/Accelerate framework use only in the neighborhood of 1% or less of 1 CPU on recent iOS devices.
The FFT computation time will be a tiny portion of the time available.
Source: I wrote the vDSP FFTs.
Why not see how the devices handle upsampled signals, starting with aurioTouch.
If you need it faster, you should measure the speeds of an integer based FFT implementation.

Ultrasound iphone (Shopkick signal technology)

I think shopkick is detecting very high frequency signal which is not audible to human ear.But the real question is how they can detect signal of more than 22khz in iphone. I have checked frequency response of iphone mic,it seems to be from 20 hz to 22 khz within the human audible range.
http://blog.faberacoustical.com/2009/iphone/iphone-microphone-frequency-response-comparison/ http://www.businessinsider.com/shopkick-crate-barrel-2010-12?op=1
Can you guide me on this. If it is possible with iphone mic,then we can able do some signal processing specifically FFT in order to get frequency.
Well I am currently working on a similar system of transmitting data using these high frequencies and this is what I found out. Al-thou keep in mind that I am doing this with Android phones, mostly Galaxy S line.
First of all spectrum of 20khz to 22khz seems quite promising because it can be detected by all phones we tested and even reproduced by some of them. These frequencies are inaudible to humans of any age and even the dogs and cats seem to not notice them. If you are targeting (actually avoiding) detection by humans you could even go to as low as 18khz since most people wouldn't hear that. This gives you a bandwidth of 4000hz which you can Frequency modulate a data into. Of course don't expect to transmit 8mp images but some small data can be transmitted. You are right in the part that you could than use FFT to transit into frequency domain and analyse those frequencies, this can be done even on older phones in Java (I think doing it in objective c would be even faster).
Also if you have few iPhones on your disposal you could install any frequency analyser and play the frequencies you want on another iPhone or some speaker to test what they can detect. Just keep in mind that standard desktop speakers would probably be able to play the given frequencies but will introduce noise of lower frequency. Piezo tweeters are probably best for these type of sounds al-thou I must say I am using iPhone 4 to play these frequencies for testing quete efficiently.
I read somewhere that Shopkick now even plays there sound codes over stores PA-s and since those speakers are not really optimised for above 20khz response I too am starting to suspect they are using frequencies below that. Take a look at this website for different store codes that some people are using to cheat the system http://www.ceploitips.com/2011/03/shopkick-walk-in-files.html
Keep in mind that using these might ban your account since they improved there misuse detection algorithms.
Also I too would like to read more about the Shopkick implementation so if anyone viewing this has some link please share.
First, human hearing pretty much tops out at 20 KHz and even that requires a very young human and a very low and erratic shift along those upper frequencies. For example, I can produce a tone as low as 18 KHz at full iPad volume at a sample rate of 48 KHz that even my dog doesn't notice. Read up on PsychoAcoustics and you will see that humans filter echoes at even very low frequencies that are there but we don't notice them.
But in the case of ShopKick, I don't think they are going above even 21 KHz. I have created several digital audio modulations on the iPhone and 21 KHz seems to be the upper limit for any distance at all.
It would help if you gave more input on what you are doing. I assume from the question you want to modulate a digital signal between two devices.
My best guess is that they are using maximal length sequences. These are almost like a weak background hiss that covers a large range of the audio spectrum. The key to detection is that the pattern repeats exactly and the phone has a key that detects the sound by correlating the key and the incoming audio.

Accelerometer to relative position

Before I reinvent the wheel I wanted to see if anyone can share code or tips for the following:
In order to get relative position of the iPhone, one needs to
Set the accelerometer read rate
Noise filter the accelerometer response
Convert it to a vector
Low pass filter the vector to find gravity
Subtract gravity from the raw reading to find the user caused acceleration
Filter the user caused acceleration to get the frequencies you are interested in ( probably bandpass depending on the application)
Integrate to find relative speed
Integrate to find position
So what I'm hoping is that people have already written some or all of the above and can provide tips, or better yet code.
A few questions I haven't found the answer to:
What is the frequency response of the iPhone accelerometer? What hardware filters exist between the accelerometer and the analog to digital converter?
What is the fastest reading rate the accelerometer delegate can be called without duplicating reading values?
Differences in the above for the various phones?
Any good tips for designing the filters, such as cutoff frequency for separating gravity and user motion?
Any code or tips for the integration steps? Any reason to integrate in the cartesion coordinate system rather than as vector, or vise versa?
Any other experiences, tips, or information that one should know prior to implementing this?
As I find information out, I'll be collecting it in this answer.
Hardware
The 3GS uses an ST LIS331DL 3-axis ±2g/±8g digital accelerometer.
The iPhone 4 and iPad use an ST LIS331DLH 3-axis ±2g/±4g/±8g digital accelerometer.
They are both capable of being read at 100Hz and 400Hz, although on the iPhone 3G (under iOS 4.1) the accelerometer delegate is not called more frequently than 100Hz even if setUpdateInterval is set for faster updates. I do not know if the API permits faster updates on the iPhone 4, and Apple's documentation merely states that the maximum is determined by the hardware of the iPhone. (TBD)
The A/D converter is on the same silicon as the MEM sensor, which is good for noise immunity.
The DL version is 8 bits (3GS) while the DLH version is 12 bits (iPhone 4). The maximum bias (offset) in the DL version is twice the bias of the DLH (0.04g vs 0.02g) version.
The data sheet for the DLH reports acceleration noise density, but that value is not reported on the DL datasheet. Noise density is reasonably low at 218 μg/√Hz for the DLH.
Both sensors give either 100Hz sampling or 400Hz sampling speeds, with no custom rate. The sensor discards values if the iPhone doesn't read the output register at the set sampling rate.
The "typical" full scale value for the DL sensor is ±2.3g, but ST only guarantees that it's at least ±2g.
Temperature effects on the sensor are present and measurable, but not very significant.
TBD:
Is the hardware filter turned on, and what are the filtering characteristics?
How noisy is the power supply to the accelerometer? (Anybody just happen to have the iPhone schematic laying around?)
The accelerometer uses an internal clock to provide timing for the sample rate and A/D conversion. The datasheet does not indicate the accuracy, precision, or temperature sensitivity of this clock. For accurate time analysis the iPhone must use an interrupt to sense when a sample is done and record the time in the interrupt. (whether this is done or not is unknown, but it's the only way to get accurate timing information)
API
Requesting lower than 100Hz sampling rates results in getting selected samples, while discarding the rest. If a sampling rate that is not a factor of 100Hz is requested in software, the time intervals between real sensor readings cannot be even. Apple does not guarantee even sampling rates even if a factor of 100 is used.
It appears that the API provides no software filtering.
The API does scale the raw accelerometer value into a double representing Gs. The scaling factor used is unknown, and whether this is different for each phone (ie, calibrated) and whether the calibration occurs on an ongoing basis to account fo sensor drift is unknown. Online reports seem to suggest that the iPhone does re-calibrate itself on occasion when it's lying flat on a surface.
Results from simple testing suggest that the API sets the sensor to ±2g for the 3GS, which is generally fine for handheld movements.
TBD:
Does Apple calibrate each unit so that the UIAccelerometer reports 1G as 1G? Apple's documentation specifically warns against using the device for sensitive measurement applications.
Does the reported NSTimeInterval represent when the values were read from the accelerometer, or when the accelerometer interrupt indicated that new values were ready?
I'm just dealing with the same problem. The only difference to your approach is that I don't want to rely on the low pass filter to find gravity. (TBH I don't see how I can reliably tell the gravity vector from the accelerometer readings)
Am trying it with the gyros right now.

How to differentiate between silence pattern and a beep pattern in sound signals in iPhone OS

I am doing sound latency test. my device will be receiving either a beep signal or a silence signal. How can i differentiate between these signals. Please help me. Thanks in advance..
Look at around 10 ms worth of samples (e.g. 441 samples at 44.1 kHz) and measure the energy in that buffer. If it's above some threshold it's a signal and if it's below the threshold then it's silence.
To measure energy just sum the squared value of each sample in the buffer and divide by the number of samples.
It depends. If the digital audio was generated synthetically (like by another function) and you can thus rely on the fact that, in one case, you'll get true digital silence (zeroed samples), then the solution is simply to test for the zeroed samples over the measurement window. Anything other than zero is not silence.
I would guess, though, that you're dealing with real-world audio recorded from, say, a microphone. If this is the case, then measuring the energy in a time window and comparing it to a threshold indeed makes sense. The two parameters that you'll have to determine are:
Threshold energy level
Length of the time window
If the threshold is too low, your false positive rate will be too high; background noise that is not a beep may be interpreted as a beep. Conversely, if your threshold is too high, your system could categorize a beep as noise. Luckily, if you're doing audio with a reasonably low background noise, your performance won't be very sensitive to this threshold.
Longer window lengths will decrease these false positive/negative rates, thus making your system more robust, but system usability may suffer with overly long windows. For instance, automated phone systems classify keypresses to aid menu navigation. If they required the user to hold each key for three seconds at a time, the accuracy would improve but at the expense of almost all usability.
I encourage you to NOT make a decision based solely on the one maximal sample as Paul suggested. Doing this completely undermines the resistance to false positives provided by the length of the sampling window.
What if they use the loop back method, does noise take into account? For example, If they send a Beep to second device, Loopback & send it back to the sender, send a silence packet and do the same, Can't they measure the latency at the sender level(provided they know the actual network latency).