My team and I are planning to build an external accessory for iOS that will sample ultrasonic sound at 256KHZ. It's a lot and I am wondering whether iOS vDSP can do the conversion from time domain to frequency domain for 256,000 samples/sec, or we need to have a hardware based solution for the FFT.
Sample projects from Apple such as aurioTouch are very helpful but I couldn't find that deals with sampling rate more than the professional audio sampling frequency. I need help figuring out the following:
Can vDSP FFTs process 256,000 samples/second? If not, any other creative ways to do the same aside from doing the conversion in the hardware?
The closest discussion I found related to this is
How many FFTs per second can I do on my smartphone? (for performing voice recognition)
A 256 kHz data rate is less than 6 times faster than normal 44100 audio. And float FFTs of real-time audio data using the vDSP/Accelerate framework use only in the neighborhood of 1% or less of 1 CPU on recent iOS devices.
The FFT computation time will be a tiny portion of the time available.
Source: I wrote the vDSP FFTs.
Why not see how the devices handle upsampled signals, starting with aurioTouch.
If you need it faster, you should measure the speeds of an integer based FFT implementation.
Related
Does the Android/iOS accelerometer/gyro API use a different ADC sampling rate OR mean/filtering to perform resampling?
I am trying to find out which sampling rate is best for my application and my approach is to collect data at 100 Hz and resample at different frequencies and determine which frequency is good enough for my use case.
However, the performance of this on the real application could be closer to the one on the development environment if I knew how this sampling was implemented on the Android API/ core-motion API. Is the API simply changing the ADC sampling frequency or is it implementing a interpolation or mean filtering after sampling at a higher frequency? (Note: The answers for the Android / iOS might be different, and I do not have this level of detailed knowledge on either system.)
Any links/sources are appreciated.
I think shopkick is detecting very high frequency signal which is not audible to human ear.But the real question is how they can detect signal of more than 22khz in iphone. I have checked frequency response of iphone mic,it seems to be from 20 hz to 22 khz within the human audible range.
http://blog.faberacoustical.com/2009/iphone/iphone-microphone-frequency-response-comparison/ http://www.businessinsider.com/shopkick-crate-barrel-2010-12?op=1
Can you guide me on this. If it is possible with iphone mic,then we can able do some signal processing specifically FFT in order to get frequency.
Well I am currently working on a similar system of transmitting data using these high frequencies and this is what I found out. Al-thou keep in mind that I am doing this with Android phones, mostly Galaxy S line.
First of all spectrum of 20khz to 22khz seems quite promising because it can be detected by all phones we tested and even reproduced by some of them. These frequencies are inaudible to humans of any age and even the dogs and cats seem to not notice them. If you are targeting (actually avoiding) detection by humans you could even go to as low as 18khz since most people wouldn't hear that. This gives you a bandwidth of 4000hz which you can Frequency modulate a data into. Of course don't expect to transmit 8mp images but some small data can be transmitted. You are right in the part that you could than use FFT to transit into frequency domain and analyse those frequencies, this can be done even on older phones in Java (I think doing it in objective c would be even faster).
Also if you have few iPhones on your disposal you could install any frequency analyser and play the frequencies you want on another iPhone or some speaker to test what they can detect. Just keep in mind that standard desktop speakers would probably be able to play the given frequencies but will introduce noise of lower frequency. Piezo tweeters are probably best for these type of sounds al-thou I must say I am using iPhone 4 to play these frequencies for testing quete efficiently.
I read somewhere that Shopkick now even plays there sound codes over stores PA-s and since those speakers are not really optimised for above 20khz response I too am starting to suspect they are using frequencies below that. Take a look at this website for different store codes that some people are using to cheat the system http://www.ceploitips.com/2011/03/shopkick-walk-in-files.html
Keep in mind that using these might ban your account since they improved there misuse detection algorithms.
Also I too would like to read more about the Shopkick implementation so if anyone viewing this has some link please share.
First, human hearing pretty much tops out at 20 KHz and even that requires a very young human and a very low and erratic shift along those upper frequencies. For example, I can produce a tone as low as 18 KHz at full iPad volume at a sample rate of 48 KHz that even my dog doesn't notice. Read up on PsychoAcoustics and you will see that humans filter echoes at even very low frequencies that are there but we don't notice them.
But in the case of ShopKick, I don't think they are going above even 21 KHz. I have created several digital audio modulations on the iPhone and 21 KHz seems to be the upper limit for any distance at all.
It would help if you gave more input on what you are doing. I assume from the question you want to modulate a digital signal between two devices.
My best guess is that they are using maximal length sequences. These are almost like a weak background hiss that covers a large range of the audio spectrum. The key to detection is that the pattern repeats exactly and the phone has a key that detects the sound by correlating the key and the incoming audio.
Before I reinvent the wheel I wanted to see if anyone can share code or tips for the following:
In order to get relative position of the iPhone, one needs to
Set the accelerometer read rate
Noise filter the accelerometer response
Convert it to a vector
Low pass filter the vector to find gravity
Subtract gravity from the raw reading to find the user caused acceleration
Filter the user caused acceleration to get the frequencies you are interested in ( probably bandpass depending on the application)
Integrate to find relative speed
Integrate to find position
So what I'm hoping is that people have already written some or all of the above and can provide tips, or better yet code.
A few questions I haven't found the answer to:
What is the frequency response of the iPhone accelerometer? What hardware filters exist between the accelerometer and the analog to digital converter?
What is the fastest reading rate the accelerometer delegate can be called without duplicating reading values?
Differences in the above for the various phones?
Any good tips for designing the filters, such as cutoff frequency for separating gravity and user motion?
Any code or tips for the integration steps? Any reason to integrate in the cartesion coordinate system rather than as vector, or vise versa?
Any other experiences, tips, or information that one should know prior to implementing this?
As I find information out, I'll be collecting it in this answer.
Hardware
The 3GS uses an ST LIS331DL 3-axis ±2g/±8g digital accelerometer.
The iPhone 4 and iPad use an ST LIS331DLH 3-axis ±2g/±4g/±8g digital accelerometer.
They are both capable of being read at 100Hz and 400Hz, although on the iPhone 3G (under iOS 4.1) the accelerometer delegate is not called more frequently than 100Hz even if setUpdateInterval is set for faster updates. I do not know if the API permits faster updates on the iPhone 4, and Apple's documentation merely states that the maximum is determined by the hardware of the iPhone. (TBD)
The A/D converter is on the same silicon as the MEM sensor, which is good for noise immunity.
The DL version is 8 bits (3GS) while the DLH version is 12 bits (iPhone 4). The maximum bias (offset) in the DL version is twice the bias of the DLH (0.04g vs 0.02g) version.
The data sheet for the DLH reports acceleration noise density, but that value is not reported on the DL datasheet. Noise density is reasonably low at 218 μg/√Hz for the DLH.
Both sensors give either 100Hz sampling or 400Hz sampling speeds, with no custom rate. The sensor discards values if the iPhone doesn't read the output register at the set sampling rate.
The "typical" full scale value for the DL sensor is ±2.3g, but ST only guarantees that it's at least ±2g.
Temperature effects on the sensor are present and measurable, but not very significant.
TBD:
Is the hardware filter turned on, and what are the filtering characteristics?
How noisy is the power supply to the accelerometer? (Anybody just happen to have the iPhone schematic laying around?)
The accelerometer uses an internal clock to provide timing for the sample rate and A/D conversion. The datasheet does not indicate the accuracy, precision, or temperature sensitivity of this clock. For accurate time analysis the iPhone must use an interrupt to sense when a sample is done and record the time in the interrupt. (whether this is done or not is unknown, but it's the only way to get accurate timing information)
API
Requesting lower than 100Hz sampling rates results in getting selected samples, while discarding the rest. If a sampling rate that is not a factor of 100Hz is requested in software, the time intervals between real sensor readings cannot be even. Apple does not guarantee even sampling rates even if a factor of 100 is used.
It appears that the API provides no software filtering.
The API does scale the raw accelerometer value into a double representing Gs. The scaling factor used is unknown, and whether this is different for each phone (ie, calibrated) and whether the calibration occurs on an ongoing basis to account fo sensor drift is unknown. Online reports seem to suggest that the iPhone does re-calibrate itself on occasion when it's lying flat on a surface.
Results from simple testing suggest that the API sets the sensor to ±2g for the 3GS, which is generally fine for handheld movements.
TBD:
Does Apple calibrate each unit so that the UIAccelerometer reports 1G as 1G? Apple's documentation specifically warns against using the device for sensitive measurement applications.
Does the reported NSTimeInterval represent when the values were read from the accelerometer, or when the accelerometer interrupt indicated that new values were ready?
I'm just dealing with the same problem. The only difference to your approach is that I don't want to rely on the low pass filter to find gravity. (TBH I don't see how I can reliably tell the gravity vector from the accelerometer readings)
Am trying it with the gyros right now.
I am new to iphone development.I am doing research on voice recording in iphone .I have downloaded the "speak here " sample program from Apple.I want to determine the frequency of my voice that is recorded in iphone.Please guide me .Please help me out.Thanks.
In the context of processing human speech, there's really no such thing as "the" frequency.
The signal will be a mix of many different frequencies, so it might be more fruitful to think in terms of a spectrum, rather than a single frequency. Even if you're talking about
a sustained musical note with a fixed pitch, there will be plenty of overtones and harmonics present, in addition to the fundamental frequency of the note. And for actual speech,
the frequency spectrum will change drastically even within a short clip, due to the different tonal characteristics of vowels and consonants.
With that said, it does make some sense to consider the peak frequency of a voice recording.
You could calculate the Fast Fourier Transform of your voice clip, then find the frequency
bin with the largest response. You may also be interested in the concept of a spectrogram, which represents how the audio spectrum of a signal varies over time.
Use Audacity. Take a small recording of typical speech, and cut it down to one wavelength, from one peak to another peak. Subtract the two times, and divide 1 by that number and you'll get the frequency of your wave in Hz.
Example:
In my audio clip, my waveform runs from 0.0760 to 0.0803 seconds.
0.0803-0.0760 = 0.0043
1/0.0043 = 232.558 Hz, my typical speech frequency
This might give you a good basis to create an analyzer. You'd need to detect the peaks, and time between the peaks of the wave and do an average calculation of the result.
You'll need to use Apple's Accelerate framework to take an FFT of the relevant audio. The FFT will convert the audio in the time domain to the frequency domain. The Accelerate framework supports the FFT and will allow you to do frequency analysis in real-time.
I got interested in this after I saw Square use the headphone jack on the iPhone to send credit card data.
What's the average bandwidth of the headphone jack on the iPhone, average notebook, and average mobile device?
Can it be doubled by sending different data streams on the different channels (left/right)?
One issue is the bandwidth of audio cables, which I won't go into here. As for audio ports, assume a soundcards with a maximum sample rate of 44,100 or 48,000 samples/s at 16 bits/sample/channel, resulting in a maximum bandwidth of 22.05 or 24 kHz (basically a result of the Nyquist-Shannon sampling theorem, though for sound sampling, the sampled signal would also have to be continuous-amplitude for this theorem to apply) and a transfer rate of 176.4 or 192 kBps for stereo.
According to Studio Six Digital, the line-in on the iPhone supports a max sample rate of 48 kHz. The mic on the 3G version also runs at 48 kHz, while the 1st gen iPhone's mic sampled at 8kHz. I haven't been able to find bit depth specs for the iPhone, but I believe it uses 16 bit samples. 24 bit samples is the other possibility.
According to Fortuny over at the Apple forums, who was quoting an Apple Audio Developer Note, the line-in on a MacBook support up to 24 bit samples with a 96 kHz sample rate, for a data rate of 576 kBps. Apple's MacBook External Ports and Connector's page lists the max sample rate as 192 kHz, but they may have switched that with the max sample rate for digital audio using the optical port.
For a rate comparison, phone systems had a sample rate of 8 kHz at 8 bits/sample mono, resulting in a max data rate of 8 kbps. FM has a sample rate of 22.05 kHz at 16 bits/sample/channel and is stereo, resulting in a data rate of 88.2 kBps.
Of course, the above calculations ignore the problem of synchronizing the data stream and error detection and correction, all of which will consume a portion of the signal.
Typical audio device maximum is 48Khz stereo, lots of devices can handle 96 Khz.
But course what comes out of the headphone jack is analog, not digital, and it runs through some filters as well on the way out, so some sort of tone modulation is the way to go. There may be some crosstalk between the stereo channels - how much crosstalk will be very device dependent.
0ld style telephone modems could send 9600 baud over standard analog lines that aren't even as clean as your typical headphone jack. And that's MONO. I would think you could get 2400 baud per channel without working too hard.
You might be able to go as high as 100K baud if you were very clever at signal processing.
Credit card validation systems were designed to run at 2400 baud mono last time I looked at them, It wouldn't surprise me if they still were given how much inertia there is in point of purchase systems.
I'm not sure if this is correct for all systems but almost all if not all sampling systems use a 1 bit delta modulation system that most likely embedded into the dsp chip set on most portable units. The decimation (changing 1 bit to 16,20 or 24 bit) is done in software and so is the anti aliasing filters. Mind you these dfp chips are being optimized via hardware so as to reduce energy consumption, so there may be a limit to what they could produce via software.
As far as nyquist limitations - these don't really come into context when transferring digital information over well controlled data paths. If you look at modems and the way they transmit information - they use a lot of DSP to send a higher band width by using phase shift keying - which looks at the relative phase shift to the carrier signal timing and can differentiate much smaller increments than the normal doubling of the nyquist limit.(sampling at 44khz while producing at data at 20 khz) so the dsp can see a 10 or 20 degree shift in the carrier frequency compared to the 180 degree shift. this is because you have a reference signal to compare with.
Also the data flow is all broadband spread-spectrum encoded which increases density a whole bunch (lookup jesse russell for broadband and Hedy Lamarr in spread-spectrum)
My laptop does 192khz at 24 bit (dell xrs/14z) or so they say. I usually transfer my audio via network connection to my main studio pc which has a ADAT optical to a remote unit so I get superior noise and cross talk levels. laptops and mobile smart phones are full of digital noise and are physically too small to reduce these issues. Until they get digital headphones (not likely soon) then one has to use discrete systems like they do in a professional recording studios.
I've put together a library to answer this question for myself. The iPhone has a pretty typical cutoff of around 20kHz, so the data rate you can achieve just depends on how good your SNR is. The relevant theory is the Shannon-Nyquist limit. I've managed to hit roughly 64kbps with this library, and I think more is possible with better tuning
If you'd like to see the library, it's https://github.com/quiet/quiet
Live demo: https://quiet.github.io/quiet-js/lab.html
20Khz is pretty much the max on any circuit intended to carry audio, because it's pretty much the top of the human ear's frequency response. Given the Nyquist limit, you're probably looking at 10Kb/sec tops. Of course, Back In The Day(TM), we though 9600b/s was high speed, so it might be good enough. And yes, you could double it using stereo output.