Zero-value data in createmediastreamsource input buffer when recording using web-audio - web-audio-api

I am attempting to record live audio via USB microphone to be converted to WAV and uploaded to a server. I am using Chrome Canary (latest build) on Windows XP. I have based my development on the example at http://webaudiodemos.appspot.com/AudioRecorder/index.html
I see that when I activate the recording, the onaudioprocess event input buffers (e.inputBuffer.getChannelData(0) for example) are all zero-value data. Naturally, there is no sound output or recorded when this is the case. I have verified the rest of the code by replacing the input buffer data with data that produces a tone which shows up in the output WAV file. When I use approaches other than createMediaStreamSource, things are working correctly. For example, I can use createObjectURL and set an src to that and successfully hear my live audio played back in real time. I can also load an audio file and using createBufferSource, see that during playback (which I hear), the inputBuffer has non-zero data in it, of course.
Since most of the web-audio recording demos I have seen on the web rely upon createMediaStreamSource, I am guessing this has been inadvertantly broken in some subsequent release of Chrome. Can anyone confirm this or suggest how to overcome this problem?

It's probably not the version of Chrome. Live input still has some high requirements right now:
1) Input and output sample rates need to be the same on Windows
2) Windows 7+ only - I don't believe it will work on Windows XP, which is likely what is breaking you.
3) Input device must be stereo (or >2 channels) - many, if not most, USB microphones show up as a mono device, and Web Audio isn't working with them yet.
I'm presuming, of course, that my AudioRecorder demo isn't working for you either.
These limitations will be removed over time.

Related

chrome speech recognition WebKitSpeechRecognition() not accepting input of fake audio device --use-file-for-fake-audio-capture or audio file

I would like to use chrome speech recognition WebKitSpeechRecognition() with the input of an audio file for testing purposes. I could use a virtual microphone but this is really hacky and hard to implement with automation, but when I tested it everything worked fine and the speechrecognition converted my audio file to text. now I wanted to use the following chrome arguments:
--use-file-for-fake-audio-capture="C:/url/to/audio.wav"
--use-fake-device-for-media-stream
--use-fake-ui-for-media-stream
This worked fine on voice recorder sites for example and I could hear the audio file play when I replayed the recording. But for some reason when I try to use this on WebKitSpeechRecognition of chrome then it doesn't use the fake audio device but instead my actual microphone. Is there any way I can fix this or test my audio files on the website? I am using C# and I couldn't really find any useful info on automatically adding, managing and configuring virtual audio devices. What approaches could I take?
Thanks in advance.
Well it turns out this is not possible because chrome and google check if you are using a fake mic ect, they do this specifically to prevent this kind of behavior so people cannot get free speech to text. There is a paid api available from google (first 60 minutes per month are free)

Streaming Live audio to the browser - Alternatives to the Web Audio API?

I am attempting to stream live audio from an iOS device to a web browser. The iOS device sends small, mono wav files (as they are recorded) through a web socket. Once the client receives the wav files, I have the Web Audio API decode and schedule them accordingly.
This gets me about 99% of the way there, except I can hear clicks between each audio chunk. After some reading around, I have realized the likely source of my problem: the audio is being recorded at a sample rate of only 4k and this cannot be changed. It appears that the Web Audio API's decodeAudioData() function does not handle sample rates other than 44.1k with exact precision resulting in gaps between chunks.
I have tried literally everything I could find about this problem (scriptProcessorNodes, adjusting the timing, creating new buffers, even manually upsampling) and none of them have worked. At this point I am about to abandon the Web Audio API.
Is the Web Audio API appropriate for this?
Is there a better alternative for what I am trying to accomplish?
Any help/suggestions are appreciated, thanks!
Alas! AudioFeeder.js works wonders. I just specify the sampling rate of 4k, feed it raw 32 bit pcm data and it outputs a consistent stream of seamless audio! Even has built in buffer handling events, so no need to set any loops or timeouts to schedule chunk playback. I did have to tweak it a bit, though, to connect it to the rest of my web audio nodes and not just context.destination.
Note: AudioFeeder does automatically upsample to the audio context sampling rate. Going from 4k to 44.1k did introduce some pretty gnarly sounding artifacts in the highend, but a 48db lowpass filter (4 x 12db's) at 2khz got rid of them. I chose 2khz because, thanks to Harry Nyquist, I know that a sampling rate of 4k couldn't have possibly produced frequencies above 2khz in the original file.
All hail Brion Vibbers

HLS H264/AAC stream functions perfectly except on OS X and iOS?

We are developing an open-source streaming server and are running into some trouble with our implementation of HLS.
We've been able to successfully convert to TS and segment into HLS segments any stream we want to, and it plays back beautifully on most every player... except for the Apple players (iPad, iPhone, Safari, Quicktime). On those, the H264 encoding settings need to be picked very carefully, and even when sticking to Baseline/3.0 some visible glitching can be seen.
The AAC audio, no matter how we encode it (both ffmpeg's aac and the libfkd_aac encoders were tried in nearly all possible configurations) sounds choppy as well. (Again, all these versions play back just fine on non-Apple players.) Changing the encoding settings does yield better results sometimes, but we've not been able to find any combination that will work for every video we've been testing with.
This leads us to conclude that perhaps the Apple-based players require something in the TS stream itself that we're not doing correctly. Is there anything that could cause this kind of behavior? For reference, an HLS teststream outputted by our packager/segmenter can be found here: link
We appreciate any feedback!

iPhone HE-AAC Streaming over Mobile Network (3G)

Developed an internet radio streamer using jPlayer which utilizes the html5 audio tags with jQuery and has a flash fall back for unsupported browsers. Upon testing the player on the iPhone (iOS 5.0.1), we ran into a very peculiar issue.
When the iPhone is connected to WiFi, it streams perfectly using the HE-AAC V2 stream # 64kbps 44.1kHz (the preferred codec for apple products). However, when the iPhone is connected to the 3G mobile network, it "stutters" or stops streaming for 1-2 secs every 1-2 minutes (does not stop streaming completely). The troubling thing is when the iPhone is forced to use a separate MP3 stream at the same bit rate, it does not have this issue and works very well on 3G.
UPDATE 5
We recently acquired a 3G/4G Sprint mobile hotspot device and tested this issue with the device. When the iPhone is connected to the mobile hotspot, it shows as being connected to a wifi device and the issue does not render even tho the actual connection is via 3G/4G. This might point back to the issue being with the iPhone not handling HE-AAC via HTTP Live Streaming and when directly connected to the mobile network.
UPDATE 4
Updated the iPhone to iOS 5.1 yet the issue persists.
UPDATE 3
Read here on SO various issues of script not rendering correctly when connected to mobile networks. The finger seems to point to the mobile network carriers that may be inserting Proxy to serve webpages, e.g. for downsizing images. Also it might inject some JavaScript pages. The test page can be found HERE Note: this page is using HE-AAC so it will only work on iPhone...
UPDATE
According to Apple's HTTP Live Streaming doc for iOS devices, that "Audio-only content can be either MPEG-2 transport or MPEG elementary audio streams, either in AAC format with ADTS headers or in MP3 format." Our music server is using OddcastV3 encoder to send out three streams (MP3, HE-AAC V2, and Oggvorbis) to the icecastV2 server. Not sure if the encoder is inserting ADTS headers for the HE-AAC V2 stream. Is there a way to check for this?
Comming from a radio planning point of view - here are my two cents:
What you are describing sounds like bandwidth shaping - which is both a common and often neccesary design of radio networks (like 3G networks). In most 3G operators I worked at you would typically optimize your network to give high-speed burst (think downloading an image, sending one email or fetching one HTML page) - over "long-running" high bandwidth services.
This is due to the simple fact that this is what most users want/need.
This shaping can on a typical 3GPP (GSM 3G) network result in that you will first get a RAB (radio access bearer) supporting 384kbit and is then downgraded as long as your your device accepts it.
This means that typicall you will get switched from 384 -> 256 -> 128, then 64kbit where maybe your device starts recieving data to slowly, then the network upgrades it and again downgrade it after a while.
So why is not then the MP3 file stuttering? my guess is that the total kbit rate might differ - so you are fine in the 64kbit RAB. This is a common phenomena.
We have managed to get the exact same thing working. 64kbit AAC-v2 on mobile devices. We are streaming files and not a steady stream, I think Magnus is right when he explains how the network prioritized traffic to bursts, in our case that means we have large parts of the file right away and the player can continue to play until he next burst comes in. In your case that means the stream pauses until the next burst comes.
Either if you can switch to larger chunks in your streaming (larger buffer) or stream whole files instead?
We had a very strange phenomenon with iOS, we had to rename all files from .m4a to .aac to be able to get them streaming on iOS. If we didn't rename them iOS wouldnt play them.
Good luck.

Realtime Audio/Video Streaming FROM iPhone to another device (Browser, or iPhone)

I'd like to get real-time video from the iPhone to another device (either desktop browser or another iPhone, e.g. point-to-point).
NOTE: It's not one-to-many, just one-to-one at the moment. Audio can be part of stream or via telephone call on iphone.
There are four ways I can think of...
Capture frames on iPhone, send
frames to mediaserver, have
mediaserver publish realtime video
using host webserver.
Capture frames on iPhone, convert to
images, send to httpserver, have
javascript/AJAX in browser reload
images from server as fast as
possible.
Run httpServer on iPhone, Capture 1 second duration movies on
iPhone, create M3U8 files on iPhone, have the other
user connect directly to httpServer on iPhone for
liveStreaming.
Capture 1 second duration movies on
iPhone, create M3U8 files on iPhone,
send to httpServer, have the other
user connected to the httpServer
for liveStreaming. This is a good answer, has anyone gotten it to work?
Is there a better, more efficient option?
What's the fastest way to get data off the iPhone? Is it ASIHTTPRequest?
Thanks, everyone.
Sending raw frames or individual images will never work well enough for you (because of the amount of data and number of frames). Nor can you reasonably serve anything from the phone (WWAN networks have all sorts of firewalls). You'll need to encode the video, and stream it to a server, most likely over a standard streaming format (RTSP, RTMP). There is an H.264 encoder chip on the iPhone >= 3GS. The problem is that it is not stream oriented. That is, it outputs the metadata required to parse the video last. This leaves you with a few options.
Get the raw data and use FFmpeg to encode on the phone (will use a ton of CPU and battery).
Write your own parser for the H.264/AAC output (very hard)
Record and process in chunks (will add latency equal to the length of the chunks, and drop around 1/4 second of video between each chunk as you start and stop the sessions).
"Record and process in chunks (will add latency equal to the length of the chunks, and drop around 1/4 second of video between each chunk as you start and stop the sessions)."
I have just wrote such a code, but it is quite possible to eliminate such a gap by overlapping two AVAssetWriters. Since it uses the hardware encoder, I strongly recommend this approach.
We have similar needs; to be more specific, we want to implement streaming video & audio between an iOS device and a web UI. The goal is to enable high-quality video discussions between participants using these platforms. We did some research on how to implement this:
We decided to use OpenTok and managed to pretty quickly implement a proof-of-concept style video chat between an iPad and a website using the OpenTok getting started guide. There's also a PhoneGap plugin for OpenTok, which is handy for us as we are not doing native iOS.
Liblinphone also seemed to be a potential solution, but we didn't investigate further.
iDoubs also came up, but again, we felt OpenTok was the most promising one for our needs and thus didn't look at iDoubs in more detail.