I am getting a remote media stream in which both audio and video resides. In this stream i am getting mono audio while i need stereo audio.
So how can i convert mono audio to stereo audio using Web Audio API
August 2016 update:
Apparently the behavior of the .connect() call changed; in the past the input/output indices were automatically incremented with each call, but now they just default to 0 - so when unspecified the calls will always connect output 0 to input 0.
If the input stream has two channels, but only one of the channels is used, you'll have to manually route that channel to both the left and right speakers. It is possible to break the two channels in a stereo connection (the connection with the MediaElementSource) up into two separate mono connections by using a ChannelSplitter. Then the left (or right, depending on your usage) channel-connection can easily be routed to the left and right connections of the ChannelMerger (thanks to fan-out support allowing a single output to be connected to multiple different inputs), which'll combine all it's mono input connections back into one stero output connection. The Gain node as shown in the old answer is unnecessary.
These connections can be made by specifiying the correct indices to the connect(AudioNode destination, optional unsigned long output = 0, optional unsigned long input = 0); call, as mentioned above.
//create a source node to capture the audio from your video element
source = context.createMediaElementSource(document.querySelector('video'));
//Create the splitter and the merger
splitter = context.createChannelSplitter();
merger = context.createChannelMerger();
//route the source audio to the splitter. This is a stereo connection.
source.connect(splitter);
//route output 0 (left) from the splitter to input 0 (left) on the merger. This is a mono connection, carrying the left output signal to the left input of the Merger.
splitter.connect(merger, 0, 0);
//route output 0 (left) from the splitter to input 1 (right) on the merger. This is a mono connection as well, carrying the left output signal to the right input of the Merger.
splitter.connect(merger, 0, 1);
//finally, connect the merger to the destination. This is a stereo connection.
merger.connect(context.destination);
And here is what it looks like in a diagram. Remember that the connections between the input and splitter, and the Merger and destination are stereo connections (or more, depending on the configuration - when you have a 2.1, 5.1 or 7.1 set the connection between the merger and destination can contain 3, 6 or 8 channels respectively), whilst the two connections between the Splitter and Merger are mono connections.
+--------------+ +------------+ +-------------------+ +-----------+
| Stereo input |===>| Splitter | | Merger |===>|destination|
+--------------+ | channel0 |--->| channel0 (left) | +-----------+
| channel1 | \ | |
| etc | ->| channel1 (right)|
+------------+ +-------------------+
I am not a hundred percent sure, but this might work with the channel merger node. You just connect a gainnode to both input 1 and 2. (call .connect twice).
Edit (I have time now, so a more complete answer):
Do you really receive one channel audio, as webAudio should mix that automatically, according to this document, which states: For example, if a mono audio stream is connected to a stereo input it should just mix to left and right channels appropriately.. If you receive a stereo stream, where only one channel contains data, you will need to split this up in two channels, and then connect the channel with the audio to both left and right: (working example here)
gain = context.createGain();
splitter = context.createChannelSplitter();
merger = context.createChannelMerger();
merger.connect(context.destination);
source.connect(splitter);
splitter.connect(gain);
gain.connect(merger);
gain.connect(merger);
What happens on the merger and splitter is that when you call .connect, you take the next channel, but you just want to take the first channel, and then split that. So we route that to a gain node, and split it from there:
+----------------+ +----+ +------------------+
+----------+ | Splitter | |gain| | Merger |
|mono input|--->| channel0 |--->| |---->| channel0 (left) | +-----------+
+----------+ | channel1 | | | | |-->|destination|
| etc | | |---->| channel1 (right) | +-----------+
+----------------+ +----+ +------------------+
Related
Background Information
I am trying to make sure I will be able to run two ADXL345 Accelerometers on the same I2C Bus.
To my understanding, the bus can transmit up to 400k bits/s on fast mode.
In order to send 1 byte of data, there are 20 extra bits of overhead.
There are 6 bytes per accelerometer reading (XLow, XHigh, YLow, YHigh, ZLow, ZHigh)
I need to do 1000 readings per second with both accelerometers
Thus,
My total data used per second is 336k bits/s which is within my limit of 400k bits/s.
I am not sure if I am doing these calculations correctly.
Question:
How much data am I transmitting per second with two accelerometers reading 1000 times per second on i2c?
Your math seems to be a bit off; for this accelerometer (from the datasheet: https://www.sparkfun.com/datasheets/Sensors/Accelerometer/ADXL345.pdf), in order to read the 6 bytes of XYZ sample data, you need to perform a 6-byte burst read of the registers. What this means in terms of data transfer is a write of the register address to the accelerometer (0x31) then a burst read of 6 bytes continuously. Each of these two transfers requires sending first the I2C device address and the R/W bit, as well as an ACK/NAK per byte, including the address bytes, as well as START/REPEAT START/STOP conditions. So, over all, an individual transfer to get a single sample (ie, a single XYZ acceleration vector) is as follows:
Start (*) | Device Address: 0x1D (7) | Write: 0 (1) | ACK (1) | Register Address: 0x31 (8) | ACK (1) | Repeat Start (*) | Device Address: 0x1D (7) | Read: 1 (1) | ACK (1) | DATA0 (8) | ACK(1) | DATA1 (8) | ACK (1) | ... | DATA5 (8) | NAK (1) | Stop (*)
If we add all that up, we get 81+3 bits of data that need to be transmitted. Note first that the START, REPEAT START and STOP might not actually take a bits worth of time each but for simplicity we can assume they do. Note also that while the device address is only 7 bits, you always need to postpend the READ/WRITE bit, so an I2C transaction is always 8 bits + ACK/NAK, so 9 bits in total. Note also, the I2C max transfer rate really defines the max SCK speed the device can handle, so in fast mode, the SCK is at most 400KHz (thus 400Kbps at most, but because of the protocol, you'll get less in real data). Thus, 84 bits at 400KHz means that we can transfer a sample in 0.21 ms or ~4700 samples/sec assuming no gaps or breaks in transmission.
Since you need to read 2 samples every 1ms (2 accelerometers, so 84 bits * 2 = 164 bits/sample or 164Kbps at 1KHz sampling rate), this should at least be possible for fast mode I2C. However, you will need to be careful that you are taking full use of the I2C controller. Depending on the software layer you are working on, it might be difficult to issue I2C burst reads fast enough (ie, 2 burst read transactions within 1ms). Using the FIFO on the accelerometer would significantly help the latency requirement, meaning instead of having 1ms to issue two burst reads, you can delay up to 32ms to issue 64 burst reads (since you have 2 accelerometers); but since you need to issue a new burst read to read the next sample, you'll have to be careful about the delay introduced by software between calls to whatever API youre using to perform the I2C transactions.
Are there registers involved or is it cache memory related?
An illustrative example for my question which perhaps is simple enough, I move my mouse across this screen I am currently typing on. I don;t click on anything, I just move the arrow left to right and up and down. How does the CPU handle the position changes of my mouse in relation to the monitors display which seems instantaneous?
Edit: I understand that this is more handled by the Operating system as a mouse is an external device and the CPU just calculates values and does logic. the mouse moves and on every clock signal the operating system gets an interrupt and handles it appropriately.
When you move/click your mouse, it generates an interrupt. An interrupt is basically a way to tell the cpu that an event has happened that needs to processed. The kernel will then run its interrupt handler to process the mouse events.
For example, the PS/2 mouse communicates by means of a 3-byte packet:
-----------------------------------------------
Byte 1 | YV | XV | YS | XS | 1 | MB | RB | LB |
-----------------------------------------------
Byte 2 | X movement |
-----------------------------------------------
Byte 3 | Y movement |
-----------------------------------------------
The MB, RM, LB flags represent the Middle, Right and Left button clicks.
The kernel will then eventually pass these events onto the application that is running.
For example, in Linux, the X Window Server is the process that handles mouse events. Individual graphical applications are informed of them through a generic X event protocol.
Registers and cache memory are always involved when running code. The kernel interrupt handlers are optimized to quickly process the interrupts and pass it on. The change is seen as near instantaneous because cpu's are extremely fast. Processors work with nanosecond resolution and there are a billion nanoseconds to every second.
I have a buffer in SRAM of size 4096 bytes which gets updated with new raw audio data periodically:
----------------------------------------
| 2048 bytes of L | 2048 bytes of right|
----------------------------------------
^ ^
|A |B
NOTE: A and B are pointers to the start addresses.
As shown, the data is non-interleaved stereo (16 bit samples, 44100Hz sampling rate) and since it is in memory already, I prefer to use MMAP'ed access instead of RW (and as far as my understanding of alsa goes, it should not use a separate buffer for copying data into from this buffer).
The starting address for this buffer is fixed (say 0x3f000000 physical address) and I am MMAPing this buffer to get a virtual address pointer.
Now, how do I send data to alsa for playback and what should be my configuration?
My current unsuccessful way is:
Resample ON
Rate 44100
SND_PCM_ACCESS_MMAP_NONINTERLEAVED
channels 2
format SND_PCM_FORMAT_S16_LE
period near 1024 frames
buffer near 2*1024 frames
void* ptr[2];
ptr[0] = A // Points to mmaped virtual address of A
ptr[1] = B // Points to mmaped virtual address of B
while(1)
{
wait_for_new_data_in_buffer();
snd_pcm_mmap_writen(handle, &ptr, period_size);
}
Extra info:
1. I am using an embedded board with arm cores and running basic linux on it.
2. This is a proprietary work related project and hence the vague-ness of this question.
3. I already know that directly MMAPing a physical address is not recommended so do not waste your time commenting about it.
Thanks in advance.
In iPhone SDK 4.3 I would like to record what is being played out through speaker via Remote IO and also record the mic input. I was wondering if the best way is to record each separately to a different channel in an audio file. If so which apis allow me to do this and what audio format should I use. I was planning on using ExtAudioFileWrite to do the actual writing to the file.
Thanks
If both tracks that you have is mono, 16bit integer with the same sample rate:
format->mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
format->mBitsPerChannel = 16;
you can combine those tracks to the 2 channels PCM by just alternating sample from one track with sample from another.
[short1_track1][short1_track2][short2_track1][short2_track2] and so on.
After that you can write this samples to the output file using ExtAudioFileWrite. That file should be 2 channel kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked of course.
If one of tracks is stereo (I don't think that it is reasonable to record stereo from iphone mic), you can convert it to the mono by taking average from 2 channels or by skipping every second sample of it.
You can separately save PCM data from the play and record callback buffers of the RemoteIO Audio Unit, then mix them using your own mixer code (DSP code) before writing the mixed result to a file.
You may or may not need to do your own echo cancellation (advanced DSP code) as well.
using iphone sdk 4.3. I am trying to connect the Remote IO mic connection to the input of 2 mixer units in an AUGraph. However with the following code only the first connection works and the second fails with error code -10862 (Audio processing graphs can only contain one output unit)
result = AUGraphConnectNodeInput (
processingGraph,
iONode, // source node
1, // source node output bus number
mixerNode1, // destination node
1 // desintation node input bus number
);
result = AUGraphConnectNodeInput (
processingGraph,
iONode, // source node
1, // source node output bus number
mixerNode2, // destination node
1 // desintation node input bus number
So how can i feed the input of the mic to input of 2 mixers?
);
You cannot connect the same output to two separate inputs. The core audio model is a pull model with each node requesting samples from the previous node that it is connected to. If two mixers were requesting samples from one node you would get samples 0..255 in one mixer and samples 256 - 511 in the next mixer (if the buffer size was 256 samples). If you want a scenario like this to work buffer the samples from the mic input and then give access to the buffer in both the mixers callbacks.
I know the question is really old - but I needed a solution for that as well. So this is what I came up with....
You can use kAudioUnitSubType_Splitter.
An audio unit with one input bus and two output buses. The audio unit duplicates the input signal to each of its two output buses.
Have a look at Apple's documentation