Play encoded audio stream with web Audio Api - web-audio-api

I'm sending an encoded live audio stream (mp3 or ogg) over websockets and i want to play it with the web audio api.
I read and tried several things but nothing worked so far...
I always tried to do it with the decodeAudioData Method.
But this method can not handle an continuous stream.
So my last approach was this:
ws.onmessage = function (evt) {
context.decodeAudioData(evt.data, function (decodedData) {
source = context.createBufferSource();
source.buffer = decodedData;
source.start(startTime);
source.connect(context.destination);
startTime += decodedData.duration;
},
function(e) {
var test = e;
}
);
};
This works at least with mp3s but not very well. between the received chunks there is a very small break. so there is no fluid playback of the stream. I don't know what's the reason for that... maybe the decodedData.duration property is not accuracte enough or there is some kind of delay anywhere.
Anyway it's not working at all with ogg files. I can hear the first received chunk but the rest is ignored. Maybe this has something to do with missing headers?
Is there any other method in the web audio api to play an encoded live stream then decodeAudioData? I could not find anything...
Thanks for your help!

Don't do this over web sockets if you can help it. Let the browser do its job and play this over HTTP. Otherwise you must reinvent everything.
If you insist on reinventing everything for some reason, you must:
Buffer incoming data
Decode that data
Buffer decoded data
Play back your decoded PCM buffers with a script node
Handle the times when you have buffer underruns/overruns (likely by playing back silence or dropping PCM samples)
How you do each of these items depends on your specific needs and implementation, so I would recommend splitting up the question if you get stuck on any of that.

Related

Improve Uploading Avatar Time

I,m using Smack to upload avatar. It takes long time and most of that time it times out (sometimes even 2min is not enough). Is there a way I can improve on that? Is there any other way to quickly upload avatar?
I know I can have just my own http service serving avatars, but I'm not willing to go that route right now. Fetching VCard avatar is very quick.
I use Smack 4.3.0 and Smack Logs are found here: https://pastebin.com/dQbSEpmJ
Here is the code I use:
fun setPhoto(path: String) = viewModelScope.launch(Dispatchers.IO) {
try {
val file = File(path)
val vCardMgr = VCardManager.getInstanceFor(connection)
val vCard = vCardMgr.loadVCard()
vCard.setAvatar(Base64.encodeToString(file.readBytes(), Base64.DEFAULT), FileUtils.getMimeType(path))
vCardMgr.saveVCard(vCard)
} catch (e: Exception) {
launch(Dispatchers.Main){
Toast.makeText(chatApp.applicationContext, e.message, Toast.LENGTH_LONG).show()
}
}
}
I found that testing with openfire, the file size was so bing which in turn caused the Stanza to be so hughe that it crashed the server. That was confirmed by Guus (Ignite Realtime guy) and here I quote him:
Openfire has a (configurable) maximum stanza size limit. I think it’s
on 2MB. Note that when you base64 encode binary data, the encoded
result will be a lot larger than the unencoded original. I suggest
that you reduce the image size in your vcard, or use another mechanism
to exchange the data.
So compressing the image solved the issue

Dont receive results other than those from first audio chunk

I want some level of real-time speech to text conversion. I am using the web-sockets interface with interim_results=true. However, I am receiving results for the first audio chunk only. The second,third... audio chunks that I am sending are not getting transcribed. I do know that my receiver is not blocked since I do receive the inactivity message.
json {"error": "Session timed out due to inactivity after 30 seconds."}
Please let me know if I am missing something if I need to provide more contextual information.
Just for reference this is my init json.
{
"action": "start",
"content-type":"audio/wav",
"interim_results": true,
"continuous": true,
"inactivity_timeout": 10
}
In the result that I get for the first audio chunk, the final json field is always received as false.
Also, I am using golang but that should not really matter.
EDIT:
Consider the following pseudo log
localhost-server receives first 4 seconds of binary data #lets say Binary 1
Binary 1 is sent to Watson
{interim_result_1 for first chunk}
{interim_result_2 for first chunk}
localhost-server receives last 4 seconds of binary data #lets say Binary 2
Binary 2 is sent to Watson
Send {"action": "stop"} to Watson
{interim_result_3 for first chunk}
final result for the first chunk
I am not receiving any transcription for the second chunk
Link to code
You are getting the time-out message because the service waits for you to either send more audio or send a message signalling the end of an audio submission. Are you sending that message? It's very easy:
By sending a JSON text message with the action key set to the value stop: {"action": "stop"}
By sending an empty binary message
https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/speech-to-text/websockets.shtml
Please let me know if this does not resolve your problem
This is a bit late, but I've open-sourced a Go SDK for Watson services here:
https://github.com/liviosoares/go-watson-sdk
There is some documentation about speech-to-text binding here:
https://godoc.org/github.com/liviosoares/go-watson-sdk/watson/speech_to_text
There is also an example of streaming data to the API in the _test.go file:
https://github.com/liviosoares/go-watson-sdk/blob/master/watson/speech_to_text/speech_to_text_test.go
Perhaps this can help you.
The solution to this question was to set the size header of the wav file to 0.

Sending a video from Perl to a client over HTTP

I am currently making a perl script that will convert a file to webm/ogg/mp4 format and then send it back to the user but in embed video. It all works except that I can not send an EOF so the HTML5 video player knows what the end is and so he can correctly use the file (like going to a specific time and even knowing when the file has ended (now it just stops but you can't do anything anymore with the video.
Start-code:
elsif ($path =~ /^\/((\w|\d){11})\.webm$/ig) {
print "HTTP/1.0 200 OK\r\n";
$handler = \&resp_youtubemovie;
$handler->($cgi,$1);
Function to send webm file
sub resp_youtubemovie {
my $cgi = shift;
my $youtubeID = shift;
return if !ref $cgi;
open(movie,"<$youtubeID.webm");
local($/);
$movie = <movie>;
close(movie);
print "Content-type: movie/webm\n";
print $movie;
}
I've already tried with a while loop and a buffer but that doesn't work either, I've also tried to change the HTTP status code to 206 Partial Content because I wiresharked some other video streaming websites used it but it didn't matter. So anyone an idea how to open a movie file and stream it correctly?
Rather than doing this by hand, a framework like Dancer can take care of this. This will save you many, many, many headaches. It also allows you to take advantage of the Plack/PSGI superglue which figures out how to talk to web servers for you.
use Dancer;
get qr{/(\w{11}\.webm)$}i => sub {
my($video_file) = splat;
return send_file(
$video_file,
streaming => 1,
);
}
Using Dancer routes you should be able to adapt your existing code pretty easily especially if its a big if/elsif matching against various paths. Dancer does a very good job making simple things simple, it also gives you a huge amount of control over the exact HTTP response if you need it.
A few notes...
The content-type for webm is video/webm which may be the source of your problems. Dancer should just get it right. If not you can tell send_file the content type explicitly.
(\w|\d){11} is better written as \w{11} since \w includes \d.
You must use the 206 Partial Content HTTP status and you must also send:
The Accept-Range: bytes header.
A Content-Range: 0-2048/123456 header where you send the starting and ending byte index of the content followed by the total byte length of the content. The client will be sending you the byte ranges it wants in the request header. The client may send multiple byte ranges in a single request, in which case you'd also need to send the content with multipart word boundaries.
Finally, to get back to your question, if the client requests a byte range that isn't satisfiable then you send a 416 HTTP status and close the connection.

iPhone extracting songname from the icecast streaming radio

I am looking to extract the Song name from an ice cast streaming radio. I am getting the icy-genre,icy-name n stuff. but not the song name. Can we extract it from the stream?
From your question I presume you already added Icy-Metadata: 1 to you request.
You'll have to read the "icy-metaint" response header, that will tell you how bytes to read between each metadata update in the stream.
The following is pseudocode:
byteinterval = int(response.getHeader("icy-metaint"))
stream = response.getBodyStream()
stream.read(byteinterval)
metadata_len = byte(stream.read(1)) * 16
metadata = stream.read(metadata_len)
The metadata will look something like this:
StreamTitle='Some Song Name Stream Client Sent';StreamUrl='http://someurl.com/';
Unfortunately there's no absolute standard for either encoding of the complete metadata buffer, or the contents of the StreamTitle.
Song name may or may not contain album name, artist name and complete song name or other fields.
The metadata buffer itself may or may not be UTF-8. It's up to the streaming client to decide on what to inject. Most decent clients will use UTF-8 when forced to send non ASCII data, but not all (and they may send some native encoding like Windows-1521 or UTF-16).
If you want to keep getting metadata updates you can simply consume blocks of "byteinterval" length of bytes to get more metadata updates, or disconnect and poll the stream later.

NSURLRequest with HTTPBody input stream: Stream sends event before being opened

I want to send a large amount of data to a server using NSURLConnection (and NSURLRequest). For this I create a bound pair of NSStreams (using CFStreamCreateBoundPair(...)). Then I pass the input stream to the NSURLRequest (-setHTTPBodyStream:) and schedule the output stream on the current run loop. When the run loop continues, I get the events to send data and the input stream sends this data to the server.
My problem is, that this only works when the data fits into the buffer between the paired streams. If the data is bigger, then somehow the input stream gets an event (I assume "bytes available") but the NSURLConnection has not yet opened the input stream. This results in an error message printed and the data is not being sent.
I tried to catch this in my -stream:handleEvent: method by simply returning if the input stream is not yet opened, but then my output stream gets a stream closed event (probably because I never sent data when I could).
So my question is: How to use a bound pair of streams with NSURLConnection correctly?
(If this matters: I'm developing on the iOS platform)
Any help is appreciated!
Cheers, Markus
Ok, I kind of fixed this by starting the upload delayed, so that it starts after the NSURLConnection had time to setup its input stream.
It's not what I call a clean solution though, since relying on -performSelector:withObject:afterDelay: seems a bit hacky.
So if anyone else has a solution to this, I'm still open for any suggestions.