Azure bing cognitive services speech to text in javascript via REST fails - bing

In JavaScript we use recorder.js to capture microphone input, down sample it to 16kHz, encode it as a WAV file and get a blob.
Next, we obtain the raw blob bytes via a FileReader onload() callback and then use an XMLHttpRequest to send() the raw bytes to Bing.
The XMLHttpRequest includes the headers:
'Ocp-Apim-Subscription-Key' : 'xxxxxx'
'Content-Type' : 'audio/wav; codec=audio/pcm; samplerate=16000'
A sample blob size is 62456 bytes.
FireFox network tracing shows 2 interactions. The first is
Request URL: https://speech.platform.bing.com/speech/recognition/interactive/cognitiveservices/v1?language=en-US&format=simple
Request Method: OPTIONS
and the second
Request URL:https://speech.platform.bing.com/speech/recognition/interactive/cognitiveservices/v1?language=en-US&format=simple
Request Method: POST
content-length: 94476
However, I keep getting the following reply
{"RecognitionStatus":"InitialSilenceTimeout","Offset":29000000,"Duration":0}
FWIW, any idea why the source blob size of 62456 would result in content-length: 94476?
The same raw blob bytes are processed by Amazon Lex properly.
Is there any JavaScript RESTful example?
Many thanks.
/--------------------------------------------------------------
After putting together the test case below I also tried the following without success.
console.log("Send to BING blob");
var self = this;
console.log(blob);
var msUrl = 'https://speech.platform.bing.com/speech/recognition/interactive/cognitiveservices/v1';
msUrl += '?language=en-US';
msUrl += '&format=simple';
console.log(msUrl);
var xhr = new XMLHttpRequest();
xhr.onload = function(evt) { console.log('onload', xhr, evt);};
xhr.open('POST', msUrl, true);
xhr.setRequestHeader('Accept', 'application/json;text/xml');
xhr.setRequestHeader('Ocp-Apim-Subscription-Key', 'xxx');
var bingContentType = 'audio/wav; codec=audio/pcm; samplerate=16000';
xhr.setRequestHeader('Content-Type', bingContentType);
xhr.send(blob);

The shorter code version of sending to Bing was fine. The probelm was that the recorder worker's encodeWAV(samples) function did not
take into account the down sampling to 16000. The function was incorrectly writing the captured sampling rate as the header value. The lines to be tweaked are:
view.setUint32(24, downSampleRate, true);
view.setUint32(28, downSampleRate * 2, true); /*MONO*/
Apparently AWS Lex ignores the header values as it only expects 16kHz mono whereas the Bing service has to look at the header information to determine which of the audio formats supported is being sent.

Today I came across this problem, after spending half an hour, I was able to find the real cause of my issue. Let me go through the steps which are mentioned in this link.
Verified my Bing speech API is in running status.
Verified my key by running the below code in
$FetchTokenHeader = #{
'Content-type'='application/x-www-form-urlencoded';
'Content-Length'= '0';
'Ocp-Apim-Subscription-Key' = ''
}
$OAuthToken = Invoke-RestMethod -Method POST -Uri https://api.cognitive.microsoft.com/sts/v1.0/issueToken -Headers $FetchTokenHeader
show the token received
$OAuthToken
As mentioned in the last point in that link, InitialSilenceTimeout may be the result of the unformatted/invalid wav file. So I downloaded a new wav file from internet and tested with it.
Bingo, that worked. And finally, I was able to get my speech in text format

Related

MATLAB Not properly sending HTTP POST requests

I'm currently trying to build a MATLAB based system to interface with the API of my stock broker. I'm however running into quite some issues with sending the http post requests to the server.
I already have it working perfectly when testing with POSTMAN, but for some reason it keeps refusing my MATLAB send requests. I now testing the actual requests through PIPEDREAM which lets me view the http request.
Image of the good and bad requests:
The Left is an image of my postman requests which it perfectly processes as JSON strings. However my MATLAB requests are not processed properly and also are 10 characters longer than the actual string value.
The (trimmed) code to send the requests can be seen here.
% http request classes
import matlab.net.*
import matlab.net.http.*
% prepare payload
username = "usr";
password = "XXXXXXXXXXXXX";
login_payload = struct("username", username, "password", password);
request = RequestMessage('POST', [ bunchOfHeaders ], jsonencode(login_payload));
% Send request to login api
[login_resp, c, h] = request.send("https://trading.somebroker.com/login/secure/login");
Does anyone have any clue what could be happening here? If I set the content-lenght to the "correct" length (same as length(login_payload)) it says my length is wrong even though my postman requests seem to not struggle with this.
Found the answer... Matlabs http stuff is absolutely braindead.
I had a closer look into the raw intercepted messages (pipedream just sends your request back to you and you can view it with string(login_resp)).
For some god darn reason matlab encases the json string with "s which makes the receiver treat the whole body as a string. This is caused by setting the "content-type" to "application/json". changing the content-type to "text/plain" did not encase it in "s and completely solved my issue

POST image to web service with Flutter

I am currently using Flutter and Dart to try to send an image to a web service, after which I wait for a JSON response. The below links show the Postman request that I am trying to mimic.
An image of Postman with the headers required
An image of Postman with the body
I think the issue I am having is either not setting the headers of my request correctly, or not encoding the image correctly.
Currently I am getting a HTTP 400 error.
I have tried following these suggested solutions on StackOverflow but have been getting the HTTP 400 error.
Any help would be much appreciated!
Try this. I'd suggest creating yourself a plain Dart project, if you haven't already. This way you can test things without the need of the phone emulator, etc.
main() async {
http.MultipartRequest request =
new http.MultipartRequest('POST', Uri.parse(url));
request.headers['Prediction-Key'] = '3f4a......'; // todo - insert real value
request.files.add(
new http.MultipartFile.fromBytes(
'image',
bytes,
filename: 'somefile', // optional
contentType: new MediaType('image', 'jpeg'),
),
);
http.StreamedResponse r = await request.send();
print(r.statusCode);
}
If the file is on disk, not in memory, then use the fromPath named constructor instead. Experiment with different media types. I've used image/jpeg, but you could try application/octet-stream.
As an aside, in your first screenshot you show a content type, but Postman ignores this as the overall content type is overridden by multipart-form. Uncheck that row in Postman to prove this.
There was another question recently on SO, where the server was incorrectly expecting headers to be case sensitive. In postman, try again with lowercase prediction-key to prove that the server doesn't mind lowercase headers (which is what Dart uses).

MediaElementAudioSource outputs zeroes due to CORS access restrictions for

I try get frequency from element audio with src is a url
var aud = document.getElementById("audio-player");
var canvas, ctx, source, context, analyser, fbc_array;
function initMp3Player(){
try {
context = new (window.AudioContext || window.webkitAudioContext)();
} catch(e) {
throw new Error('The Web Audio API is unavailable');
}
analyser = context.createAnalyser(); // AnalyserNode method
analyser.smoothingTimeConstant = 0.6;
analyser.fftSize = 512;
canvas = document.getElementById('canvas_up');
ctx = canvas.getContext('2d');
source = context.createMediaElementSource(aud);
source.crossOrigin = 'anonymous';
source.connect(analyser);
analyser.connect(context.destination);
frameLooper();
}
function frameLooper(){
window.requestAnimationFrame(frameLooper);
fbc_array = new Uint8Array(analyser.frequencyBinCount);
analyser.getByteFrequencyData(fbc_array);
console.log(fbc_array);
var gradient = ctx.createLinearGradient(0,0,0,300);
gradient.addColorStop(1,'#000000');
gradient.addColorStop(0.65,'#000000');
gradient.addColorStop(0.55,'#FF0000');
gradient.addColorStop(0.25,'#FFCC00');
gradient.addColorStop(0,'#ffffff');
if(fbc_array != null){
ctx.clearRect(0, 0, canvas.width, canvas.height);
}
ctx.fillStyle = gradient; // Color of the bars
for (var i = 0; i < (fbc_array.length); i++ ){
var value = -(fbc_array[i]/4);
ctx.fillRect(i*5,canvas.height,4,value*2);
}
}
window.addEventListener("load", initMp3Player, false);
and HTML:
<audio id="audio-player"><source src="" type="audio/mpeg"></audio>
but I receive error:
MediaElementAudioSource outputs zeroes due to CORS access restrictions for ...
I searched very much but i receive a good answer and detail. I'm not really good english, so very super if answers have demo ... thanks
I just find this problem, and mad with the Message:MediaElementAudioSource outputs zeroes due to CORS access restrictions for. But it's just a message, i can still hear the audio.
And I googled lots of this, think this link will be helpful:http://www.codingforums.com/javascript-programming/342454-audio-api-js.html
The createMediaElementSource method should create an object that uses the MediaElementAudioSourceNode interface. Such objects are subject to Cross-Origin Resource Sharing (CORS) restrictions based on the latest draft of the Web Audio API spec. (Note that this restriction doesn't appear to be in the outdated W3C version of the spec.) According to the spec, silence should be played when CORS restrictions block access to a resource, which would explain the "outputs zeroes" message; presumably, zero is equivalent to no sound.
To lift the restriction, the owner of the page at
http://morebassradio.no-ip.org:8214/;stream/1 would need to configure
their server to output an Access-Control-Allow-Origin header with
either a list of domains (including yours) or the * value to lift it
for all domains. Given that this stream appears to already be
unrestricted, public-facing content, maybe you can convince the owners
to output that header. You can test whether the header is being sent
by pressing Ctrl+Shift+Q in Firefox to open the Network panel, loading
the stream through the address bar, and then inspecting the headers
associated with that HTTP request in the Network panel.
Note that they can't use a meta element here since the audio stream
is, obviously, not an HTML document; that technique only works for
HTML and XHTML documents.
(While you're messing with Firefox panels, you may want to make sure
Security errors and warnings are enabled (by clicking the Security
button or its arrow) in the Console panel (Ctrl+Shift+K). I'm not sure
if there's a corresponding CORS message in Firefox like in Chrome, but
there might be. I wasted a bunch of time wondering why a page wasn't
working one day while troubleshooting a similar technology, Content
Security Policy (CSP), only to find that I had the relevant Firefox
messages hidden.)
You shouldn't need to mess with the crossorigin property/attribute
unless you set crossorigin = "use-credentials" (JavaScript) or
crossorigin="use-credentials" (HTML) somewhere, but you probably
didn't do that because that part of the HTML spec isn't finalized yet,
and it would almost certainly cause your content to "break" after
doing so since credentials would be required at that point.
I'm not familiar with the Web Audio API, so I wasn't able to figure
out how to output a MediaElementAudioSourceNode and trigger an error
message for my own troubleshooting. If I use createMediaElementSource
with an HTMLMediaElement (HTMLAudioElement), the result doesn't seem
to be a MediaElementAudioSourceNode based on testing using the
instanceof operator even though the spec says it should be if I'm
reading it right.
Then in my situation, i get the HTTP response Header:
HTTP/1.1 206 Partial Content
Date: Thu, 02 Jun 2016 06:50:43 GMT
Content-Type: audio/mpeg
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: X-Log, X-Reqid
Access-Control-Max-Age: 2592000
Content-Disposition: inline; filename="653ab5685893b4bf.mp3"
Content-Transfer-Encoding: binary
Last-Modified: Mon, 16 May 2016 02:00:05 GMT
Server: nginx
Cache-Control: public, max-age=31536000
ETag: "FpGQqtcf_s2Ce8W_4Mv6ZqSVkVTK"
X-Log: mc.g;IO:2/304
X-Reqid: 71cAAFQgUBiJMVQU
X-Qiniu-Zone: 0
Content-Range: bytes 0-1219327/1219328
Content-Length: 1219328
Age: 1
X-Via: 1.1 xinxiazai211:88 (Cdn Cache Server V2.0), 1.1 hn13:8 (Cdn Cache Server V2.0)
Connection: keep-alive
Note that "Access-Control-Allow-Origin: *", i think this just the right thing, but i still get the message. Hope it help you.
This is correct. You can't access media from a different domain in Web Audio without CORS enabled on the media server (and making the appropriate CORS request.) This is to prevent cross-domain information attacks.
I was running into this problem when I would develop my application by opening the index.html file in my browser. A server was required in order to use the audio files I needed.
I installed the Live Server extension on Visual Studio Code - one of many ways to solve this.

Uploading blob to azure - http header not in correct format

I am trying to upload videos to azure media server via rest api. I have reach the step of uploading the video however I am getting an error. I use the following code to upload the video.
var client = new HttpClient();
client.DefaultRequestHeaders.TryAddWithoutValidation("Authorization", "Bearer " + token);
client.DefaultRequestHeaders.Add("x-ms-version", "2.8");
client.DefaultRequestHeaders.Add("x-ms-date", "2015-02-5");
client.DefaultRequestHeaders.Add("DataServiceVersion", "3.0");
client.DefaultRequestHeaders.Add("MaxDataServiceVersion", "3.0");
client.DefaultRequestHeaders.Add("x-ms-blob-type", "BlockBlob");
var formcontent = new MultipartFormDataContent();
FileStream stream = File.OpenRead(#"C:\AzureMediaUploadTest\MediaUploadTest\VideoFiles\tom.mp4");
byte[] fileBytes = new byte[stream.Length];
stream.Read(fileBytes, 0, fileBytes.Length);
stream.Close();
var streamcontent = new StreamContent(new MemoryStream(fileBytes));
formcontent.Add(streamcontent);
formcontent.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
result = await client.PutAsync(uploadurl, formcontent);
However the result gives a 400 - A http header is not in the correct format. Iam not sure which header is refered too or am I missing something.
Any help is appreciated.
UPDATE: I have marked the question as answered however I am now having issues with the authentication header - the new issue is asked here - Uploading blob to azure - Create authentication header
According to this documentation here:
All authenticated requests must include the Coordinated Universal Time
(UTC) timestamp for the request. You can specify the timestamp either
in the x-ms-date header, or in the standard HTTP/HTTPS Date header. If
both headers are specified on the request, the value of x-ms-date is
used as the request's time of creation. The storage services ensure
that a request is no older than 15 minutes by the time it reaches the
service. This guards against certain security attacks, including
replay attacks. When this check fails, the server returns response
code 403 (Forbidden).
Your 2015-02-05 if far from valid UTC date format.
And according to this documentation here, and the sample PUT request, the Date header is represented as x-ms-date: Wed, 23 Oct 2013 22:41:55 GMT
There is no single in the Azure Blob REST API documentation where Date is referred to as yyyy-mm-dd format.

Manually generating x-gwt-rpc from Python

I want to access a GWT service from a Python script, so I want to generate a x-gwt-rpc request manually. Can't seem to find any info on the format of a GWT RPC call, since everybody does it from Java (so the call is generated by the framework). Where can I find some detailed documentation about this format?
Don't think it is a trivial task to do that, but because gwt is opensource i would say that the source-code is a pretty good documentation for how it works, if you know java that is.
Gwt source
I stumbled on the same problem as you and I think I solved it rather easily.
Though I haven't figured out how to catch the response properly, I managed to get the response and successfully send the request. Here is what I did:
import requests
url = 'yours url'
header = {'Accept':'*/*',
'Accept-Encoding':'gzip, deflate',
etc...
}
cookie = {cookies if needed
}
data_g = 'this would be request payload u can see in F12 of browser '# u just copy it and paste it, !!!like a string (UTF-8 chars)
t = requests.post(url, headers=header, data = data_g, cookies = cookie)
print vars(t).keys()
#line above will print all variables of t
print t
Also these are some good links you should check out:
https://github.com/GDSSecurity/GWT-Penetration-Testing-Toolset
https://docs.google.com/document/d/1eG0YocsYYbNAtivkLtcaiEE5IOF5u4LUol8-LL0TIKU/edit?hl=de&forcehl=1