I want to use IBM Watson Speech to Text from the browser with "Live Detection" i.e. I don't want to record the audio and send files, instead I want to use the websockets option for continuous speech to text.
I found this git repo
https://github.com/watson-developer-cloud/speech-javascript-sdk
But I could not find any actual example on how to use. Can someone show me some code exaple of how to use it (or some other alternative).
I believe the WatsonSpeech.SpeechToText.recognizeMicrophone({token}) is probably what I need and some example of hot to use it would be enough.
Here's an example of it in action
You can find the source code for this on github here
I think the bit you're looking for is in /src/socket.js
Related
I'm designing a web application that will enable users to upload docx documents and will the show the diff between each revision.
I don't know how to approach the problem. Is it a bitmap, how do I decode the doc to show changes. Do Microsoft has an API I can use to simply send 2 word docs, and it will return the changes between the two?
I also have the same question on google docs. I think with google drive it's simpler. Saw this API
If anyone has done something similar or has an example to some similar app, I would be grateful.
Have you looked at Word's Compare tool? See under Review|Compare. Word's API also support the compare tool's automation (e.g. via VBA).
To use Word's API, you'll need to automate Word. For the details of the method, see: https://msdn.microsoft.com/en-us/library/office/hh128820(v=office.14).aspx
See also: https://code.msdn.microsoft.com/windowsdesktop/Compare-Two-Word-Documents-043b2e1d
I know you can make custom Google Assistant triggers that will invoke IFTTT. But I want to make a custom trigger that will do something but /also/ keep the default Google Assistant behavior. Is there a way to do this?
Description of my actual goal: I speak German as much as possible at home with my daughter. But there are times where I don't know a word, so I can say "OK Google, what is $word in German?" and it will speak it to me. This is very useful.
Then I manually add that word to my vocabulary list to study it.
I would like to write my own Python/Node microservice that will receive the word and generate flashcards (do a lookup on Linguee for sample sentences, for example) in my study program automatically.
But I would also like to keep the Google Assistant behavior that reads the translation back to me on my phone.
So is there a way to accomplish this? Basically instead of having a trigger invoke Google Assistant, I'd like it to do that and also do a second behavior (issue a POST request to a custom URL).
Thank you.
The obvious way is written in Working with the HTML preview, to use some link, so how to send data to the base running program without clicking links? I want to make some seamless extension between editor and previewer.
A few possible ways:
Open a local communications channel between your extension and the page. The extension could setup a simple server for example that the webview hits. This is best if you have lots of data to send or need to support more complex scenarios.
Inside the webview, you can instead post a message simulating a click with a command. Here's what VSCode's built-in markdown extension does for example:
window.parent.postMessage({
command: 'did-click-link',
data: `command:_markdown.revealLine?${encodeURIComponent(JSON.stringify(args))}`
}, 'file://');
The second approach is pretty hacky but it works well to just trigger events every so often.
We are also considering a better API for this. Please let us know if you have any thoughts or suggestions for this
I am going to integrate broadcasted channels with IPTV channels into one menu on my TVs. The problem is that swithing between different sources is a pain. So basically I need to create a menu on the tv to select the channel I want to watch and then switch the TV to that very channel. I know how to create the menu.
The other part of the solution is to push the ITPV channel from the mediaserver to the TV screen. This is a hard part. I ended up installing gupnp and playing with it. It works and I'll be able to write the application.
May be you have an idea of a better solution to pushing the content via DLNA? Is there a command line utility or a mediaserver that can be controlled from the command line? That'll be an ideal option.
The very basic question is how would you programmatically play a resource from a mediaserver on a renderer?
Thanks.
This shows how you can instruct your renderer to play media from a mediaServer, using curl from the command line. You can easily make a similar http request from within a program.
http://www.accella.net/knowledgebase/sending-a-video-content-to-a-dlnaupnp-softwaredevice-using-curl/
and this too:
http://djoepnpoep.blogspot.co.za/2015/07/command-line-dlnaupnp-av-with-curl.html
The very basic question is how would you programmatically play a resource from a mediaserver on a renderer
The very basic answer is, you can't. UPnP MediaServer in itself is not designed for ability to start playing content to a renderer, exactly the same way as a HTTP server can't start displaying HTML on a particular browser window without the browser making at least one request first. So you have two options:
your implementation of "the menu in TV" (whatever that is) is capable of UPnP discovery and browse the mediaserver for the wanted content (perhaps on hardcoded URL to simplify).
introduce an UPnP Control Point into your network, which knows how to discover and browse the mediaserver and push the content into a selected Renderer. I don't see any reason why that shouldn't be possible to do from commandline, gUPnP seems to provide a source for sufficiently powerful Control Point which you can tweak and tailor to your needs.
Mind that both options effectively result in your TV making a request on MediaServer and actively downloading the stream data. There is no hidden wizardry in the second option, "push" practically means that the Control Point tells the renderer "here is the URL which you start downloading".
I am doing an app in which I require a business card reader I googled alot but BBY is the only solution which I was able to find out. Can anybody help me out with some opensource library which can be tweaked or used directly as a business card reader.
Please enlighten me on this.
you can look into the Tesseract open source engine... its pretty good for image processing.. i mean it will extract the text out of the image but then you will have to process it to extract name ,phone numbers and other details.
this guy has explained how to use it in iOS .. http://tinsuke.wordpress.com/2011/11/01/how-to-compile-and-use-tesseract-3-01-on-ios-sdk-5/
We started an open source project to build a Javascript library (based on the OCR engine tesseract.js for the OCR part) that exctract the relevant data from a business card based on heuristic criteria.
The library (BCR Library, available on github) is usable in any html project (included mobile cordova, phone gap or ionic projects) just including it via script tag.
The library doesn't have any external api call and fully works offline.
I think that you should give a try to Covve Bussiness Card Scan API. The quality of the result is great in various languages. You can check a comparison analysis of similar services here.
[Disclosure] I'm part of the team developing the service.