"Okay Google, show pictures of [PARAMETER PHRASE]" - actions-on-google

I'm creating a setup of a Google Assistant/Home that should IDEALLY respond to the phrase "Okay Google, show pictures of [PARAMETER PHRASE]" by giving me the parameter phrase. It also HAS to be able to function like a regular home ("Hey Google, how far away is the moon", "... tell me a joke", etc.), without having me reimplement all of that functionality (unmatched phrases should fallback to the Google Home).
If I use the Home, I'm afraid I won't be able to avoid "... tell [MY APP NAME] to ...", but it has a great mic and speaker built in.
I am alternatively looking into a raspberry pi solution for the added layer of control, but the Home has a fantastic mic and speaker already. And importantly, I absolutely don't want to recreate the core Google Home features (possibly able to pass off uncaught phrases to the Google Home backend?)
I can mask some non-parameterized commands with the Assistant Shortcuts ("Okay Google, cat time!", "Hey Google, show me cats") in order to simplify the call phrase, but that does not work because it's not parametrizable.
TLDR: I have a setup that needs to 1. work like a normal Google Home, but must 2. have additional functionality that I implement. I would like to 3. avoid having to state "... tell MY TARGET APP to [...]", but I need 4. parameters to be passed to my code., even if completely unparsed.
What are my options?

There are a bunch of possible approaches here, depending on the exact angle you want to tackle this. None really are perfect at this time, however, but since everything is evolving, we'll see what might develop.
It sounds like you're making an IoT picture frame or something like that? And you want to be able to talk to it? If so, you may want to look into the Assistant SDK, which lets you embed the Assistant into your IoT device. This would let you implement some voice commands yourself, but pass other things off to the Assistant to handle.
But this isn't a perfect solution, since it splits where the voice recognition works, where it is applied, and may not get you the hotword triggering.
It is also still in an early Developer Preview, so things might change, and it may evolve to be something closer to what you want... but it is difficult to tell right now.
Depending on the IoT appliance you're working on, you may be able to leverage the built-in commands by building a Smart Home Action. However, at the moment, these have a fairly limited set of appliance types they can work with. It also sounds like you're trying to deal with media control - which isn't something that Smart Home directly works with, and is (hopefully) a future Action API (there were some hints about this at I/O, with Cast compatibility promised... but no details).
If you really want to build for the Home and Assistant, you'll need to use the limitations around Actions on Google. And that does include some issues with the triggering name.
However... one good strategy is to pick a name that works well with the prefix phrases that are used. Since "Ask" is a legitimate prefix that Home handles, you could plan for a triggering name such as "awesome photo frame", and make the command "Ask awesome photo frame to show pictures of something".
More risky, since it isn't clearly documented, but it seems that some triggering names work without a prefix at all. So if your application is named "fly to the moon", it seems like you can say "Hey Google, fly to the moon" and the action will be triggered. If you can get a name like this registered, it will feel very natural for the user.
Finally, you can pick a reasonable name, but have your users set an alias or shortcut that makes sense to them. I'm not sure how this would fit in with solution (1), but being able for you to predefine shortcuts would make it pretty powerful.

You can't invoke your app without first connecting to your app using Ok Googe, talk to my app* because if it happens so, it will be like talking to the Core Assistant, not your app.
Google doesn't allow to talk an app without app invoke

Related

Send transcribed, spoken text from smart speakers/displays to server after Conversational Action shutdown?

If I need to explain how insanely important and useful this functionality is, please let me know. However, I suspect this is obvious to everyone except Google.
Please, please tell me there is another way to accomplish this.
I need to do all speech parsing, processing, and responses on my own. And from a smart speaker/display. Conversational Actions allowed for this. As far as I have been able to tell, there is no alternative way to accomplish this. I'm shocked and severely disappointed. You're literally crippling your smart speakers and displays. I have one in every room right now and will be selling them after the shutdown unless something changes. I sure hope you reverse course on this.
We noted your ask here and will continue to monitor for other similar requests involving the Conversational Actions in our support channels. We do collect these requests, share with the teams involved in the planning process, and try to get them in our feature development timeline.
Unfortunately there are no features readily available to replace the capability you mentioned above, but our teams are constantly working towards providing a better Google Smart Home Ecosystem. When we have any updates on these features, we will update our public documentation.

Create custom Google Smart Home Action

I have a Google Nest Hub Max and I want to increase its capabilities for a custom need:
"Hey Google, add xyz to my work planning"
Then I want to make an HTTP call to my private server
The private server returns a text
The text is displayed in the Google Nest Hub Max screen + speak-out.
How can that be achieved?
Originally I thought that this will not be difficult. I've imagined a NodeJs, Java, Python or whatever framework where Google gives me the xyz text and I can do my thing and return a simple text. And obviously, Google will handle the intent matching and only call my custom code when users say the precise phrase.
I've tried to search for how to do it online, but there is a lot of documentation everywhere. This post resumes quite well the situation, but I've never found a tutorial or hello world example of such a thing.
Does anyone know how to do it?
For steps 2. and 3., I don't necessarily need to use a private server, if I can achieve what the private server does inside the Smart Home Action code, mostly some basic Python code.
First - you're on the right track! There are a few assumptions and terminology issues in your question that we need to clear up first, but your idea is fundamentally sound:
Google uses the term "Smart Home Actions" to describe controlling IoT/smart home devices such as lights, appliances, outlets, etc. Making something that you control through the Assistant, including Smart Speakers and Smart Hubs, means building a Conversational Action.
Most Conversational Actions need to be invoked by name. So you would start your action with something like "Talk to Work Planning" or "Ask Work Planning to add XYZ'. There are a limited, but growing, number of built in intents (BIIs) to cover other verticals - but don't count on them right now.
All Actions are public. They all share an invocation name namespace and anyone can access them. You can add Account Linking or other ways to ensure a limited audience, and there are ways to have more private alpha and beta testing, but there are issues with both. (Consider this an opportunity!)
You're correct that Google will help you with parsing the Intent and getting the parameter values (the XYZ in your example) and then handing this over to your server. However, the server must be at a publicly accessible address with an HTTPS endpoint. (Google refers to this as a webhook.)
There are a number of resources available, via Google, StackOverflow, and elsewhere:
On StackOverflow, look for the actions-on-google tag. Frequently, conversational actions are either built with dialogflow-es or, more recently, actions-builder which each have their own tags. (And don't forget that when you post your own questions to make sure you provide code, errors, screen shots, and as much other information as you can to help us help you overcome the issues.)
Google's documentation about how to design and build conversational actions.
Google also has codelabs and sample code illustrating how to build conversational actions. The codelabs include the "hello world" examples you are probably looking for.
Most sample code uses JavaScript with node.js, since Google provides a library for it. If you want to use python, you'll need the JSON format that the Assistant will send to your webhook and that it expects back in response.
There are articles and videos written about it. For example, this series of blog posts discussing designing and developing actions outlines the steps and shows the code. And this YouTube playlist takes you through the process step-by-step (and there are other videos covering other details if you want more).

How to move from one level to another in Google Actions Trivia game?

I've made Google Action using Trivia template. Try it by saying
Ok Google, Talk to LCDP Trivia Challenge
In this, once you finish playing one level, the template asks whether you want to play same level again or not. Instead, I want user to try a new level.(Suppose a user is done playing easy level, then instead of playing the easy level again, I would like to ask whether they want to play Medium or Hard level)
At this moment, the template only allows the user to play same user again and again. But I think if the user has scored well in one quiz they would like to try the next level.
So, how can I suggest different difficulty once the game is completed instead of choosing Yes/No for playing same level again? Can I customize this trivia template?
The resolution of the screenshot isn't very good, I can't read what's written there.
There is no way to further customize the template. If you would want more flexibility, you would need to implement your own action.
All customizations are listed in the list of configuration parameters in the documentation.

Why is there no support for fans int the Smart Home API?

How come there is no fan in the device types? It seems kind of odd considering there is support for vacuums, washers, and dishwashers. One of the main things that smarthome users want to control is their ceiling fan(s). I can tell google assistant that it is a light, and I can set the speed, however I can't say things like "Set fan to low". I have to say "dim the fan" or "set fan to 50%" Also, will we ever be able to bind lights to individual google home devices. For example, if I'm in the living room it would be much nicer to say "Turn off lights", instead of "Turn off living room lights". If I say "turn off lights", it turns off all the lights in my whole house, which is pretty annoying. It would be nice if the google team could use the term "all lights" or be able to group items like on amazon's platform. This is a VERY big reason why a lot of home automation people choose the amazon product over the google product. However, I feel the Google Home is a far superior product except for those few annoying things.
Thanks for the feedback on the Fan device type. To be sure, we're continually adding more devices based on priority. However, as you mention we don't have a Trait to control the speed. Before we can encourage developers to use a fan, we would need to go through the process of adding a new trait. This process takes some time, so we've been prioritizing. I do like your clever workaround.
With respect to binding lights to rooms, that's also good feedback. One of the benefits of the HomeGraph is giving the Google Assistant a good understanding of the different devices in one has in different rooms. You can already group devices into rooms for better control.
Since the question was asked, there is now a Fan device type with a FanSpeed trait.

Real time web page

I want to build simple web based app, where users, for example, could push the spacebar button, and then do something further, like answer a question, and while other users at the same time only sees that this question is not available any more for answer. When user submits answer, everyone see it.
All right, here is an example. I have seen TV shows, where four players have one button, if one or two of them know answer, they hit a button, and one lamp turns on and the first is allowed to answer, while other keeps their mouths shut. I want to build the same idea, but in the web.
But problem is that, I don't know where to start, what keywords I should search for help on google and so on. I see, that it might work on HTML5, maybe JavaScript and so on.
I have idea using Ajax, but request it every second to get latest actions made seems rubbish. Also I found one service called Pusher, but it has limited users in one time, which doesn't fit my needs.
I need just ideas. Thanks.
Before you read the rest, a disclaimer: I work for Realtime.co but I do believe I can help here so I'm not trying to "pitch a sale".
You can check out Realtime (www.realtime.co). It's basically a set of tools for developers to use real time technologies on their projects. It uses websockets but does fallback to whatever the user's browser supports (such as long polling, for example).
Behind Realtime you have a one-to-one/one-to-many/many-to-many messaging system that will transport your messages to and from your users.
There's also a plus which is the fact that the Realtime framework is actually cross-platform. This means that you can even have your web users communicate with iPhone users, Android, users, Windows Phone, desktop applications, server applications, etc..
You can learn about the JavaScript API here: http://docs.xrtml.org/getting_started/hello_message.html#javascript.
You only need to register at Realtime.co as a developer and start using the free license.
I really hope that helps.
Okey, I think I will go with node.js.
Writing all this previous post, made me think in right way :)