Is there a maximum length or amount of time that a user can respond for (in response to an Action on Google intent) - actions-on-google

I see there is a limit that a user has to respond by before the conversation ends:
"Your response must occur within about 5 seconds or the Assistant assumes your fulfillment has timed out and ends your conversation."
How long does it take for the app to time out and exit the conversation
But is there a maximum that a user can respond for (type voice)? We want to allow for longer responses (and then access the response text).
Ideally we would like an unlimited response time and the ability to access the raw input (type voice) when received
It would be excellent if we could access the audio from the user's response, but as I understand that is not possible.

As I explained here, you can't have access to the raw audio recordings of interactions with your actions as of now. You only get access to the transcription of the user's utterance.
The quote you've supplied:
"Your response must occur within about 5 seconds or the Assistant assumes your fulfillment has timed out and ends your conversation."
isn't about the user's response. Your webhook fulfillment must complete in 5 seconds, otherwise, your system persona will time out.
As far as the length of users prompt goes; if the user doesn't say anything it will trigger a no-input prompt (on smart speakers) or just close the mic (on smartphones) in around 8 seconds. (I don't know if there's an official resource that proves 8 seconds, this is just my experience)
But once the user starts speaking, it will keep listening until the user stops talking. So you can theoretically have a long prompt from the user. However, I wouldn't recommend this since it would be a terrible user experience if you look at it from a conversation design stand point.

Related

Watson Assistant API v2 & session expiration

We're building an app that uses the API v2 to interact with Watson Assistant. We're aware that the "state" of the conversation (among others: the position in the dialog tree) is now kept on the service side using the session_id key.
The problem: the session expires (5 to 60 minutes depending on the pricing plan).
Is there a way to either resurrect an expired session or save the conversation state so that it can be restored ?
We've tried to save and restore the global & skills contexts but they don't hold the conversation state.
Thanks for your help.
The current inactivity timeout period is plan-specific
- lite and standard 5 minutes
- plus and premium 1 hour
In the coming days, you will be able to change that value for plus and premium up to 24 hours. Lite and Standard you will only be able to decrease to a lower value if you want to close sessions faster.
You can always save context at the application level but currently there is not a way under the V2 API to save where the user is in the dialog so that you can pass it back after exceeding the allowed session inactivity timeout period.
Complementing what #oscar.ny mentioned, it's also plan-specific and you could potentially change the timeout timing on the Settings -> Timeout limit field -> Change the value and close, it saves automatically.
Something that I've done before in the past was sending an empty message when the event of 5min inactive happened. This event would call the function that would hit the API message method to send an "Are you still here, I was talking about xyz". Where xyz was the latest message sent to the user to maintain the session.
Ref:
change Timeout limit

actions on google - how to handle long running operations

I have AoG action that is logging-in to external backend and once logged in it can control specific appliance via external backend's API. Action basically controls home alarm via commands like arm section XY, disarm section garage, etc. Before getting control to alarm it is necessary to login and this takes considerable time (approx. 20-30 seconds). This is much longer than AoG actually allows resulting in timeouts. I am initiating login as asynchronous operation in actions.intent.MAIN handler (i.e. not waiting for the result of login within handler) and just saying to user to tell the command (arm/disarm garage, etc.) in couple of seconds. I have also implemented push notification which is working fine. Problem with push notification is that it just pops up on mobile phone without any sound & user has to open notifications and tap it. Then it will trigger intent and do requested action.
This is not really good user experience (typically I would like to use my action in the car when coming home and having the possibility to disarm the home alarm without need to touch the phone, tap the notification, etc).
Any idea how to implement it in more proper way? What I would really appreciate is if google assistant could actually re-initiate the conversation & tell me something like: 'hey I am already logged in into alarm service provider, what do you want me to do now?'.
I will be grateful for any advice dealing with similar problem.
I am using ActionsSDK for Node.JS to build my action.
You've already looked at the ways that the Assistant can initiate (or re-initiate) a conversation. Actions are really designed for something that is conversational, and a 30 second pause in the conversation would be awkward.
One other option you have is to use a Media Response as part of your reply to the user logging in (or as part of your welcome intent? Not entirely clear, but the approach would be the same). This would let you play some "hold music" for several seconds. At the end of the music playing, an actions.intent.MEDIA_STATUS would be sent to your Action, which you can use to make sure the login has completed and, if so, respond to the user appropriately.
The only way for AoG to "take initiative of starting a conversation" is through push notifications. There is no way for the assistant to strike up a conversation after a period of time or when an event occur.
Perhaps another way of doing might be to only send push notification if your action fails to execute the long sequence of events and the triggering action could invoke an intent to try again. The assumption would be that everything's fine unless said so.
You could also notify the user that it'll take a couple of seconds to complete the action once it's initated and implement followup intents that handles if the user asks "Is it done?" or "How's it going?". Making it part of the flow to check on progress, but with the assumption that it should be successful.
You can easily dislocate the long running background process by implementing a task queue in Firebase where your intent is creating a child similar to this.
firebase.database().ref("tasks").push({action: "disarm_garage"});
And then you create a cloud function trigger to handle it
functions.database.ref('tasks/{id}').onCreate((snap) => {
const action = snap.val().action;
switch (action) {
case 'disarm_garage':
// ...
break;
}
// Remove the task after processing
return snap.ref.remove();
});
That would ensure that you have enough time to complete the task in background without blocking the conversation.

DialogFlow google Home Assistant keeps listening, does not pause

I have created a chat bot which responds to request.This is the flow currently happening:
I say "Talk to My Test App"
My app starts and says welcome message.
I request something and my intent is fulfilled
After this the Google Home does not pause but keeps on listening.
If I stop it then again I will have to say "Talk to My Test App", which I also don't want.
I want google home to sleep after fulfilment.
and Awake in the same app when I say "Ok Google"
More details:-
In my use case the user will talk to the app frequently, for example after every 30 seconds-2mins. I don't want him to say every time "Hey Google" to wake up and then "Talk to My App" and then the command. I also don't want to say long sentence after waking up the Google Home like "Talk to My App to Do this".So I thought it would be better that my app doesn't stop by ending the conversation, instead it should be paused.So that the user can just wake up Google Home and directly pass the command.
Currently Google Home does not pause after the first command and keeps on listening surrounding sounds and responds to the noise, because of this issue user has to stop it.
I needed to have a pause so I can narrate my thoughts but not exit my conversation for a client demo, so I added this in DialogFlow's text response, with a very long break at the end of each text response. I can then interrupt the pause with "Okay Google" and stay within my conversion.
<speak>This is a sentence with a <break time="600s"/> pause</speak>
As the name suggests, a conversational VUI suggests that you're going to have a conversation with the agent. Not something with long pauses in between. The assumption is that if there is no reply, that the user isn't actively engaging in the conversation. There is no direct feature that does what you want, although there are a couple of interesting workarounds that might work for you.
First, as you suggest, deep linking with a phrase such as "Hey Google, ask My App to Do This" is certainly a possible approach, and one that you should support. In production, and as a user uses it more, the introduction and hand-off from Google gets shorter and shorter. Even the launch phrase can be shortened with the user creating a Shortcut - but this is a choice of the user, not you.
Although there is no way to "pause" a conversation, there is a way to have the reply include streaming audio that the user can interrupt. Using a Media Response begins playing that media.
When the URL pointed to by the media ends, your Action gets a callback (via an Event in Dialogflow or an Intent with actions.json) indicating the media has ended, and you can do something like play another Media Response, and continue to do this as long as appropriate.
Are any time, your user can interrupt the audio by saying "Hey Google" and some command. This will trigger any matching Intent, as if they said a command as usual.
There are some caveats with this scheme - some commands don't actually work (anything with "next" in it, for example, since that sounds more like a media command that isn't implemented), and you need an audio of reasonable length that won't be distracting in your environment, but this might be a reasonable solution to your scenario.
If you want to exit the conversation you may do the following:
Go to the dialogflow console.
Create and intent (say goodbye).
In the Event section, enter 'actions_intent_CANCEL' in the add event field.
Enter your Training Phrases. (Say exit, stop, pause etc)
Enter your greetings (say goodbye) in text response. You may skip this if you don't want to reply.
Enable 'Set this intent as end of conversation'.
Save

How to make the agent say something before leaving the mic open?

Google rejected my app and give the following feedback:
During testing we noticed that when the Action is not able to get data
it opens the mic and leaves it open without prompting. Make sure that
your agent always says something before leaving the mic open for the
user, so that the user knows what they can say. This is particularly
important when your agent is first triggered.
I've built my app using API AI tool and webhooks (connects to a web service running on Heroku). Heroku sleeps after 30 minutes of inactivity. I think this error occurs when Heroku takes a long time to respond. Any idea how can I make the agent say something before leaving the mic open?
I am not sure why I got this feedback because in case the web service request times out, Google Home speaks the following response.
It might answer the text response you added on API.ai but at the bottom of the page of your intent (under the text response) click on "Actions on Google" then check "End conversation" Check this screenshot
When you use assistant.ask in your fulfillment logic, you should be asking the user a question. It should be clear to the user what they are expected to answer.
If your fulfillment instance goes to sleep or doesn't respond quickly, then typically the assistant will play a message that indicates your action isn't responding.

How do I constantly check if something on a server (Parse) has changed without thousands of requests?

I am creating an application which has a follow mechanism where the followed user has to accept the request of a following (similar to private accounts on instagram).
I then want the following user to find out when the other user has checked a million times (every time the following user opens the screen if I did the query in viewDidLoad). However, the problem with this, is that there will be a lot requests which will expensive to me as I will have to pay for the requests to Parse so I want to minimise these queries.
Currently, the best thing I can think of is to check once a day at midnight for example but this doesn't seem very seamless.
Is there a better way of doing this?
For starters consider how stale you are willing to allow an app's view of the world to be and cache the response that long. If a user views that screen every 30 seconds you might only want to actually check with the server 5 minutes after the last successful response (or the last response which had 0 follow requests).
You might consider switching from this sort of "pull" polling where the client decides when to ask the server if anything has changed to a "push" model where the server can inform the client when a change occurs. For example you can send a silent background push notification to a user's devices when they have a follow request, the app can then respond by performing your existing query.
You might still want polling or user triggered requests (like a "pull to refresh" gesture) as a fallback for missed notifications or devices with notifications disabled but you should be able to drastically reduce request volume.