How can I create a multi-part response with DialogFlow? - actions-on-google

So far, I have a conversational app that works with webhooks to my backend PHP server that sends JSON responses back to the Dialogflow API. So far, its working rather well.
The next step in the development would be to have the Google Assitant respond to the user with multi-part responses. I've seen the "Lucky Trivia" game do something similar (screenshot attached).
It is not clear to me how I can have the Assistant App generate multiple bubbles.
Some solutions I've tried:
Using rich responses with multiple parts
Generating SSML responses and using several <speak> or <p> tags
Using message objects
Using a followupEvent object
None of these have gotten me to the point Id like.
Rich responses will work for a maximum of two separate bubbles and no more.
SSML seems promising and is a great way to add prosody and sound bites, but everything I've tried will not deliver multi-part speech bubbles.
I can't find a syntax for message objects that works with "platform":"google". Indeed, specific support for platform=google isn't listed on that page, but I have seen it in some request/response JSON objects.
The followupEvent response seemed most promising, but as far as I can tell, the intent that triggers from the named event completely replaces the current response, it doesn't just add onto it.
So, my question is: What's the best strategy for getting similar multi-part messages on Google Assistant using DialogFlow?
Optimally, I'd like to fire new requests to my webhook sequentially, but building one large response containing all parts is a viable option if necessary.
How does Lucky Trivia do this?

I suspect that Lucky Trivial is able to get around the rules because it was made by Google and doesn't use the same library that we do. But let's look at each of your attempts and then some possible other approaches.
What doesn't work
As you note, RichResponses are limited to only two SimpleResponses which translate to two text bubbles. You could make larger responses, but there is still a suggested limit of 300 characters per bubble, and a hard limit of 640 characters.
The SSML responses, as the name suggests, are about what you hear - not so much what you see.
Message objects are turned into native platform objects anyway, so unless there was some way to support it in Google (and there isn't), then you can't do it.
Follow-up events are specifically documented to ignore the text that is returned from the original event. Their entire point is to delegate processing to the other intent.
What might work: Cards
This doesn't look exactly the same as what you want, but one way to get additional text included that is separate from the two bubbles is through a Basic card as one of the rich response items. You can even do some basic formatting in the card and include graphics.
More complicated: Media Response
Including a Media response object with the rich response items is a way you can send multiple responses to the user without having to wait for them to say something. In this way, you can get multiple text bubbles in a row without the user having to reply.
The trick is that you'll send the two simple responses in the rich response, and then include a Media response with a very short, and possibly silent, audio file.
After the audio file finished playing, you'll get an intent that indicates the media has finished playing. You can then send another reply with one or two more simple responses. If necessary, you can repeat this.
There are some downsides - the media player will show while it is playing, which will interrupt the bubbles, but once done it should clear. There will also be a pause in between some of the bubbles. But playing audio might also enhance your reply.

Related

How to send custom dimensions, medium, source or referer with an event via Measurement Protocol V2?

With v1 of the measurement protocol, you could use these parameters to add custom dimensions or change medium, source or refer for a page view:
https://ssl.google-analytics.com/collect?v=1&tid=UA-xxxxxxxx&cid=[custom-id]&t=pageview&dp=[Url of pageview]&dh=[hostname of pageview]&cm=[new-medium]&cs=[new-source]&dr=[new-referer]&cd1=[custom-dimension-1]&cd2=[custom-dimension-2]
How is it done in measurement protocol v2?
I couldn't find any documentation about the page-view-event in V2 (for example it's just not mentioned here
https://developers.google.com/analytics/devguides/collection/protocol/ga4/reference/events), even the event-builder (https://ga-dev-tools.web.app/ga4/event-builder/) doesn't support a simple page-view.
So, all I got so far is this:
$data = '
{ "client_id": "'.[custom-id].'",
"events": [
{
"name": "page_view",
"params": {
"page_location": "'.[Url of pageview].'"
}
}
]
}
';
So, what are possible parameters for a page-view-event?
Ok, a few things here right away that you should know if you're playing with MP:
Measurement protocol is a poor name. It implies there's more than one protocol for data gathering. There's none. There is just only one protocol for tracking.
MP2 still largely MP1. Google tries to pose GA4 as a new product, but it's just our old good GA UA with a simplified backend and overengineered front-end that tries to deliver the level of quality Site Catalyst/Omniture/Adobe Analytics have been delivering for a decade. MP is largely the same. dr, cm, cs and a lot of other fields are still there. cds aren't there anymore cuz they're replaced with eps and ups, but more about that a bit later.
GA4 uses this big marketing claim that the new analytics is so wonderfully event-based, unlike the old one. When I dug into why they keep claiming it everywhere, I realized that the only difference is that pageviews are now events. Not much difference really. But yes, a pageview is just an event named page_view. We'll talk about it a bit more later.
Custom dimensions are no more. Now they're called event properties and user properties. The same thing really, Google just tries to make it less obvious that there are no more session level custom dimensions. Or product-level CDs. Though the product level is seemingly on their roadmap.
Make sure you're using the correct measurement id. They made it a lot harder to find it in GA4. It's no longer just the property id visible in the property list, unfortunately.
GA's real-time reports don't include all dimensions, especially if those dimensions are involved in advanced metrics/dimensions calculations. Do not use real time reports for inspecting the content of your events. It's not meant for debugging. It's a vanity report. Still helpful to check the volume of events when you're sending a bunch and expect to see them in GA. Google even has a warning here:
Like the DebugView report, the Realtime report performs limited attribution analysis to ensure responsive reporting. We recommend that you refer to the Acquisition reports for the most accurate attribution information.
Finally, what I often do instead of reading the so-still-unfinished-and-not-really-helpful documentation on MP2, is either use a library like this.
Or, since 1 is the case, I would just implement a moniker tracking in my test GTM, then see what and how it sends to where in the Network debugger and simply reimplement it on my side exactly how GTM does it. No magic involved. Here is how my GTM tag would look like:
With a trigger on any click or any page load. After all is done, I publish the lib. Then I would inject this GTM's code in a local site, or in my test site, or however else you want to test it. And trigger the tag that you need to mimic with MP.
I use this wonderful extension to show all events that fire and their details right in my console.
Now this is how the above tag looks on my test site through the extension:
It's pretty useful.
How do I know that page_referrer is used as dr instead of ep in GTM? Here is the list of the fields that will never be seen as ep. But Google doesn't care enough to map them properly to what these fields are called in MP, so you either have to test, or know, or google it elsewhere.
Finally, here is how the network request looks like:
I published the tag to prod (I keep a test site in prod), so you can go and look at it. Or just find a site that uses GA4 and see its network requests. How does google know that this is a pageview? by the event name: en=page_view
Of course, you do the same with medium and source. Judging from the documentation I've linked to above, the medium and source look like campaign_source and campaign_medium in GTM. GTM maps them accordingly to cs and cm fields. And that's how you know these are the correct mp fields. Give GA time to process these and check on them in a few days.
Good, now this is applicable to the enhanced ecommerce hits too, it's just that they have more variables and data structures in them typically.
Finally, if you want to simulate batch events, you can just make a few tags fire in rapid succession and GTM will neatly pack them in one network request if they fit. You can then digest how the packing is done through the same methods as we do here and simulate.

Realtime meta-data/ captioning for live streamed audio

How might I achieve adding a track of accurately aligned real-time "additional" data with live-streamed audio? Primarily interested in the browser here, but ideally the solution would be possible with any platform.
The idea is, if I have a live recording from my computer being sent into Icecast via something like DarkIce, I want a listener (who could join a stream at any time) to be able to place some kind of annotation over a few of the samples and allow them to send only the annotation back (for example, using a regular HTTP request). However, this needs a mechanism to align the annotation with the dumped streamed audio at the server side, and in a live stream, the user AFAIK can't actually get the timestamp in the "whole" stream, just from when they joined. But if there was some kind of simultaneously aligned metadata, then perhaps this would be possible.
The problem is, most systems seem to assume you "pre-caption" or multi-plex your data streams beforehand. However, this wouldn't make sense for something being recorded and live-streamed in real-time. Google's examples seem to be mostly around their ability to do "live captioning" which is more about processing audio in real-time then adding slightly delayed captions using speech recognition. This isn't what I'm after. I've looked into various ways data is put into OGG containers, as well as the current captioning like WebVTT, and I am struggling to find examples of this.
I found maybe a hint here: https://github.com/w3c/webvtt/issues/320 and I've been recommended to look for examples by Apple and Google using WebVTT for something along these lines, but cannot find these demos. There's older tech as well (Kate, CMML, Annodex, etc) but none of these are in use and are completely replaced by WebVTT. Perhaps I can achieve something like this web WebRTC, but I'm not sure this gives any guarantees on alignment and it's a slightly different technology stack that I am looking at in this scenario.

What is event stream in rest api and why do we need it?

I'm trying to develop rest api for the first time. And looking to loopback references that uses change stream for the resources like /resources/change-stream with the GET and POST methods.
I have visited this post which indicates differences between rest api and streaming api.
While the loopback is providing it in rest api, I think. What is it and what it does. Can you please explain it to me in a way that you're making clear to me (for a six years old child). Because, I am developing REST API for the first time in my own. So, I would like to understand step by step if possible like what should I have in the postman. Should I use the url like '/api/resources/change-stream?_format=event-stream along with application/json content-type or just /api/resources/change-stream would be fine.
It would be great example if you could provide me some real example so that I can develop it trying in my own application.
PS: It's perfectly fine to me whichever language (Node.js, Python, Ruby, PHP) you'd choose to provide answer with some examples.
If I had to guess, it sounds like a 1-way long polling where you leave a long running, open request to a server that will fulfill the request when an event happens. If the request times out, don't worry about it, send another and leave it open. When the request is fulfilled with an event, immediately fire another request so that you can receive the next event.
Since the document on the other end of the API is still (probably) a JSON document, you should keep that mime. However, you aren't limited in what you can send back as an event type; if you want to send back XML or YAML, do so and set that mime. The "stream" is just a convention mechanism.
As far as your application is concerned and from a REST perspective, it just takes a while for the event that you are trying to get to be provided to you and it has a high chance of failure. But I wouldn't look at this from a REST perspective, REST is just convention, don't let it tie you down.
Alternatively, long-polling should probably be replaced by something like a WebSocket as it provides a much easier API (in my opinion) and doesn't seem as hacky as long-polling.
If you're trying to ask, "how do I tell a RESTful consumer that my API is a 'stream' API", there is not point. Again, as far as REST is concerned, the https://example.com/api/events/ endpoint refers to a JSON type document that changes a lot, takes a long time to receive, and "fails" often (if the events you generate don't fire a lot).

constructing xml within ios app

I am sending a request back to the server I am communicating with, the gist of this request will have a bunch of different parameters, like user ID, request number etc.
One of the more important parts of the request is a segment of XML that im hoping to create based of a few user selections in my interface.
Then at the end i will wrap this all up and send it off to the server...
However at the moment I have no idea how to form an segment xml, I have been reading this but im not sure how it relates to what I would like to do.
any help, example code, example tutorials or anything would be really helpful.
A plist is just xml with a strict dtd; and you can use NSPropertyListSerialization to create one to send back to the server from an NSArray/NSDictionary, very easily.

What do you wish Fiddler could do that it can't... or that you can't figure out how to do?

In this question's comment, EricLaw (the author of Fiddler) wrote:
Fiddler has lots of interesting
features, but not all of them are
super well-documented. A related
question would be: "What do you wish
Fiddler could do that it can't... or
that you can't figure out how to do?"
– EricLaw -MSFT- Nov 2 at 2:54
Following the lead - what do you want from Fiddler that it doesn't have now (or you don't know whether it has)?
I'd like it to be able to format and pretty-print XML and JSON request/response bodies, e.g.
so a raw:
<SomeElement><Nested><MoreNested>X</SomeElement></Nested></MoreNested>
Could be displayed as:
<SomeElement>
<Nested>
<MoreNested>X</SomeElement>
</Nested>
</MoreNested>
Would be really useful when looking at our API calls.
I know it can do the XML tree view, but I'm more comfortable looking at raw markup so I can see exactly what's going on. I'd just like to look at the raw markup in a nicely formatted and coloured way!
The Inspectors tab has a WebForms button, which is very nice for checking x-www-form-urlencoded POST data, but as soon as the form is multipart/form-data (e.g. forms with file uploads), this button can't display it. Or can it?
I'm not sure this is really a programming question... But, one feature I'd like is to be able to configure a standard set of screens on the right side. I always use raw mode, for example, and have to reset it every time I start the app.
Another is that I would love to have it work in a mode where it snoops on the TCP conversation, rather than acting as a proxy, along the lines of tools such as IBM Page Detailer. The problem is that browsers interact with proxies differently than they do when they're talking directly to hosts.
I hope I can select which process to watch.
not only browser or not-browser option.
I'd really like to increase the latency between the request header of an HTTP POST and the request body. I can see how you can increase the latency of the entire request as a whole, but not the individual packets.