Adobe Analytics overwriting visitorID - tags

I need to overwrite the default visitorID, that's automatically set by Adobe Analytics s_code, with a custom value.
As explained here, I may set the s.visitorID variable for this purpose but It's not clear to me how, and overall when, doing so.
I guess this variable should be set into the s_code itself but I fear that the automatic visitorID would be used anyway for the first s.t() call, in the place of the custom value I'd like to use.In fact, I want that since the first automatic request the custom visitorID is used.In addiction, assumed that the custom value is passed within a GET parameter, I'd like to know if the "Query string parameter on the image request" ("vid" param) could be used for this scenario. and how (is the second method listed in the link above).Thank you.

Some thoughts and words of warning about setting the vistor id yourself:
There are some benefits to setting the visitor id yourself. The main benefit is that you are 100% in control of how visit(or)s are tracked. Another good reason to do this is if you already have a visitor id infrastructure in place on your site, setting AA's visitor id to what you already have can potentially make it easier to tie data to or cross-reference data between AA and other places that utilize that visitor id.
Sidenote for that.. It is possible to get the visitor ids from Adobe from various places (e.g. Data Warehouse, Data Workbench, using the Adobe API), but currently there is no report in Adobe Analytics itself to see the visitor id, even if you set it yourself. To get around this, you can also assign it to a prop and/or eVar, which is visible in AA reports. But this is not possible with the 3rd party cookie tracking (because javascript cannot read 3rd party cookies), so that is another benefit to setting it yourself.
Beyond that, IMO there are basically no benefits to setting it yourself. On the other hand.. now for some warnings...
Setting a value in visitorID or vid or equivalent does NOT cause AA to update the visitor id in its cookie. AA continues to generate/output the visitor id in the cookie and then on AA server, and it simply overwrites that value with your visitor id for that hit (on the backend, on the collection server). But it does NOT update the cookie with your new visitor id.
The implication here is that if you want to set the visitor id yourself, you are essentially putting the responsibility on yourself to keep track of the visitor. So, you must have your own infrastructure in place that not only generates the id, but preserves it and ties it to the visitor, so that it can be output on every page the visitor views - including if the visitor navigates between multiple domains your tracking code is implemented on (if applicable).
If you do not do this, then the visit(or) will break and count as a new visit(or) going from page to page, or when the visitor hop from domainA to domainB. How often the break actually occurs is directly tied to how often you are actually outputting the visitor id yourself (with a correct value). For example if you only set it on first hit, and then on never set it, the visit(or) will break once, because from 2nd hit + AA will just default to using its own generated visitor id.
To put this into perspective.. I've had my fingers in the web analytics pie for almost 10 years now, the last 6 of them working full time for a high profile web analytics agency, so this is basically all I do for a living. I've worked with I don't even know how many clients by now (over 100 for sure), and there has only been one client who actually went through with setting the visitor id themselves. I'm just mentioning this so that you understand that (from my experience, anyways), this is not something most clients embark upon. So, make sure you are absolutely confident about your methods for generating and keeping track of visitor id values before doing this.
Having said that, if you still want to do this...
Firstly, to be clear about the link you posted, those are examples of how the visitor id can be set depending on your implementation. Adobe offers several ways to record data, and the javascript method is just one way. The examples on that page show what you'd set for some of the other ways (including the js way).
The point here is that not all of those methods may be relevant to you, depending on your implementation. For example, if you are only implementing AA with javascript (whether it be the core s_code.js or through DTM or w/e), the only one relevant to you is s.visitorID.
So for example, yes, you can use s.visitorID to override AA's default visitor id generation, for javascript implementation. If you set that variable, you will see the vid param show up in the request to AA (look at the request sent to the AA collection server with a packet sniffer, or with browser addon or dev tools net/traffic tab).
The reason vid was mentioned on that link is because that is what you'd use if you are manually building the request URL to AA. For example, if you don't want to use the javascript implementation, and instead want to use server-side logic to build and output an image tag yourelf, or send data to Omniture directly from your server (e.g. cURL), the vid param is what you'd set for the visitor id.
Sidenote: hardcoding your own image request is kind of a throwback for mobile device tracking in earlier times when mobile devices did not consistently or fully support javascript. Pretty much all modern mobile devices these days fully support javascript, and on top of that, Adobe has done a lot of work over time streamlining the core AA library to be leaner and more efficient whether viewing on desktop or mobile browser (make sure you are using the latest version of AppMeasurement library).
So IOW if you are using the javascript library, you don't need to worry about the vid param, because the js library already does it. Although for QA purposes, you can check that it is there with your value on a given request.
As far as "how" to set it.. assuming you're implementing it with javascript: you set it like any other AA variable. Somewhere between the AA library loading and the s.t or s.tl trigger, you assign a value to it, e.g.
s.visitorID="[my value here]";
Where specifically you set it, depends on how you have implemented AA. For example, if you have implemented AA through DTM, and you have your custom visitor id exposed on your page before the DTM library is loaded (e.g. some data layer property, or in a cookie), you can create a data element that grabs that value, and then in the AA > Tool Config > Cookie > Visitor ID field, you can specify your data element, and DTM will set it for you (but you are still responsible for making sure whatever source the data element draws from is there)
And again, note that even if you set it, you will still see AA's default generated id in their cookie and request urls. The override happens on AA's collection server, which you do not have visibility into. To verify that AA is actually using your custom value, you will need to export it from AA (e.g.with a data warehouse export).

Related

How to send custom dimensions, medium, source or referer with an event via Measurement Protocol V2?

With v1 of the measurement protocol, you could use these parameters to add custom dimensions or change medium, source or refer for a page view:
https://ssl.google-analytics.com/collect?v=1&tid=UA-xxxxxxxx&cid=[custom-id]&t=pageview&dp=[Url of pageview]&dh=[hostname of pageview]&cm=[new-medium]&cs=[new-source]&dr=[new-referer]&cd1=[custom-dimension-1]&cd2=[custom-dimension-2]
How is it done in measurement protocol v2?
I couldn't find any documentation about the page-view-event in V2 (for example it's just not mentioned here
https://developers.google.com/analytics/devguides/collection/protocol/ga4/reference/events), even the event-builder (https://ga-dev-tools.web.app/ga4/event-builder/) doesn't support a simple page-view.
So, all I got so far is this:
$data = '
{ "client_id": "'.[custom-id].'",
"events": [
{
"name": "page_view",
"params": {
"page_location": "'.[Url of pageview].'"
}
}
]
}
';
So, what are possible parameters for a page-view-event?
Ok, a few things here right away that you should know if you're playing with MP:
Measurement protocol is a poor name. It implies there's more than one protocol for data gathering. There's none. There is just only one protocol for tracking.
MP2 still largely MP1. Google tries to pose GA4 as a new product, but it's just our old good GA UA with a simplified backend and overengineered front-end that tries to deliver the level of quality Site Catalyst/Omniture/Adobe Analytics have been delivering for a decade. MP is largely the same. dr, cm, cs and a lot of other fields are still there. cds aren't there anymore cuz they're replaced with eps and ups, but more about that a bit later.
GA4 uses this big marketing claim that the new analytics is so wonderfully event-based, unlike the old one. When I dug into why they keep claiming it everywhere, I realized that the only difference is that pageviews are now events. Not much difference really. But yes, a pageview is just an event named page_view. We'll talk about it a bit more later.
Custom dimensions are no more. Now they're called event properties and user properties. The same thing really, Google just tries to make it less obvious that there are no more session level custom dimensions. Or product-level CDs. Though the product level is seemingly on their roadmap.
Make sure you're using the correct measurement id. They made it a lot harder to find it in GA4. It's no longer just the property id visible in the property list, unfortunately.
GA's real-time reports don't include all dimensions, especially if those dimensions are involved in advanced metrics/dimensions calculations. Do not use real time reports for inspecting the content of your events. It's not meant for debugging. It's a vanity report. Still helpful to check the volume of events when you're sending a bunch and expect to see them in GA. Google even has a warning here:
Like the DebugView report, the Realtime report performs limited attribution analysis to ensure responsive reporting. We recommend that you refer to the Acquisition reports for the most accurate attribution information.
Finally, what I often do instead of reading the so-still-unfinished-and-not-really-helpful documentation on MP2, is either use a library like this.
Or, since 1 is the case, I would just implement a moniker tracking in my test GTM, then see what and how it sends to where in the Network debugger and simply reimplement it on my side exactly how GTM does it. No magic involved. Here is how my GTM tag would look like:
With a trigger on any click or any page load. After all is done, I publish the lib. Then I would inject this GTM's code in a local site, or in my test site, or however else you want to test it. And trigger the tag that you need to mimic with MP.
I use this wonderful extension to show all events that fire and their details right in my console.
Now this is how the above tag looks on my test site through the extension:
It's pretty useful.
How do I know that page_referrer is used as dr instead of ep in GTM? Here is the list of the fields that will never be seen as ep. But Google doesn't care enough to map them properly to what these fields are called in MP, so you either have to test, or know, or google it elsewhere.
Finally, here is how the network request looks like:
I published the tag to prod (I keep a test site in prod), so you can go and look at it. Or just find a site that uses GA4 and see its network requests. How does google know that this is a pageview? by the event name: en=page_view
Of course, you do the same with medium and source. Judging from the documentation I've linked to above, the medium and source look like campaign_source and campaign_medium in GTM. GTM maps them accordingly to cs and cm fields. And that's how you know these are the correct mp fields. Give GA time to process these and check on them in a few days.
Good, now this is applicable to the enhanced ecommerce hits too, it's just that they have more variables and data structures in them typically.
Finally, if you want to simulate batch events, you can just make a few tags fire in rapid succession and GTM will neatly pack them in one network request if they fit. You can then digest how the packing is done through the same methods as we do here and simulate.

How exactly does backend work from a developer perspective?

Theres a ton of videos and websites trying to explain backend vs frontend, but unfortunately none of them explains it in a way that you know how to develop a backend - driven website (at least I haven't found anything good).
So, I wanted to ensure that I understood it and kindly ask you to confirm or correct me on this topic.
Example:
I wanted to build Mini - Google. I have a Database containing 1000 stored websites.
Assumption #1:
Everytime I type something into the search bar, the autofill suggestions change. This means, everytime i type, another website / API gets called returning the current autofill suggestions. On a developer site, this means the website e.g. is a Python script which gets called with the current word typed in as a Parameter and is returning all suggestions as e.g. JSON:
// Client Side Script
function ontype(input):
suggestions = get("https://api.googlemini.com/suggestions?q=" + str(input))
show(suggestions)
Assumption #2:
This also means I could manually call the website containing the Python script, providing a random word and it would always return a JSON containing the autofill suggestions for that word.
Question #1:
If A#1 turns out true but A#2 turns out false, how could I prevent a user from randomly accessing the "API" while still returning results when called by a script?
Assumption #3:
After pressing enter, my website googlemini.com/search?... would be called. As google.com/search reloads everytime searching for a new query (or going to page 2 etc.), I assume, instead of calling an API, when the server gets the client request, it first searches through its database, sorts the results and then returns a whole html as a static webpage:
// Server Side Script
#app.route("/search")
function oncall():
query = getparam("q")
results = searchdatabase(query)
html = buildhtml(results)
return html
Question #2:
Often, I hear (or at least understand it this way) that database and webserver are 2 seperate servers. How would that work? Wouldn't that mean the database server needs to be accessible to the web too (of course it would have security layers etc., but technically it would)? How could I access the database server from the webserver?
Question #3:
Are there, on a technical basis, any other ways to build backend services?
That's it. I would also appreciate any recommendations like videos, websites or others to learn how to technically setup and / or secure backend servers.
Thanks in advance.
For your first question you can yes there is a way to prevent miss use.
What you can do is add identifier to api like Auth token to identify a user and every time a user access the api you can save the count on the server n whenever the count has exceeded a limit within a time span you can reject the call. And the limit can be set in such a way that it doesn't trouble the honest user and punishes the wrong one. There are even more complex and effective methods but this is the basic idea.
For question number to let me explain you a simple concept a database is a very efficient, resourcefull and expensive data storage solution we never want it to be used in a general sense as varible store or something. We always want to access the database in call get the data process the data update the data. So we do it data way and its not necessary you make sepreate server for data base. The thing is we mostly make databse to be accessible to various platforms android, ios, windows. So its better to add some abstraction and keep data base as a separte entity.
For the last, I am not well aware about what you meant by other but I am listing some backend teechnologies, some of these might be used in isolation some of these not some other tools as well.
Django
FLask
Djnago rest
GraphQL
SQL
PHP
Node
Deno

Restful business logic on property update

I'm building a REST API and I'm trying to keep it as RESTful as possible, but some things are still not quite clear for me. I saw a lot of topic about similar question but all too centered about the "simple" problem of updating data, my issue is more about the business logic around that.
My main issue is with business logic triggered by partial update of a model. I see a lot of different opinion online about PATCH methods, creating new sub-ressources or adding action, but it often seems counter productive with the REST approach of keeping URI simple and structured.
I have some record that need to be proceeded ( refused, validated, partially validated ..etc ), each change trigger additional actions.
If it's refused, an email with the reason should be sent
if it's partially validated, the link to fulfill the missing data is sent
if it's validated some other ressources must be created.
There is a few other change that can be made to the status but this is enough for the example.
What would be a RESTful way to do that ?
My first idea would be to create actions :
POST /record/:id/refuse
POST /record/:id/validate ..etc
It seems RESTful to me but too complicated, and moreover, this approach means having multiple route performing essentially the same thing : Update one field in the record object
I also see the possibility of a PATCH method like :
PATCH /record/:id in which I check if the field to update is status, and the new value to know which action to perform.
But I feel it can start to be too complex when I will have the need to perform similar action for other property of the record.
My last option, and I think maybe the best but I'm not sure if it's RESTful, would be to use a sub-ressource status and to use PUT to update it :
PUT /record/:id/status, with a switch on the new value.
No matter what the previous value was, switching to accepted will always trigger the creation, switching to refused will always trigger the email ...etc
Are those way of achieving that RESTful and which one make more sense ? Is there other alternative I didn't think about ?
Thanks
What would be a RESTful way to do that ?
In HTTP, your "uniform interface" is that of a document store. Your Rest API is a facade, that takes messages with remote authoring semantics (PUT/POST/PATCH), and your implementation produces useful work as a side effect of its handling of those messages.
See Jim Webber 2011.
I have some record that need to be proceeded ( refused, validated, partially validated ..etc ), each change trigger additional actions.
So think about how we might do this on the web. We GET some resource, and what is returned is an html representation of the information of the record and a bunch of forms that describe actions we can do. So there's a refused form, and a validated form, and so on. The user chooses the correct form to use in the browser, fills in any supplementary information, and submits the form. The browser, using the HTML form processing rules, converts the form information into an HTTP request.
For unsafe operations, the form is configured to use POST, and the browsers therefore know that the form data should be part of the message-body of the request.
The target-uri of the request is just whatever was used as the form action -- which is to say, the representation of the form includes in it the information that describes where the form should be submitted.
As far as the browser and the user are concerned, the target-uri can be anything. So you could have separate resources to handle validate messages and refused messages and so on.
Caching is an important idea, both in REST and in HTTP; HTTP has specific rules baked into it for cache invalidation. Therefore, it is often the case that you will want to use a target-uri that identifies the document you want the client to reload if the command is successful.
So it might go something like this: we GET /record/123, and that gives us a bunch of information, and also some forms describing how we can change the record. So fill one out, submit it successfully, and now we expect the forms to be gone - or a new set of forms to be available. Therefore, it's the record document itself that we would expect to be reloading, and the target-uri of the forms should be /record/123.
(So the API implementation would be responsible for looking at the HTTP request, and figuring out the meaning of the message. They might all go to a single /record/:id POST handler, and that code looks through the message-body to figure out which internal function should do the work).
PUT/PATCH are the same sort of idea, except that instead of submitting forms, we send edited representations of the resource itself. We GET /record/123, change the status (for example, to Rejected), and then send a copy of our new representation of the record to the server for processing. It would therefore be the responsibility of the server to examine the differences between its representation of the resource and the new provided copy, and calculate from them any necessary side effects.
My last option, and I think maybe the best but I'm not sure if it's RESTful, would be to use a sub-resource status and to use PUT to update it
It's fine -- think of any web page you have ever seen where the source has a link to an image, or a link to java script. The result is two resources instead of one, with separate cache entries for each -- which is great, when you want fine grained control over the caching of the resources.
But there's a trade - you also need to fetch more resources. (Server-push mitigates some of this problem).
Making things easier on the server may make things harder on the client - you're really trying to find the design with the best balance.

Is it correct performing GET requests and checks inside a POST handler?

I'm designing a ticket booking API. Right now booking a ticket resolves into POST /users/{id}/tickets but each /events/{id} has a maximum of available tickets. How do I properly design a check?
I've come up with two ways:
1) having an availibleTickets: field into the /events/{id} that gets checked and possibly updated each time I POST a new ticket.
2) having a maxTickets: field into /events/{id} and check the length of GET /events/{id}/tickets array, compare it to maxTickets
Anyway I have to perform a GET request inside the POST handler but it doesn't look right to me, do you have any suggestions?
How would you desing a ticketing system for a Web page? The same steps you apply to a Web page also apply to REST as it is just a generalization of the same interaction flow used on the Web.
Usually, on the Web you have a link you can see an event you can order tickets for. On this page you have a link to order tickets for that particular show. Depending on the system you use, you might see a layout of the event venue in the form of buttons or images to click if there is a certain seat order where available seats are marked as green and ones that are already booked as red or whatever color scheme you use. A click on a seat will trigger some reservation logic on the server that returns almost the same page as before but this time with the seat marked as orange to indicate a reservation. Next you click the available seat next to that seat to reserve a further seat. This story continues until you either have enough seats marked as reserved or no available seats are available and you have no options left as to either cancel the reservation, proceed to the order step or unreserve seats you marked as reserved beforehand. Once you are satisfied with your choice, you will find an order or submit button or link where you turn your reservation into a booking. This might involve some further steps like entering your contact and/or billing information. Though this is in principle how I'd design such a system for the Web.
As you might see, this turns out into some kind of state machine where the server tells you all of the options you have available at this current state of the process. This is exactly what Asbjørn Ulsberg mentiones when talking about affordance and state machines. From the blueprint of the venue and the respective seats on that blueprint, which are actually buttons or images you might click, you knew what these widges are for and you somehow know what will happen when you click on one of the seats. This is what affordance is all about. By seeing it you know what you can do with it.
The interaction concept outlined above should be taken and translated to REST. As a client you don't need to know the structure of the URI, all you need to know is what seats are available and what happens when you click certain links. This is usually done in REST through link relation names that give the mentioned link some semantical context to the current state of the resource the client just fetched. Such link-relations may seem like a-priori knowledge needed by the client, which is a bit anti-REST, as REST tries to decouple clients from servers to allow the latter one to evolve freely without risking clients to break, though as link-relations should be standardized, or should be based on extensions, such as dublin-core or other microformats. Buidling up on standards will either lead to broad acceptance and support by different clients or on mechanisms to plug-in such knowledge into a client later on. This in general avoids so-called out-of-band information or process flows that force you to lookup up the manual on how to use that system.
The approach outlined above would utilize an own reservation resource that is uniquely created on "entering" the reservation, which is kept till the order ticket step is invoked. This reservation resource keeps track of the reserved seats the user has chosen so far. Whether the system considers reserved seats by other users as taken or not is an implementation detail. It is ok to either use a first-come system or a more polite one that guarantees the reserver his seats until some grace-period has passed and the user didn't order them. This gives you a good impression that such resources can be volatile and just be part of a certain process.
In regards whether to use GET, POST or other HTTP methods, a Web page that sends you to a reservation page will show you a form containing all of the seats of the venue. As HTML does only support GET or POST, the latter one is the most appropriate thing. In a REST or HTTP API you might use PUT though. A server might already have assigned you a certain, unique "reservation" link that you can just invoke with PUT. If the reservation resource does not exist yet, it will be created for you, if it did, the whole content will just be updated. Especially when you dealing with reservations and money flows you want to use idempotent methods such as PUT.
I hope I could give you some ideas on how you might design your reservation system by letting a server teach a client everything it needs to know to proceed through its task.
It's inside the post method (server-side) that you must check if tickets are available before book the event.
you can create a specific route to know how many tickets is available if needed. the client could call it before book an event. Or give the availibleTickets in the get /events/{id}
Imagine 10 client trying to buy the last ticket at the same time, if the security is not in the post method, you'll book 9 imaginary tickets

Zend Framework Dynamic Routing

I have a website that is purely database driven. I m new to Zend Framwork, and to the concept of routing, though I have been doing a lot of reading. Brain is pretty much a sponge, with some stuff still trying to be comprehended. I am using ZF mainly for the routing, though I plan to implement other aspects of it when I can. For the most part it is a learning process, so there are some things I will want to write myself without the framework.
Here’s how the site should work:
URLs could be anything from “/” - a root index, to
“/contact/ - a root file, to
 “/deposits/” – a sub-directory to
“/deposits/ira/” - a file in the sub directory.
When a user clicks on a link code will need to parse the REQUEST_URI in order to look into the “pages” table of the database. The sole purpose for this is to get the ID of the record matching the REQUEST_URI. That ID is the key to everything for the page, and other tables are then checked to see if there is any data for other aspects of the page that need to be gotten. The immediate need is for the template name. The site will have a few different pages that are used depending on if it is a home page, section landing page, or content of a section. This information is decided upon when a page is saved to the DB.
I want to be able to take this data and then decide how to route it so that it uses the correct template and can collect the rest of the data from that point to complete the page.
Since sections and pages can be created at any time, there must be controllers that can handle what to do based on the returned template data. This pretty much means the controllers and such will need some standardized names that are non-specific to what values passed in the REQUEST_URI.
How would I accomplish this in Zend so that all this happens before the controller is selected, and only having template names to go by for selecting the correct controller?
Thanks,
Cy
If it's truly the case that all your routes (at least for the frontend) are dynamic, then it sounds like you could do this:
On bootstrap, remove all default routes.
In a plugin on routeStartup, grab the REQUEST_URI and query your db. Presumably that record contains enough info for you to figure out the required controller, action, layout, etc.
2.1 Add a single route (matching the REQUEST_URI) mapping to the controller and action.
2.2 Set the layout to be the required layout.
Then during normal dispatch, the route will match, the controller/action will be invoked and you'll have the layout set correctly.
If the request doesn't match any page stored in your db, you'll have to invoke the error controller/action yourself to give a 404 response.
However, if you eventually have some static (that is, non-db-stored) routes in your app (which I imagine must be the case), then you'll want to match against them before you hit the db to search for the requested one. That matching sounds like a pain (though perhaps there is a way to ask the router itself if the requested route matches, just like the standard dispatcher does).
In that case, perhaps an alternative approach would be to add all those (non-db-stored) routes to the router in the standard way at bootstrap, and put all this REQUEST_URI inspection, db-searching, and controller/action/layout handling in the 404 handler. If the requested url matches something in your db, then _forward() (not redirect) to that controller/action and set the layout, as above.
It's probably not the most performant solution- since _forward() triggers another iteration in the dispatch loop - but it seems like it could work.