How to read Vertex AI model monitoring statistics

How to read Vertex AI model monitoring statistics - google-cloud-storage

I've found in Google's VertexAI monitoring guide collab information (near the end of the notebook), that all model monitoring related metrics are stored in auto generated bucket called "cloud-ai-platform-". I'd like to review those and see for example data drift over time.
The problem is, I cannot find any documentation mention about it anywhere else.
I've identified the bucket related to that and tried to read them, but they are in binary format, most likely protobuf. Location looks like this: gs://cloud-ai-platform-<id>/model_monitoring/job-<id>/serving/<date>/stats_and_anomalies/<id>/stats/
I've expected to find schemas there as well somewhere so that the reader can be generated, but it's not there.

Related

Firebase analytics - Unity - time spent on a level

is there any possibility to get exact time spent on a certain level in a game via firebase analytics? Thank you so much 🙏
I tried to use logEvents.

The best way to do so would be measuring the time on the level within your codebase, then have a very dedicated event for level completion, in which you would pass the time spent on the level.
Let's get to details. I will use Kotlin as an example, but it should be obvious what I'm doing here and you can see more language examples here.
firebaseAnalytics.setUserProperty("user_id", userId)
firebaseAnalytics.logEvent("level_completed") {
param("name", levelName)
param("difficulty", difficulty)
param("subscription_status", subscriptionStatus)
param("minutes", minutesSpentOnLevel)
param("score", score)
}
Now see how I have a bunch of parameters with the event? These parameters are important since they will allow you to conduct a more thorough and robust analysis later on, answer more questions. Like, Hey, what is the most difficult level? Do people still have troubles on it when the game difficulty is lower? How many times has this level been rage-quit or lost (for that you'd likely need a level_started event). What about our paid players, are they having similar troubles on this level as well? How many people have ragequit the game on this level and never played again? That would likely be easier answer with sql at this point, taking the latest value of the level name for the level_started, grouped by the user_id. Or, you could also have levelName as a UserProperty as well as the EventProperty, then it would be somewhat trivial to answer in the default analytics interface.
Note that you're limited in the number of event parameters you can send per event. The total number of unique parameter names is limited too. As well as the number of unique event names you're allowed to have. In our case, the event name would be level_completed. See the limits here.
Because of those limitations, it's important to name your event properties in somewhat generic way so that you would be able to efficiently reuse them elsewhere. For this reason, I named minutes and not something like minutes_spent_on_the_level. You could then reuse this property to send the minutes the player spent actively playing, minutes the player spent idling, minutes the player spent on any info page, minutes they spent choosing their upgrades, etc. Same idea about having name property rather than level_name. Could as well be id.
You need to carefully and thoughtfully stuff your event with event properties. I normally have a wrapper around the firebase sdk, in which I would enrich events with dimensions that I always want to be there, like the user_id or subscription_status to not have to add them manually every time I send an event. I also usually have some more adequate logging there Firebase Analytics default logging is completely awful. I also have some sanitizing there, lowercasing all values unless I'm passing something case-sensitive like base64 values, making sure I don't have double spaces (so replacing \s+ with " " (space)), maybe also adding the user's local timestamp as another parameter. The latter is very helpful to indicate time-cheating users, especially if your game is an idler.
Good. We're halfway there :) Bear with me.
Now You need to go to firebase and register your eps (event parameters) into cds (custom dimensions and metrics). If you don't register your eps, they won't be counted towards the global cd limit count (it's about 50 custom dimensions and 50 custom metrics). You register the cds in the Custom Definitions section of FB.
Now you need to know whether this is a dimension or a metric, as well as the scope of your dimension. It's much easier than it sounds. The rule of thumb is: if you want to be able to run mathematical aggregation functions on your dimension, then it's a metric. Otherwise - it's a dimension. So:
firebaseAnalytics.setUserProperty("user_id", userId) <-- dimension
param("name", levelName) <-- dimension
param("difficulty", difficulty) <-- dimension (or can be a metric, depends)
param("subscription_status", subscriptionStatus) <-- dimension (can be a metric too, but even less likely)
param("minutes", minutesSpentOnLevel) <-- metric
param("score", score) <-- metric
Now another important thing to understand is the scope. Because Firebase and GA4 are still, essentially just in Beta being actively worked on, you only have user or hit scope for the dimensions and only hit for the metrics. The scope basically just indicates how the value persists. In my example, we only need the user_id as a user-scoped cd. Because user_id is the user-level dimension, it is set separately form the logEvent function. Although I suspect you can do it there too. Haven't tried tho.
Now, we're almost there.
Finally, you don't want to use Firebase to look at your data. It's horrible at data presentation. It's good at debugging though. Cuz that's what it was intended for initially. Because of how horrible it is, it's always advised to link it to GA4. Now GA4 will allow you to look at the Firebase values much more efficiently. Note that you will likely need to re-register your custom dimensions from Firebase in GA4. Because GA4 is capable of getting multiple data streams, of which firebase would be just one data source. But GA4's CDs limits are very close to Firebase's. Ok, let's be frank. GA4's data model is almost exactly copied from that of Firebase's. But GA4 has a much better analytics capabilities.
Good, you've moved to GA4. Now, GA4 is a very raw not-officially-beta product as well as Firebase Analytics. Because of that, it's advised to first change your data retention to 12 months and only use the explorer for analysis, pretty much ignoring the pre-generated reports. They are just not very reliable at this point.
Finally, you may find it easier to just use SQL to get your analysis done. For that, you can easily copy your data from GA4 to a sandbox instance of BQ. It's very easy to do.This is the best, most reliable known method of using GA4 at this moment. I mean, advanced analysts do the export into BQ, then ETL the data from BQ into a proper storage like Snowflake or even s3, or Aurora, or whatever you prefer and then on top of that, use a proper BI tool like Looker, PowerBI, Tableau, etc. A lot of people just stay in BQ though, it's fine. Lots of BI tools have BQ connectors, it's just BQ gets expensive quickly if you do a lot of analysis.
Whew, I hope you'll enjoy analyzing your game's data. Data-driven decisions rock in games. Well... They rock everywhere, to be honest.

Want to know what allth terms in windbg output of "!analyze -v" indicates?

What the key value indicates.......and which is the term help me to undersatnd how the windbg bucket the crashes means how it braodly classify the crashes into?

help me to understand the windbg bucket
IMHO, the idea of buckets was introduced for WER (Windows Error Reporting). WER was used by Microsoft but was also available for companies. WER included a service where you could log in on a Microsoft website and then get an overview of your application crashes.
Of course, people were not interested in a flat list of crashes, but they wanted to know how many crashes of the same type occured. Thus Microsoft and other company could focus on fixing those bugs first which affected most of the users.
The bucket, as the name suggests, is a container where similar problems grouped. The bucket ID is generated in 2 phases: a labeling process which was done on the client side and a classifying process which was done on server side.
What you get from !analyze is the classification, so basically you have access to the functionality via WinDbg that Microsoft used on the server side for providing the WER services.
These WER services are not available any more. They hae been replaced by something else, but I have forgotten the name.
how it braodly classify the crashes into?
An ideal bucketing algorithm would create a new bucket for each bug. So the number of buckets is just limited by the amount of bugs you can code into your application.
The command !analyze has implemented more than 500 different heuristics. The combination of these can create more than 25.000.000 different buckets.
Buckets can differ because of
stack
modules
function name
function offset
corruptions (heap corruption, image corruption)
detected malware
known outdated programs or libraries
known defective hardware
exception codes
exception subcodes
...
The result of that bucketing process is this line of output:
FAILURE_BUCKET_ID: BREAKPOINT_80000003_ntdll.dll!LdrpDoDebuggerBreak
which is probably somehow equivalent to this hash:
FAILURE_ID_HASH: {06f54d4d-201f-7f5c-0224-0b1f2e1e15a5}
I have read some of your previous questions in the windbg tag and I get the impression that you want to use the bucket ID to display some meaningful information to humans.
Actually, the WER system provided such a feature. It worked like this: a developer analyzes the crashes in a bucket and finds out what to do (e.g. update a driver, install a newer version of the application etc). He then assigns that bucket ID a text. Any customers that experience the same crash again were redirected to a website at Microsoft that contained the text written by the developer.
However, note that there is no magic involved that would transfer a crash into something human readable. That's the developer doing hard work and then creating a mapping from the bucket ID to some text that is displayed.
IMHO, the latter can easily be achieved. However, any new bug will require an analysis first. But, who knows, maybe we can train an AI that does better at this.
For more on buckets etc. please read the Microsoft paper Debugging in the (Very) Large:Ten Years of Implementation and Experience

Need some help understanding Vocabulary of Interlinked Dataset (VoID) in Linked Open Data

I have been trying to understand VoID in Linked Open Data. It would be great if anyone could help clarify some of my confusions.
Does it need to be stored in a separate file or it can be included in the RDF dataset itself? If so, how do I query it? (A sample query would be really helpful)
How is the information in VoID used in real life?

Does it need to be stored in a separate file or it can be included in the RDF dataset itself? If so, how do I query it? (A sample query would be really helpful)
In theory not, but for practical purposes yes. In the end the information is encoded in triples, so it doesn't really matter in what file you put them and you could argue that it's best to actually put the VoID info into the data files and serve these triples with your data as meta-info. It's query-able as all other forms of RDF, either load it into some SPARQL endpoint or use a library that can directly load RDF files. This however also shows the reason why a separate file makes sense: instead of having to load potentially large data files just to get some dataset meta info, it makes sense to offer the meta-data in its own file.
How is the information in VoID used in real life?
VoID is actually used in several scenarios already, but mostly a recommendation and a good idea. The most prominent use-cases i know of is to get your dataset shown in the LOD Cloud. You currently have to register it with datahub.io and add a VoID file (example from my associations dataset).
Other examples (sadly many defunct nowadays) can be found here: http://semanticweb.org/wiki/VoID.html

EventStore basics - what's the difference between Event Meta Data/MetaData and Event Data?

I'm very much at the beginning of using / understanding EventStore or get-event-store as it may be known here.
I've consumed the documentation regarding clients, projections and subscriptions and feel ready to start using on some internal projects.
One thing I can't quite get past - is there a guide / set of recommendations to describe the difference between event metadata and data ? I'm aware of the notional differences; Event data is 'Core' to the domain, Meta data for describing, but it is becoming quite philisophical.
I wonder if there are hard rules regarding implementation (querying etc).
Any guidance at all gratefully received!

Shamelessly copying (and paraphrasing) parts from Szymon Kulec's blog post "Enriching your events with important metadata" (emphases mine):
But what information can be useful to store in the metadata, which info is worth to store despite the fact that it was not captured in
the creation of the model?
1. Audit data
who? – simply store the user id of the action invoker
when? – the timestamp of the action and the event(s)
why? – the serialized intent/action of the actor
2. Event versioning
The event sourcing deals with the effect of the actions. An action
executed on a state results in an action according to the current
implementation. Wait. The current implementation? Yes, the
implementation of your aggregate can change and it will either because
of bug fixing or introducing new features. Wouldn’t it be nice if
the version, like a commit id (SHA1 for gitters) or a semantic version
could be stored with the event as well? Imagine that you published a
broken version and your business sold 100 tickets before fixing a bug.
It’d be nice to be able which events were created on the basis of the
broken implementation. Having this knowledge you can easily compensate
transactions performed by the broken implementation.
3. Document implementation details
It’s quite common to introduce canary releases, feature toggling and
A/B tests for users. With automated deployment and small code
enhancement all of the mentioned approaches are feasible to have on a
project board. If you consider the toggles or different implementation
coexisting in the very same moment, storing the version only may be
not enough. How about adding information which features were applied
for the action? Just create a simple set of features enabled, or map
feature-status and add it to the event as well. Having this and the
command, it’s easy to repeat the process. Additionally, it’s easy to
result in your A/B experiments. Just run the scan for events with A
enabled and another for the B ones.
4. Optimized combination of 2. and 3.
If you think that this is too much, create a lookup for sets of
versions x features. It’s not that big and is repeatable across many
users, hence you can easily optimize storing the set elsewhere, under
a reference key. You can serialize this map and calculate SHA1, put
the values in a map (a table will do as well) and use identifiers to
put them in the event. There’s plenty of options to shift the load
either to the query (lookups) or to the storage (store everything as
named metadata).
Summing up
If you create an event sourced architecture, consider adding the
temporal dimension (version) and a bit of configuration to the
metadata. Once you have it, it’s much easier to reason about the
sources of your events and introduce tooling like compensation.
There’s no such thing like too much data, is there?

I will share my experiences with you which may help. I have been playing with akka-persistence, akka-persistence-eventstore and eventstore. akka-persistence stores it's event wrapper, a PersistentRepr, in binary format. I wanted this data in JSON so that I could:
use projections
make these events easily available to any other technologies
You can implement your own serialization for akka-persistence-eventstore to do this, but it still ended up just storing the wrapper which had my event embedded in a payload attribute. The other attributes were all akka-persistence specific. The author of akka-persistence-eventstore gave me some good advice, get the serializer to store the payload as the Data, and the rest as MetaData. That way my event is now just the business data, and the metadata aids the technology that put it there in the first place. My projections now don't need to parse out the metadata to get at the payload.

Documentation or specification for .step and .stp files

I am looking for some kind of specification, documentation, explanation, etc. for .stp/.step files.
It's more about what information each line contains instead of a general information.
I can't seem to figure out what each value means all by myself.
Does anyone know some good readings about STEP files?
I already searched google but all I got were information about the general structure instead of each particular value.

The structure of STEP-File, i.e. the grammar and the logic behind how the file is organized is described in the standard ISO 10303-21.
ISO 10303 or STEP is divided into Application Protocols (AP). Each AP defines a schema written in EXPRESS. The schemas are available on the Internet: the CAX-IF provides some, STEPtools has some good HTML documentations.
The reference of the AP schemas is hosted on stepmod.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse