We are trying to see if graphite will fit our use case. So we have a number of public parameters. Like key value pairs.
Say:
Data:
Caller:abc
Site:xyz
Http status: 400
6-7 more similar fields (key values pairs) .
Etc.
This data is continuously posted to use in a data report. What we want is to draw visualisations over this data.
We want graphs that will say things like how many 400s by sites etc. Which are the top sites or callers for whom there is 400.
Now we are wondering if this can be done with graphite.
But we have questions. Graphite store numerical values. So how will we represent this in graphite.
Something like this ?
Clicks.metric.status.400 1 currTime
Clicks.metric.site.xyz 1 currTime
Clicks.metric.caller.abc 1 currTime
Adding 1 as the numerical value to record the event.
Also how will we group the set of values together.
For eg this http status is for this site as it is one record.
In that case we need something like
Clicks.metric.status.{uuid1}.400 1 currTime
Clicks.metric.site.{uuid1}.xyz 1 currTime
Our aim is to then use grafana to have graphs on this data as in what are the top site which have are showing 400 status?
will this is ok ?
regards
Graphite accepts three types of data: plaintext, pickled, and AMQP.
The plaintext protocol is the most straightforward protocol supported
by Carbon.
The data sent must be in the following format: <metric path> <metric
value> <metric timestamp>. Carbon will then help translate this line
of text into a metric that the web interface and Whisper understand.
If you're new to graphite (which sounds like you are) plaintext is definitely the easiest to get going with.
As to how you'll be able to group metrics and perform operations on them, you have to remember that graphite doesn't natively store any of this for you. It stores timeseries metrics, and provides functions that manipulate that data for visual / reporting purposes. So when you send a metric, prod.host-abc.application-xyz.grpc.GetStatus.return-codes.400 1 1522353885, all you're doing is storing the value 1 for that specific metric at timestamp 1522353885. You can then use graphite functions to display that data, e.g.,: sumSeries(prod.*.application-xyz.grpc.GetStatus.return-codes.400) will produce a sum of all 400 error codes from all hosts.
Related
I have added GTM and GA4 to some website apps to produce tables of detailed stats on click-throughs of ads per advertiser for a date range. I now have suitable reports working successfully using Data Studio, but my attempts to do the same using the PHP implementation of Analytics Data API V1 Beta (in order to do batch runs covering many date ranges) repeatedly hit a brick wall: the methods needed to analyse the response from instantiating BetaAnalyticsDataClient and then invoking runPivotReport or batchRunReports or batchRunPivotReports (and so on) appear not be specified.
The only example that I could work from is the ‘quickstart’ one that does a basic dimension and metric retrieval, and even this employs:
getRows()
getDimensionValues()
getValue()
getMetricValues
that do not appear in the API documentation, at least that I can find.
The JSON response format for each report is of course documented: for example the output from running runPivotReport is documented as an instantiation of runPivotReportResponse.
But nowhere can I find a specification of the methods to be used to traverse the JSON tree (vide getDimensionValues() above) and extract some output data.
Guesswork has taken me part way, but purely for example, when retrieving pivot data, should a
getPivotDimensionHeaders()[0]
be followed by a
getDimensionValues()
or a
getPivotDimensionValues()
I am obviously approaching this all wrong, but what should I do, please?
I use the Google Analytics (GA) Reporting API to retrieve custom data from my GA views. Often my queries include a mixture of standard and custom dimensions regarding different URLs and their respective page views, e.g. something like ga:pagePath, ga:pageTitle, ga:dimensionX, where dimensionX is set on a hit level and send with every page view (like publishing date or some CMS identifier).
The returned data regularly includes rows that represent some kind of dimension combination with 0 page views. How can that be? Why would GA report a data point with 0 hits?
PS: I don't use 360, so sampling applies.
Looks like you have set includeEmptyRows: true in the request.
includeEmptyRows boolean
If set to false, the response does not include rows if all the
retrieved metrics are equal to zero. The default is false which will
exclude these rows.
from: https://developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/reports/batchGet#ReportRequest
I'm pretty confused concerning this hip thing called NoSQL, especially CloudantDB by Bluemix. As you know, this DB doesn't store the values chronologically. It's the programmer's task to sort the entries in case he wants the data to.. well.. be sorted.
What I try to achive is to simply get the last let's say 100 values a sensor has sent to Watson IoT (which saves everything in the connected CloudantDB) in an ORDERED way. In the end it would be nice to show them in a D3.css style kind of graph but that's another task. I first need the values in an ordered array.
What I tried so far: I used curl to get the data via PHP from https://averylongID-bluemix.cloudant.com/iotp_orgID_iotdb_2018-01-25/_all_docs?limit=20&include_docs=true';
What I get is an unsorted array of 20 row entries with random timestamps. The last 20 entries in the DB. But not in terms of timestamps.
My question is now: Do you know of a way to get the "last" 20 entries? Sorted by timestamp? I did a POST request with a JSON string where I wanted the data to be sorted by the timestamp, but that doesn't work, maybe because of the ISO timestamp string.
Do I really have to write a javascript or PHP script to get ALL the database entries and then look for the 20 or 100 last entries by parsing the timestamp, sorting the array again and then get the (now really) last entries? I can't believe that.
Many thanks in advance!
I finally found out how to get the data in a nice ordered way. The key is to use the _design api together with the _view api.
So a curl request with the following URL / attributes and a query string did the job:
https://alphanumerical_something-bluemix.cloudant.com/iotp_orgID_iotdb_2018-01-25/_design/iotp/_view/by-date?limit=120&q=name:%27timestamp%27
The curl result gets me the first (in terms of time) 120 entries. I just have to find out how to get the last entries, but that's already a pretty good result. I can now pass the data on to a nice JS chart and display it.
One option may be to include the timestamp as part of the ID. The _all_docs query returns documents in order by id.
If that approach does not work for you, you could look at creating a secondary index based on the timestamp field. One type of index is Cloudant Query:
https://console.bluemix.net/docs/services/Cloudant/api/cloudant_query.html#query
Cloudant query allows you to specify a sort argument:
https://console.bluemix.net/docs/services/Cloudant/api/cloudant_query.html#sort-syntax
Another approach that may be useful for you is the _changes api:
https://console.bluemix.net/docs/services/Cloudant/api/database.html#get-changes
The changes API allows you to receive a continuous feed of changes in your database. You could feed these changes into a D3 chart for example.
My understanding of REST is that anything that does not change state to the underlying system (e.g. query) should be a GET request. This also means that query parameters have to be put into the URI like so:
api/SomeMethod/Parameter1/{P1:double}/Parameter2/{P1:double}
or as query strings as discussed here:
REST API Best practice: How to accept list of parameter values as input
Sometimes the query may require a lengthy vector (number of x/y points). How do I overcome the length problem of URIs here? Should I just use a POST? Thanks.
If the vector really is big enough to start worrying about you should really consider moving it out of the query params and represent it as a RESTful resource.
For example, create a collection at:
api/Vector
Then your API clients can POST their large vectors and then in another request refer to it by a single id number.
This reduces the size of the query length drastically, abides by REST, and allows for these vectors to be easily reused. If you are worried about storage you can expire vectors after 30 minutes or longer.
Another option is to go down the JSON-LD road which is similar except you don‘t host the vectors. You just provide an #context object and API clients will host the vector on their own server and reference it to your API by URL in a query parameter.
Explanation:
I am able to query the Google Core reporting APIv3 using the client library to get data on pageviews for specific URLs of a website I am working on. I want to get data(pageviews) for each day within a specified range. So far I am simply looping through the range, sending individual request to the API. in each request I am setting the same value for the start date and the end date.
Problem:
Obviously this gets the job done, BUT it is certainly not the best way to go about it. Because, assumming I want to get data for the past 3 months for each of about 2000 URIs. Then I will need 360000 number of requests and that value is well over the limit quota defined by Google.
Potential solution: So one way I thought of solving this issue is probably to send a request setting start-date and end-date to be a week apart but the API will return a sum of the values rather than the individual values.
main question: So is there a way to insist that these values should not be added up and returned as a sum but rather returned (as associative array or something like that) separately for each.
I hope the question is clear and that there is a solution! Thank you!
Very straightforward:
Metric: ga:pageview, Dimension: ga:date, Set a filter for your pagepath, and set a start-date and end-date.
Example:
https://www.googleapis.com/analytics/v3/data/ga?ids=ga%3Axxyyzz&dimensions=ga%3Adate&metrics=ga%3Apageviews&filters=ga%3Apagepath%3D%3D%2Ffaq.html&start-date=2013-06-27&end-date=2013-07-11&max-results=50
This will return the pageviews for that the faq.html& page for each day in the time-frame.
You should check out the QueryExplorer. Great tool to find out how to structure queries.