Why is subscribersGained - subscribersLost different than my current sub count? - youtube-analytics-api

I don't see an option in documentation for just "subscribers" but I can take subscribersGained and subtract subscribersLost. However, the number calculated is a bit lower than the actual result. Is there a reason for this or is there a way to actually get raw sub count?

As of writing (June 2021), there is no metric to gather the total number of subscribers via the YouTube Analytics API, although, as you have already noted, it is possible to collect the number of people that have subscribed and unsubscribed. There is currently no metric or dimension that allows this.
A workaround to get the total number of subscribers would be to collect all subscribers and those that have unsubscribed (using the subscribersGained and subscribersLost metrics), from when a channel was created, and subtracting the one from the other to get the total. eg. -
Total number of subscribers = subscribersGained - subscribersLost

Related

PromQL Requests per minute

I'm trying to create a graph of total POST requests per minute in a graph, but there's this "ramp up" pattern that leads me to believe that I'm not getting the actual total of requests per minute, but getting an accumulative value.
Here is my query:
sum_over_time(django_http_responses_total_by_status_view_method_total{job="django-prod-app", method="POST", view="twitch_webhooks"}[1m])
Here are the "ramp up" patterns over 7days (drop offs indicating a reboot):
What leads me to believe my understanding of sum_over_time() is incorrect is because the existing webhooks should always exist. At the time of the most recent reboot, we have 72k webhook subscriptions, so it doesn't make sense for the value to climb over time, it would make more sense to see a large spike at the start for catching webhooks that were not captured during downtime.
Is this query correct for what I'm trying to achieve?
I am using django-prometheus for exporting.
You want increase rather than sum_over_time, as this is a counter.
If the django_http_responses_total_by_status_view_method_total metrics is a counter, then increase() function must be used for returning the number of requests during the last minute:
increase(django_http_responses_total_by_status_view_method_total[1m])
Note that increase() function in Prometheus can return fractional results even if django_http_responses_total_by_status_view_method_total metric contains only integer values. This is due to implementation details - see this comment and this article for details.
If the django_http_responses_total_by_status_view_method_total metric is a gauge, which shows the number of requests since the previous sample, then sum_over_time() function must be used for returning requests per last minute:
sum_over_time(django_http_responses_total_by_status_view_method_total[1m])

How can I instantly get the result of my produced event in kafka and kafka-streams?

I am simplifying my problem with the following scenario:
3 friends share a loyalty card. The card has two restrictions
can be used max 10 times (does not matter which uses the card, ie
friend_a can use it 10 times.
the max money in the card is 200. So with 1 "event" of value = 200 the card is "completed".
I am using a kafka producer that sends events in the kafka cluster like this
{ "name": "friend_1", "value": 10 }
{ "name": "friend_3", "value": 20 }
the events are posted to a topic that is connected with a kafka stream that groups by key and doing aggregation to sum the money spent. That seems to work, however I am facing a "concurrency issue"
Let's imaging the card is used 9 times, so only 1 time remains to be used and the total money spent is 190, that means there are 10 units left to spend.
So friend_2 wants to buy something that costs 11 units (which should not be allowed) and friend_3 wants to buy something that costs 9 units which should be allowed. Friend_3 will modify the state using the card for the 10th time. All other future attempts should not modify something.
So it seems reasonable for the card user to know if the event he sent modified the max used number and the total count. How can I do it in kafka? Using the streams aggregation I can always increase the values, but how can do I know if my action "modified the state" of the card?
UPDATE: the card user should immediately get a negative feedback if the transaction validates a rule.
From how I understand your question there are a few options.
One option is to derive a new stream after the aggregation that you filter() for data that would modify 'the state' of the card, like filtering all events that have > 200 units spent or > 10 uses. This stream can then be used to inform the card user(s) that the card has been spent, e.g. by sending an email. This is approach can be implemented solely with the DSL.
When more flexibility or tighter control is needed, another option is to use the Processor API (which you can integrate with the DSL so most of your code could keep using the DSL), where you implement the aggregation step yourself (working with state stores that you attach to a Transformer or Processor). During the aggregation, you can implement the logic that checks whether the incoming event is valid (in your example: friend_3 with 9 units is valid, friend_2 with 11 units is not). If it is valid, the aggregation increases the card's counters (units and uses), and that's it. If it is invalid, the event is discarded and will not modify the counters, the Transformer/Processor can emit a new event to another stream that tells the card user(s) that something didn't work. You can similarly implement the functionality to inform users that a card has been fully used, that a card is no longer usable, or any other 'state change' of that card.
Also, depending on what you want to do, take a look at the interactive queries feature of Kafka Streams. Sometimes other applications may want to do quick point lookups (queries) of the latest state of something, like the card's state, which can be done with interactive queries via e.g. a REST API.
Hope this helps!

Last value corresponding to each key sent on a Kafka topic

We have a Kafka topic configured on which we publish accumulated reports for each stock we traded throughout the day.
For example Stock A - Buy-50, Sell-60, Stock B - Buy-44, Sell-34 etc. The key while publishing is RIC code of the stock.
The next day I want all consumers to get the last published positions for each stock individually. I want to understand how to configure Kafka producer/consumer to achieve this behavior.
One thing that comes to mind is creating a partition for each stock, this will result into individual offsets for each stock and all consumers can point to the HIGHEST offset and get the latest position.
Is this the correct approach or am I missing something obvious?
Your approach will work, but only if you don't care about the time boundaries too much - for example, you do not need to get the counts for each day separately, with a strict requirement that only events that happened between say, [01/25/2017 00:00 - 01/26/2017 00:00] must be counted.
If you do need to get counts per day in a strict manner - you could try using Kafka Streams , with the key of RIC and the window set to 24 hours based on the event timestamp.
This is just one other way to do that - I'm sure there are more approaches available!

rate limit policy on queries to Azure Insights REST API for Events (Audit Logs)

I have some questions regarding Azure Insights REST Api for Events.
When I make HTTP request to Inisghts API for events, I receive the header "
x-ms-ratelimit-remaining-subscription-reads", with value "14999".
But next query in 1s returns me the same value of remaining reads.
I see there is some throttling policy there, but I would like to understand how it works and what is the correct way to deal with that.
In particular,
1) how many reads I am able to do per second?
2) if I exceed the whole remaining reads parameter, how much time should I wait before it will again be maximum?
3) is it decreased on every query attempt, despite of the $top parameter setted and how many results has been returned?
Thank you!
This article seems to have the responses you need.
To answer the questions based on it:
There is no limit to the number of requests per second, but you have 15k
requests/hour/subscription/region/instance of ARM region. Worst case scenario you will get throttled after 15k requests but you'd have to be extremely unlucky for that.
If you exceed the limit, you are
told how much you have to wait and you can integrate that logic by
looking at the Retry-After header. Happily, it's a matter of
seconds.
I believe the $top parameter doesn't affect the query since
no matter how many results are brought back, a paging request is
still just one request.
As for the fact that you get 14999 requests
remaining multiple times, as they say in their documentation it is
expected since an ARM region has multiple instances and each instance has
15k requests limit/subscription/hour. If you hit simultaneously and
you get the same number remaining, it just means that you were lucky
enough to hit different instances within the same ARM region.
1) how many reads I am able to do per second?
Based on the rate limits published here - https://azure.microsoft.com/en-in/documentation/articles/azure-subscription-service-limits/#subscription-limits, you can perform 15000 reads / hour (not sure it would translate to 4 reads / second).
2) if I exceed the whole remaining reads parameter, how much time
should I wait before it will again be maximum?
Given the rates are defined per hour, my guess would be to wait till next hour if you exhaust 15000 read request limit.
3) is it decreased on every query attempt, despite of the $top
parameter setted and how many results has been returned?
This is based on the number of API calls and not the amount of data returned. So I would say defining $top parameter should not have any impact on this.
When I make HTTP request to Inisghts API for events, I receive the
header " x-ms-ratelimit-remaining-subscription-reads", with value
"14999". But next query in 1s returns me the same value of remaining
reads.
I would assume there's some caching in play here. Is it the same request you're repeating or a different request all together?

Google Measurement Protocol offline apps and event dates

I want to use Google Measurement Protocol to record offline events, i.e. take data from an EPOS system and track them in Google Analytics. This would be a batch process once a day. How do I tell Google what the date of the event is? If the console app went offline for a few days I wouldn't want three days worth of events to be associated with one day.
Your best best currently is to use the Queue Time Measurement Protocol Parameter.
v=1&tid=UA-123456-1&cid=5555&t=pageview&dp=%2FpageA&qt=343
Queue Time is used to collect offline / latent hits. The value represents the time delta (in milliseconds) between when the hit being reported occurred and the time the hit was sent. The value must be greater than or equal to 0. Values greater than four hours may lead to hits not being processed.