Google Analytics Reporting API (UA) for GA360 account limit 10000 rows? - google-analytics-api

I'm using Google Analytics Reporting API (UA vs4) for a GA 360 client and getting up to 10,000 rows per API call.
Pagination (with pageToken) is used, but it doesn't help to get around this limit. The maximum number of rows retrieved is always 10,000.
The email address to access the API is added as a Viewer (not as Analytics).
I do not encounter this problem with other websites.
How to overcome this limit? Is it a GA 360 setting?
The pagination code part:
response = get_report(analytics, date_start, date_end)
pageToken = response['reports'][0].get('nextPageToken')
df = df_response(response)
while pageToken != None:
response = get_report(analytics, date_start, date_end, pageToken)
pageToken = response['reports'][0].get('nextPageToken')
df_temp = df_response(response)
df = pd.concat([df, df_temp], axis=0, ignore_index=True)
The output as generated with pageSize is 50,000:
This pageToken clips, strangely enough, already at 10,000.
If I do this for another website it shows the expected pageToken is 50,000 and continues to retrieve all data:
Problem is with and without clientID. Without clientID this is the output:
Same code, same set with clientID clips at 10000 rows:

When you send a request to Google analytics for your report. You can set the pagesize to up to 100k. You probably have it set to 10k currently.

Related

Qlik REST Connector - How to retrieve all paginated data by calling an API as to load the data to Qlik Sense server

I am trying to load all paginated data from API (as shown below) to Qlik Sense server but it only gives first page data whereas my requirement is to load all the paginated data to the server.
I have to use `\XML/JSON script in the request body to achieve the goal.
However, tried with Offset pagination type but looks like Total records path is not specified by the API data provider.
What about not relying on Qlik to perform the pagination?
Instead manually loop through all possible pages and when there is no data - break the loop. You will have define the total pages (set to some silly number)
Pseudo code:
// set the total possible pages
let customTotalPages = 100000;
// loop through each page
for a = 1 to $(customTotalPages)
// set the pagenumber in the body
// to be the current iteration number
let body = "<body>
<method>GET_N08_LIST</method>
<params>
<fil></fil>
<pagenumber>$(a)</pagenumber>
</params>
</body>
";
// load the data
TemTable:
Load * From ApiEndpoint WITH CONNECTION (
BODY "$(body)"
);
// check if the api response have any rows
let tempTableRows = NoOfRows('TempTable');
// break the loop if there are no more pages to process
if $(tempTableRows) = 0 then
Exit
end if
// if the api response returned data
// then concatenate this data to main table (which is outside the loop somewhere)
Concatenate (MainTable) Load * Resident TempTable;
// drop the temp table in each iteration
Drop TempTable;
next

How to filter YouTube Analytics API request for embedded video stats only

I'm trying to get data from the YouTube Analytics API for embedded videos only.
When I use the "insightPlaybackLocationType==EMBEDDED" filter, I get a response that the query is not supported. Without this filter, the query returns a response without any errors.
response = self.executeAPIRequest(
yt_instance.reports().query,
ids="channel==" + c_id,
startDate=startdate,
endDate=enddate,
metrics="views,likes,dislikes,comments,shares,estimatedMinutesWatched,averageViewDuration,averageViewPercentage",
sort='-views',
filters="video==VIDEO_ID_HERE;insightPlaybackLocationType==EMBEDDED",
maxResults=200,
)
Here's the error I get:
googleapiclient.errors.HttpError: https://youtubeanalytics.googleapis.com/v2/reports?ids=channel%3D%CHANNEL_ID_HERE&startDate=2017-02-28&endDate=2019-08-11&metrics=views%2Clikes%2Cdislikes%2Ccomments%2Cshares%2CestimatedMinutesWatched%2CaverageViewDuration%2CaverageViewPercentage&sort=-views&filters=video%3D%VIDEO_ID_HERE%3BinsightPlaybackLocationType%3D%3DEMBEDDED&maxResults=200&alt=json returned "The query is not supported. Check the documentation at https://developers.google.com/youtube/analytics/v2/available_reports for a list of supported queries.">
That filter can only be used with the insightPlaybackLocationDetail dimension.
Bear in mind that the dimension only supports views and estimatedMinutesWatched metrics.
Documentation (Playback location detail):
https://developers.google.com/youtube/analytics/channel_reports#playback-location-reports
Be sure to set the sort and maxResults parameters correctly for this dimension.

How to get more than 100 query results with Azure DocumentDB REST API

I am following a sample for Azure DocumentDB below. In the sample, C# code queries for documents in the DocumentDB.
https://github.com/Azure/azure-documentdb-dotnet/blob/master/samples/rest-from-.net/Program.cs
Line 182:
var qry = new SqlQuerySpec { query = "SELECT * FROM root" };
var r = client.PostWithNoCharSetAsync(new Uri(baseUri, resourceLink), qry).Result;
The problem is the result 'r' only contains the first 100 documents. If I use the Client SDK, I can get more than 100. I tried using stream, but had no luck so far. Any help would be appreciated!
For a SQL query the results are returned in segments if the result set is too large. The results are returned in chunks of 100 items or 1 MB (whichever limit is hit first) by default.
You can either use continuation tokens to get each segment after another. Or you set the x-ms-max-item-count custom header in a request to increase the limit to an appropriate value.
You can have a look the the REST API for further details.
For the sample program you have to add the line
client.DefaultRequestHeaders.Add("x-ms-max-item-count", "1000");
in order to get 1000 documents instead of 100.
I'm just guessing here, but it might be worth a shot. Here's the documentation from MSDN that describes the List action:
https://learn.microsoft.com/en-us/rest/api/documentdb/list-documents
In the "Headers" section under "Response" it is mentioned that you might get an optional token in the header "x-ms-continuation". Based on the description you have to issue another GET request with this token specified to get the other elements of the result set.
Can you check whether you get a header like this in the response? If so, you can issue another get request with this token specified (see the same documentation page under "Request").

How to implement cursors for pagination in an api

This is similar to to this question which doesn't have any answers. I've read all about how to use cursors with the twitter, facebook, and disqus api's and also this article about how disqus generally built their cursors, but I still cannot seem to grok the concept of how they work and how to implement a similar solution in my own projects. Can someone explain specifically the different techniques and concepts behind them?
Lets first understand why offset pagination fails for large data sets with an example.
Clients provide two parameters limit for number of results and offset and for page offset.
For example, with offset = 40, limit = 20, we can tell the database to return the next 20 items, skipping the first 40.
Drawbacks:
Using LIMIT OFFSET doesn’t scale well for large
datasets. As the offset increases the farther you go within the
dataset, the database still has to read up to offset + count rows
from disk, before discarding the offset and only returning count
rows.
If items are being written to the dataset at a high frequency, the
page window becomes unreliable, potentially skipping or returning
duplicate results.
How Cursors solve this ?
Cursor-based pagination works by returning a pointer to a specific item in the dataset. On subsequent requests, the server returns results after the given pointer.
We will use parameters next_cursor along with limit as the parameters provided by client in this case.
Let’s assume we want to paginate from the most recent user to the oldest user.When client request for the first time , suppose we select the first page through query:
SELECT * FROM users
WHERE team_id = %team_id
ORDER BY id DESC
LIMIT %limit
Where limit is equal to limit plus one, to fetch one more result than the count specified by the client. The extra result isn’t returned in the result set, but we use the ID of the value as the next_cursor.
The response from the server would be:
{
"users": [...],
"next_cursor": "1234", # the user id of the extra result
}
The client would then provide next_cursor as cursor in the second request.
SELECT * FROM users
WHERE team_id = %team_id
AND id <= %cursor
ORDER BY id DESC
LIMIT %limit
With this, we’ve addressed the drawbacks of offset based pagination:
Instead of the window being calculated from scratch on each request based on the total number of items, we’re always fetching the next count rows after a specific reference point. If items are being written to the dataset at a high frequency, the overall position of the cursor in the set might change, but the pagination window adjusts accordingly.
This will scale well for large datasets. We’re using a WHERE clause to fetch rows with id values less than the last id from the previous page. This lets us leverage the index on the column and the database doesn’t have to read any rows that we’ve already seen.
For detailed explanation you can visit this wonderful engineering article from slack!
Here is an article about pagination: paginating-real-time-data-cursor-based-pagination
Cursors – we need to have at least one column with unique sequential values to implement cursor based pagination. This can be similar to Twitter’s max_id parameter or Facebook’s after parameter.
In general you should pass the current item or page number in the request as a param. Other usual param is the batch size of the page. Then on the server side backend you select and return the proper dataset, with an SQL query for example.
enter image description hereHere's what I am Done with. The cursor is working as a pointer and it points to that index. and limit will pick that many rows from that pointer. Let's say we have given id 10 and limit 5 then it will go to id 10 and pick 5 elements from there.
Some Graph API connections uses cursors by default. You can use 'limit' and 'before'/'after' parameters in your call. If you are still not clear, you can post your code here and I can explain with it.

facebook graph api comment list sort , like 'orderby=desc'?

I use graph api to get the picture's comments, but I want to first sort the results by creating time and then return to the latest data. Similar to the sql statement 'order by create_time desc', I do not know if have such a parameter.
Currently used to offset and limit access to the latest data, but also know the total number of comments,
pagesize = 25;
offset = comments.count - pagesize;
limit = 25;
url = "https://graph.facebook.com/" + object_id + "/comments?access_token=" + access_token + "&limit=" + limit + "&offset=" + limit;
next page:
offset -= 25
but comments.ount of numerical sometimes is not accurate
and the result of the request URL to return to sometimes don't match
Whether to have very good solution
Or I used the wrong way (‘limit’ and ‘offset’ Parameter)!!!
Thank you for your answer.
"Graphics API" the existence of the cache?
i post a message and 46 comments.requests url, set the parameters:
offset=0&limit=1
Then it should return to the last comment (latest one), the actual return to the middle of a comment, and I tested a few times, set the
offset and limit. According to the returned results, the middle one is
the latest comment
If I set the limit value is greater than the 'comment.count', the returned data is all, the official website and facebook consistent
Because the cache reason?
Thanks again~
#dbau - You are still better off using FQL. In my experience, unless you are making a very simple call, you have very little control over what you get via a Graph API call.
Why don't you want to use FQL? FQL is an endpoint of the Graph API. There is still some data that can only be returned via FQL.
This will get you the result you're looking for. The query needs to be URL encoded. I left it in plain text for clarity.
https://graph.facebook.com/fql?access_token=[TOKEN]&q=
SELECT id, fromid, text, time, likes, user_likes FROM comment
WHERE object_id = [OBJECT_ID] ORDER BY time DESC LIMIT 0,[N]
You may find you don't get [N] comments returned each time, because Facebook filters out items that are not visible to the access_token owner after the query is run. You could either up the LIMIT and filter out any excess results returned or if you are using a user access_token, you could add AND can_like = TRUE to the WHERE clause to be guaranteed that, if they exist, [N] posts visible to the current user are returned.
Graph API returns latest objects first.
Facebook provides 2 keywords to filter the fetched data.
Limit : Returns "limit" number of latest records
Offset : Returns "limit" number of records from the offset position
So to retrieve latest "x" comments posted for an object
https://graph.facebook.com/[OBJECTID]?limit=[X]&offset=0
To retrieve next "X" comments (page wise)
https://graph.facebook.com/[OBJECTID]?limit=[X]&offset=[X*PAGENo]
Hope the answer is clear enough for you.