BigQuery: How to access all the distinct event_params.key of a event_name - google-cloud-firestore

I am interesting in knowing the distinct event_params.key available for a particular event_name.
I understand that in order to access parameter values, each individual element of the array will need to be accessed and expanded individually. For instance, to access a user’s session_id and button_name per event:
select user_id, event_name,
(select value.int_value from unnest(event_params) where key='ga_session_id') as session_id,
(select value.string_value from unnest(event_params) where key='button_name') as button_name
from `analytics_xx.events_*`
where user_id = 'abc'
But this assumes I know the key. What if I don't? How can I return all the available keys for that event_name?
The following didn't work:
select event_name,
(select * from unnest(event_params)) as available_keys
from `analytics_xx.events_*`
where event_name = "button_click"

One approach would be to unnest the event parameters and access the "key" attribute (distinct).
select distinct event_params.key
from `analytics_xxx.events_*`, unnest(event_params) as event_params
where event_name = "button_click"

Related

Postgresql - select query with aggregated decisions column as json

I have table which contains specified columns:
id - bigint
decision - varchar(80)
type - varchar(258)
I want to make a select query which in result returns something like this(id, decisionsValues with counts as json, type):
id decisions type
1 {"firstDecisionsValue":countOfThisValue, "secondDecisionsValue": countOfThisValue} entryType
I heard that I can try play with json_agg but it does not allow COUNT method, tried to use json_agg with query:
SELECT ac.id,
json_agg(ac.decision),
ac.type
FROM myTable ac
GROUP BY ac.id, ac.type;
but ends with this(for entry with id 1 there are two occurences of firstDecisionsValue, one occurence of secondDecisionsValue):
id decisions type
1 {"firstDecisionsValue", "firstDecisionsValue", "secondDecisionsValue"} entryType
minimal reproducible example
CREATE TABLE myTable
(
id bigint,
decisions varchar(80),
type varchar(258)
);
INSERT INTO myTable
VALUES (1, 'firstDecisionsValue', 'myType');
INSERT INTO myTable
VALUES (1, 'firstDecisionsValue', 'myType');
INSERT INTO myTable
VALUES (1, 'secondDecisionsValue', 'myType');
Can you provide me any tips how to make it as expected?
1, {"fistDecisionsValue":2, "secondDecisionsValue":1}, entryType
You can try this
SELECT a.id, jsonb_object_agg(a.decisions, a.count), a.type
FROM
( SELECT id, type, decisions, count(*) AS count
FROM myTable
GROUP BY id, type, decisions
) AS a
GROUP BY a.id, a.type
see the result in dbfiddle.
First, you should calculate the count of id, type, decisions for each decisions after that, you should use jsonb_object_agg to create JSON.
Demo
with data as (
select
ac.id,
ac.type,
ac.decisions,
count(*)
from
myTable ac
group by
ac.id,
ac.type,
ac.decisions
)
select
d.id,
d.type,
json_object_agg(d.decisions, d.count)
from
data d
group by
d.id,
d.type

GROUP BY one column, then by another column

SELECT lkey, max(votecount) FROM VOTES
WHERE ekey = (SELECT ekey FROM Elections where electionid='NR2019')
GROUP BY lkey
ORDER BY lkey ASC
Is there an easy way to get the pkey in this Statement?
Solution should look like this
Use DISTINCT ON:
SELECT DISTINCT ON (v.ikey) v.*
FROM VOTES v
INNER JOIN Elections e ON e.ekey = v.ekey
WHERE e.electionid = 'NR2019'
ORDER BY v.ikey, v.votecount DESC;
In plain English, the above query says to return the single record for each ikey value having the highest vote count.

Use coalesce function in sql to to return zero counts of records without showing any column value to be null

I am trying to return zero counts as results for a query. But when I am running this query, a particular column's values are returning as null.
select tab1.source_type,
coalesce(tab2.numberofrecords,0) as numberofrecords,
coalesce(dt,current_date-40) as dt,
coalesce(client_id,client_id) as client_id
from (select distinct source_type from integration_customers )
as tab1
left join
(select count(id) as Numberofrecords,
source_type, Date(created_at) as dt,
client_id
from integration_customers ic
where Date(created_at)= current_date-39
and
source_type in
(select distinct "source" from integration_integrationconfig ii where status ='complete')
group by source_type ,dt,client_id
order by dt desc) as tab2
on tab1.source_type = tab2.source_type
But the results for this query is something like this:
I want to remove these null values and show the client id specifically for each zero record as well.
The table integration customers has the client id, created at,source type.

How to compare two values with SQL in Google Big Query?

I am trying to get from the Google Big Query database all records which have the same value in different columns. Let's say, when sending some event from the phone I am setting variable machine_name to the firebase user_properties. And then I am sending the event event_notification_send. And when I am querying table - I want to fetch all data from DB with events with name event_notification_send which has parameter machine_name with some value X1 and that record must at the same time have a parameter in user_properties, in key Last_notification with the same value X1.
How can I do that SQL query?
Thanks.
Here is sample of my code:
#standardSQL
SELECT *
FROM
`myProject.analytics_159820162.events_*`
WHERE
_TABLE_SUFFIX BETWEEN '20180725' AND '20180727'
AND event_name in ("event_notification_received", "event_notification_dissmissed")
AND platform = "ANDROID"
AND
(SELECT COUNTIF((key = "machine_name"))
FROM UNNEST(event_params)
) > 0 -- to see if specified event has such key
AND
(SELECT COUNTIF((key = "Last_notification"))
FROM UNNEST(user_properties)
) > 0 -- to see if specified event has such key
ORDER BY event_timestamp ASC
To check if row/event has parameters "machine_name" and "Last_notification" with same value you can use below statement
SELECT COUNT(DISTINCT key) cnt
FROM UNNEST(event_params)
WHERE key IN ("machine_name", "Last_notification")
GROUP BY value
ORDER BY cnt DESC
LIMIT 1
Assuming that the rest of your query in question is correct - below adds your criteria to it
#standardSQL
SELECT *
FROM `myProject.analytics_159820162.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20180725' AND '20180727'
AND event_name IN ("event_notification_received", "event_notification_dissmissed")
AND platform = "ANDROID"
AND (
SELECT COUNT(DISTINCT key) cnt
FROM UNNEST(event_params)
WHERE key IN ("machine_name", "Last_notification")
GROUP BY value
ORDER BY cnt DESC
LIMIT 1
) = 2
ORDER BY event_timestamp ASC
Note: using below is just to be on safe side in case if event has multiple parameters wit the same keys but different values
ORDER BY cnt DESC
LIMIT 1

How to get fields and added in group by in PostreSQL8.4?

I am selecting column used in group by and count, and query looks something like
SELECT s.country, count(*) AS posts_ct
FROM store s
JOIN store_post_map sp ON sp.store_id = s.id
GROUP BY 1;
However, I want to select some more fields, like store name or store address from store table where count is max, but I don't to include that in group by clause.
For instance, to get the stores with the highest post-count per country:
SELECT DISTINCT ON (s.country)
s.country, s.store_id, s.name, sp.post_ct
FROM store s
JOIN (
SELECT store_id, count(*) AS post_ct
FROM store_post_map
GROUP BY store_id
) sp ON sp.store_id = s.id
ORDER BY s.country, sp.post_ct DESC
Add any number of columns from store to the SELECT list.
Details about this query style in this related answer:
Select first row in each GROUP BY group?
Reply to comment
This produces the count per country and picks (one of) the store(s) with the highest post-count:
SELECT DISTINCT ON (s.country)
s.country, s.store_id, s.name
,sum(post_ct) OVER (PARTITION BY s.country) AS post_ct_for_country
FROM store s
JOIN (
SELECT store_id, count(*) AS post_ct
FROM store_post_map
GROUP BY store_id
) sp ON sp.store_id = s.id
ORDER BY s.country, sp.post_ct DESC;
This works because the window function sum() is applied before DISTINCT ON per definition.