I am interesting in knowing the distinct event_params.key available for a particular event_name.
I understand that in order to access parameter values, each individual element of the array will need to be accessed and expanded individually. For instance, to access a user’s session_id and button_name per event:
select user_id, event_name,
(select value.int_value from unnest(event_params) where key='ga_session_id') as session_id,
(select value.string_value from unnest(event_params) where key='button_name') as button_name
from `analytics_xx.events_*`
where user_id = 'abc'
But this assumes I know the key. What if I don't? How can I return all the available keys for that event_name?
The following didn't work:
select event_name,
(select * from unnest(event_params)) as available_keys
from `analytics_xx.events_*`
where event_name = "button_click"
One approach would be to unnest the event parameters and access the "key" attribute (distinct).
select distinct event_params.key
from `analytics_xxx.events_*`, unnest(event_params) as event_params
where event_name = "button_click"
Related
I have table which contains specified columns:
id - bigint
decision - varchar(80)
type - varchar(258)
I want to make a select query which in result returns something like this(id, decisionsValues with counts as json, type):
id decisions type
1 {"firstDecisionsValue":countOfThisValue, "secondDecisionsValue": countOfThisValue} entryType
I heard that I can try play with json_agg but it does not allow COUNT method, tried to use json_agg with query:
SELECT ac.id,
json_agg(ac.decision),
ac.type
FROM myTable ac
GROUP BY ac.id, ac.type;
but ends with this(for entry with id 1 there are two occurences of firstDecisionsValue, one occurence of secondDecisionsValue):
id decisions type
1 {"firstDecisionsValue", "firstDecisionsValue", "secondDecisionsValue"} entryType
minimal reproducible example
CREATE TABLE myTable
(
id bigint,
decisions varchar(80),
type varchar(258)
);
INSERT INTO myTable
VALUES (1, 'firstDecisionsValue', 'myType');
INSERT INTO myTable
VALUES (1, 'firstDecisionsValue', 'myType');
INSERT INTO myTable
VALUES (1, 'secondDecisionsValue', 'myType');
Can you provide me any tips how to make it as expected?
1, {"fistDecisionsValue":2, "secondDecisionsValue":1}, entryType
You can try this
SELECT a.id, jsonb_object_agg(a.decisions, a.count), a.type
FROM
( SELECT id, type, decisions, count(*) AS count
FROM myTable
GROUP BY id, type, decisions
) AS a
GROUP BY a.id, a.type
see the result in dbfiddle.
First, you should calculate the count of id, type, decisions for each decisions after that, you should use jsonb_object_agg to create JSON.
Demo
with data as (
select
ac.id,
ac.type,
ac.decisions,
count(*)
from
myTable ac
group by
ac.id,
ac.type,
ac.decisions
)
select
d.id,
d.type,
json_object_agg(d.decisions, d.count)
from
data d
group by
d.id,
d.type
SELECT lkey, max(votecount) FROM VOTES
WHERE ekey = (SELECT ekey FROM Elections where electionid='NR2019')
GROUP BY lkey
ORDER BY lkey ASC
Is there an easy way to get the pkey in this Statement?
Solution should look like this
Use DISTINCT ON:
SELECT DISTINCT ON (v.ikey) v.*
FROM VOTES v
INNER JOIN Elections e ON e.ekey = v.ekey
WHERE e.electionid = 'NR2019'
ORDER BY v.ikey, v.votecount DESC;
In plain English, the above query says to return the single record for each ikey value having the highest vote count.
I am trying to return zero counts as results for a query. But when I am running this query, a particular column's values are returning as null.
select tab1.source_type,
coalesce(tab2.numberofrecords,0) as numberofrecords,
coalesce(dt,current_date-40) as dt,
coalesce(client_id,client_id) as client_id
from (select distinct source_type from integration_customers )
as tab1
left join
(select count(id) as Numberofrecords,
source_type, Date(created_at) as dt,
client_id
from integration_customers ic
where Date(created_at)= current_date-39
and
source_type in
(select distinct "source" from integration_integrationconfig ii where status ='complete')
group by source_type ,dt,client_id
order by dt desc) as tab2
on tab1.source_type = tab2.source_type
But the results for this query is something like this:
I want to remove these null values and show the client id specifically for each zero record as well.
The table integration customers has the client id, created at,source type.
I am trying to get from the Google Big Query database all records which have the same value in different columns. Let's say, when sending some event from the phone I am setting variable machine_name to the firebase user_properties. And then I am sending the event event_notification_send. And when I am querying table - I want to fetch all data from DB with events with name event_notification_send which has parameter machine_name with some value X1 and that record must at the same time have a parameter in user_properties, in key Last_notification with the same value X1.
How can I do that SQL query?
Thanks.
Here is sample of my code:
#standardSQL
SELECT *
FROM
`myProject.analytics_159820162.events_*`
WHERE
_TABLE_SUFFIX BETWEEN '20180725' AND '20180727'
AND event_name in ("event_notification_received", "event_notification_dissmissed")
AND platform = "ANDROID"
AND
(SELECT COUNTIF((key = "machine_name"))
FROM UNNEST(event_params)
) > 0 -- to see if specified event has such key
AND
(SELECT COUNTIF((key = "Last_notification"))
FROM UNNEST(user_properties)
) > 0 -- to see if specified event has such key
ORDER BY event_timestamp ASC
To check if row/event has parameters "machine_name" and "Last_notification" with same value you can use below statement
SELECT COUNT(DISTINCT key) cnt
FROM UNNEST(event_params)
WHERE key IN ("machine_name", "Last_notification")
GROUP BY value
ORDER BY cnt DESC
LIMIT 1
Assuming that the rest of your query in question is correct - below adds your criteria to it
#standardSQL
SELECT *
FROM `myProject.analytics_159820162.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20180725' AND '20180727'
AND event_name IN ("event_notification_received", "event_notification_dissmissed")
AND platform = "ANDROID"
AND (
SELECT COUNT(DISTINCT key) cnt
FROM UNNEST(event_params)
WHERE key IN ("machine_name", "Last_notification")
GROUP BY value
ORDER BY cnt DESC
LIMIT 1
) = 2
ORDER BY event_timestamp ASC
Note: using below is just to be on safe side in case if event has multiple parameters wit the same keys but different values
ORDER BY cnt DESC
LIMIT 1
I am selecting column used in group by and count, and query looks something like
SELECT s.country, count(*) AS posts_ct
FROM store s
JOIN store_post_map sp ON sp.store_id = s.id
GROUP BY 1;
However, I want to select some more fields, like store name or store address from store table where count is max, but I don't to include that in group by clause.
For instance, to get the stores with the highest post-count per country:
SELECT DISTINCT ON (s.country)
s.country, s.store_id, s.name, sp.post_ct
FROM store s
JOIN (
SELECT store_id, count(*) AS post_ct
FROM store_post_map
GROUP BY store_id
) sp ON sp.store_id = s.id
ORDER BY s.country, sp.post_ct DESC
Add any number of columns from store to the SELECT list.
Details about this query style in this related answer:
Select first row in each GROUP BY group?
Reply to comment
This produces the count per country and picks (one of) the store(s) with the highest post-count:
SELECT DISTINCT ON (s.country)
s.country, s.store_id, s.name
,sum(post_ct) OVER (PARTITION BY s.country) AS post_ct_for_country
FROM store s
JOIN (
SELECT store_id, count(*) AS post_ct
FROM store_post_map
GROUP BY store_id
) sp ON sp.store_id = s.id
ORDER BY s.country, sp.post_ct DESC;
This works because the window function sum() is applied before DISTINCT ON per definition.