How to compare two values with SQL in Google Big Query? - select

I am trying to get from the Google Big Query database all records which have the same value in different columns. Let's say, when sending some event from the phone I am setting variable machine_name to the firebase user_properties. And then I am sending the event event_notification_send. And when I am querying table - I want to fetch all data from DB with events with name event_notification_send which has parameter machine_name with some value X1 and that record must at the same time have a parameter in user_properties, in key Last_notification with the same value X1.
How can I do that SQL query?
Thanks.
Here is sample of my code:
#standardSQL
SELECT *
FROM
`myProject.analytics_159820162.events_*`
WHERE
_TABLE_SUFFIX BETWEEN '20180725' AND '20180727'
AND event_name in ("event_notification_received", "event_notification_dissmissed")
AND platform = "ANDROID"
AND
(SELECT COUNTIF((key = "machine_name"))
FROM UNNEST(event_params)
) > 0 -- to see if specified event has such key
AND
(SELECT COUNTIF((key = "Last_notification"))
FROM UNNEST(user_properties)
) > 0 -- to see if specified event has such key
ORDER BY event_timestamp ASC

To check if row/event has parameters "machine_name" and "Last_notification" with same value you can use below statement
SELECT COUNT(DISTINCT key) cnt
FROM UNNEST(event_params)
WHERE key IN ("machine_name", "Last_notification")
GROUP BY value
ORDER BY cnt DESC
LIMIT 1
Assuming that the rest of your query in question is correct - below adds your criteria to it
#standardSQL
SELECT *
FROM `myProject.analytics_159820162.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20180725' AND '20180727'
AND event_name IN ("event_notification_received", "event_notification_dissmissed")
AND platform = "ANDROID"
AND (
SELECT COUNT(DISTINCT key) cnt
FROM UNNEST(event_params)
WHERE key IN ("machine_name", "Last_notification")
GROUP BY value
ORDER BY cnt DESC
LIMIT 1
) = 2
ORDER BY event_timestamp ASC
Note: using below is just to be on safe side in case if event has multiple parameters wit the same keys but different values
ORDER BY cnt DESC
LIMIT 1

Related

Use coalesce function in sql to to return zero counts of records without showing any column value to be null

I am trying to return zero counts as results for a query. But when I am running this query, a particular column's values are returning as null.
select tab1.source_type,
coalesce(tab2.numberofrecords,0) as numberofrecords,
coalesce(dt,current_date-40) as dt,
coalesce(client_id,client_id) as client_id
from (select distinct source_type from integration_customers )
as tab1
left join
(select count(id) as Numberofrecords,
source_type, Date(created_at) as dt,
client_id
from integration_customers ic
where Date(created_at)= current_date-39
and
source_type in
(select distinct "source" from integration_integrationconfig ii where status ='complete')
group by source_type ,dt,client_id
order by dt desc) as tab2
on tab1.source_type = tab2.source_type
But the results for this query is something like this:
I want to remove these null values and show the client id specifically for each zero record as well.
The table integration customers has the client id, created at,source type.

how to select multiple column from the table using group by( based on one column) , having and count in hive query

Requirement :
Using group by A and get records having count > 1
eg:
SELECT count(sk), id, sk
FROM table x
GROUP BY id
HAVING COUNT(sk) > 1
But I am not able to select sk in select statement. Is there any other way to do this. how to use partition on this input and output set attached here?
Something like this, you can do.
select * from (
SELECT count(sk)over(partition by id) as cnt, id, sk
FROM table x) a
where a.cnt >1

How to use new created column in where column in sql?

Hi I have a query which looks like the following :
SELECT device_id, tag_id, at, _deleted, data,
row_number() OVER (PARTITION BY device_id ORDER BY at DESC) AS row_num
FROM mdb_history.devices_tags_mapping_history
WHERE at <= '2019-04-01'
AND _deleted = False
AND (tag_id = '275674' or tag_id = '275673')
AND row_num = 1
However when I run the following query, I get the following error :
ERROR: column "row_num" does not exist
Is there any way to go about this. One way I tried was to use it in the following way:
SELECT * from (SELECT device_id, tag_id, at, _deleted, data,
row_number() OVER (PARTITION BY device_id ORDER BY at DESC) AS row_num
FROM mdb_history.devices_tags_mapping_history
WHERE at <= '2019-04-01'
AND _deleted = False
AND (tag_id = '275674' or tag_id = '275673')) tag_deleted
WHERE tag_deleted.row_num = 1
But this becomes way too complicated as I do it with other queries as I have number of join and I have to select the column as stated from so it causes alot of select statement. Any smart way of doing that in a more simpler way. Thanks
You can't refer to the row_num alias which you defined in the same level of the select in your query. So, your main option here would be to subquery, where row_num would be available. But, Postgres actually has an option to get what you want in another way. You could use DISTINCT ON here:
SELECT DISTINCT ON (device_id), device_id, tag_id, at, _deleted, data
FROM mdb_history.devices_tags_mapping_history
WHERE
at <= '2019-04-01' AND
_deleted = false AND
tag_id IN ('275674', '275673')
ORDER BY
device_id,
at DESC;
Too long/ formatted for a comment. There is a reason behind #TimBiegeleisen statement "alias which you defined in the same level of the select". That reason is that all SQL statement follow the same sequence for evaluation. Unfortunately that sequence does NOT follow the sequence of clauses within the statement presentation. that sequence is in order:
from
where
group by
having
select
limits
You will notice that what actually gets selected fall well after evaluation of the where clause. Since your alias is defined within the select phase it does not exist during the where phase.

db2 - How to get the min date and the next from the same table

I have a table with date attribute and i need to do a query that gets the MIN date and the next of the MIN date
And I tried that :
select min(SC.TIMESTAMP) as minDate, result.TIMESTAMP
from Event SC
INNER JOIN
(SELECT TIMESTAMP from Event
HAVING TIMESTAMP > min(SC.TIMESTAMP)
) as result on result.BUSINESSID1 = SC.BUSINESSID1
where SC.BUSINESSSTEP = 'CONTAINER_PLACING_EVENT'
and SC.LOCATIONCODE = '1';
Could you please advice how to do that ?
Thanks in Advance
Perhaps you can rearrange your query into this form:
select
min(TS), min(TS2)
from
event,
(select TS as TS2 from event where TS > (select min(TS) from event))
Add extra criteria as desired. I would try to rewrite yours, but it isn't entirely clear what the criteria for the count are supposed to be. If you are expecting more than one row (for example, the min and min2 of each LOCATIONCODE) then you will probably want a GROUP BY in there.
Also, I wouldn't call a column TIMESTAMP as it is a reserved word.
You can use the ROW_NUMBER() OLAP Function:
SELECT *
FROM (
SELECT
TIMESTAMP
,ROW_NUMBER() OVER (
PARTITION BY BUSINESSSTEP, LOCATIONCODE
ORDER BY TIMESTAMP ASC
) AS RN
FROM EVENT
WHERE BUSINESSSTEP = 'CONTAINER_PLACING_EVENT'
AND LOCATIONCODE = '1'
) A
WHERE RN < 3
This will return as rows instead of columns, but it should get you what you want. If you think your original query would have returned multiple rows (for multiple entities), you can change the PARTITION BY clause to include the column that makes them distinct.

Sorting in CTE expression

I am retreiving all users from DB ordered by number of followers for each user DESC
with TH_Users as
(
SELECT [ID]
,[FullName]
,[UserName]
,[ImageName]
,dbo.GetUserFollowers(ID) AS Followers
, ROW_NUMBER() OVER (order by ID ) AS 'RowNumber'
from dbo.TH_Users
Where CultureID = #cultureID
)
Select ID,[FullName]
,[UserName]
,[ImageName], Followers from TH_Users
Where RowNumber BETWEEN #startIdx AND #endIdx
Order BY Followers DESC
I am using a function to get number of followers for each user. now is I user Followers column as the column order for ROW_NUMBER() OVER (order by Followers ) AS 'RowNumber'
I get a compilation error.
Putting Order BY Followers DESC at the end of the query will not give the intended result.
Any suggestions ?
Thanks
When you use AS to give an alias to a column, that alias is not available within the query - logically, applying aliases to columns is (almost) the very last part of evaluating a query.
So if you want your ROW_NUMBER with the CTE to be OVER what you alias as Followers, you must express it in the same terms as the column itself:
;with TH_Users as
(
SELECT [ID]
,[FullName]
,[UserName]
,[ImageName]
,dbo.GetUserFollowers(ID) AS Followers
, ROW_NUMBER() OVER (order by dbo.GetUserFollowers(ID) ) AS 'RowNumber'
from dbo.TH_Users
Where CultureID = #cultureID
)
Note that this will not cause the function to be evaluated any more times than it is currently.
(I have not tested this)