Ignore results less than X time apart

Ignore results less than X time apart - tsql

The data I currently have looks something like the table I have below:
UserId VisitDate
1 2012-01-01 00:15:00.000
1 2012-01-01 00:16:00.000
1 2012-01-12 00:15:00.000
1 2012-01-12 00:16:00.000
1 2012-01-24 00:15:00.000
1 2012-01-24 00:16:00.000
I would like to return only results than have been 10 or more days apart so that it looks like so:
UserId VisitDate
1 2012-01-01 00:15:00.000
1 2012-01-12 00:15:00.000
1 2012-01-24 00:15:00.000
Simpler than I'm making it perhaps, but how would one go about doing so in transact-sql?

You can use the new lag() function in that case: http://msdn.microsoft.com/en-us/library/hh231256.aspx
Something like:
SELECT UserId, VisitDate
FROM(
SELECT UserId,
VisitDate,
LAG(VisitDate,1,'1900-01-01')OVER(PARTITION BY UserId ORDER BY VisitDate) PrevVisitDate
FROM dbo.YourTable
)X
WHERE DATEDIFF(day,PrevVisitDate,VisitDate)>=10;
You can see it in action here: http://sqlfiddle.com/#!6/a0f02/1

Related

Generating a table with dates from a table without dates in Postgresql

I would like to create a table that has a list of dates from a table that doesn't have dates.
So table A would be (without any dates):
product_name
1
2
3
4
5
6
7
8
9
10
table B would be:
dates
01/01/2022
02/01/2022
03/01/2022
04/01/2022
05/01/2022
...
etc to 31/12/2022
My final outcome table would be:
product_name dates
1 01/01/2022
1 02/01/2022
1 03/01/2022
1 04/01/2022
1 05/01/2022
1 05/01/2022
etc all the way to 31/12/2022
but also further down
product_name dates
2 01/01/2022
2 02/01/2022
etc..
Is there a way to do this simply? I have tried
select date_trunc('day', dd) date
from pg_catalog.generate_series
( '2022-01-01' :: timestamp
,'2022-12-31' :: timestamp
, '1 day' :: interval) dd
;
But I couldn't join as there are no dates in table A.

Find Minimum Timestamp From 2 Users POSTGRES

This is my table_gamers:
game_id
user1
user2
timestamp
1
890
123
2022-01-01
2
123
768
2022-02-09
I need to find for each user:
The first user they played.
Their first game ID.
Their MIN timestamp (timestamp from their first game).
This is what I need:
User
User They Played
Game ID
timestamp
890
123
1
2022-01-01
123
890
1
2022-01-01
768
123
2
2022-02-09
This is my query:
SELECT user1 FROM table_gamers WHERE MIN(timestamp)
UNION ALL
SELECT user1 FROM table_gamers WHERE MIN(timestamp)
How do I query each User's First Opponent? I am confused.

doing step by step by some with_clauses:
first get all matches user1-user2, user2-user1
second give some ids by ordering by timestamp
third get what you want:
with base_data as (
select game_id,user1,user2,timestamp from table_gamers
union all
select game_id,user2,user1,timestamp from table_gamers
),
base_id as (
select
row_number() over (order by base_data.timestamp) as id,
row_number() over (PARTITION by base_data.user1 order by base_data.timestamp) as id_2,
*
from base_data
)
select * from base_id
where id_2 = 1 order by timestamp
retults in
id id_2 game_id user1 user2 timestamp
2 1 1 123 890 2022-01-01T00:00:00.000Z
1 1 1 890 123 2022-01-01T00:00:00.000Z
4 1 2 768 123 2022-02-09T00:00:00.000Z
i hope that gives you the right idea
https://www.db-fiddle.com/f/9PrxioFeVaTmtVcYdteovj/0

Add date for each ID in PostgreSQL

I have a table "Users" that looks like this
id name
1 Johny
2 Michael
3 Jony
i want add new column called date,
date
2021-01-01
2021-02-01
but i want the date for each id
id name date
1 Johny 2021-01-01
1 Johny 2021-02-01
2 Michael 2021-01-01
2 Michael 2021-02-01
3 Jony 2021-01-01
3 Jony 2021-02-01
How to do this ?

Seemingly you're wanting to do a cross join between an existing table, users and either another table, or some pseudo table.
Here's how to do it with a pseudo table (I've aliased as d)
select u.id, u.name, d.date
from
users u
cross join
(
select TO_DATE('2021-01-01', 'YYYY-MM-DD') as date
union
select TO_DATE('2021-02-01', 'YYYY-MM-DD')
) as d;
Sql Fiddle example here
This will produce all permutations (#users x #dates) between the two tables

Optimized querying in PostgreSQL

Assume you have a table named tracker with following records.
issue_id | ingest_date | verb,status
10 2015-01-24 00:00:00 1,1
10 2015-01-25 00:00:00 2,2
10 2015-01-26 00:00:00 2,3
10 2015-01-27 00:00:00 3,4
11 2015-01-10 00:00:00 1,3
11 2015-01-11 00:00:00 2,4
I need the following results
10 2015-01-26 00:00:00 2,3
11 2015-01-11 00:00:00 2,4
I am trying out this query
select *
from etl_change_fact
where ingest_date = (select max(ingest_date)
from etl_change_fact);
However, this gives me only
10 2015-01-26 00:00:00 2,3
this record.
But, I want all unique records(change_id) with
(a) max(ingest_date) AND
(b) verb columns priority being (2 - First preferred ,1 - Second preferred ,3 - last preferred)
Hence, I need the following results
10 2015-01-26 00:00:00 2,3
11 2015-01-11 00:00:00 2,4
Please help me to efficiently query it.
P.S :
I am not to index ingest_date because I am going to set it as "distribution key" in Distributed Computing setup.
I am newbie to Data Warehouse and querying.
Hence, please help me with optimized way to hit my TB sized DB.

This is a typical "greatest-n-per-group" problem. If you search for this tag here, you'll get plenty of solutions - including MySQL.
For Postgres the quickest way to do it is using distinct on (which is a Postgres proprietary extension to the SQL language)
select distinct on (issue_id) issue_id, ingest_date, verb, status
from etl_change_fact
order by issue_id,
case verb
when 2 then 1
when 1 then 2
else 3
end, ingest_date desc;
You can enhance your original query to use a co-related sub-query to achieve the same thing:
select f1.*
from etl_change_fact f1
where f1.ingest_date = (select max(f2.ingest_date)
from etl_change_fact f2
where f1.issue_id = f2.issue_id);
Edit
For an outdated and unsupported Postgres version, you can probably get away using something like this:
select f1.*
from etl_change_fact f1
where f1.ingest_date = (select f2.ingest_date
from etl_change_fact f2
where f1.issue_id = f2.issue_id
order by case verb
when 2 then 1
when 1 then 2
else 3
end, ingest_date desc
limit 1);
SQLFiddle example: http://sqlfiddle.com/#!15/3bb05/1

SELECT record based upon dates

Assuming data such as the following:
ID EffDate Rate
1 12/12/2011 100
1 01/01/2012 110
1 02/01/2012 120
2 01/01/2012 40
2 02/01/2012 50
3 01/01/2012 25
3 03/01/2012 30
3 05/01/2012 35
How would I find the rate for ID 2 as of 1/15/2012?
Or, the rate for ID 1 for 1/15/2012?
In other words, how do I do a query that finds the correct rate when the date falls between the EffDate for two records? (Rate should be for the date prior to the selected date).
Thanks,
John

How about this:
SELECT Rate
FROM Table1
WHERE ID = 1 AND EffDate = (
SELECT MAX(EffDate)
FROM Table1
WHERE ID = 1 AND EffDate <= '2012-15-01');
Here's an SQL Fiddle to play with. I assume here that 'ID/EffDate' pair is unique for all table (at least the opposite doesn't make sense).

SELECT TOP 1 Rate FROM the_table
WHERE ID=whatever AND EffDate <='whatever'
ORDER BY EffDate DESC
if I read you right.
(edited to suit my idea of ms-sql which I have no idea about).