How to select last timestamp by distinct columns?

How to select last timestamp by distinct columns? - select

Suppose there is table like this:
| user_id | location_id | datetime | other_field |
| ------- | ----------- | ------------------- | ----------- |
| 12 | 1 | 2020-02-01 10:00:00 | asdqwe |
| 12 | 1 | 2020-02-01 10:30:00 | asdqwe |
| 12 | 2 | 2020-02-01 10:40:00 | asdqwe |
| 12 | 2 | 2020-02-01 10:50:00 | asdqwe |
| 13 | 1 | 2020-02-01 10:10:00 | asdqwe |
| 13 | 1 | 2020-02-01 10:20:00 | asdqwe |
| 14 | 3 | 2020-02-01 09:00:00 | asdqwe |
I want to select last datetime of each distinct user_id and location_id. This is what result I am looking for:
| user_id | location_id | datetime | other_field |
| ------- | ----------- | ------------------- | ----------- |
| 12 | 1 | 2020-02-01 10:30:00 | asdqwe |
| 12 | 2 | 2020-02-01 10:50:00 | asdqwe |
| 13 | 1 | 2020-02-01 10:20:00 | asdqwe |
| 14 | 3 | 2020-02-01 09:00:00 | asdqwe |
Here is the table description:
CREATE TABLE mykeyspace.mytable (
user_id int,
location_id int,
datetime timestamp,
other_field text,
PRIMARY KEY ((user_id, location_id, other_field), datetime)
) WITH CLUSTERING ORDER BY (datetime ASC)
AND read_repair_chance = 0.0
AND dclocal_read_repair_chance = 0.1
AND gc_grace_seconds = 864000
AND bloom_filter_fp_chance = 0.01
AND caching = { 'keys' : 'ALL', 'rows_per_partition' : 'NONE' }
AND comment = ''
AND compaction = { 'class' : 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold' : 32, 'min_threshold' : 4 }
AND compression = { 'chunk_length_in_kb' : 64, 'class' : 'org.apache.cassandra.io.compress.LZ4Compressor' }
AND default_time_to_live = 0
AND speculative_retry = '99PERCENTILE'
AND min_index_interval = 128
AND max_index_interval = 2048
AND crc_check_chance = 1.0
AND cdc = false;

For such things, CQL has "PER PARTITION LIMIT" clause (available in Cassandra 3.6+ IIRC). But to use on your table, you need to change table definition to CLUSTERING ORDER BY (datetime DESC), and then you could write:
select * from prospacedb.quarter_utilisation per partition limit 1;
and get row with latest timestamp for every partition key you have.

Related

Return unique grouped rows with the latest timestamp [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 3 years ago.
At the moment I'm struggling with a problem that looks very easy.
Tablecontent:
Primay Keys: Timestamp, COL_A,COL_B ,COL_C,COL_D
+------------------+-------+-------+-------+-------+--------+--------+
| Timestamp | COL_A | COL_B | COL_C | COL_D | Data_A | Data_B |
+------------------+-------+-------+-------+-------+--------+--------+
| 31.07.2019 15:12 | - | - | - | - | 1 | 2 |
| 31.07.2019 15:32 | 1 | 1 | 100 | 1 | 5000 | 20 |
| 10.08.2019 09:33 | - | - | - | - | 1000 | 7 |
| 31.07.2019 15:38 | 1 | 1 | 100 | 1 | 33 | 5 |
| 06.08.2019 08:53 | - | - | - | - | 0 | 7 |
| 06.08.2019 09:08 | - | - | - | - | 0 | 7 |
| 06.08.2019 16:06 | 3 | 3 | 3 | 3 | 0 | 23 |
| 07.08.2019 10:43 | - | - | - | - | 0 | 42 |
| 07.08.2019 13:10 | - | - | - | - | 0 | 24 |
| 08.08.2019 07:19 | 11 | 111 | 111 | 12 | 0 | 2 |
| 08.08.2019 10:54 | 2334 | 65464 | 565 | 76 | 1000 | 19 |
| 08.08.2019 11:15 | 232 | 343 | 343 | 43 | 0 | 2 |
| 08.08.2019 11:30 | 2323 | rtttt | 3434 | 34 | 0 | 2 |
| 10.08.2019 14:47 | - | - | - | - | 123 | 23 |
+------------------+-------+-------+-------+-------+--------+--------+
Needed query output:
+------------------+-------+-------+-------+-------+--------+--------+
| Timestamp | COL_A | COL_B | COL_C | COL_D | Data_A | Data_B |
+------------------+-------+-------+-------+-------+--------+--------+
| 31.07.2019 15:38 | 1 | 1 | 100 | 1 | 33 | 5 |
| 06.08.2019 16:06 | 3 | 3 | 3 | 3 | 0 | 23 |
| 08.08.2019 07:19 | 11 | 111 | 111 | 12 | 0 | 2 |
| 08.08.2019 10:54 | 2334 | 65464 | 565 | 76 | 1000 | 19 |
| 08.08.2019 11:15 | 232 | 343 | 343 | 43 | 0 | 2 |
| 08.08.2019 11:30 | 2323 | rtttt | 3434 | 34 | 0 | 2 |
| 10.08.2019 14:47 | - | - | - | - | 123 | 23 |
+------------------+-------+-------+-------+-------+--------+--------+
As you can see, I'm trying to get single rows for my primary keys, using the latest timestamp, which is also a primary key.
Currently, I tried a query like:
SELECT Timestamp, COL_A, COL_B, COL_C, COL_D, Data_A, Data_B From Table XY op
WHERE Timestamp = (
SELECT MAX(Timestamp) FROM XY as tsRow
WHERE op.COL_A = tsRow.COL_A
AND op.COL_B = tsRow.COL_B
AND op.COL_C = tsRow.COL_C
AND op.COL_D = tsRow."COL_D
);
which gives me result that looks fine at first glance.
Is there a better or more safe way to get my preferred result?

demo:db<>fiddle
You can use the DISTINCT ON clause, which gives you the first record of an ordered group. Here your group is your (A, B, C, D). This is ordered by the Timestamp column, in descending order, to get the most recent record to be the first.
SELECT DISTINCT ON ("COL_A", "COL_B", "COL_C", "COL_D")
*
FROM
mytable
ORDER BY "COL_A", "COL_B", "COL_C", "COL_D", "Timestamp" DESC
If you want to get your expected order, you need a second ORDER BY after this operation:
SELECT
*
FROM (
SELECT DISTINCT ON ("COL_A", "COL_B", "COL_C", "COL_D")
*
FROM
mytable
ORDER BY "COL_A", "COL_B", "COL_C", "COL_D", "Timestamp" DESC
) s
ORDER BY "Timestamp"
Note: If you have the Timestamp column as part of the PK, are you sure, you really need the four other columns as PK as well? It seems, that the TS column is already unique.

DB2 Query multiple select and sum by date

I have 3 tables: ITEMS, ODETAILS, OHIST.
ITEMS - a list of products, ID is the key field
ODETAILS - line items of every order, no key field
OHIST - a view showing last years order totals by month
ITEMS ODETAILS OHIST
+----+----------+ +-----+---------+---------+----------+ +---------+-------+
| ID | NAME | | OID | ODUE | ITEM_ID | ITEM_QTY | | ITEM_ID | M5QTY |
+----+----------+ +-----+---------+---------+----------+ +---------+-------+
| 10 + Widget10 | | A33 | 1180503 | 10 | 100 | | 10 | 1000 |
+----+----------+ +-----+---------+---------+----------+ +---------+-------+
| 11 + Widget11 | | A33 | 1180504 | 11 | 215 | | 11 | 1500 |
+----+----------+ +-----+---------+---------+----------+ +---------+-------+
| 12 + Widget12 | | A34 | 1180505 | 10 | 500 | | 12 | 2251 |
+----+----------+ +-----+---------+---------+----------+ +---------+-------+
| 13 + Widget13 | | A34 | 1180504 | 11 | 320 | | 13 | 4334 |
+----+----------+ +-----+---------+---------+----------+ +---------+-------+
| A34 | 1180504 | 12 | 450 |
+-----+---------+---------+----------+
| A34 | 1180505 | 13 | 125 |
+-----+---------+---------+----------+
Assuming today is May 2, 2018 (1180502).
I want my results to show ID, NAME, M5QTY, and SUM(ITEM_QTY) grouped by day
over the next 3 days (D1, D2, D3)
Desired Result
+----+----------+--------+------+------+------+
| ID | NAME | M5QTY | D1 | D2 | D3 |
+----+----------+--------+------+------+------+
| 10 | Widget10 | 1000 | 100 | | 500 |
+----+----------+--------+------+------+------+
| 11 | Widget11 | 1500 | | 535 | |
+----+----------+--------+------+------+------+
| 12 | Widget12 | 2251 | | 450 | |
+----+----------+--------+------+------+------+
| 13 | Widget13 | 4334 | | | 125 |
+----+----------+--------+------+------+------+
This is how I convert ODUE to a date
DATE(concat(concat(concat(substr(char((ODETAILS.ODUE-1000000)+20000000),1,4),'-'), concat(substr(char((ODETAILS.ODUE-1000000)+20000000),5,2), '-')), substr(char((ODETAILS.ODUE-1000000)+20000000),7,2)))

Try this (you can add the joins you need)
SELECT ITEM_ID
, SUM(CASE WHEN ODUE = INT(CURRENT DATE) - 19000000 + 1 THEN ITEM_QTY ELSE 0 END) AS D1
, SUM(CASE WHEN ODUE = INT(CURRENT DATE) - 19000000 + 2 THEN ITEM_QTY ELSE 0 END) AS D2
, SUM(CASE WHEN ODUE = INT(CURRENT DATE) - 19000000 + 3 THEN ITEM_QTY ELSE 0 END) AS D3
FROM
ODETAILS
GROUP BY
ITEM_ID

Rank based on row number SQL Server 2008 R2

I want to group rank my table data by rowcount. First 12 rows that are ordered by date for each ProductID would get value = 1. Next 12 rows would get value = 2 assigned and so on.
How table structure looks:
For ProductID = 1267 are below associated dates:
02-01-2016
03-01-2016
.
. (skipping months..table has one date per month)
.
12-01-2016
02-01-2017
.
.
.
02-01-2018

Use row_number() over() with some arithmetic to calculate groups of 12 ordered by date (per productid). Change the sort to ASCendng or DESCendng to suit your need.
select *
, (11 + row_number() over(partition by productid order by somedate DESC)) / 12 as rnk
from mytable
GO
myTableID | productid | somedate | rnk
--------: | :------------- | :------------------ | :--
9 | 123456 | 2018-11-12 08:24:25 | 1
8 | 123456 | 2018-10-02 12:29:04 | 1
7 | 123456 | 2018-09-09 02:39:30 | 1
2 | 123456 | 2018-09-02 08:49:37 | 1
1 | 123456 | 2018-07-04 12:25:06 | 1
5 | 123456 | 2018-06-06 11:38:50 | 1
12 | 123456 | 2018-05-23 21:12:03 | 1
18 | 123456 | 2018-04-02 03:59:16 | 1
3 | 123456 | 2018-01-02 03:42:24 | 1
17 | 123456 | 2017-11-29 03:19:32 | 1
10 | 123456 | 2017-11-10 00:45:41 | 1
13 | 123456 | 2017-11-05 09:53:38 | 1
16 | 123456 | 2017-10-20 15:39:42 | 2
4 | 123456 | 2017-10-14 19:25:30 | 2
20 | 123456 | 2017-09-21 21:31:06 | 2
6 | 123456 | 2017-04-06 22:10:58 | 2
14 | 123456 | 2017-03-24 23:35:52 | 2
19 | 123456 | 2017-01-22 05:07:23 | 2
11 | 123456 | 2016-12-13 19:17:08 | 2
15 | 123456 | 2016-12-02 03:22:32 | 2
dbfiddle here

Postgresql - increment counter in rows where a column has duplicate value

I have added a column (seq) to a table used for scheduling so the front end can manage the order in which each item can be displayed. Is it possible to craft a SQL query to populate this column with an incremental counter based on the common duplicate values in the date column?
Before
------------------------------------
| name | date_time | seq |
------------------------------------
| ABC1 | 15-01-2017 11:00:00 | |
| ABC2 | 16-01-2017 11:30:00 | |
| ABC1 | 16-01-2017 11:30:00 | |
| ABC3 | 17-01-2017 10:00:00 | |
| ABC3 | 18-01-2017 12:30:00 | |
| ABC4 | 18-01-2017 12:30:00 | |
| ABC1 | 18-01-2017 12:30:00 | |
------------------------------------
After
------------------------------------
| name | date_time | seq |
------------------------------------
| ABC1 | 15-01-2017 11:00:00 | 0 |
| ABC2 | 16-01-2017 11:30:00 | 0 |
| ABC1 | 16-01-2017 11:30:00 | 1 |
| ABC3 | 17-01-2017 10:00:00 | 0 |
| ABC3 | 18-01-2017 12:30:00 | 0 |
| ABC4 | 18-01-2017 12:30:00 | 1 |
| ABC1 | 18-01-2017 12:30:00 | 2 |
------------------------------------
Solved, thanks to both answers.
To make it easier for anybody who finds this, the working code is:
UPDATE my_table f
SET seq = seq2
FROM (
SELECT ctid, ROW_NUMBER() OVER (PARTITION BY date_time ORDER BY ctid) -1 AS seq2
FROM my_table
) s
WHERE f.ctid = s.ctid;

Use the window function row_number():
with my_table (name, date_time) as (
values
('ABC1', '15-01-2017 11:00:00'),
('ABC2', '16-01-2017 11:30:00'),
('ABC1', '16-01-2017 11:30:00'),
('ABC3', '17-01-2017 10:00:00'),
('ABC3', '18-01-2017 12:30:00'),
('ABC4', '18-01-2017 12:30:00'),
('ABC1', '18-01-2017 12:30:00')
)
select *,
row_number() over (partition by name order by date_time)- 1 as seq
from my_table
order by date_time;
name | date_time | seq
------+---------------------+-----
ABC1 | 15-01-2017 11:00:00 | 0
ABC1 | 16-01-2017 11:30:00 | 1
ABC2 | 16-01-2017 11:30:00 | 0
ABC3 | 17-01-2017 10:00:00 | 0
ABC1 | 18-01-2017 12:30:00 | 2
ABC3 | 18-01-2017 12:30:00 | 1
ABC4 | 18-01-2017 12:30:00 | 0
(7 rows)
Read this answer for a similar question about updating existing records with a unique integer.

Check out ROW_NUMBER().
SELECT name, date_time, ROW_NUMBER() OVER (PARTITION BY date_time ORDER BY name) FROM [table]

Crosstab function and Dates PostgreSQL

I had to create a cross tab table from a Query where dates will be changed into column names. These order dates can be increase or decrease as per the dates passed in the query. The order date is in Unix format which is changed into normal format.
Query is following:
Select cd.cust_id
, od.order_id
, od.order_size
, (TIMESTAMP 'epoch' + od.order_date * INTERVAL '1 second')::Date As order_date
From consumer_details cd,
consumer_order od,
Where cd.cust_id = od.cust_id
And od.order_date Between 1469212200 And 1469212600
Order By od.order_id, od.order_date
Table as follows:
cust_id | order_id | order_size | order_date
-----------|----------------|---------------|--------------
210721008 | 0437756 | 4323 | 2016-07-22
210721008 | 0437756 | 4586 | 2016-09-24
210721019 | 10749881 | 0 | 2016-07-28
210721019 | 10749881 | 0 | 2016-07-28
210721033 | 13639 | 2286145 | 2016-09-06
210721033 | 13639 | 2300040 | 2016-10-03
Result will be:
cust_id | order_id | 2016-07-22 | 2016-09-24 | 2016-07-28 | 2016-09-06 | 2016-10-03
-----------|----------------|---------------|---------------|---------------|---------------|---------------
210721008 | 0437756 | 4323 | 4586 | | |
210721019 | 10749881 | | | 0 | |
210721033 | 13639 | | | | 2286145 | 2300040

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to select last timestamp by distinct columns? - select

Related

Return unique grouped rows with the latest timestamp [duplicate]

DB2 Query multiple select and sum by date

Rank based on row number SQL Server 2008 R2

Postgresql - increment counter in rows where a column has duplicate value

Crosstab function and Dates PostgreSQL

Categories

Resources