Substitute Dense_Rank() value with Newid() value in TSQL? - tsql

is it possible to substitute the value generated by Dense_Rank() with a Newid() value in TSQL? I use Dense_Rank() for grouping but I need a uniqueidentifier generated instead of an integer. Thanks in advance.

There's no direct way to do this, but as I mentioned in my comment, you can get your dense_rank() for each record, then generate a NEWID() for each distinct Dense_Rank(), then join it back to itself.
CREATE TABLE test(f1 int, f2 char(1));
INSERT INTO test
VALUES (1, 'a'),
(1, 'b'),
(1, 'c'),
(2, 'a'),
(2, 'b'),
(3, 'a'),
(3, 'd'),
(3, 'g');
With dr AS (SELECT f1, f2, dense_rank() OVER (PARTITION BY f1 ORDER BY f2) as dr FROM test)
,dr_newid AS (SELECT dr, newid() as nid FROM (SELECT dr FROM dr GROUP BY dr) as drsub)
SELECT dr.f1, dr.f2, dr.dr, dr_newid.nid
FROM dr LEFT OUTER JOIN dr_newid ON dr.dr = dr_newid.dr
ORDER BY f1, f2;
+----+----+----+--------------------------------------+
| f1 | f2 | dr | nid |
+----+----+----+--------------------------------------+
| 1 | a | 1 | 966389AF-4C70-4AA8-A5C9-6F9537B8A1B8 |
| 1 | b | 2 | 73BE2978-B7D7-46B8-8B04-3103C8410575 |
| 1 | c | 3 | CB935CCA-AFE5-4D13-9583-0440DF1BEFE2 |
| 2 | a | 1 | 966389AF-4C70-4AA8-A5C9-6F9537B8A1B8 |
| 2 | b | 2 | 73BE2978-B7D7-46B8-8B04-3103C8410575 |
| 3 | a | 1 | 966389AF-4C70-4AA8-A5C9-6F9537B8A1B8 |
| 3 | d | 2 | 73BE2978-B7D7-46B8-8B04-3103C8410575 |
| 3 | g | 3 | CB935CCA-AFE5-4D13-9583-0440DF1BEFE2 |
+----+----+----+--------------------------------------+
One caveat here though... depending on how your box performs the join from dr to dr_newid it may generate unique newids for each distinct dense_rank value. Using a LEFT JOIN should trick the optimizer into generating the dr_newid intermediate result set once to be joined back. An INNER JOIN though may not.
If it's giving incorrect results, you may dump that dr_newid out to a temp table and then join back, forcing the server to derive the newid() once for each distinct dense_rank() and avoid tricks to force the optimizer's logic.
sqlfiddle here

Related

I don't understand how to add the grouped values on SQL

Data table:
| WINNER | FOOT CLUB|
| -------- | -------- |
| 1 | Beşiktaş |
| 2 | Beşiktaş |
| 3 |Galatasaray |
| 4 |Galatasaray |
| 5 | Beşiktaş |
| 6 | Istanbul |
| 7 | Istanbul |
| 8 | Istanbul |
| 9 |Galatasaray |
| 10 |Galatasaray |
| 11 |Fenerbahçe |
| 12 |Fenerbahçe |
| 13 |Fenerbahçe |
| 14 | Istanbul |
Help, please. I need to make a sorted array of a sequence of identical values appear. Use SQL syntax of any version. I need this result:
Beşiktaş 2
Galatasaray 2
Beşiktaş 1
Istanbul 3
Galatasaray 2
Fenerbahçe 3
Istanbul 1
CREATE TABLE football (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL
);
INSERT INTO football VALUES (1, 'Beşiktaş');
INSERT INTO football VALUES (2, 'Beşiktaş');
INSERT INTO football VALUES (3, 'Galatasaray');
INSERT INTO football VALUES (4, 'Galatasaray');
INSERT INTO football VALUES (5, 'Beşiktaş');
INSERT INTO football VALUES (6, 'Istanbul');
INSERT INTO football VALUES (7, 'Istanbul');
INSERT INTO football VALUES (8, 'Istanbul');
INSERT INTO football VALUES (9, 'Galatasaray');
INSERT INTO football VALUES (10, 'Galatasaray');
INSERT INTO football VALUES (11, 'Fenerbahçe');
INSERT INTO football VALUES (12, 'Fenerbahçe');
INSERT INTO football VALUES (13, 'Fenerbahçe');
INSERT INTO football VALUES (14, 'Istanbul');
SELECT name,
RANK() OVER()
FROM football
it turned out like this:
Beşiktaş|1
Beşiktaş|1
Galatasaray|1
Galatasaray|1
Beşiktaş|1
Istanbul|1
Istanbul|1
Istanbul|1
Galatasaray|1
Galatasaray|1
Fenerbahçe|1
Fenerbahçe|1
Fenerbahçe|1
Istanbul|1
The below was adapted from this solution.
Dbfiddle for your solution if desired
select name, count(*) as cnt
from (select t.*,
(row_number() over (order by id) - row_number() over (partition by name order by id)
) as grp
from football t
) t
group by name, grp
order by min(id) asc

PostgreSQL : Change column values based on another column value using some condition in same table

I have a table and want to replace the column value with value from other column value based on some condition.
+---------------------+
| Cntry | Code | Value |
+---------------------+
| US | C11 | A |
| US | C12 | B |
| US | C13 | C |
| US | C14 | D |
| US | C15 | E |
| UK | C11 | A |
| UK | C12 | B |
| UK | C13 | C |
| UK | C14 | D |
| UK | C15 | E |
+---------------------+
I want to replace the value of C14 based on the value of C11 based on Cntry
So my output should be like this.
+---------------------+
| Cntry | Code | Value |
+---------------------+
| US | C11 | A |
| US | C12 | B |
| US | C13 | C |
| US | C14 | A |<====Repalce with C11 for US
| US | C15 | E |
| UK | C11 | G |
| UK | C12 | B |
| UK | C13 | C |
| UK | C14 | G |<====Repalce with C11 for UK
| UK | C15 | E |
+---------------------+
Is there anyway to do this in postgresql?
Thanks
Create sample data:
CREATE TABLE table1 (
cntry varchar NULL,
code varchar NULL,
value varchar NULL
);
INSERT INTO table1 (cntry, code, value) VALUES('US', 'C11', 'A');
INSERT INTO table1 (cntry, code, value) VALUES('US', 'C12', 'B');
INSERT INTO table1 (cntry, code, value) VALUES('US', 'C13', 'C');
INSERT INTO table1 (cntry, code, value) VALUES('US', 'C14', 'D');
INSERT INTO table1 (cntry, code, value) VALUES('US', 'C15', 'E');
INSERT INTO table1 (cntry, code, value) VALUES('UK', 'C11', 'G');
INSERT INTO table1 (cntry, code, value) VALUES('UK', 'C12', 'B');
INSERT INTO table1 (cntry, code, value) VALUES('UK', 'C13', 'C');
INSERT INTO table1 (cntry, code, value) VALUES('UK', 'C14', 'D');
INSERT INTO table1 (cntry, code, value) VALUES('UK', 'C15', 'E');
Sample query:
select
t1.cntry,
t1.code,
case when t2.value is not null then t2.value else t1.value end as "value"
from table1 t1
left join (
select
cntry,
'C14' as code,
value
from table1
where code = 'C11'
) t2 on t1.cntry = t2.cntry and t1.code = t2.code
-- Result:
cntry code value
US C11 A
US C12 B
US C13 C
US C14 A
US C15 E
UK C11 G
UK C12 B
UK C13 C
UK C14 G
UK C15 E
If you want to actually change the contents of your table, then an UPDATE query will do the trick.
UPDATE mytable
SET code = 'C11'
WHERE code = 'C14'`
For obvious reasons, you should be super careful with UPDATE queries. There are a couple of ways to avoid mistakes that I sometimes use:
Try a SELECT statement first to get the rows I think I want to change. If this looks good, then edit the query to change SELECT to UPDATE
Make a copy of the table. Try your update on the copy. If you're happy with the results, try the query on the original table. Use SELECT INTO to create table (SELECT * INTO tablecopy FROM mytable) and then DROP TABLE (DROP tablecopy) on the copy.

Join and combine tables to get common rows in a specific column together in Postgres

I have a couple of tables in Postgres database. I have joined and merges the tables. However, I would like to have common values in a specific column to appear together in the final table (In the end, I would like to perform groupby and maximum value calculation on the table).
The schema of the test tables looks like this:
Schema (PostgreSQL v11)
CREATE TABLE table1 (
id CHARACTER VARYING NOT NULL,
seq CHARACTER VARYING NOT NULL
);
INSERT INTO table1 (id, seq) VALUES
('UA502', 'abcdef'), ('UA503', 'ghijk'),('UA504', 'lmnop')
;
CREATE TABLE table2 (
id CHARACTER VARYING NOT NULL,
score FLOAT
);
INSERT INTO table2 (id, score) VALUES
('UA502', 2.2), ('UA503', 2.6),('UA504', 2.8)
;
CREATE TABLE table3 (
id CHARACTER VARYING NOT NULL,
seq CHARACTER VARYING NOT NULL
);
INSERT INTO table3 (id, seq) VALUES
('UA502', 'qrst'), ('UA503', 'uvwx'),('UA504', 'yzab')
;
CREATE TABLE table4 (
id CHARACTER VARYING NOT NULL,
score FLOAT
);
INSERT INTO table4 (id, score) VALUES
('UA502', 8.2), ('UA503', 8.6),('UA504', 8.8);
;
I performed join and union and oepration of the tables to get the desired columns.
Query #1
SELECT table1.id, table1.seq, table2.score
FROM table1 INNER JOIN table2 ON table1.id = table2.id
UNION
SELECT table3.id, table3.seq, table4.score
FROM table3 INNER JOIN table4 ON table3.id = table4.id
;
The output looks like this:
| id | seq | score |
| ----- | ------ | ----- |
| UA502 | qrst | 8.2 |
| UA502 | abcdef | 2.2 |
| UA504 | yzab | 8.8 |
| UA503 | uvwx | 8.6 |
| UA504 | lmnop | 2.8 |
| UA503 | ghijk | 2.6 |
However, the desired output should be:
| id | seq | score |
| ----- | ------ | ----- |
| UA502 | qrst | 8.2 |
| UA502 | abcdef | 2.2 |
| UA504 | yzab | 8.8 |
| UA504 | lmnop | 2.8 |
| UA503 | uvwx | 8.6 |
| UA503 | ghijk | 2.6 |
View on DB Fiddle
How should I modify my query to get the desired output?

How can I get the sum(value) on the latest gather_time per group(name,col1) in PostgreSQL?

Actually, I got a good answer about the similar issue on below thread, but I need one more solution for different data set.
How to get the latest 2 rows ( PostgreSQL )
The Data set has historical data, and I just want to get sum(value) for the group on the latest gather_time.
The final result should be as following:
name | col1 | gather_time | sum
-------+------+---------------------+-----
first | 100 | 2016-01-01 23:12:49 | 6
first | 200 | 2016-01-01 23:11:13 | 4
However, I just can see the data for the one group(first-100) with a query below meaning that there is no data for the second group(first-200).
Thing is that I need to get the one row per the group.
The number of the group can be vary.
select name,col1,gather_time,sum(value)
from testtable
group by name,col1,gather_time
order by gather_time desc
limit 2;
name | col1 | gather_time | sum
-------+------+---------------------+-----
first | 100 | 2016-01-01 23:12:49 | 6
first | 100 | 2016-01-01 23:11:19 | 6
(2 rows)
Can you advice me to accomplish this requirement?
Data set
create table testtable
(
name varchar(30),
col1 varchar(30),
col2 varchar(30),
gather_time timestamp,
value integer
);
insert into testtable values('first','100','q1','2016-01-01 23:11:19',2);
insert into testtable values('first','100','q2','2016-01-01 23:11:19',2);
insert into testtable values('first','100','q3','2016-01-01 23:11:19',2);
insert into testtable values('first','200','t1','2016-01-01 23:11:13',2);
insert into testtable values('first','200','t2','2016-01-01 23:11:13',2);
insert into testtable values('first','100','q1','2016-01-01 23:11:11',2);
insert into testtable values('first','100','q1','2016-01-01 23:12:49',2);
insert into testtable values('first','100','q2','2016-01-01 23:12:49',2);
insert into testtable values('first','100','q3','2016-01-01 23:12:49',2);
select *
from testtable
order by name,col1,gather_time;
name | col1 | col2 | gather_time | value
-------+------+------+---------------------+-------
first | 100 | q1 | 2016-01-01 23:11:11 | 2
first | 100 | q2 | 2016-01-01 23:11:19 | 2
first | 100 | q3 | 2016-01-01 23:11:19 | 2
first | 100 | q1 | 2016-01-01 23:11:19 | 2
first | 100 | q3 | 2016-01-01 23:12:49 | 2
first | 100 | q1 | 2016-01-01 23:12:49 | 2
first | 100 | q2 | 2016-01-01 23:12:49 | 2
first | 200 | t2 | 2016-01-01 23:11:13 | 2
first | 200 | t1 | 2016-01-01 23:11:13 | 2
One option is to join your original table to a table containing only the records with the latest gather_time for each name, col1 group. Then you can take the sum of the value column for each group to get the result set you want.
SELECT t1.name, t1.col1, MAX(t1.gather_time) AS gather_time, SUM(t1.value) AS sum
FROM testtable t1 INNER JOIN
(
SELECT name, col1, col2, MAX(gather_time) AS maxTime
FROM testtable
GROUP BY name, col1, col2
) t2
ON t1.name = t2.name AND t1.col1 = t2.col1 AND t1.col2 = t2.col2 AND
t1.gather_time = t2.maxTime
GROUP BY t1.name, t1.col1
If you wanted to use a subquery in the WHERE clause, as you attempted in your OP, to restrict to only records with the latest gather_time then you could try the following:
SELECT name, col1, gather_time, SUM(value) AS sum
FROM testtable t1
WHERE gather_time =
(
SELECT MAX(gather_time)
FROM testtable t2
WHERE t1.name = t2.name AND t1.col1 = t2.col1
)
GROUP BY name, col1

SQL - group by - limit clause - postgresql

I have a table which has two columns C1 and C2.
C1 has an integer data type and C2 has text.
Table looks like this.
---C1--- ---C2---
1 | a |
1 | b |
1 | c |
1 | d |
1 | e |
1 | f |
1 | g |
2 | h |
2 | i |
2 | j |
2 | k |
2 | l |
2 | m |
2 | n |
------------------
My question: i want a sql query which does group by on column C1 but with size of 3.
looks like this.
------------------
1 | a,b,c |
1 | d,e,f |
1 | g |
2 | h,i,j |
2 | k,l,m |
2 | n |
------------------
is it possible by executing SQL???
Note: I do not want to write stored procedure or function...
You can use a common table expression to partition the results into rows, and then use STRING_AGG to join them into comma separated lists;
WITH cte AS (
SELECT *, (ROW_NUMBER() OVER (PARTITION BY C1 ORDER BY C2)-1)/3 rn
FROM mytable
)
SELECT C1, STRING_AGG(C2, ',') ALL_C2
FROM cte
GROUP BY C1,rn
ORDER BY C1
An SQLfiddle to test with.
A short explanation of the common table expression;
ROW_NUMBER() OVER (...) will number the results from 1 to n for each value of C1. We then subtract 1 and divide by 3 to get the sequence 0,0,0,1,1,1,2,2,2... and group by that value in the outer query to get 3 results per row.
Apart from Joachim Isaksson's answer,you try this method also
SELECT C1, string_agg(C2, ',') as c2
FROM (
SELECT *, (ROW_NUMBER() OVER (PARTITION BY C1 ORDER BY C2)-1)/3 as row_num
FROM atable) t
GROUP BY C1,row_num
ORDER BY c2