Postgresql use string_agg results in an IN statement - postgresql

Anyone know how string_agg results need to be "massaged" so they can be used in an IN statement?
The following is some sample code. Thanks for your time.
P.S: Before scratching your head and asking what the hell. I'm only using this code to show the problem of the string_agg b/c as you can see the query otherwise is a bit pointless.
Henry
WITH TEMP AS
(
SELECT 'John' AS col1
UNION ALL
SELECT 'Peter' AS col1
UNION ALL
SELECT 'Henry' AS col1
UNION ALL
SELECT 'Mo' AS col1
)
-- results that are being used in the IN statement
--SELECT string_agg('''' || col1::TEXT || '''',',') AS col1 FROM TEMP
SELECT col1 FROM TEMP
WHERE col1 IN
(
SELECT string_agg('''' || col1::TEXT || '''',',') AS col1
FROM TEMP
)

You can't mix dynamic code with static code. Your example is not very clear as to what exactly is it that you want to do. Your sample could be written as:
WITH TEMP(col1) AS (values ('John'), ('Peter'), ('Henry'), ('Mo'))
SELECT col1 FROM TEMP
WHERE col1 IN (SELECT col1 FROM TEMP)
or using an array:
WITH TEMP(col1) AS (values ('John'), ('Peter'), ('Henry'), ('Mo'))
SELECT col1 FROM TEMP
WHERE col1 = ANY (SELECT ARRAY(SELECT col1 FROM TEMP))
or simply (in this case since the main from and the subselect are the same table without any filters):
WITH TEMP(col1) AS (values ('John'), ('Peter'), ('Henry'), ('Mo'))
SELECT col1 FROM TEMP

Related

select distinct values in multiple column and save in common column with column tags

I have a table in postgres with two columns:
col1 col2
a a
b c
d e
f f
I would like to have distinct on the two columns and make one column and later assign the tag of column name from where it is coming. The desired output is:
col source
a col1, col2
b col1
c col1
d col1
e col1
f col1, col2
I am able to find distinct in individual columns but not able to make a single column and add label source.
below is the query i am using:
select distinct on (col1, col2) col1, col2 from table
Any suggestions would be really helpful.
You can un-pivot the columns and the aggregate them back:
select u.value, string_agg(distinct u.source, ',' order by u.source)
from data
cross join lateral (
values('col1', col1), ('col2', col2)
)as u(source,value)
group by u.value
order by u.value;
Online example
Alternatively, if you don't want to list each column, you can convert the row to a JSON value and then un-pivot that:
select x.value, string_agg(distinct x.source, ',' order by x.source)
from data d
cross join lateral jsonb_each_text(to_jsonb(d)) as x(source, value)
group by x.value
order by x.value;

Postgres EXECUTE with CTE

Is it possible to EXECUTE a prepared statement using parameters you'd get from a CTE ?
The below samples are simplified versions of my code, but this is replicating exactly the problem I have.
Here's how far I've been able to go - without a CTE :
BEGIN;
CREATE TEMPORARY TABLE testTable
(
col1 NUMERIC,
col2 TEXT
) ON COMMIT DROP;
INSERT INTO testTable
VALUES (1, 'foo'), (2, 'bar');
PREPARE myStatement AS
WITH cteTable AS
(
SELECT col1, col2
FROM testTable
WHERE col1 = $1
)
SELECT col2 FROM cteTable;
EXECUTE myStatement(2);
DEALLOCATE myStatement;
COMMIT;
Here's the result:
col2
bar
And now, here's what I am trying to achieve :
BEGIN;
CREATE TEMPORARY TABLE testTable
(
col1 NUMERIC,
col2 TEXT
) ON COMMIT DROP;
INSERT INTO testTable
VALUES (1, 'foo'), (2, 'bar');
PREPARE myStatement AS
WITH cteTable AS
(
SELECT col1, col2
FROM testTable
WHERE col1 = $1
)
SELECT col2 FROM cteTable;
-- Using a CTE here to get the parameters for the prepared statement
WITH parameters AS
(
SELECT 2 val
)
EXECUTE myStatement(SELECT val FROM parameters);
DEALLOCATE myStatement;
COMMIT;
The error message I'm having is
Syntax error at or near EXECUTE
Even if try to run the EXECUTE part without attempting to use the CTE values, I still have the same error message.
As I haven't been able to find anyone else having the same issue despite my researches, I guess I could be doing it wrong.. If so could someone please point me into the right direction ?
Thanks
As Vao Tsun commented, this is not possible.
I have converted my prepared statement to a function which will return a table, then used a CTE to SELECT UNION my function with multiple parameters.

Sort two csv fields by removing duplicates and without row-by-row processing

I am trying to combine two csv fields, eliminate duplicates, sort and store it in a new field.
I was able to achieve this. However, I encountered a scenario where the values are like abc and abc*. I need to keep the one with abc* and remove the other.
Could this be achieved without row by row processing?
Here is what I have.
CREATE TABLE csv_test
(
Col1 VARCHAR(100),
Col2 VARCHAR(100),
Col3 VARCHAR(500)
);
INSERT dbo.csv_test (Col1, Col2)
VALUES ('xyz,def,abc', 'abc*,tuv,def,xyz*,abc'), ('qwe,bca,a23', 'qwe,bca,a23*,abc')
--It is assumed that there are no spaces around commas
SELECT Col1, Col2, Col1 + ',' + Col2 AS Combined_NonUnique_Unsorted,
STUFF((
SELECT ',' + Item
FROM (SELECT DISTINCT Item FROM dbo.DelimitedSplit8K(Col1 + ',' + Col2,',')) t
ORDER BY Item
FOR XML PATH('')
),1,1,'') Combined_Unique_Sorted
, ExpectedResult = 'Keep the one with * and make it unique'
FROM dbo.csv_test;
--Expected Results; if there are values like abc and abc* ; I need to keep abc* and remove abc ;
--How can I achieve this without looping or using temp tables?
abc,abc*,def,tuv,xyz,xyz* -> abc*,def,tuv,xyz*
a23,a23*,abc,bca,qwe -> a23*,abc,bca,qwe
Well, since you agree that normalizing the database is the correct thing to do, I decided to try to come up with a solution for you.
I ended up with quite a cumbersome solution involving 4(!) common table expressions - cumbersome, but it works.
The first cte is to add a row identifier missing from your table - I've used ROW_NUMBER() OVER(ORDER BY Col1, Col2) for that.
The second cte is to get a unique set of values from combining both csv columns. Note that this does not handle the * part yet.
The third cte is handling the * issue.
And finally, the fourth cte is putting all the unique items back into a single csv. (I could do it in the third cte but I wanted to have each cte responsible of a single part of the solution - it's much more readable.)
Now all that's left is to update the first cte's Col3 with the fourth cte's Combined_Unique_Sorted:
;WITH cte1 as
(
SELECT Col1,
Col2,
Col3,
ROW_NUMBER() OVER(ORDER BY Col1, Col2) As rn
FROM dbo.csv_test
), cte2 as
(
SELECT rn, Item
FROM cte1
CROSS APPLY
(
SELECT DISTINCT Item
FROM dbo.DelimitedSplit8K(Col1 +','+ Col2, ',')
) x
), cte3 AS
(
SELECT rn, Item
FROM cte2 t0
WHERE NOT EXISTS
(
SELECT 1
FROM cte2 t1
WHERE t0.Item + '*' = t1.Item
AND t0.rn = t1.rn
)
), cte4 AS
(
SELECT rn,
STUFF
((
SELECT ',' + Item
FROM cte3 t1
WHERE t1.rn = t0.rn
ORDER BY Item
FOR XML PATH('')
), 1, 1, '') Combined_Unique_Sorted
FROM cte3 t0
)
UPDATE t0
SET Col3 = Combined_Unique_Sorted
FROM cte1 t0
INNER JOIN cte4 t1 ON t0.rn = t1.rn
To verify the results:
SELECT *
FROM csv_test
ORDER BY Col1, Col2
Results:
Col1 Col2 Col3
qwe,bca,a23 qwe,bca,a23*,abc a23*,abc,bca,qwe
xyz,def,abc abc*,tuv,def,xyz*,abc abc*,def,tuv,xyz*
You can see a live demo on rextester.

How to filter sql duplicates?

My question: I want the records without duplicate, in the same table and in multiple tables? How can I proceed to do this in SQL?
Let me explain what I have tried:
Select distinct Col1, col2
from Table
where order id = 143
Output
VolumeAnswer1 AreaAnswer1 heightAnswer1
VolumeAnswer2 AreaAnswer1 heightAnswer2
VolumeAnswer3 AreaAnswer1 heightAnswer2
Expected Output
It shows the duplicate for the second table, but I need the output to be like:
VolumeAnswer1 AreaAnswer1 heightAnswer1
VolumeAnswer2 heightAnswer2
VolumeAnswer3
I need the same scenario for multiple tables, same duplicate I found for joins also. If it cannot be handled in SQL Server, how can we handle it in .Net? I used multiple select but they used to change it in single select. Each and every column should bind in dropdownlist...
Something like this might be a good place to start:
;with cte1 as (
Select col1, cnt1
From (
Select
col1
,row_number() over(Partition by col1 Order by col1) as cnt1
From tbltest) as tbl_sub1
Where cnt1 = 1
), cte2 as (
Select col2, cnt2
From (
Select
col2
,row_number() over(Partition by col2 Order by col2) as cnt2
From tbltest) as tbl_sub2
Where cnt2 = 1
), cte3 as (
Select col3, cnt3
From (
Select
col3
,row_number() over(Partition by col3 Order by col3) as cnt3
From tbltest) as tbl_sub3
Where cnt3 = 1
)
Select
col1, col2, col3
From cte1
full join cte2 on col1 = col2
full join cte3 on col1 = col3
Sql Fiddle showing example: http://sqlfiddle.com/#!3/c9127/1

Aggregate GREATEST in T-SQL

My SQL is rusty -- I have a simple requirement to calculate the sum of the greater of two column values:
CREATE TABLE [dbo].[Test]
(
column1 int NOT NULL,
column2 int NOT NULL
);
insert into Test (column1, column2) values (2,3)
insert into Test (column1, column2) values (6,3)
insert into Test (column1, column2) values (4,6)
insert into Test (column1, column2) values (9,1)
insert into Test (column1, column2) values (5,8)
In the absence of the GREATEST function in SQL Server, I can get the larger of the two columns with this:
select column1, column2, (select max(c)
from (select column1 as c
union all
select column2) as cs) Greatest
from test
And I was hoping that I could simply sum them thus:
select sum((select max(c)
from (select column1 as c
union all
select column2) as cs))
from test
But no dice:
Msg 130, Level 15, State 1, Line 7
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
Is this possible in T-SQL without resorting to a procedure/temp table?
UPDATE: Eran, thanks - I used this approach. My final expression is a little more complicated, however, and I'm wondering about performance in this case:
SUM(CASE WHEN ABS(column1 * column2) > ABS(column3 * column4)
THEN column5 * ABS(column1 * column2) * column6
ELSE column5 * ABS(column3 * column4) * column6 END)
Try this:
SELECT SUM(CASE WHEN column1 > column2
THEN column1
ELSE column2 END)
FROM test
Try this... Its not the best performing option, but should work.
SELECT
'LargerValue' = CASE
WHEN SUM(c1) >= SUM(c2) THEN SUM(c1)
ELSE SUM(c2)
END
FROM Test
SELECT
SUM(MaximumValue)
FROM (
SELECT
CASE WHEN column1 > column2
THEN
column1
ELSE
column2
END AS MaximumValue
FROM
Test
) A
FYI, the more complicated case should be fine, so long as all of those columns are part of the same table. It's still looking up the same number of rows, so performance should be very similar to the simpler case (as SQL Server performance is usually IO bound).
How to find max from single row data
-- eg (empid , data1,data2,data3 )
select emplid , max(tmp.a)
from
(select emplid,date1 from table
union
select emplid,date2 from table
union
select emplid,date3 from table
) tmp , table
where tmp.emplid = table.emplid
select sum(id) from (
select (select max(c)
from (select column1 as c
union all
select column2) as cs) id
from test
)
The best answer to this is simply put :
;With Greatest_CTE As
(
Select ( Select Max(ValueField) From ( Values (column1), (column2) ) ValueTable(ValueField) ) Greatest
From Test
)
Select Sum(Greatest)
From Greatest_CTE
It scales a lot better than the other answers with more than two value columns.