How to turn groups of rows into separate columns? - postgresql

I have a postgresql table that looks like this:
a|b|c|result
0|3|6|50
0|3|7|51
0|4|6|52
0|4|7|53
1|3|6|54
1|3|7|55
1|4|6|56
1|4|7|57
Is there an easy way to SELECT something like:
a|result for b=3|result for b=4
0|sum(50,51) |sum(52,53)
1|sum(54,55) |sum(56,57)
In other words, how to convert the groups of values of b into columns of aggregate functions like sum(), avg(), or others?
Thanks for your comments.

Not sure I understand your question completely, but I think you are looking for case.
-- drop table if exists sample;
create table sample
(a int,
b int,
c int,
result int);
insert into sample values
(0,3,6,50),
(0,3,7,51),
(0,4,6,52),
(0,4,7,53),
(1,3,6,54),
(1,3,7,55),
(1,4,6,56),
(1,4,7,57)
;
select
a,
sum(case when b = 3 then result end) as result_for_b3,
sum(case when b = 4 then result end) as result_for_b4
from
sample
group by
a
Result:
a;result_for_b3;result_for_b4
1;109;113
0;101;105
And if you (but I hope you don't) need to have output exactly as in your question, than you need to use string_agg function:
select
a,
'aggreg(' || string_agg(case when b = 3 then result end::varchar, ',') || ')' as result_for_b3,
'aggreg(' || string_agg(case when b = 4 then result end::varchar, ',') || ')' as result_for_b4
from
sample
group by
a
Result:
a;result_for_b3;result_for_b4
0;aggreg(50,51);aggreg(52,53)
1;aggreg(54,55);aggreg(56,57)

Related

"Subquery returns more than 1 value" when trying to INSERT

I'm trying to INSERT into a table a column that is part of another column in another table using TSQL, but I get the error stating that there is more than one value returned when I used that subquery as an expression. I understand what causes the error, but I can't seem to think of a way to make it produce what I want.
I'm trying to do something similar to:
A.Base B.Reference C.Wanted
--- ---- ----
abcdaa aa abcdaa
bcdeab bb cdefbb
cdefbb cc efghcc
defgbc ddd fghddd
efghcc
fghddd
So I'm using the code:
INSERT INTO C ( [Some other column], Wanted )
SELECT
A.[Some other column],
, CASE
WHEN LEN( B.Reference ) = 2 THEN
( SELECT A.Base FROM A WHERE RIGHT( A.Base, 2 ) =
( SELECT B.Reference FROM B WHERE LEN( B.Reference ) = 2 )
)
WHEN LEN( B.Reference ) = 3 THEN
( SELECT A.Base FROM A WHERE RIGHT( A.Base, 3 ) =
( SELECT B.Reference FROM B WHERE LEN( B.Reference ) = 3 )
)
END
FROM
A
, B
Which will return me the "more than 1 value" error. Honestly, I'm probably making this way more convoluted than it needs to be, but I've been staring at these tables for a while now.
I hope I'm getting the idea across as to what I'm trying to do.
If you know the records aren't duplicate, and you are sure your JOIN between A and B works (as Martin mentioned) can't you just select distinct to return just the unique records?
I'd try it like this:
--Create a mockup with declared table variables and test data
DECLARE #tblA TABLE(someColumnInA VARCHAR(100));
DECLARE #tblB TABLE(someColumnInB VARCHAR(100));
DECLARE #tblC TABLE(someColumnInC VARCHAR(100));
INSERT INTO #tblA VALUES
('abcdaa')
,('bcdeab')
,('cdefbb')
,('defgbc')
,('efghcc')
,('fghddd')
INSERT INTO #tblB VALUES
('aa')
,('bb')
,('cc')
,('ddd');
--The query
INSERT INTO #tblC(someColumnInC)
SELECT SomeColumnInA
FROM #tblA a
WHERE EXISTS(SELECT 1 FROM #tblB b WHERE a.someColumnInA LIKE '%' + b.SomeColumnInB + '%');
SELECT * FROM #tblC;
The idea in short:
After creating a mockup (please do this next time in advance) we use a query to insert all values from #tblA into #tblC as long as there exists any value in #tblB, which is part of the current value in #tblA.
How about doing something like this?
select *
from A
where RIGHT(A.Base,2) IN (select B.Reference FROM B WHERE LEN(B.Reference) = 2)
UNION ALL
select *
from A
where RIGHT(A.Base,3) IN (select B.Reference FROM B WHERE LEN(B.Reference) = 3)

Does String Value Exists in a List of Strings | Redshift Query

I have some interesting data, I'm trying to query however I cannot get the syntax correct. I have a temporary table (temp_id), which I've filled with the id values I care about. In this example it is only two ids.
CREATE TEMPORARY TABLE temp_id (id bigint PRIMARY KEY);
INSERT INTO temp_id (id) VALUES ( 1 ), ( 2 );
I have another table in production (let's call it foo) which holds multiples those ids in a single cell. The ids column looks like this (below) with ids as a single string separated by "|"
ids
-----------
1|9|3|4|5
6|5|6|9|7
NULL
2|5|6|9|7
9|11|12|99
I want to evaluate each cell in foo.ids, and see if any of the ids in match the ones in my temp_id table.
Expected output
ids |does_match
-----------------------
1|9|3|4|5 |true
6|5|6|9|7 |false
NULL |false
2|5|6|9|7 |true
9|11|12|99 |false
So far I've come up with this, but I can't seem to return anything. Instead of trying to create a new column does_match I tried to filter within the WHERE statement. However, the issue is I cannot figure out how to evaluate all the id values in my temp table to the string blob full of the ids in foo.
SELECT
ids,
FROM foo
WHERE ids = ANY(SELECT LISTAGG(id, ' | ') FROM temp_ids)
Any suggestions would be helpful.
Cheers,
this would work, however not sure about performance
SELECT
ids
FROM foo
JOIN temp_ids
ON '|'||foo.ids||'|' LIKE '%|'||temp_ids.id::varchar||'|%'
you wrap the IDs list into a pair of additional separators, so you can always search for |id| including the first and the last number
The following SQL (I know it's a bit of a hack) returns exactly what you expect as an output, tested with your sample data, don't know how would it behave on your real data, try and let me know
with seq AS ( # create a sequence CTE to implement postgres' unnest
select 1 as i union all # assuming you have max 10 ids in ids field,
# feel free to modify this part
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10)
select distinct ids,
case # since I can't do a max on a boolean field, used two cases
# for 1s and 0s and converted them to boolean
when max(case
when t.id in (
select split_part(ids,'|',seq.i) as tt
from seq
join foo f on seq.i <= REGEXP_COUNT(ids, '|') + 1
where tt != '' and k.ids = f.ids)
then 1
else 0
end) = 1
then true
else false
end as does_match
from temp_id t, foo
group by 1
Please let me know if this works for you!

Sql Server - Running Totals Based on Conditions

I have been banging my head trying to come up with the correct logic (SQL Server 2012) needed to achieve something I would imagine would be fairly routine but I have been unable to find any examples of this anywhere. Basically, I have 3 columns in a table: product, flag, value. It is possible for a product to be listed multiple times within the table but only once with a unique flag (i.e. product1 can have flag1 or flag2 with different/identical but there will never be 2 records with product1 and flag1 and different/identical values).
The flag represents a pre-defined value (1,2,3,4) and the intention behind this field is to be able to assign a unique mathematical equation based on the value of the flag. The end result would yield a single product, the unique flag, and a new cumulative total based on the mathematical equation output. For instance, let's say product1 was listed 4 times with flag values of flag1, flag2, flag3, flag4 (see below):
Product-----Flag-----Value
Product1----Flag1----1.00
Product1----Flag2----3.00
Product1----Flag3----5.00
Product1----Flag4----7.00
Product-----Flag-----Value
Product1----Flag1----1.00 (flag1 value)
Product1----Flag2----4.00 (flag1+flag2 value)
Product1----Flag3----6.00 (flag1+flag3 value)
Product1----Flag4----10.00 (flag2+flag4 value)
Flag1 is defined as add flag1 only. Flag2 is defined as add flag1 and flag2. Flag 3 is defined as add flag1 and flag 3. Flag 4 is defined as add flag2 and flag4. the new output would be product1 listed four times with flag values of flag1, flag2, flag3, flag4 but new values as flag1, flag1_flag2, flag1+flag3, flag2+flag4.
I have tried to apply the logic via a case statement but I can't figure out how to traverse all the products for each condition and I have tried to go with a running totals solution but I am not sure how to incorporate the flag condition into it so it only performs a running total for when those conditions are true. Any assistance and/or article to help get me going down the right path would be greatly appreciated.
While I'm not sure I fully understand your question I think this might be what you want. For this to work it assumes flag1 is always present when flags 1 through 3 are and that flag2 is present when flag4 is.
;with cte as (
select
product,
max(case when flag = 'Flag1' then Value end) as f1Value,
max(case when flag = 'Flag2' then Value end) as f2Value,
max(case when flag = 'Flag3' then Value end) as f3Value,
max(case when flag = 'Flag4' then Value end) as f4Value
from flags group by Product
)
select
flags.Product,
flags.Flag,
flags.Value as "Org. value",
case flag
when 'Flag1' then f1Value
when 'Flag2' then f1Value + f2Value
when 'Flag3' then f1Value + f3Value
when 'Flag4' then f2Value + f4Value
else flags.Value -- take the present value when flag is not Flag1-4
end as "New value"
from flags
inner join cte on flags.Product = cte.Product
Take a look at this Sample SQL Fiddle to see it in action.
You can join a table to itself, and pick the conditions appropriately:
SELECT p1.product,p1.Flag,p1.Value + COALESCE(p2.Value,0)
FROM
Products p1
left join
Products p2
on
p1.Product = p2.Product and
p2.Flag = CASE p1.Flag
--1 doesn't need a previous value
WHEN 2 THEN 1
WHEN 3 THEN 1
WHEN 4 THEN 2
END
I assumed and tried on Range values.
CREATE TABLE #tmp (Product VARCHAR(10), flag VARCHAR(10),value numeric(13,2))
GO
INSERT INTO #tmp
SELECT 'Product1' , 'Flag1',1
UNION
SELECT 'Product1' , 'Flag2',3
UNION
SELECT 'Product1' , 'Flag3',5
UNION
SELECT 'Product1' , 'Flag4',7
GO
;WITH cte
AS
(
SELECT row_number () OVER(
ORDER BY flag) 'row',*
FROM #tmp
)
SELECT *,value 'RT'
FROM cte
WHERE row = 1
UNION
SELECT * ,(
SELECT cte.value
FROM cte
WHERE row = 1
) + value 'RT'
FROM cte
WHERE row BETWEEN 2
AND 3
UNION
SELECT * ,(
SELECT cte.value
FROM cte
WHERE row =2
) + value 'RT'
FROM cte
WHERE row >3
GO
DROP TABLE #tmp

query for a range of records in result

I am wondering if there is some easy way, a function, or other method to return data from a query with the following results.
I have a SQL Express DB 2008 R2, a table that contains numerical data in a given column, say col T.
I am given a value X in code and would like to return up to three records. The record where col T equals my value X, and the record before and after, and nothing else. The sort is done on col T. The record before may be beginning of file and therefore not exist, likewise, if X equals the last record then the record after would be non existent, end of file/table.
The value of X may not exist in the table.
This I think is similar to get a range of results in numerical order.
Any help or direction in solving this would be greatly appreciated.
Thanks again,
It might not be the most optimal solution, but:
SELECT T
FROM theTable
WHERE T = X
UNION ALL
SELECT *
FROM
(
SELECT TOP 1 T
FROM theTable
WHERE T > X
ORDER BY T
) blah
UNION ALL
SELECT *
FROM
(
SELECT TOP 1 T
FROM theTable
WHERE T < X
ORDER BY T DESC
) blah2
DECLARE #x int = 100
;WITH t as
(
select ROW_NUMBER() OVER (ORDER BY T ASC) AS row_nm,*
from YourTable
)
, t1 as
(
select *
from t
WHERE T = #x
)
select *
from t
CROSS APPLY t1
WHERE t.row_nm BETWEEN t1.row_nm -1 and t1.row_nm + 1

Getting the minimum of two values in SQL

I have two variables, one is called PaidThisMonth, and the other is called OwedPast. They are both results of some subqueries in SQL. How can I select the smaller of the two and return it as a value titled PaidForPast?
The MIN function works on columns, not variables.
SQL Server 2012 and 2014 supports IIF(cont,true,false) function. Thus for minimal selection you can use it like
SELECT IIF(first>second, second, first) the_minimal FROM table
While IIF is just a shorthand for writing CASE...WHEN...ELSE, it's easier to write.
The solutions using CASE, IIF, and UDF are adequate, but impractical when extending the problem to the general case using more than 2 comparison values. The generalized
solution in SQL Server 2008+ utilizes a strange application of the VALUES clause:
SELECT
PaidForPast=(SELECT MIN(x) FROM (VALUES (PaidThisMonth),(OwedPast)) AS value(x))
Credit due to this website:
http://sqlblog.com/blogs/jamie_thomson/archive/2012/01/20/use-values-clause-to-get-the-maximum-value-from-some-columns-sql-server-t-sql.aspx
Use Case:
Select Case When #PaidThisMonth < #OwedPast
Then #PaidThisMonth Else #OwedPast End PaidForPast
As Inline table valued UDF
CREATE FUNCTION Minimum
(#Param1 Integer, #Param2 Integer)
Returns Table As
Return(Select Case When #Param1 < #Param2
Then #Param1 Else #Param2 End MinValue)
Usage:
Select MinValue as PaidforPast
From dbo.Minimum(#PaidThisMonth, #OwedPast)
ADDENDUM:
This is probably best for when addressing only two possible values, if there are more than two, consider Craig's answer using Values clause.
For SQL Server 2022+ (or MySQL or PostgreSQL 9.3+), a better way is to use the LEAST and GREATEST functions.
SELECT GREATEST(A.date0, B.date0) AS date0,
LEAST(A.date1, B.date1, B.date2) AS date1
FROM A, B
WHERE B.x = A.x
With:
GREATEST(value [, ...]) : Returns the largest (maximum-valued) argument from values provided
LEAST(value [, ...]) Returns the smallest (minimum-valued) argument from values provided
Documentation links :
MySQL http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html
Postgres https://www.postgresql.org/docs/current/functions-conditional.html
SQL Server https://learn.microsoft.com/en-us/sql/t-sql/functions/logical-functions-least-transact-sql
I just had a situation where I had to find the max of 4 complex selects within an update.
With this approach you can have as many as you like!
You can also replace the numbers with aditional selects
select max(x)
from (
select 1 as 'x' union
select 4 as 'x' union
select 3 as 'x' union
select 2 as 'x'
) a
More complex usage
#answer = select Max(x)
from (
select #NumberA as 'x' union
select #NumberB as 'x' union
select #NumberC as 'x' union
select (
Select Max(score) from TopScores
) as 'x'
) a
I'm sure a UDF has better performance.
Here is a trick if you want to calculate maximum(field, 0):
SELECT (ABS(field) + field)/2 FROM Table
returns 0 if field is negative, else, return field.
Use a CASE statement.
Example B in this page should be close to what you're trying to do:
http://msdn.microsoft.com/en-us/library/ms181765.aspx
Here's the code from the page:
USE AdventureWorks;
GO
SELECT ProductNumber, Name, 'Price Range' =
CASE
WHEN ListPrice = 0 THEN 'Mfg item - not for resale'
WHEN ListPrice < 50 THEN 'Under $50'
WHEN ListPrice >= 50 and ListPrice < 250 THEN 'Under $250'
WHEN ListPrice >= 250 and ListPrice < 1000 THEN 'Under $1000'
ELSE 'Over $1000'
END
FROM Production.Product
ORDER BY ProductNumber ;
GO
This works for up to 5 dates and handles nulls. Just couldn't get it to work as an Inline function.
CREATE FUNCTION dbo.MinDate(#Date1 datetime = Null,
#Date2 datetime = Null,
#Date3 datetime = Null,
#Date4 datetime = Null,
#Date5 datetime = Null)
RETURNS Datetime AS
BEGIN
--USAGE select dbo.MinDate('20120405',null,null,'20110305',null)
DECLARE #Output datetime;
WITH Datelist_CTE(DT)
AS (
SELECT #Date1 AS DT WHERE #Date1 is not NULL UNION
SELECT #Date2 AS DT WHERE #Date2 is not NULL UNION
SELECT #Date3 AS DT WHERE #Date3 is not NULL UNION
SELECT #Date4 AS DT WHERE #Date4 is not NULL UNION
SELECT #Date5 AS DT WHERE #Date5 is not NULL
)
Select #Output=Min(DT) FROM Datelist_CTE;
RETURN #Output;
END;
Building on the brilliant logic / code from mathematix and scottyc, I submit:
DECLARE #a INT, #b INT, #c INT = 0;
WHILE #c < 100
BEGIN
SET #c += 1;
SET #a = ROUND(RAND()*100,0)-50;
SET #b = ROUND(RAND()*100,0)-50;
SELECT #a AS a, #b AS b,
#a - ( ABS(#a-#b) + (#a-#b) ) / 2 AS MINab,
#a + ( ABS(#b-#a) + (#b-#a) ) / 2 AS MAXab,
CASE WHEN (#a <= #b AND #a = #a - ( ABS(#a-#b) + (#a-#b) ) / 2)
OR (#a >= #b AND #a = #a + ( ABS(#b-#a) + (#b-#a) ) / 2)
THEN 'Success' ELSE 'Failure' END AS Status;
END;
Although the jump from scottyc's MIN function to the MAX function should have been obvious to me, it wasn't, so I've solved for it and included it here: SELECT #a + ( ABS(#b-#a) + (#b-#a) ) / 2. The randomly generated numbers, while not proof, should at least convince skeptics that both formulae are correct.
Use a temp table to insert the range of values, then select the min/max of the temp table from within a stored procedure or UDF. This is a basic construct, so feel free to revise as needed.
For example:
CREATE PROCEDURE GetMinSpeed() AS
BEGIN
CREATE TABLE #speed (Driver NVARCHAR(10), SPEED INT);
'
' Insert any number of data you need to sort and pull from
'
INSERT INTO #speed (N'Petty', 165)
INSERT INTO #speed (N'Earnhardt', 172)
INSERT INTO #speed (N'Patrick', 174)
SELECT MIN(SPEED) FROM #speed
DROP TABLE #speed
END
Select MIN(T.V) FROM (Select 1 as V UNION Select 2 as V) T
SELECT (WHEN first > second THEN second ELSE first END) the_minimal FROM table