postgres case statement with subquery - postgresql

I have a subquery like this
with subquery as (select host from table_A where << some condition >>)
and in my main query, I am querying data from another table called table_B, and one of the columns is called destination_host. Now I need to check if the destination_host is in the list returned from my subquery, then I want to output TypeA in my select statement or else TypeB. My select statement looks something like
select name, place, destination_host
from table_B
where <<some condition>>
I want to output a fourth column that is based on a condition check, let's say we call this host_category and if the destination_host value exists in the subquery then I want to add value typeA or else typeB. Please can you help me understand how to write this. I understand that it is hard to provide guidance if you don't have actual data to work with.
I tried using case statements such as this one:
when (destination_host in (select host from subquery)) THEN 'typeA'
when (destination_host not in (select host from subquery)) THEN 'typeB'
end as host_category
but I don't think this is the way to solve this problem.

I would use EXISTS:
WITH subquery AS (...)
SELECT CASE WHEN EXISTS (SELECT 1 FROM subquery
WHERE subquery.host = table_b.destination_host)
THEN 'typeA'
ELSE 'typeB'
END
FROM table_b;
With queries like that, you have to take care of NULL values. If table_b.destination_host is NULL, the row will always show up as typeB, because NULL = NULL is not TRUE in SQL.

Related

Use postgresql query results to form another query

I am trying to select from one table using the select result from another table. I can run this in two queries but would like to optimize it into just one.
First query.. Select ids where matching other id
select id from lookuptable where paid = '547'
This results in something like this
6316352
6316353
6318409
6318410
6320468
6320469
6320470
6322526
6322527
6324586
6324587
6326648
I would like to then use this result to make another selection. I can do it manually like below. Note, there could be many rows with these values so I've been using a IN statement
select * from "othertable" where id in (6316352,6316353,6318409,6318410,6320468,6320469,6320470,6322526,6322527,6324586,6324587,6326648);
select
ot.*
from
"othertable" as ot
join
lookuptable as lt
on
ot.id = lt.id
where
lt.paid = '547'
The IN operator supports not just value lists but also subqueries, so you can literally write
select * from "othertable" where id in (select id from lookuptable where paid = '547');

Update a column with row_number with duplicate records without PK

I have a table with duplicate records but without primary key. Data looks like this:
I want to update one empty column with one of column concatenate with row_number. After update, I want to achieve this:
Since the table does not have a unique column, which means I would join back to a CTE or subquery. I know in sql server, it can be done like this:
UPDATE X
SET X.NEW_KEY = X.PERSONNUMBER + '-' + X.NEW_CODE_DEST
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY PERSONNUMBER) AS NEW_CODE_DEST,PERSONNUMBER,NEW_KEY
FROM EMPLOYEE
) as X;
I tried same logic in postgresql but it didn't work. It threw an error:
SQL Error [42P01]: ERROR: relation "x" does not exist
I also tried this in
UPDATE EMPLOYEE
SET NEW_KEY = X.PERSONNUMBER || '-' || X.NEW_CODE_DEST
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY PERSONNUMBER) AS NEW_CODE_DEST,PERSONNUMBER
FROM EMPLOYEE
) as X;
The NEW_KEY is updated with duplicate value, not surprise there.
So is there a equivalent method in to achieve the same update result? Or I have my query wrong?
Really appreciate the help!

oracle: grouping on merged columns

I have a 2 tables FIRST
id,rl_no,adm_date,fees
1,123456,14-11-10,100
2,987654,10-11-12,30
3,4343,14-11-17,20
and SECOND
id,rollno,fare,type
1,123456,20,bs
5,634452,1000,bs
3,123456,900,bs
4,123456,700,bs
My requirement is twofold,
1, i first need to get all columns from both tables with common rl_no. So i used:
SELECT a.ID,a.rl_no,a.adm_date,a.fees,b.rollno,b.fare,b.type FROM FIRST a
INNER JOIN
SECOND b ON a.rl_no = b.rollno
The output is like this:
id,rl_no,adm_date,fees,rollno,fare,type
1,123456,14-11-10,100,123456,20,bs
1,123456,10-11-12,100,123456,900,bs
1,123456,14-11-17,100,123456,700,bs
2,Next i wanted to get the sum(fare) of those rollno that were common between the 2 tables and also whose fare >= fees from FIRST table group by rollno and id.
My query is:
SELECT x.ID,x.rl_no,,x.adm_date,x.fees,x.rollno,x.type,sum(x.fare) as "fare" from (SELECT a.ID,a.rl_no,a.adm_date,a.fees,b.rollno,b.fare,b.type FROM FIRST a
INNER JOIN
SECOND b ON a.rl_no = b.rollno) x, FIRST y
WHERE x.rollno = y.rl_no AND x.fare >= y.fees AND x.type IS NOT NULL GROUP BY x.rollno,x.ID ;
But this is throwing in exceptions.
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
The expected output will be like this:
id,rollno,adm_date,fare,type
1,123456,14-11-10,1620,bs
So could someone care to show an oracle newbie what i'm doing wrong here?
It looks like there's a couple different problems here;
Firstly, you're trying to group by an x.ID column which doesn't exist; it looks like you'll want to add ID to the selected columns in your sub-query.
Secondly, when aggregating with GROUP BY, all selected columns need to be either listed in the GROUP BY statement or aggregated. If you're grouping by rollno and ID, what do you want to have happen to all the extra values for adm_date, fees, and type? Are those always going to be the same for each distinct rollno and ID pair?
If so, simply add them to the GROUP BY statement, ie,
GROUP BY adm_date, fees, type, rollno, ID
If not, you'll need to work out exactly how you want to select which one to be output; If you've got output like your example (adding in an ID column here)
ID,adm_date,fees,rollno,fare,type
1,14-11-10,100,123456,20,bs
1,10-11-12,100,123456,900,bs
1,14-11-17,100,123456,700,bs
Call that result set 'a'. If I run;
SELECT a.ID, a.rollno, SUM(a.fare) as total_fare
FROM a
GROUP BY a.ID, a.rollno
Then the result will be a single row;
ID,rollno,total_fare
1,123456,1620
So, if you also select the adm_date, fees, and type columns, oracle has no idea what you mean to do with them. You're not using them for grouping, and you're not telling oracle how you want to pick which one to use.
You could do something like
SELECT a.ID,
FIRST(a.adm_date) as first_adm_date,
FIRST(a.fees) as first_fees,
a.rollno,
SUM(a.fare) as total_fare,
FIRST(a.type) as first_type
FROM a
GROUP BY a.ID, a.rollno
Which would give the result;
ID,first_adm_date,first_fees,rollno,total_fare,first_type
1,14-11-10,100,123456,1620,bs
I'm not sure if that's what you mean to do though.

EXISTS in filter returning too many values

I need to write a query that uses EXISTS, rather than IN, so that it will run fast. The filter is being fed so many parameter values that EXISTS seems like the only option. The difference is between a 20+ minute query and a 5 second query.
This is the query I have:
SELECT DISTINCT d.GROUP_NAME
FROM [EMPLOYEE] e JOIN [DATA_FACT] d ON (e.KEY = d.KEY)
WHERE d.DATE BETWEEN #Start and #End
AND EXISTS
(
select '1234567' -- #ID
)
AND e.Location IN (#Location)
ORDER BY d.GROUP_NAME ASC
The problem is that it is returning too many records. Based on the values I'm passing to filter on, I should get 1 row back but instead I am getting 28.
If I remove the EXISTS and add the following then I get the 1 record I need:
AND e.ID IN ('1234567')
Is there a way to fix the query to work with EXISTS so that I get the correct results?
This is essentially what you want if you are going to try to use exists to filter your data_fact table by parameters in your employee table. Not sure how much it's going to improve your performance though when you throw a massive number of employee IDs at it.
SELECT
d.GROUP_NAME
FROM [DATA_FACT] AS d
WHERE d.DATE BETWEEN #Start and #End
AND EXISTS
(
select 1
from EMPLOYEE AS e
WHERE d.[KEY] = e.[KEY]
AND e.[Location] IN (#Location)
AND e.ID IN ('1234567')
)
ORDER BY d.GROUP_NAME ASC

How to refactor this sql query

I have a lengthy query here, and wondering whether it could be refactor?
Declare #A1 as int
Declare #A2 as int
...
Declare #A50 as int
SET #A1 =(Select id from table where code='ABC1')
SET #A2 =(Select id from table where code='ABC2')
...
SET #A50 =(Select id from table where code='ABC50')
Insert into tableB
Select
Case when #A1='somevalue' Then 'x' else 'y' End,
Case when #A2='somevalue' Then 'x' else 'y' End,
..
Case when #A50='somevalue' Then 'x' else 'y' End
From tableC inner join ......
So as you can see from above, there is quite some redundant code. But I can not think of a way to make it simpler.
Any help is appreciated.
If you need the variables assigned, you could pivot your table...
SELECT *
FROM
(
SELECT Code, Id
FROM Table
) t
PIVOT
(MAX(Id) FOR Code IN ([ABC1],[ABC2],[ABC3],[ABC50])) p /* List them all here */
;
...and then assign them accordingly.
SELECT #A1 = [ABC1], #A2 = [ABC2]
FROM
(
SELECT Code, Id
FROM Table
) t
PIVOT
(MAX(Id) FOR Code IN ([ABC1],[ABC2],[ABC3],[ABC50])) p /* List them all here */
;
But I doubt you actually need to assign them at all. I just can't really picture what you're trying to achieve.
Pivotting may help you, as you can still use the CASE statements.
Rob
Without taking the time to develop a full answer, I would start by trying:
select id from table where code in ('ABC1', ... ,'ABC50')
then pivot that, to get one row result set of columns ABC1 through ABC50 with ID values.
Join that row in the FROM.
If 'somevalue', 'x' and 'y' are constant for all fifty expressions. Then start from:
select case id when 'somevalue' then 'x' else 'y' end as XY
from table
where code in ('ABC1', ... ,'ABC50')
I am not entirely sure from your example, but it looks like you should be able to do one of a few things.
Create a nice look up table that will tell you for a given value of the select statement what should be placed there. This would be much shorter and should be insanely fast.
Create a simple for loop in your code and generate a list of 50 small queries.
Use sub-selects or generate a list of selects with one round trip to retrieve your #a1-#A50 values and then generate the query with them already in place.
Jacob