Create new columns using a column value and fill values from another column - postgresql

I have the following table in PostgreSQL
id type name
146 INN Ofloxacin
146 TRADE_NAME Ocuflox
146 TRADE_NAME Ofloxacin
146 TRADE_NAME Tarivid i.v.
146 TRADE_NAME Tarivid 400
147 TRADE_NAME Mictral
147 TRADE_NAME Neggram
543 INN Amphetamine
543 INN Amfetamine
543 TRADE_NAME Adzenys xr-odt
543 TRADE_NAME Adzenys er
543 TRADE_NAME Dyanavel xr
I would like to create two new columns trade_name and inn and fill their respective value (copying over or concatenate the INN values) from column 'name'. I am expecting the following output
id trade_name inn
146 Ocuflox Ofloxacin
146 Ofloxacin Ofloxacin
146 Tarivid i.v. Ofloxacin
146 Tarivid 400 Ofloxacin
147 Mictral Ofloxacin
147 Neggram Ofloxacin
543 Adzenys xr-odt Amphetamine | Amfetamine
543 Adzenys er Amphetamine | Amfetamine
543 Dyanavel xr Amphetamine | Amfetamine
Any help is highly appreciated.

You can get a result set of distinct ids and then join that back to the same table. Once to get trade_names and once to get inn records:
SELECT ids.id,
tradenames.name as trade_name,
inns.name as inn
FROM
(SELECT DISTINCT id FROM yourtable) as ids
LEFT OUTER JOIN yourtable as tradenames
ON ids.id = tradenames.id
AND tradenames.type = 'TRADE_NAME'
LEFT OUTER JOIN yourtable as inns
ON ids.id = inns.id
AND inns.type = 'INNS';
You might also be able to pull this off with a pivot, but I think that would be overkill for the two output columns you are after.

Related

Complex logic to create time series in Postgres

I have a sample dataset like below and I would like to create a report in such a format that the Value is updated for all the dates between the Start and End date.
Input Dataset
ID Start End Value
232 "2022-06-08 18:49:00" "2022-11-18 08:06:00" 55
456 "2022-10-17 10:24:00" "2022-12-16 12:52:00" 100
From the above Dataset I would like to create another dataset as below.
I need to generate the date series from the START and END date from the Input dataset and fill the same value to all of those value.
Any ideas or suggestions will be helpful.
Expected Output
ID Date Value
232 "2022-06-08" 55
232 "2022-06-09" 55
232 "2022-06-10" 55
232 "2022-06-11" 55
232 "2022-06-12" 55
.
.
232 "2022-11-17" 55
232 "2022-11-18" 55
456 "2022-10-17" 100
456 "2022-10-18" 100
456 "2022-10-19" 100
.
.
456 "2022-12-15" 100
456 "2022-12-16" 100
Database : Postgres 12
You can use generate_series()
select t.id,
g.dt::date as date,
t.value
from the_table t
cross join generate_series(t."Start"::date, t."End"::date, interval '1 day') as g(dt)
order by t.id, g.dt

How to delete duplicate rows without unique ID

Id
SleepDay
TotalMinutesAsleep
TotalTimeInBed
8378563200
4/20/2016
381
409
8378563200
4/21/2016
396
417
8378563200
4/22/2016
441
469
8378563200
4/23/2016
565
591
8378563200
4/24/2016
458
492
8378563200
4/25/2016
388
402 ---> this is the duplicate
8378563200
4/25/2016
388
402
8378563200
4/26/2016
550
584
8378563200
4/27/2016
531
600
This is part of my table and how can I delete the duplicate row? I use CTE clause but it deleted all records of id #8378563200 on 4/25/2016.
Use:
DELETE
FROM table1
WHERE ctid IN (SELECT ctid
FROM (SELECT ctid,
ROW_NUMBER() OVER (
PARTITION BY Id, SleepDay,TotalMinutesAsleep,TotalTimeInBed ) AS rn
FROM table1) t
WHERE rn > 1);
Replace table1 with your own table name.
Without column(s) to identify a unique row?
Then you could use ctid.
ctid
The physical location of the row version within its table. Note
that although the ctid can be used to locate the row version very
quickly, a row's ctid will change if it is updated or moved by VACUUM
FULL. Therefore ctid is useless as a long-term row identifier. A
primary key should be used to identify logical rows
For example:
delete
from SleepLogs log1
using SleepLogs log2
where log2.Id = log1.Id
and log2.SleepDay = log1.SleepDay
and log2.TotalMinutesAsleep = log1.TotalMinutesAsleep
and log2.TotalTimeInBed = log1.TotalTimeInBed
and log2.ctid < log1.ctid;
1 rows affected
select * from SleepLogs
id
sleepday
totalminutesasleep
totaltimeinbed
8378563200
2016-04-20
381
409
8378563200
2016-04-21
396
417
8378563200
2016-04-22
441
469
8378563200
2016-04-23
565
591
8378563200
2016-04-24
458
492
8378563200
2016-04-25
388
402
8378563200
2016-04-26
550
584
8378563200
2016-04-27
531
600
Test on db<>fiddle here

How to get those rows where all the values are same against unique id

I have below mentioned table:
ID State City Pincode Code Date
U-1 AAB CCV 141414 121 2018-04-04 18:08:17
U-1 AAB CCV 141414 121 2018-04-04 18:08:17
U-2 BTB ERV 150454 145 2018-05-05 19:11:25
U-2 BTB ERV 150454 145 2018-05-05 19:11:25
U-3 FFT ERT 160707 150 2018-05-22 21:37:45
U-4 FFT RTT 160707 150 2018-05-28 14:23:48
I want to fetch only those rows where all the values are same in the particular unique ID.
Output:
ID State City Pincode Code Date
U-1 AAB CCV 141414 121 2018-04-04 18:08:17
U-1 AAB CCV 141414 121 2018-04-04 18:08:17
U-2 BTB ERV 150454 145 2018-05-05 19:11:25
U-2 BTB ERV 150454 145 2018-05-05 19:11:25
Get the duplicate rows and join the result to the original table.
select * from table a
join ( select id,state,city,pincode,code,date
from table
group by id,state,city,pincode,code,date
having count(*) > 1 ) b
on a.id = b.id
and a.state = b.state
and a.city = b.city
and a.pincode = b.pincode
and a.code = b.code
and a.date=b.date
You can try this:
SELECT * FROM table WHERE ID IN (
SELECT count(*) AS c FROM table
WHERE c > 1
GROUP BY ID
)
Get all rows where count of the records with this ID is greater than 2 (at least two rows with this id)

SELECT FROM VALUES used a bit like a CASE statement - but possibly more powerful

I just found myself writing the code below - which works.
Interesting, but is it necessarily the best method?
the syntax allows the TRY_CAST to only be performed once.
Note "Atextfield" can contain valid numbers and invalid numbers.
SELECT *
FROM call
WHERE
EXISTS ( SELECT 1
FROM ( VALUES( TRY_CAST(call.[Atextfield] AS int) )
) AS Table1(num)
WHERE
(Table1.num BETWEEN 124 AND 140 )
OR (Table1.num BETWEEN 143 AND 146 )
OR (Table1.num BETWEEN 148 AND 149 )
OR (Table1.num BETWEEN 160 AND 169 )
OR (Table1.num BETWEEN 181 AND 189 )
)
;
2 .Could this be re-written as follows?
SELECT *
FROM [call]
WHERE TRY_CAST([call].AtextField AS TINYINT) BETWEEN 124 AND 189
AND TRY_CAST([call].AtextField AS TINYINT) NOT IN (141,142,147)
AND TRY_CAST([call].AtextField AS TINYINT) NOT BETWEEN 150 AND 159
AND TRY_CAST([call].AtextField AS TINYINT) NOT BETWEEN 170 AND 180
Note I'm new to CASE in t-sql...
2A. Is the TRY_CAST(...) evaluated more than once?
Which of the above will be quicker?
Is there a better way to write this?
Is the first method useful when the criteria get more involved and complex.
Is this an acceptable approach?
Harvey
There's no need to use exists or 1 = CASE...
Just put your logic in the where clause directly. I'd probably do something like this:
SELECT *
FROM [call]
WHERE TRY_CAST([call].AtextField AS TINYINT) BETWEEN 124 AND 189
AND TRY_CAST([call].AtextField AS TINYINT) NOT IN (141,142,147)
AND TRY_CAST([call].AtextField AS TINYINT) NOT BETWEEN 150 AND 159
AND TRY_CAST([call].AtextField AS TINYINT) NOT BETWEEN 170 AND 180
Cross Apply Method:
SELECT *
FROM [call]
CROSS APPLY (SELECT CAST(PersonID AS TINYINT)) CA(intField)
WHERE intField BETWEEN 124 AND 189
AND intField NOT IN (141,142,147)
AND intField NOT BETWEEN 150 AND 159
AND intField NOT BETWEEN 170 AND 180
My guess is that your query and mine queries will be pretty similiar. If you want to check performance, try running this first and then running each query and recording the logical reads and times.
SET STATISTICS IO ON
SET STATISTICS TIME ON

Recursive CTE with multiple valid same parent child relationships

I have an equipment inventory application I am working on. The piece of equipment is my top level and it contains assemblies, sub-assemblies and parts. I am trying to use recursive CTE to display the parent/child relationships. The issue I am having is that some assemblies can have multiple sub-assemblies that are the same, meaning there is not difference in the part numbers. This is causing my query to not show the correct relationship based on my order by statement. This is the first time I have used CTE so I have be using a lot learned on the web.
PartNumberID 174 is used twice in this assembly.
Sample Table
equipmentID parentPartNumberID partNumberID
17 1 281
17 281 156
17 156 161
17 161 224
17 281 174
17 174 192
17 192 56
17 174 193
17 281 174
17 174 192
17 192 56
17 174 193
17 281 283
17 ` 283 183
17 283 277
17 283 173
Results of Query
PARENT CHILD PARTLEVEL HIERARCHY
1 281 0 281
281 156 1 281.156
156 161 2 281.156.161
161 224 3 281.156.161.224
281 174 1 281.174
281 174 1 281.174
174 192 2 281.174.192
174 192 2 281.174.192
192 56 3 281.174.192.56
192 56 3 281.174.192.56
174 193 2 281.174.193
174 193 2 281.174.193
281 283 1 281.283
283 173 2 281.283.173
283 183 2 281.283.183
283 277 2 281.283.277
As you can see the hierarchy is created correctly but I it is not being returned correctly because there is nothing unique for these 2 assemblies for the order by statement.
The Code:
with parts(PARENT,CHILD,PARTLEVEL,HIERARCHY) as (select parentPartNumberID,
--- Used to get rid of duplicates
CASE WHEN ROW_NUMBER() OVER (PARTITION BY partNumberID ORDER BY partNumberID) > 1
THEN NULL
ELSE partNumberID END AS partNumberID,
0,
CAST( partNumberID as nvarchar) as PARTLEVEL
FROM db.tbl_ELEMENTS
WHERE parentPartNumberID=1 and equiptmentID=17
UNION ALL
SELECT part1.parentPartNumberId,
--- Used to get rid of duplicates
CASE WHEN ROW_NUMBER() OVER (PARTITION BY parts1.partNumberID ORDER BY parts1.partNumberID) > 1
THEN 10000 + parts1.partNumberID
ELSE parts1.partNumberID END,
PARTLEVEL+1,
cast(parts.hierarchy + '.' + CAST(parts1.partNumberID as nvarchar) as nvarchar)
from dbo.tbl_BOM_Elements as parts1 inner
join parts onparts1.parentPartNumberID=parts.CHILD
where id =17)
select CASE WHEN PARENT > 10000
THEN PARENT - 10000
ELSE PARENT END AS PARENT,
CASE WHEN CHILD > 10000
THEN CHILD - 10000
ELSE CHILD END AS CHILD,
PARTLEVEL,HIERARCHY
from parts
order by hierarchy
I tried to create a unique ID to order but was not successful. Any suggestions would be greatly appreciated.
I'll start by just answering the part about getting a sequential id.
If you have control you could just a unique Id to your source table. Having a surrogate primary key would be pretty typical here.
You could instead use a second CTE before the recursive one and add the row numbers there using ROW_NUMBER() OVER BY (ORDER BY equipmentID, parentPartNumberID, partNumberID). Then build your recursive CTE off of that rather than the source table directly.
Better might be to use the first CTE to instead GROUP BY equipmentID, parentPartNumberID, partNumberID and add a COUNT(1) field. This would let you instead use the count in you hierarchy rather than getting the duplicates. Something like 281.283.277x2 or whatever.