Valid data in A group- SQL Grouping - tsql

May some one please help me how to achieve the below scenario.
two tables one called as driver and one called as electronic.
-- Driver table
DECLARE #DRIVER TABLE
(
Parenttype varchar (50),
childtype varchar (50)
)
INSERT #DRIVER
SELECT 'Carbon Composite Resistor','Ceramic %'
-- Electronic Table
DECLARE #ELECTRONIC TABLE
(
PARENTSKU varchar (50),
ROLLOVER varchar (50),
CHILDSKU varchar (50),
TYPE varchar (50)
)
INSERT #ELECTRONIC
SELECT 'BIN19-1405','LEAD','19-1405','Carbon Composite Resistor' UNION ALL
SELECT 'SAM92-140','MERCURY','92-140','Carbon Composite Resistor' UNION ALL
SELECT 'SAB45-155','LEAD','45-155','Carbon Composite Resistor' UNION ALL
SELECT 'NIP69-153','SULPHUR','69-153','Carbon Composite Resistor' UNION ALL
SELECT 'DIP19-1508','LEAD','19-1508','Carbon Composite Resistor' UNION ALL
SELECT 'ZQC140012','ROHS','140012','Carbon Composite Resistor' UNION ALL
SELECT 'LHH543012','ROHS','543012','Carbon Composite Resistor' UNION ALL
SELECT 'JWC592013','ROHS','592013','Carbon Composite Resistor' UNION ALL
SELECT 'GHY846013','ROHS','846013','Carbon Composite Resistor' UNION ALL
SELECT 'ZQC140012','ROHS','140012','Ceramic capacitors LARGE' UNION ALL
SELECT 'LHH543012','ROHS','543012','Ceramic capacitors SMALL' UNION ALL
SELECT 'JWC592013','ROHS','592013','Ceramic capacitors MEDIUM' UNION ALL
SELECT 'GHY846013','ROHS','846013','Ceramic capacitors' UNION ALL
SELECT 'MCN8LTC8K','ROHS','8LTC8K','Double-layer capacitors' UNION ALL
SELECT 'PRM81150','ROHS','81150','Tantalum capacitors' UNION ALL
SELECT 'PRM846013','ROHS','846013','Hybrid capacitors '
Here I am looking for output which meet two below condition
1st: ALL those Parent SKU which contain type which is equal to the Parenttype available in Driver table and rollover other than ROHS
2ND: ALL THOSE Parent SKU Records whose rollover are ROHS but only present with parent type but not with Child Type
Expected Output
╔════════════╦══════════╦══════════╦═══════════════════════════╗
║ PARENTSKU ║ ROLLOVER ║ CHILDSKU ║ TYPE ║
╠════════════╬══════════╬══════════╬═══════════════════════════╣
║ BIN19-1405 ║ LEAD ║ 19-1405 ║ Carbon Composite Resistor ║
║ SAM92-140 ║ MERCURY ║ 92-140 ║ Carbon Composite Resistor ║
║ SAB45-155 ║ LEAD ║ 45-155 ║ Carbon Composite Resistor ║
║ NIP69-153 ║ SULPHUR ║ 69-153 ║ Carbon Composite Resistor ║
║ DIP19-1508 ║ LEAD ║ 19-1508 ║ Carbon Composite Resistor ║
║ MCN8LTC8K ║ ROHS ║ 8LTC8K ║ Double-layer capacitors ║
║ PRM81150 ║ ROHS ║ 81150 ║ Tantalum capacitors ║
║ PRM846013 ║ ROHS ║ 846013 ║ Hybrid capacitors  ║
╚════════════╩══════════╩══════════╩═══════════════════════════╝
Thanks a lot.

try this
SELECT E.PARENTSKU ,
E.ROLLOVER ,
E.CHILDSKU ,
E.TYPE
FROM #ELECTRONIC AS E
WHERE ( E.Type IN ( SELECT D.Parenttype
FROM #DRIVER AS D )
AND E.ROLLOVER != 'ROHS'
)
OR NOT EXISTS ( SELECT NULL
FROM #DRIVER AS D
WHERE ( D.Parenttype = E.TYPE )
OR ( E.TYPE LIKE D.childtype ) )
output

Related

How to concatenate strings of a string field in a PostgreSQL 'WITH RECURSIVE' query?

As a follow up to this question How to concatenate strings of a string field in a PostgreSQL 'group by' query?
I am looking for a way to concatenate the strings of a field within a WITH RECURSIVE query (and NOT using GORUP BY). So for example, I have a table:
ID COMPANY_ID EMPLOYEE
1 1 Anna
2 1 Bill
3 2 Carol
4 2 Dave
5 3 Tom
and I wanted to group by company_id, ordered by the count of EMPLOYEE, to get something like:
COMPANY_ID EMPLOYEE
3 Tom
1 Anna, Bill
2 Carol, Dave
It's simple with GROUP BY:
SELECT company_id, string_agg(employee, ', ' ORDER BY employee) AS employees
FROM tbl
GROUP BY company_id
ORDER BY count(*), company_id;
Sorting in a subquery is typically faster:
SELECT company_id, string_agg(employee, ', ') AS employees
FROM (SELECT company_id, employee FROM tbl ORDER BY 1, 2) t
GROUP BY company_id
ORDER BY count(*), company_id;
As academic proof of concept: an rCTE solution without using any aggregate or window functions:
WITH RECURSIVE rcte AS (
(
SELECT DISTINCT ON (1)
company_id, employee, ARRAY[employee] AS employees
FROM tbl
ORDER BY 1, 2
)
UNION ALL
SELECT r.company_id, e.employee, r.employees || e.employee
FROM rcte r
CROSS JOIN LATERAL (
SELECT t.employee
FROM tbl t
WHERE t.company_id = r.company_id
AND t.employee > r.employee
ORDER BY t.employee
LIMIT 1
) e
)
SELECT company_id, array_to_string(employees, ', ') AS employees
FROM (
SELECT DISTINCT ON (1)
company_id, cardinality(employees) AS emp_ct, employees
FROM rcte
ORDER BY 1, 2 DESC
) sub
ORDER BY emp_ct, company_id;
db<>fiddle here
Related:
Select first row in each GROUP BY group?
Optimize GROUP BY query to retrieve latest row per user
Concatenate multiple result rows of one column into one, group by another column
No group by here:
select * from tarded;
┌────┬────────────┬──────────┐
│ id │ company_id │ employee │
├────┼────────────┼──────────┤
│ 1 │ 1 │ Anna │
│ 2 │ 1 │ Bill │
│ 3 │ 2 │ Carol │
│ 4 │ 2 │ Dave │
│ 5 │ 3 │ Tom │
└────┴────────────┴──────────┘
(5 rows)
with recursive firsts as (
select id, company_id,
first_value(id) over w as first_id,
row_number() over w as rn,
count(1) over (partition by company_id) as ncompany,
employee
from tarded
window w as (partition by company_id
order by id)
), names as (
select company_id, id, employee, rn, ncompany
from firsts
where id = first_id
union all
select p.company_id, c.id, concat(p.employee, ', ', c.employee), c.rn, p.ncompany
from names p
join firsts c
on c.company_id = p.company_id
and c.rn = p.rn + 1
)
select company_id, employee
from names
where rn = ncompany
order by ncompany, company_id;
┌────────────┬─────────────┐
│ company_id │ employee │
├────────────┼─────────────┤
│ 3 │ Tom │
│ 1 │ Anna, Bill │
│ 2 │ Carol, Dave │
└────────────┴─────────────┘
(3 rows)

Postgres array comparison - find missing elements

I have the table below.
╔════════════════════╦════════════════════╦═════════════╗
║id ║arr1 ║arr2 ║
╠════════════════════╬════════════════════╬═════════════╣
║1 ║{1,2,3,4} ║{2,1,7} ║
║2 ║{0} ║{3,4,5} ║
╚════════════════════╩════════════════════╩═════════════╝
I want to find out the elements which are in arr1 and not in arr2.
Expected output
╔════════════════════╦════════════════════╗
║id ║diff ║
╠════════════════════╬════════════════════╣
║1 ║{3,4} ║
║2 ║{0} ║
╚════════════════════╩════════════════════╝
If I have 2 individual arrays, I can do as follows:
select array_agg(elements)
from (
select unnest(array[0])
except
select unnest(array[3,4,5])
) t (elements)
But I am unable to integrate this code to work by selecting from my table.
Any help would be highly appreciated. Thank you!!
I would write a function for this:
create function array_diff(p_one int[], p_other int[])
returns int[]
as
$$
select array_agg(item)
from (
select *
from unnest(p_one) item
except
select *
from unnest(p_other)
) t
$$
language sql
stable;
Then you can use it like this:
select id, array_diff(arr1, arr2)
from the_table
A much faster alternative is to install the intarray module and use
select id, arr1 - arr2
from the_table
You should use except for each id and after that group by for each group
Demo
with diff_data as (
select id, unnest(arr1) as data
from test_table
except
select id, unnest(arr2) as data
from test_table
)
select id, array_agg(data order by data) as diff
from diff_data
group by id

Trying to get start and end time in hours from a set of 30 minute time intervals and some results aren't returning correctly

I'm trying to get a report of the hours worked out of a Postgresql database. I'm going to use Python and Pandas to format run additional calculations before outputting to reports and I'm using the pd.read_sqq_query() method to pull the data into python using raw SQL.
The information is over multiple tables users, intervals, claimed. Claimed is a many to many mapping to intervals and to users. I'm expecting to get multiple users back so I'm using the PARTITION BY username clause to group them. Please let me know if the layout might be causing the problem as my example below has been simplified some.
I've recently discovered various resources talking about gaps and islands problems and found one that seems to fit my use case that I've adapted to work; Ref: Gaps and islands. It appears to be MSSQL though I don't believe it's mentioned in there.
The problem is that some of the results aren't returning what I expect. I've created a SQL Fiddle with a minimum viable sqlfiddle
This is one of the segments found by the islands. I'm taking the MAX(endtime) and the MIN(starttime) but in some cases I am missing the final interval.
Ex: The following table has one segment and I would expect it to show starttime as 2020-03-08T0:00:00 and endtime as 2020-03-08T4:00:00 but I'm actually getting the endtime as 2020-03-08T3:30:00
╔═════════════╦═════════════════════╦═════════════════════╗
║ Username ║ Start Time ║ End Time ║
╠═════════════╬═════════════════════╬═════════════════════╣
║ Test User 1 ║ 2020-03-08T02:00:00 ║ 2020-03-08T02:30:00 ║
║ Test User 1 ║ 2020-03-08T02:30:00 ║ 2020-03-08T03:00:00 ║
║ Test User 1 ║ 2020-03-08T03:00:00 ║ 2020-03-08T03:30:00 ║
║ Test User 1 ║ 2020-03-08T03:30:00 ║ 2020-03-08T04:00:00 ║
╚═════════════╩═════════════════════╩═════════════════════╝
This is what I have in the SQLFiddle for the example and there's more data but all for one user.
SELECT username,
islandId,
MIN(starttime) as IslandStartDate,
MAX(endtime) as IslandEndDate
FROM
(SELECT *,
CASE
WHEN Groups.PreviousEndDate >= starttime THEN 0
ELSE 1
END as IslandStartInd,
SUM(CASE
WHEN Groups.PreviousEndDate >= starttime then 0
else 1
end) OVER (PARTITION BY Groups.username
ORDER BY Groups.RN) as IslandId
FROM
( SELECT ROW_NUMBER() over (PARTITION BY tr.username
order by tr.starttime,
tr.endtime) as rn ,
tr.username ,
tr.starttime ,
tr.endtime ,
LAG(tr.endtime, 1) OVER (PARTITION BY tr.username
ORDER BY tr.starttime,
tr.endtime) as PreviousEndDate
FROM timerange tr
WHERE tr.starttime BETWEEN '2020-03-01' AND '2020-03-20'
ORDER BY tr.username) Groups ) Islands
Group BY username,
islandid
ORDER BY username,
IslandStartDate
I have restructured the gaps-and-islands approach using window functions and common table expressions to make it easier to follow.
You can uncomment the commented queries at the bottom (one at a time) to see how the strategy works step by step.
The sqlfiddle.
with gaps as (
select *,
case
when starttime = lag(endtime) over (partition by username
order by starttime) then 0
else 1
end as gap_begin_row_marker
from timerange
), grp_numbers as (
select username, starttime, endtime,
sum(gap_begin_row_marker) over (partition by username
order by starttime) as grp_num
from gaps
), collapsed_intervals as(
select grp_num, username, min(starttime) as starttime, max(endtime) as endtime
from grp_numbers
group by grp_num, username
), summed_time as (
select username, sum(endtime - starttime) as time_claimed
from collapsed_intervals
group by username
)
/* select * from gaps; */
/* select * from grp_numbers; */
/* select * from collapsed_intervals; */
select * from summed_time;

Synchronize a set of table records via triggers

I have a table from an existing product (so no schema changes are possible) that have a schema similar to the following:
objectId, typeId, value
What I need to do is to essentially make several typeIds linked and in synch with each other. See the example below. There will be around 35 sets of 3-4 linked typeIds each.
I cannot figure out reasonable way to do this that is performant, handles multi-row inserts and prevents trigger recursion. Any suggestions?
Example
typeId 1, 2 and 3 should be linked.
INSERT INTO foo (objectId, typeId, value) VALUES (1, 1, 'bar')
Should result in the table containing the following
╔══════════╦════════╦═══════╗
║ objectId ║ typeId ║ value ║
╠══════════╬════════╬═══════╣
║ 1 ║ 1 ║ bar ║
║ 1 ║ 2 ║ bar ║
║ 1 ║ 3 ║ bar ║
╚══════════╩════════╩═══════╝
Any updates to the value of either of these records should result in the value of all of them to be changed.
In the end, I solved my problem by modifying the "insert" and "update" stored procedures used by the application in question, instead of solving it with triggers.
The insert procedure ended up something like this:
IF #typeId IN (1,2,3)
INSERT INTO foo (objectId, typeId, value) VALUES (#objectId, 1, #value), (#objectId, 2, #value), (#objectId, 3, #value)
ELSE IF
...
ELSE
INSERT INTO foo (objectId, typeId, value) VALUES (#objectId, #typeId, #value)
The update procedure ended up something like this:
IF #typeId IN (1,2,3)
UPDATE foo SET value = #value WHERE objectId = #objectId AND #typeId IN (1,2,3)
ELSE IF
...
ELSE
UPDATE foo SET value = #value WHERE rowId = #rowId

How do I split text into multiple fields using Postgresql?

I have a table with a column that needs to be split and inserted into a new table. Column's name is location and has data that could look like Detroit, MI, USA;Chicago, IL, USA or as simple as USA.
Ultimately, I want to insert the data into a new dimension table that looks like:
City | State | Country|
Detroit MI USA
Chicago IL USA
NULL NULL USA
I came across the string_to_array function and am able to split the larger example (Detroit, MI, USA; Chicago, IL, USA) into 2 strings of Detroit, MI, USA and Chicago, IL, USA.
Now I'm stumped on how to split those strings again and then insert them. Since there are two strings separated by a comma, does using string_to_array again work? It doesn't seem to work in Sqlfiddle.
Note: I'm using Sqlfiddle right now since I don't have access to my Redshift table at the moment.
This is for Redshift, which unfortunately is still using PostGresql 8.0.2 and thus does not have the unnest function
postgres=# select v[1] as city, v[1] as state, v[2] as country
from (select string_to_array(unnest(string_to_array(
'Detroit, MI, USA;Chicago, IL, USA',';')),',')) s(v);
┌─────────┬─────────┬─────────┐
│ city │ state │ country │
╞═════════╪═════════╪═════════╡
│ Detroit │ Detroit │ MI │
│ Chicago │ Chicago │ IL │
└─────────┴─────────┴─────────┘
(2 rows)
Tested on Postgres, not sure if it will work on Redshift too
Next query should to work on every Postgres
select v[1] as city, v[1] as state, v[2] as country
from (select string_to_array(v, ',') v
from unnest(string_to_array(
'Detroit, MI, USA;Chicago, IL, USA',';')) g(v)) s;
It use old PostgreSQL trick - using derived table.
SELECT v[1], v[2] FROM (SELECT string_to_array('1,2',',')) g(v)
Unnest function:
CREATE OR REPLACE FUNCTION _unnest(anyarray)
RETURNS SETOF anyelement AS '
BEGIN
FOR i IN array_lower($1,1) .. array_upper($1,1) LOOP
RETURN NEXT $1[i];
END LOOP;
RETURN;
END;
' LANGUAGE plpgsql;