Trying to compare items in the same column t-sql

Trying to compare items in the same column t-sql - tsql

I’m trying to find the count of lookupid based on customer
LookUpID Customer
1302 01
1303 01
1337 01
Each customer can have multiple lookupids, but if they have 1337 selected and not 1302 or 1303 than you can’t add that customer to to the count.
Count(case when LookUpID=1337 and LookUpID not in (1302,1303) then 0
when LookUpID = 1337 and LookUpID in (1302,1303) then 1 else 0 end)
I can’t find out how to exclude customers with LookUpID = 1337 if the other 2 aren’t selected. Any input would help, thanks.

Assuming the selected records are in the SelectedRecords table, we can do this:
SELECT lt.Customer
FROM LookupTable lt
LEFT JOIN SelectedRecords sr ON sr.LookUpID=lt.LookUpID
GROUP BY lt.Customer
HAVING COUNT(*)=COUNT(DISTINCT sr.LookUpID)
This way, we are counting the number of LookUpID-s of each customer and checking if the number of selected LookUpID-s is the same.

Maybe some variant of this:
SELECT [Customer]
,CASE WHEN MAX([LookUpID]) = 1337 AND MIN([LookUpID]) = 1337 THEN 0 ELSE COUNT([LookUpID]) END
FROM [table]
GROUP BY [Customer];
If the min and the max value for particular record are both 1337, the we have only one record for the row and we have count = 0; in all other case, just count.

This code may help you.
If you need to count customer then use count over t.customer
CREATE TABLE #test(
[lookupid] [int] NULL,
[customer] [varchar](2) NULL
) ON [PRIMARY]
GO
INSERT #test ([lookupid], [customer]) VALUES (1302, N'01')
INSERT #test ([lookupid], [customer]) VALUES (1303, N'01')
INSERT #test ([lookupid], [customer]) VALUES (1337, N'01')
INSERT #test ([lookupid], [customer]) VALUES (1337, N'02')
INSERT #test ([lookupid], [customer]) VALUES (1337, N'03')
INSERT #test ([lookupid], [customer]) VALUES (1303, N'03')
-- select * from #test
select r.customer,t.customer,r.lookupid
from #test r
left join (select * from #test where lookupid=1337
and customer in(select customer from #test
where lookupid in(1302,1303)))t on t.customer=r.customer

Related

How to update duplicate rows in a table n postgresql

I have created synthetic data for a typical call center.
Below is the screenshot of the table I have created.
Table 1:
Problem statement: Since this is completely random data, I noticed that there are some customers who are being assigned to the same agents whenever they call again.
So using this query I was able to test such a case and count the number of times agents are being repeated for each customer.
select agentid, customerid, count(customerid) from aa_dev.calls group by agentid, customerid having count(customerid) > 1 ;
Table 2
I have a separate agents table to called aa_dev.agents in which the agent's ids are stored
Now I want to replace the agentid for such cases, such that if agentid is repeated 6 times for a single customer then 5 of the times the agent id should be updated with any other agentid from the table but call time shouldn't be overlapping That means the agent we are replacing with should not be busy on the time the call is going one.
I have assigned row numbers to each repeated ones.
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY agentid, customerid ORDER BY random()) rn,
COUNT(*) OVER (PARTITION BY agentid, customerid) cnt
FROM aa_dev.calls
)
SELECT agentid, customerid, rn
FROM cte
WHERE cnt > 1;
This way I could visualize the repetition clearly.
So I don't want to update row 1 but the rest.
Is there any way I can acheive this? Can I use the row number and write a query according to the row number to update rownum 2 onwards row one by one with each row having a unique agent?

If you don't want duplicates in your artificial data, it's probably better to not generate them.
But if you already have a table with duplicates and want to work on the duplicates, either updating them or deleting, here is the easy way:
You need a unique ID for each updated row. If you don't have it,
add it temporarily. Then you can use this pattern to update all duplicates
except the first one:
To add artificial id column to preexisting table, use:
ALTER TABLE calls ADD id serial;
In my case I generated a test table with 100 random rows:
CREATE TEMP TABLE calls (id serial, agentid int, customerid int);
INSERT INTO calls (agentid, customerid)
SELECT (random()*10)::int, (random()*10)::int
FROM generate_series(1, 100) n;
Define what constitutes a duplicate and find duplicates in data:
SELECT agentid, customerid, count(*), array_agg(id) id
FROM calls
GROUP BY 1,2 HAVING count(*)>1
ORDER BY 1,2;
Update all the duplicate rows except first one with NULLs:
UPDATE calls SET agentid = whatever_needed
FROM (
SELECT array_agg(id) id, min(id) idmin FROM calls
GROUP BY agentid, customerid HAVING count(*)>1
) AS dup
WHERE calls.id = ANY(dup.id) AND calls.id <> dup.idmin;
Alternatively, remove all duplicates except first one:
DELETE FROM calls
USING (
SELECT array_agg(id) id, min(id) idmin FROM calls
GROUP BY agentid, customerid HAVING count(*)>1
) AS dup
WHERE calls.id = ANY(dup.id) AND calls.id <> dup.idmin;

Update Multiple Columns in One Statement Based On a Field with the Same Value as the Column Name

Not sure if this is possible without some sort of Dynamic SQL or a Pivot (which I want to stay away from)... I have a report that displays total counts for various types/ various status combinations... These types and statuses are always going to be the same and present on the report, so returning no data for a specific combination yields a zero. As of right now there are only three caseTypes (Vegetation, BOA, and Zoning) and 8 statusTypes (see below).
I am first setting up the skeleton of the report using a temp table. I have been careful to name the temp table columns the same as what the "statusType" column will contain in my second table "#ReportData". Is there a way to update the different columns in "#FormattedData" based on the value of the "statusType" column in my second table?
Creation of Formatted Table (for report):
CREATE TABLE #FormattedReport (
caseType VARCHAR(50)
, underInvestigation INT NOT NULL DEFAULT 0
, closed INT NOT NULL DEFAULT 0
, closedDPW INT NOT NULL DEFAULT 0
, unsubtantiated INT NOT NULL DEFAULT 0
, currentlyMonitored INT NOT NULL DEFAULT 0
, judicialProceedings INT NOT NULL DEFAULT 0
, pendingCourtAction INT NOT NULL DEFAULT 0
, other INT NOT NULL DEFAULT 0
)
INSERT INTO #FormattedReport (caseType) VALUES ('Vegetation')
INSERT INTO #FormattedReport (caseType) VALUES ('BOA')
INSERT INTO #FormattedReport (caseType) VALUES ('Zoning')
Creation of Data Table (to populate #FormattedReport):
SELECT B.Name AS caseType, C.Name AS StatusType, COUNT(*) AS Amount
INTO #ReportData
FROM table1 A
INNER JOIN table2 B ...
INNER JOIN table3 C ...
WHERE ...
GROUP BY B.Name, C.Name
CURRENT Update Statement (Currently will be 1 update per column in #FormattedReport):
UPDATE A SET underInvestigation = Amount FROM #ReportData B
INNER JOIN #FormattedReport A ON B.CaseType LIKE CONCAT('%', A.caseType, '%')
WHERE B.StatusType = 'Under Investigation'
UPDATE A SET closed = Amount FROM #ReportData B
INNER JOIN #FormattedReport A ON B.CaseType LIKE CONCAT('%', A.caseType, '%')
WHERE B.StatusType = 'Closed'
...
REQUESTED Update Statement: Would like to have ONE update statement knowing which column to update when "#ReportData.statusType" is the same as a "#FormattedData" column's name. For my "other" column, I'll just do that one manually using a NOT IN.

Assuming I understand the question, I think you can use conditional aggregation for this:
;WITH CTE AS
(
SELECT CaseType
,SUM(CASE WHEN StatusType = 'Under Investigation' THEN Amount ELSE 0 END) As underInvestigation
,SUM(CASE WHEN StatusType = 'Closed' THEN Amount ELSE 0 END) As closed
-- ... More of the same
FROM #ReportData
GROUP BY CaseType
)
UPDATE A
SET underInvestigation = B.underInvestigation
,closed = b.closed
-- more of the same
FROM #FormattedReport A
INNER JOIN CTE B
ON B.CaseType LIKE CONCAT('%', A.caseType, '%')

Postgresql: insert the same data a few times

I have table a, in this table after a SQL request, I have the same records a few times.
Here is my request.
for server_id in (select bs.id from status.servers bs
join settings.config blc on bs.id = blc.server_id
where blc.lane_number = (dataitem->>'No')::SMALLINT AND blc.min_length <= (dataitem->>'len')::real
)
LOOP
insert into a(measurement_id, server_id, status)
VALUES (
measurement_id,server_id,false
);
END LOOP;
And as result i have in table a, records like:
id meas_id serv_id status
1 12 1 f
2 12 1 f
3 12 1 f
i've changed code a little, in working code there are not syntax mistakes

answering
"why i have the same records with dif id?"
table a probably have a default value for column id, so values are taken from sequence. most probably you created it with serial data type... Those results are expected then. If you want to define your value, you should not skip column in scalar list, so
insert into a(measurement_id, server_id, status)
must become
insert into a(id, measurement_id, server_id, status)
and the value passed accordingly...
If you expected one result (assuming it from same value of server_id), you need to add distinct to the
for server_id in (select distinct bs.id from status.servers bs
because currently your select returns three rows with same bs.id as result of a join with three matching rows on join key...

Exclude rows that return NULL for a column when using a Case statement

SELECT ir.objectid,ir.objecttype,ir.name,ir.email,ir.createdate,
CASE objecttype
WHEN 1 THEN (select friendlyurl
from locations
where id = ir.objectid)
END as objecturl
FROM inforequests ir
WHERE createdate > '1/1/2014'
order by CreateDate asc
This query returns 10 rows for me, but 1 row shows NULL for column objecturl, which happens if no record is found in the [locations] table.
How can I alter my query to make sure that when objecturl IS NULL, that row is not returned, so in my case my query would only return 9 rows.

Add it to the WHERE clause:
where createdate > '1/1/2014' and objecttype = 1
Since your CASE does not handle any other values, it will result in a NULL when objecttype <> 1.
Alternatively, you could nest SELECTs:
select *
from ( SELECT ir.objectid,ir.objecttype,ir.name,ir.email,ir.createdate,
CASE objecttype
WHEN 1 THEN (select friendlyurl
from locations
where id = ir.objectid)
END as objecturl
FROM inforequests ir
WHERE createdate > '1/1/2014' ) as Temp
where objecturl is not NULL
order by CreateDate asc
Note that this is somewhat different as it will also exclude rows for which the correlated subquery returns NULL.

TSQL - Mapping one table to another without using cursor

I have tables with following structure
create table Doc(
id int identity(1, 1) primary key,
DocumentStartValue varchar(100)
)
create Metadata (
DocumentValue varchar(100),
StartDesignation char(1),
PageNumber int
)
GO
Doc contains
id DocumentStartValue
1000 ID-1
1100 ID-5
2000 ID-8
3000 ID-9
Metadata contains
Documentvalue StartDesignation PageNumber
ID-1 D 0
ID-2 NULL 1
ID-3 NULL 2
ID-4 NULL 3
ID-5 D 0
ID-6 NULL 1
ID-7 NULL 2
ID-8 D 0
ID-9 D 0
What I need to is to map Metadata.DocumentValues to Doc.id
So the result I need is something like
id DocumentValue PageNumber
1000 ID-1 0
1000 ID-2 1
1000 ID-3 2
1000 ID-4 3
1100 ID-5 0
1100 ID-6 1
1100 ID-7 2
2000 ID-8 0
3000 ID-9 0
Can it be achieved without the use of cursor?

Something like, sorry can't test
;WITH RowList AS
( --assign RowNums to each row...
SELECT
ROW_NUMBER() OVER (ORDER BY id) AS RowNum,
id, DocumentStartValue
FROM
doc
), RowPairs AS
( --this allows us to pair a row with the previous rows to create ranges
SELECT
R.DocumentStartValue AS Start, R.id,
R1.DocumentStartValue AS End
FROM
RowList R JOIN RowList R1 ON R.RowNum + 1 = R1.RowNum
)
--use ranges to join back and get the data
SELECT
RP.id, M.DocumentValue, M.PageNumber
FROM
RowPairs RP
JOIN
Metadata M ON RP.Start <= M.DocumentValue AND M.DocumentValue < RP.End
Edit: This assumes that you can rely on the ID-x values matching and being ascending. If so, StartDesignation is superfluous/redundant and may conflict with the Doc table DocumentStartValue

with rm as
(
select DocumentValue
,PageNumber
,case when StartDesignation = 'D' then 1 else 0 end as IsStart
,row_number() over (order by DocumentValue) as RowNumber
from Metadata
)
,gm as
(
select
DocumentValue as DocumentGroup
,DocumentValue
,PageNumber
,RowNumber
from rm
where RowNumber = 1
union all
select
case when rm.IsStart = 1 then rm.DocumentValue else gm.DocumentGroup end
,rm.DocumentValue
,rm.PageNumber
,rm.RowNumber
from gm
inner join rm on rm.RowNumber = (gm.RowNumber + 1)
)
select d.id, gm.DocumentValue, gm.PageNumber
from Doc d
inner join gm on d.DocumentStartValue = gm.DocumentGroup
Try to use query above (maybe you will need to add option (maxrecursion ...) also) and add index on DocumentValue for Metadata table. Also, it it's possible - it will be better to save appropriate group on Metadat rows inserting.
UPD: I've tested it and fixed errors in my query, not it works and give result as in initial question.
UPD2: And recommended indexes:
create clustered index IX_Metadata on Metadata (DocumentValue)
create nonclustered index IX_Doc_StartValue on Doc (DocumentStartValue)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Trying to compare items in the same column t-sql - tsql

Related

How to update duplicate rows in a table n postgresql

Update Multiple Columns in One Statement Based On a Field with the Same Value as the Column Name

Postgresql: insert the same data a few times

Exclude rows that return NULL for a column when using a Case statement

TSQL - Mapping one table to another without using cursor

Categories

Resources