Update Variable based on Group - tsql

I need to perform an update to a field in a table with a variable, but I need the variable to change when the group changes. It is just an INTt, so for example if I The example below I want to update the record of texas with a 1 and flordia with the next number of 2:
UPDATE table
set StateNum = #Count
FROM table
where xxxxx
GROUP BY state
Group Update Variable
Texas 1
Texas 1
Florida 2
Florida 2
Florida 2

I think you should use a lookup table with the state and its number StateNum Then you should store this number instead of the name to your table.
You might use DENSE_RANK within an updateable CTE:
--mockup data
DECLARE #tbl TABLE([state] VARCHAR(100),StateNum INT);
INSERT INTO #tbl([state]) VALUES
('Texas'),('Florida'),('Texas'),('Nevada');
--your update-statement
WITH updateableCTE AS
(
SELECT StateNum
,DENSE_RANK() OVER(ORDER BY [state]) AS NewValue
FROM #tbl
)
UPDATE updateableCTE SET StateNum=NewValue;
--check the result
SELECT * FROM #tbl;
And then you should use this to get the data for your lookup table
SELECT StateNum,[state] FROM #tbl GROUP BY StateNum,[state];
Then drop the state-column from your original table and let the StateNum be a foreign key.

Related

How to update duplicate rows in a table n postgresql

I have created synthetic data for a typical call center.
Below is the screenshot of the table I have created.
Table 1:
Problem statement: Since this is completely random data, I noticed that there are some customers who are being assigned to the same agents whenever they call again.
So using this query I was able to test such a case and count the number of times agents are being repeated for each customer.
select agentid, customerid, count(customerid) from aa_dev.calls group by agentid, customerid having count(customerid) > 1 ;
Table 2
I have a separate agents table to called aa_dev.agents in which the agent's ids are stored
Now I want to replace the agentid for such cases, such that if agentid is repeated 6 times for a single customer then 5 of the times the agent id should be updated with any other agentid from the table but call time shouldn't be overlapping That means the agent we are replacing with should not be busy on the time the call is going one.
I have assigned row numbers to each repeated ones.
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY agentid, customerid ORDER BY random()) rn,
COUNT(*) OVER (PARTITION BY agentid, customerid) cnt
FROM aa_dev.calls
)
SELECT agentid, customerid, rn
FROM cte
WHERE cnt > 1;
This way I could visualize the repetition clearly.
So I don't want to update row 1 but the rest.
Is there any way I can acheive this? Can I use the row number and write a query according to the row number to update rownum 2 onwards row one by one with each row having a unique agent?
If you don't want duplicates in your artificial data, it's probably better to not generate them.
But if you already have a table with duplicates and want to work on the duplicates, either updating them or deleting, here is the easy way:
You need a unique ID for each updated row. If you don't have it,
add it temporarily. Then you can use this pattern to update all duplicates
except the first one:
To add artificial id column to preexisting table, use:
ALTER TABLE calls ADD id serial;
In my case I generated a test table with 100 random rows:
CREATE TEMP TABLE calls (id serial, agentid int, customerid int);
INSERT INTO calls (agentid, customerid)
SELECT (random()*10)::int, (random()*10)::int
FROM generate_series(1, 100) n;
Define what constitutes a duplicate and find duplicates in data:
SELECT agentid, customerid, count(*), array_agg(id) id
FROM calls
GROUP BY 1,2 HAVING count(*)>1
ORDER BY 1,2;
Update all the duplicate rows except first one with NULLs:
UPDATE calls SET agentid = whatever_needed
FROM (
SELECT array_agg(id) id, min(id) idmin FROM calls
GROUP BY agentid, customerid HAVING count(*)>1
) AS dup
WHERE calls.id = ANY(dup.id) AND calls.id <> dup.idmin;
Alternatively, remove all duplicates except first one:
DELETE FROM calls
USING (
SELECT array_agg(id) id, min(id) idmin FROM calls
GROUP BY agentid, customerid HAVING count(*)>1
) AS dup
WHERE calls.id = ANY(dup.id) AND calls.id <> dup.idmin;

How to use a declare statement to update a table

I have this Declare Statement
declare #ReferralLevelData table([Type of Contact] varchar(10));
insert into #ReferralLevelData values ('f2f'),('nf2f'),('Travel'),('f2f'),('nf2f'),('Travel'),('f2f'),('nf2f'),('Travel');
select (row_number() over (order by [Type of Contact]) % 3) +1 as [Referral ID]
,[Type of Contact]
from #ReferralLevelData
order by [Referral ID]
,[Type of Contact];
It does not insert into the table so i feel this is not working as expect, i.e it doesn't modify the table.
If it did work I was hoping to modify the statement to make it update.
At the moment the table just prints this result
1 f2f
1 nf2f
1 Travel
2 f2f
2 nf2f
2 Travel
3 f2f
3 nf2f
3 Travel
EDIT:
I want TO Update the table to enter recurring data in groups of three.
I have a table of data, it is duplicated twice in the same table to make three sets.
Its "ReferenceID" is the primary key, i want to in a way group the 3 same ReferenceID's and inject these three values "f2f" "NF2F" "Travel" into the row called "Type" in any order but ensure that each ReferenceID only has one of those values.
Do you mean the following?
declare #ReferralLevelData table(
[Referral ID] int,
[Type of Contact] varchar(10)
);
insert into #ReferralLevelData([Referral ID],[Type of Contact])
select
(row_number() over (order by [Type of Contact]) % 3) +1 as [Referral ID]
,[Type of Contact]
from
(
values ('f2f'),('nf2f'),('Travel'),('f2f'),('nf2f'),('Travel'),('f2f'),('nf2f'),('Travel')
) v([Type of Contact]);
If it suits you then you also can use the next query to generate data:
select r.[Referral ID],ct.[Type of Contact]
from
(
values ('f2f'),('nf2f'),('Travel')
) ct([Type of Contact])
cross join
(
values (1),(2),(3)
) r([Referral ID]);

Update table to set the rows of a second group to be the same as the first row of the first group in sql

How do i update my table in sql so that the rows of the second group are the same as the first row of that group.
Please find the attached screen shot for better understanding.
Current and Expected Results:
Create Table Script
CREATE TABLE CurrentTable
(
Group1ID INT,
Group1Name VARCHAR(100),
Group2ID INT,
Group2Name VARCHAR(100)
)
INSERT INTO CurrentTable VALUES ('360943','Group 1','24418','Donald')
INSERT INTO CurrentTable VALUES ('360943','Group 1','24419','Kalamaz')
INSERT INTO CurrentTable VALUES ('360944','Group 2','24410','Adam')
INSERT INTO CurrentTable VALUES ('360944','Group 2','24411','Suzan')
Now Table Looks like
Group1ID Group1Name Group2ID Group2Name
360943 Group 1 24418 Donald
360943 Group 1 24419 Kalamaz
360944 Group 2 24410 Adam
360944 Group 2 24411 Suzan
Update Script
UPDATE O SET O.Group2ID = (SELECT TOP 1 Group2ID FROM CurrentTable I WHERE O.Group1ID = I.Group1ID),
O.Group2Name = (SELECT TOP 1 Group2Name FROM CurrentTable I WHERE O.Group1ID = I.Group1ID)
FROM CurrentTable O
Post Running Update
Group1ID Group1Name Group2ID Group2Name
360943 Group 1 24418 Donald
360943 Group 1 24418 Donald
360944 Group 2 24410 Adam
360944 Group 2 24410 Adam

PostgreSQL Removing duplicates

I am working on postgres query to remove duplicates from a table. The following table is dynamically generated and I want to write a select query which will remove the record if the first row has duplicate values.
The table looks something like this
Ist col 2nd col
4 62
6 34
5 26
5 12
I want to write a select query which remove either row 3 or 4.
There is no need for an intermediate table:
delete from df1
where ctid not in (select min(ctid)
from df1
group by first_column);
If you are deleting many rows from a large table, the approach with an intermediate table is probably faster.
If you just want to get unique values for one column, you can use:
select distinct on (first_column) *
from the_table
order by first_column;
Or simply
select first_column, min(second_column)
from the_table
group by first_column;
select count(first) as cnt, first, second
from df1
group by first
having(count(first) = 1)
if you want to keep one of the rows (sorry, I initially missed it if you wanted that):
select first, min(second)
from df1
group by first
Where the table's name is df1 and the columns are named first and second.
You can actually leave off the count(first) as cnt if you want.
At the risk of stating the obvious, once you know how to select the data you want (or don't want) the delete the records any of a dozen ways is simple.
If you want to replace the table or make a new table you can just use create table as for the deletion:
create table tmp as
select count(first) as cnt, first, second
from df1
group by first
having(count(first) = 1);
drop table df1;
create table df1 as select * from tmp;
or using DELETE FROM:
DELETE FROM df1 WHERE first NOT IN (SELECT first FROM tmp);
You could also use select into, etc, etc.
if you want to SELECT unique rows:
SELECT * FROM ztable u
WHERE NOT EXISTS ( -- There is no other record
SELECT * FROM ztable x
WHERE x.id = u.id -- with the same id
AND x.ctid < u.ctid -- , but with a different(lower) "internal" rowid
); -- so u.* must be unique
if you want to SELECT the other rows, which were suppressed in the previous query:
SELECT * FROM ztable nu
WHERE EXISTS ( -- another record exists
SELECT * FROM ztable x
WHERE x.id = nu.id -- with the same id
AND x.ctid < nu.ctid -- , but with a different(lower) "internal" rowid
);
if you want to DELETE records, making the table unique (but keeping one record per id):
DELETE FROM ztable d
WHERE EXISTS ( -- another record exists
SELECT * FROM ztable x
WHERE x.id = d.id -- with the same id
AND x.ctid < d.ctid -- , but with a different(lower) "internal" rowid
);
So basically I did this
create temp t1 as
select first, min (second) as second
from df1
group by first
select * from df1
inner join t1 on t1.first = df1.first and t1.second = df1.second
Its a satisfactory answer. Thanks for your help #Hack-R

Updating with Nested Select Statements

I have a table that holds 3 fields of data: Acct#, YMCode, and EmployeeID. The YMCode is an Int that is formatted 201308, 201307, etc. For each Acct#, I need to select the EmployeedID used for the YMCode 201308 and then update all of the other YMCodes for the Acct# to the EmployeedID used in 201308.
so for each customer account in the table...
Update MyTable
Set EmployeeID = EmployeeID used in YMCode 201308
Having a hard time with it.
Put it in a transaction and look at the results before committing, but I think this is what you want:
UPDATE b
SET EmployeeID = a.EmployeeID
FROM MyTable a
INNER JOIN MyTable b
ON a.[Acct#] = b.[Acct#]
where a.YMCode =
(SELECT MAX(YMCode) from MyTable)
To get max YMCode, just add select statement at the end.