I have table in Postgres
create table foo(bar int);
I know I can insert sequential data like this
insert into public.foo (select i from generate_series(1, 10) as i);
Now I want to update rows
update public.foo set bar = sq.t from (select t from generate_series(100, 10000) as t) as sq;
but this will update column to all the same values.
I know I need to use where somehow, but how can I use it without primary keys from both sides?
EDIT:
I will add more real life detail. I have complex table with around 20 columns. Around 40k rows. I am interested in two columns here, pk (or id, integer, with id_seq) and created_date.
I populated this table with duplicating initial 10 rows, so created_date are repeating (like 123123123). I want to pick big range of dates from generate_series with 1 min interval and put them in created_date column to have sequential data there. And ideally regenerate ids from 1. How can I do it?
You're looking for
UPDATE foo
SET bar = CASE WHEN new_value < N THEN new_value + start END
FROM (
SELECT
ctid,
row_number() OVER () as new_value
FROM foo
) AS new_foo
WHERE foo.ctid = new_foo.ctid;
TABLE foo;
(online demo)
Related
I'm using this query to find duplicate dates but not sure how to sum each duplicate dates, average it and remove duplicate dates.
DB Schema
date_time
datapoint_1
datapoint_2
SQL Query
SELECT date_time, COUNT(date_time)
FROM MYTABLE
GROUP BY date_time
HAVING COUNT(date_time) > 1
ORDER BY COUNT(date_time)
I would create a new table to replace the old one. That is easier and might even perform better:
CREATE TABLE mytable2 (LIKE mytable);
INSERT INTO mytable2 (date_time, datapoint_1, datapoint_2)
SELECT m.date_time, avg(m.datapoint_1), avg(m.datapoint_2)
FROM mytable AS m
GROUP BY m.date_time;
Then you can drop mytable and rename mytable2 to replace it.
To prevent new rows from creating duplicates, you could change the way you insert data:
-- to keep track of counts
ALTER TABLE mytable ADD numval integer DEFAULT 1;
-- to prevent duplicates
ALTER TABLE mytable ADD UNIQUE (date_time);
-- to insert new rows
INSERT INTO mytable (date_time, datapoint_1, datapoint_2)
VALUES ('2021-06-30', 42.0, -34.9)
ON CONFLICT (date_time)
DO UPDATE SET numval = mytable.numval + 1,
datapoint_1 = mytable.datapoint_1 + excluded.datapoint_1,
datapoint_2 = mytable.datapoint_2 + excluded.datapoint_2;
-- to select the averages
SELECT date_time,
datapoint_1 / numval AS datapoint_1,
datapoint_2 / numval AS datapoint_2
FROM mytable;
When you use GROUP BY you can also use aggregate functions to reduce multiple lines to a single one (COUNT, that you used is one of such functions). In your case the query would be:
SELECT date_time, avg(datapoint_1), avg(datapoint_2)
FROM MYTABLE
GROUP BY date_time
For every distinct date_time you will get a single row with the average of datapoint_1 and datapoint_2.
I am trying to create a unique index for a subset of data in a particular table. The existing data is something like this -
But the actual data should look like this -
The subset of rows will be the rows with the condition status as A or B. For these set of rows, the unique_id and amount value combination should be unique.
The DB2 version been used here is 9.7 on a windows server. Is partial index or conditional index possible in DB2?
New table
create or replace function generate_unique_det()
returns varchar(13) for bit data
deterministic
no external action
contains sql
return generate_unique();
create table test_unique (
unique_id int not null
, status char(1) not null
, amount int not null
, status2 varchar(13) for bit data not null generated always as
(case when status in ('A', 'B') then '' else generate_unique_det() end)
) in userspace1;
create unique index test_unique1 on test_unique (unique_id, amount, status2);
insert into test_unique (unique_id, status, amount)
values
(1234, 'A', 400)
--, (1234, 'B', 400)
, (1234, 'Z', 400)
, (1234, 'Z', 400);
The standard generate_unique function is not deterministic.
Such functions are not allowed in the generated always clause.
This is why we create our own function based on the standard one.
The problem with such a "fake" function could be, if Db2 not actually called this function for each updated / inserted row during a multi-row change operation (why to call the deterministic function multiple times on the same set of parameters, if it's enough to do it once and reuse the result for other affected rows afterwards). But it works in reality - Db2 does call such a function for every affected row, which is desired in our case.
You are not able to insert the commented out row in the last statement.
Existing table
set integrity for test_unique off;
alter table test_unique add
status2 varchar(13) for bit data not null
generated always as (case when status in ('A', 'B') then '' else generate_unique_det() end);
set integrity for test_unique immediate checked force generated;
-- If you need to save the rows violated future unique index
create table test_unique_exc like test_unique in userspace1;
-- If you don't need to save the the rows violated future unique index,
-- then just run the inner DELETE statement only,
-- which just removes these rows.
-- The whole statement inserts the deleted rows into the "exception table".
with d as (
select unique_id, status, amount, status2
from old table (
delete from (select t.*, rownumber() over (partition by unique_id, amount, status2) rn_ from test_unique t) where rn_>1
)
)
select count(1)
from new table (
insert into test_unique_exc select * from d
);
create unique index test_unique1 on test_unique (unique_id, amount, status2);
I have two tables. One table A has n rows of data and the other table B is empty. I want to insert n rows into table B, 1 row for each row in table A. Table B will have a couple of fields from table A in it, including a foreign key from table A.
In the end I want one row in B for each row in A. To do this I used:
INSERT INTO B(Col1
,Col2
,Col3
,Col4
,Col5
);
SELECT 100
,25
,'ABC'
,1
,A.ID
FROM Auctions A
Now, I've put this code in a stored procedure and this SP takes an int param called NumInserts.
I want to insert n * NumInserts rows. So, if n is 10 and NumInserts is 5 I want to run this code 5 * 10 (50) times.
In other words for each row in table A I want to insert 5 rows in table B. How would I do that?
create procedure insert_into_b
#numInserts int
as
begin
while #numInserts > 0
begin
insert into b (id)
select id from a
set #numInserts = #numInserts - 1
end
end
exec insert_into_b 2
This is a hack and I wouldn't recommend using it in production or big volumes of data. However, in development quick-and-dirty scenarios I found it often useful:
Use GO \[count\] to execute a batch of commands a specified number of times.
Concretely, if you had a stored procedure called InsertAIntoB, you could run this in Management Studio:
exec InsertAIntoB
GO 10
(replace 10 with whatever NumInserts is)
I prefer to avoid looping when I can, just so I don't have to maintain some easily breakable and somewhat ugly loop structure in my stored procedure.
You could easily do this with a Numbers table, the CROSS APPLY statement, and your existing INSERT statement.
Given that your numbers table would look like this:
Number
======
0
1
2
...
Your SQL statement simply becomes:
INSERT INTO B
(
[Col1]
,[Col2]
,[Col3]
,[Col4]
,[Col5]
)
SELECT
100
,25
,'ABC'
,1
,a.ID
FROM
Auctions a
CROSS APPLY
Numbers n
WHERE
n.Number BETWEEN 1 AND #NumInserts
Numbers tables can be useful if use appropriately. If you're unfamiliar with them, here are a few resources and some pros/cons:
http://dataeducation.com/you-require-a-numbers-table/ (the code to create a numbers table in this article is shown below)
http://archive.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=NumbersTable
https://dba.stackexchange.com/questions/11506/why-are-numbers-tables-invaluable
Maybe this solution is overkill if #NumInserts is always going to be a reasonably small number, but if you already have a Numbers table sitting around, you might as well take advantage of it!
UPDATE:
Here's a quick and dirty method to populate a numbers table from 0 to 65,535:
CREATE TABLE Numbers
(
Number INT NOT NULL,
CONSTRAINT PK_Numbers
PRIMARY KEY CLUSTERED (Number)
WITH FILLFACTOR = 100
)
GO
INSERT INTO Numbers
SELECT
(a.Number * 256) + b.Number AS Number
FROM
(
SELECT number
FROM master..spt_values
WHERE
type = 'P'
AND number <= 255
) a (Number),
(
SELECT number
FROM master..spt_values
WHERE
type = 'P'
AND number <= 255
) b (Number)
GO
Credit: http://dataeducation.com/you-require-a-numbers-table/
Create procedure DoitNTimes
#N integer = 1
As
Set NoCount On
While #N > 0 Begin
Insert B (Col1, Col2, Col3, Col4, Col5)
Select 100, 25, 'ABC', 1, A.ID
From Auctions A
-- -----------------------------------
Set #N -= 1
End
If using SQL Server 2005 or earlier replace the Set #N -= 1' withSet #N = #N-1`
and if you really want to avoid loop using T-SQL variables, then use a CTE, not a disk-based table:
Create procedure DoitNTimes
#N integer = 1
As
Set NoCount On
With nums(num) As
(Select #N Union All
Select num - 1
From nums
Where num > 1)
Insert B (Col1, Col2, Col3, Col4, Col5)
Select 100, 25, 'ABC', 1, A.ID
From Auctions A Full Join nums
Option(MaxRecursion 10000)
but of course, this is also still looping, just like any solution to this issue.
Very late answer but there is no need to loop and it's a little simpler than Corey's good answer;
DECLARE #n int = 10;
INSERT INTO B(Col1,Col2,Col3,Col4,Col5);
SELECT 100,25,'ABC',1,A.ID
FROM Auctions A
JOIN (SELECT TOP(#n) 1 [junk] FROM sys.all_objects) as copies ON 1 = 1
You could use any table in the join as long as it has the number of rows you'll need. You could also change "1 [junk]" to "ROW_NUMBER() OVER(ORDER BY object_id) [copyno]" if you wanted a copy number somewhere in the insert table.
Hopefully this will save someone a little work down the road...
Try this (on SQL server databases):
DECLARE #NumInserts SMALLINT = 3
INSERT INTO B (Col1, Col2, Col3, Col4, Col5)
SELECT 100, 25, 'ABC', 1, A.ID
FROM Auctions A
JOIN master.dbo.spt_values numbers ON numbers.number < #NumInserts
WHERE numbers.[type] = 'P'
Note: This will only work if #NumInserts is less than or equal to 2048
master.dbo.spt_values WHERE type = 'P' is just a built-in SQL Server table of numbers from 0 to 2047
I have the following table:
RecordID
Name
Col1
Col2
....
ColN
The RecordID is BIGINT PRIMARY KEY CLUSTERED IDENTITY(1,1) and RecordID and Name are initialized. The other columns are NULLs.
I have a function which returns information about the other columns by Name.
To initialized my table I use the following algorithm:
Create a LOOP
Get a row, select its Name value
Execute the function using the selected name, and store its result
in temp variables
Insert the temp variables in the table
Move to the next record
Is there a way to do this without looping?
Cross apply was basically built for this
SELECT D.deptid, D.deptname, D.deptmgrid
,ST.empid, ST.empname, ST.mgrid
FROM Departments AS D
CROSS APPLY fn_getsubtree(D.deptmgrid) AS ST;
Using APPLY
UPDATE some_table
SET some_row = another_row,
some_row2 = another_row/2
FROM some_table st
CROSS APPLY
(SELECT TOP 1 another_row FROM another_table at WHERE at.shared_id=st.shared_id)
WHERE ...
using cross apply in an update statement
You can simply say the following if you already have the records in the table.
UPDATE MyTable
SET
col1 = dbo.col1Method(Name),
col2 = dbo.col2Method(Name),
...
While inserting new records, assuming RecordID is auto-generated, you can say
INSERT INTO MyTable(Name, Col1, Col2, ...)
VALUES(#Name, dbo.col1Method(#Name), dbo.col2Method(#name), ...)
where #Name contains the value for the Name column.
Let's say we have a table with some data in it.
IF OBJECT_ID('dbo.table1') IS NOT NULL
BEGIN
DROP TABLE dbo.table1;
END
CREATE TABLE table1 ( DATA INT );
---------------------------------------------------------------------
-- Generating testing data
---------------------------------------------------------------------
INSERT INTO dbo.table1(data)
SELECT 100
UNION ALL
SELECT 200
UNION ALL
SELECT NULL
UNION ALL
SELECT 400
UNION ALL
SELECT 400
UNION ALL
SELECT 500
UNION ALL
SELECT NULL;
How to delete the 2nd, 5th, 6th records in the table? The order is defined by the following query.
SELECT data
FROM dbo.table1
ORDER BY data DESC;
Note, this is in SQL Server 2000 environment.
Thanks.
In short, you need something in the table to indicate sequence. The "2nd row" is a non-sequitur when there is nothing that enforces sequence. However, a possible solution might be (toy example => toy solution):
If object_id('tempdb..#NumberedData') Is Not Null
Drop Table #NumberedData
Create Table #NumberedData
(
Id int not null identity(1,1) primary key clustered
, data int null
)
Insert #NumberedData( data )
SELECT 100
UNION ALL SELECT 200
UNION ALL SELECT NULL
UNION ALL SELECT 400
UNION ALL SELECT 400
UNION ALL SELECT 500
UNION ALL SELECT NULL
Begin Tran
Delete table1
Insert table1( data )
Select data
From #NumberedData
Where Id Not In(2,5,6)
If ##Error <> 0
Commit Tran
Else
Rollback Tran
Obviously, this type of solution is not guaranteed to work exactly as you want but the concept is the best you will get. In essence, you stuff your rows into a table with an identity column and use that to identify the rows to remove. Removing the rows entails emptying the original table and re-populating with only the rows you want. Without a unique key of some kind, there just is no clean way of handling this problem.
As you are probably aware you can do this in later versions using row_number very straightforwardly.
delete t from
(select ROW_NUMBER() over (order by data) r from table1) t
where r in (2,5,6)
Even without that it is possible to use the undocumented %%LOCKRES%% function to differentiate between 2 identical rows
SELECT data,%%LOCKRES%%
FROM dbo.table1`
I don't think that's available in SQL Server 2000 though.
In SQL Sets don't have order but cursors do so you could use something like the below. NB: I was expecting to be able to use DELETE ... WHERE CURRENT OF but that relies on a PK so the code to delete a row is not as simple as I was hoping for.
In the event that the data to be deleted is a duplicate then there is no guarantee that it will delete the same row as CURRENT OF would have. However in this eventuality the ordering of the tied rows is arbitrary anyway so whichever row is deleted could equally well have been given that row number in the cursor ordering.
DECLARE #RowsToDelete TABLE
(
rowidx INT PRIMARY KEY
)
INSERT INTO #RowsToDelete SELECT 2 UNION SELECT 5 UNION SELECT 6
DECLARE #PrevRowIdx int
DECLARE #CurrentRowIdx int
DECLARE #Offset int
SET #CurrentRowIdx = 1
DECLARE #data int
DECLARE ordered_cursor SCROLL CURSOR FOR
SELECT data
FROM dbo.table1
ORDER BY data
OPEN ordered_cursor
FETCH NEXT FROM ordered_cursor INTO #data
WHILE EXISTS(SELECT * FROM #RowsToDelete)
BEGIN
SET #PrevRowIdx = #CurrentRowIdx
SET #CurrentRowIdx = (SELECT TOP 1 rowidx FROM #RowsToDelete ORDER BY rowidx)
SET #Offset = #CurrentRowIdx - #PrevRowIdx
DELETE FROM #RowsToDelete WHERE rowidx = #CurrentRowIdx
FETCH RELATIVE #Offset FROM ordered_cursor INTO #data
/*Can't use DELETE ... WHERE CURRENT OF as here that requires a PK*/
SET ROWCOUNT 1
DELETE FROM dbo.table1 WHERE (data=#data OR data IS NULL OR #data IS NULL)
SET ROWCOUNT 0
END
CLOSE ordered_cursor
DEALLOCATE ordered_cursor
To perform any action on a set of rows (such as deleting them), you need to know what identifies those rows.
So, you have to come up with criteria that identifies the rows you want to delete.
Providing a toy example, like the one above, is not particularly useful.
You plan ahead and if you anticipate this is possible you add a surrogate key column or some such.
In general you make sure you don't create tables without PK's.
It's like asking "Say I don't look both directions before crossing the road and I step in front of a bus."