PostGresql: Copy data from a random row of another table - postgresql

I have two tables, stuff and nonsense.
create table stuff(
id serial primary key,
details varchar,
data varchar,
more varchar
);
create table nonsense (
id serial primary key,
data varchar,
more varchar
);
insert into stuff(details) values
('one'),('two'),('three'),('four'),('five'),('six');
insert into nonsense(data,more) values
('apple','accordion'),('banana','banjo'),('cherry','cor anglais');
See http://sqlfiddle.com/#!17/313fb/1
I would like to copy random values from nonsense to stuff. I can do this for a single value using the answer to my previous question: SQL Server Copy Random data from one table to another:
update stuff
set data=(select data from nonsense where stuff.id=stuff.id
order by random() limit 1);
However, I would like to copy more than one value (data and more) from the same row, and the sub query won’t let me do that, of course.
I Microsoft SQL, I can use the following:
update stuff
set data=sq.town,more=sq.state
from stuff s outer apply
(select top 1 * from nonsense where s.id=s.id order by newid()) sq
I have read that PostGresql uses something like LEFT JOIN LATERAL instead of OUTER APPPLY, but simply substituting doesn’t work for me.
How can I update with multiple values from a random row of another table?

As of Postgres 9.5, you can assign multiple columns from a subquery:
update stuff
set (data, more) = (
select data, more
from nonsense
where stuff.id=stuff.id
order by random()
limit 1
);

Related

Is there a way to add the same row multiple times with different ids into a table with postgresql?

I am trying to add the same data for a row into my table x number of times in postgresql. Is there a way of doing that without manually entering the same values x number of times? I am looking for the equivalent of the go[count] in sql for postgres...if that exists.
Use the function generate_series(), e.g.:
insert into my_table
select id, 'alfa', 'beta'
from generate_series(1,4) as id;
Test it in db<>fiddle.
Idea
Produce a resultset of a given size and cross join it with the record that you want to insert x times. What would still be missing is the generation of proper PK values. A specific suggestion would require more details on the data model.
Query
The sample query below presupposes that your PK values are autogenerated.
CREATE TABLE test ( id SERIAL, a VARCHAR(10), b VARCHAR(10) );
INSERT INTO test (a, b)
WITH RECURSIVE Numbers(i) AS (
SELECT 1
UNION ALL
SELECT i + 1
FROM Numbers
WHERE i < 5 -- This is the value `x`
)
SELECT adhoc.*
FROM Numbers n
CROSS JOIN ( -- This is the single record to be inserted multiple times
SELECT 'value_a' a
, 'value_b' b
) adhoc
;
See it in action in this db fiddle.
Note / Reference
The solution is adopted from here with minor modifications (there are a host of other solutions to generate x consecutive numbers with SQL hierachical / recursive queries, so the choice of reference is somewhat arbitrary).

using 1 WITH statement with multiple (n) INSERTs

I wanted to know if it is possible to create one WITH statement and add multiple data values to other table with select, or do any equivalent thing.
I have 2 tables
one has data
create table bg_item(
item_id text primary key default 'a'||nextval('bg_item_seq'),
sellerid bigint not null references bg_user(userid),
item_type char(1)not null default 'N', --NORMAL (public)
item_upload_date date NOT NULL default current_date,
item_name varchar(30) not null,
item_desc text not null,
other has images link
create table item_images(
img_id bigint primary key default nextval('bg_item_image_seq'),
item_id text not null references bg_item (item_id),
image_link text not null
);
The user can add item to sell and upload images of it, now these images can be 3 or more, now when i will add the images, and complete item's description and everything from app, my request goes to backend, and i want to perform the query that it adds the user's item, sends me the item's id (which is a sequence in PostgresSQL) and use that id to reference images that i am inserting.
Currently i was doing this (for 1 image):
WITH ins1 AS (
INSERT INTO bg_item(sellerid,item_type,item_date,item_name,item_desc,item_specs,item_category)
VALUES (1005, 'k',default,'asdf','asdf','asd','asd')
RETURNING item_id
)
INSERT INTO item_images (item_id, image_link)
select item_id,'asdfg.asd.asdf.com' from ins1
(for 3 images)
INSERT INTO bg_item(sellerid,item_type,item_date,item_name,item_desc,item_specs,item_category)
VALUES (1005, 'k',default,'asdf','asdf','asd','asd')
RETURNING item_id
)
INSERT INTO item_images (item_id, image_link)
select item_id,'asdfg.asd.asdf.com' from ins1
select item_id,'asdfg.asdaws3f.com' from ins1
select item_id,'asdfg.gooolefnsd.sfsjf.com' from ins1
This would work for 3 images.
So my question is how to do it with n number of images? (as user can upload from 1 to n images)
Can i write a for loop?
a procedure or function?
References:
With and Insert
Sql multiple insert select
I didn't understand the Edit 3 (if it is related to my answer) in the above one.
One Solution i can think of is to write a procedure to return me item_id and write one more procedure to run multiple inserts, but i want a more efficient solution.
If you are going to work with SQL then there is a concept you need to expel from your thoughts -- LOOP. As soon as you think it, it is time to rethink. It does not exist is SQL and is not typically needed. SQL works in sets of qualifying things not individual things.
Now to your issue, it can be done is 1 statement. You pass your image list as an array of text in the with clause, then unnest that array and join to your existing cte during the Insert/Select:
with images (ilist) as
(
select array['image1','image2','image3','image4','image5']
)
, item (item_id) as
(
insert into bg_item(sellerid,item_type,item_date,item_name,item_desc,item_specs,item_category)
values (1005, 'k',default,'asdf','asdf','asd','asd')
returning item_id
)
insert into item_images (item_id, image_link)
select item_id,unnest (ilist)
from images
join item on true;

PostgreSQL - Update rows in table with generate_series()

I have the following table:
create table test(
id serial primary key,
firstname varchar(32),
lastname varchar(64),
id_desc char(8)
);
I need to insert 100 rows of data. Getting the names is no problem - I have two tables one containing ten rows of first names and the other containing ten last names. By doing a insert - select query with a cross join I am able to get 100 rows of data (10x10 cross join).
id_desc contains of eight characters (fixed size is mandatory). It always starts with the same pattern (e.g. abcde) followed by 001, 002 etc. up to 999. I have tried to achieve this with the following statement:
update test set id_desc = 'abcde' || num.id
from (select * from generate_series(1, 100) as id) as num
where num.id = (select id from test where id = num.id);
The statement executes but affects zero rows. I know that the where-clause probably does not make much sense; I have been trying to finally get this to work and just started trying a couple of things. Didn't want to omit it though when posting here because I know it is definitely required.
Laurenz's suggestion fits this specific case very well. I recommend using it.
The rest of this is for the more general case where that simplification is not appropriate.
In my tests this doesn't work in this way.
I think you are better off using a WITH clause and a window function.
WITH ranked_ids (id, rank) AS (
select id, row_number() OVER (rows unbounded preceding)
FROM test
)
update test set id_desc = 'abcde' || ranked_ids.rank
from ranked_ids WHERE test.id = ranked_ids.id;
It should be as simple as
UPDATE test SET id_desc = 'abcde' || to_char(id, 'FM099');

Primary key duplicate in a table-valued parameter in stored procedure

I am using following code to insert date by Table Valued Parameter in my SP. Actually it works when one record exists in my TVP but when it has more than one record it raises the following error :
'Violation of Primary key constraint 'PK_ReceivedCash''. Cannot insert duplicate key in object 'Banking.ReceivedCash'. The statement has been terminated.
insert into banking.receivedcash(ReceivedCashID,Date,Time)
select (select isnull(Max(ReceivedCashID),0)+1 from Banking.ReceivedCash),t.Date,t.Time from #TVPCash as t
Your query is indeed flawed if there is more than one row in #TVPCash. The query to retrieve the maximum ReceivedCashID is a constant, which is then used for each row in #TVPCash to insert into Banking.ReceivedCash.
I strongly suggest finding alternatives rather than doing it this way. Multiple users might run this query and retrieve the same maximum. If you insist on keeping the query as it is, try running the following:
insert into banking.receivedcash(
ReceivedCashID,
Date,
Time
)
select
(select isnull(Max(ReceivedCashID),0) from Banking.ReceivedCash)+
ROW_NUMBER() OVER(ORDER BY t.Date,t.Time),
t.Date,
t.Time
from
#TVPCash as t
This uses ROW_NUMBER to count the row number in #TVPCash and adds this to the maximum ReceivedCashID of Banking.ReceivedCash.

Compact or renumber IDs for all tables, and reset sequences to max(id)?

After running for a long time, I get more and more holes in the id field. Some tables' id are int32, and the id sequence is reaching its maximum value. Some of the Java sources are read-only, so I cannot simply change the id column type from int32 to long, which would break the API.
I'd like to renumber them all. This may be not good practice, but good or bad is not concerned in this question. I want to renumber, especially, those very long IDs like "61789238", "548273826529524324". I don't know why they are so long, but shorter IDs are also easier to handle manually.
But it's not easy to compact IDs by hand because of references and constraints.
Does PostgreSQL itself support of ID renumbering? Or is there any plugin or maintaining utility for this job?
Maybe I can write some stored procedures? That would be very nice so I can schedule it once a year.
The question is old, but we got a new question from a desperate user on dba.SE after trying to apply what is suggested here. Find an answer with more details and explanation over there:
Compacting a sequence in PostgreSQL
The currently accepted answer will fail for most cases.
Typically, you have a PRIMARY KEY or UNIQUE constraint on an id column, which is NOT DEFERRABLE by default. (OP mentions references and constraints.) Such constraints are checked after each row, so you most likely get unique violation errors trying. Details:
Constraint defined DEFERRABLE INITIALLY IMMEDIATE is still DEFERRED?
Typically, one wants to retain the original order of rows while closing gaps. But the order in which rows are updated is arbitrary, leading to arbitrary numbers. The demonstrated example seems to retain the original sequence because physical storage still coincides with the desired order (inserted rows in desired order just a moment earlier), which is almost never the case in real world applications and completely unreliable.
The matter is more complicated than it might seem at first. One solution (among others) if you can afford to remove the PK / UNIQUE constraint (and related FK constraints) temporarily:
BEGIN;
LOCK tbl;
-- remove all FK constraints to the column
ALTER TABLE tbl DROP CONSTRAINT tbl_pkey; -- remove PK
-- for the simple case without FK references - or see below:
UPDATE tbl t -- intermediate unique violations are ignored now
SET id = t1.new_id
FROM (SELECT id, row_number() OVER (ORDER BY id) AS new_id FROM tbl) t1
WHERE t.id = t1.id;
-- Update referencing value in FK columns at the same time (if any)
SELECT setval('tbl_id_seq', max(id)) FROM tbl; -- reset sequence
ALTER TABLE tbl ADD CONSTRAINT tbl_pkey PRIMARY KEY(id); -- add PK back
-- add all FK constraints to the column back
COMMIT;
This is also much faster for big tables, because checking PK (and FK) constraint(s) for every row costs a lot more than removing the constraint(s) and adding it (them) back.
If there are FK columns in other tables referencing tbl.id, use data-modifying CTEs to update all of them.
Example for a table fk_tbl and a FK column fk_id:
WITH u1 AS (
UPDATE tbl t
SET id = t1.new_id
FROM (SELECT id, row_number() OVER (ORDER BY id) AS new_id FROM tbl) t1
WHERE t.id = t1.id
RETURNING t.id, t1.new_id -- return old and new ID
)
UPDATE fk_tbl f
SET fk_id = u1.new_id -- set to new ID
FROM u1
WHERE f.fk_id = u1.id; -- match on old ID
More in the referenced answer on dba.SE.
Assuming your ids are generated from a bignum sequence, just RESTART the sequence and update the table with idcolumn = DEFAULT.
CAVEAT: If this id column is used as a foreign key by other tables, make sure you have the on update cascade modifier turned on.
For example:
Create the table, put some data in, and remove a middle value:
db=# create sequence xseq;
CREATE SEQUENCE
db=# create table foo ( id bigint default nextval('xseq') not null, data text );
CREATE TABLE
db=# insert into foo (data) values ('hello'), ('world'), ('how'), ('are'), ('you');
INSERT 0 5
db=# delete from foo where data = 'how';
DELETE 1
db=# select * from foo;
id | data
----+-------
1 | hello
2 | world
4 | are
5 | you
(4 rows)
Reset your sequence:
db=# ALTER SEQUENCE xseq RESTART;
ALTER SEQUENCE
Update your data:
db=# update foo set id = DEFAULT;
UPDATE 4
db=# select * from foo;
id | data
----+-------
1 | hello
2 | world
3 | are
4 | you
(4 rows)
new id column and Foreign Key(s) while the old ones are still in use. With some (quick) renaming, applications do not have to be aware. (But applications should be inactive during the final renaming step)
\i tmp.sql
-- the test tables
CREATE TABLE one (
id serial NOT NULL PRIMARY KEY
, payload text
);
CREATE TABLE two (
id serial NOT NULL PRIMARY KEY
, the_fk INTEGER REFERENCES one(id)
ON UPDATE CASCADE ON DELETE CASCADE
);
-- And the supporting index for the FK ...
CREATE INDEX ON two(the_fk);
-- populate
INSERT INTO one(payload)
SELECT x::text FROM generate_series(1,1000) x;
INSERT INTO two(the_fk)
SELECT id FROM one WHERE random() < 0.3;
-- make some gaps
DELETE FROM one WHERE id % 13 > 0;
-- SELECT * FROM two;
-- Add new keycolumns to one and two
ALTER TABLE one
ADD COLUMN new_id SERIAL NOT NULL UNIQUE
;
-- UPDATE:
-- This could need DEFERRABLE
-- Note since the update is only a permutation of the
-- existing values, we dont need to reset the sequence.
UPDATE one SET new_id = self.new_id
FROM ( SELECT id, row_number() OVER(ORDER BY id) AS new_id FROM one ) self
WHERE one.id = self.id;
ALTER TABLE two
ADD COLUMN new_fk INTEGER REFERENCES one(new_id)
;
-- update the new FK
UPDATE two t
SET new_fk = o.new_id
FROM one o
WHERE t.the_fk = o.id
;
SELECT * FROM two;
-- The crucial part: the final renaming
-- (at this point it would be better not to allow other sessions
-- messing with the {one,two} tables ...
-- --------------------------------------------------------------
ALTER TABLE one DROP COLUMN id CASCADE;
ALTER TABLE one rename COLUMN new_id TO id;
ALTER TABLE one ADD PRIMARY KEY(id);
ALTER TABLE two DROP COLUMN the_fk CASCADE;
ALTER TABLE two rename COLUMN new_fk TO the_fk;
CREATE INDEX ON two(the_fk);
-- Some checks.
-- (the automatically generated names for the indexes
-- and the sequence still contain the "new" names.)
SELECT * FROM two;
\d one
\d two
UPDATE: added the permutation of new_id (after creating it as a serial)
Funny thing is: it doesn't seem to need 'DEFERRABLE'.
*This script will work for postgresql
This is a generic solution that works for all cases
This query find the desciption of the fields of all tables from any database.
WITH description_bd AS (select colum.schemaname,coalesce(table_name,relname) as table_name , column_name, ordinal_position, column_default, data_type, is_nullable, character_maximum_length, is_updatable,description from
( SELECT columns.table_schema as schemaname,columns.table_name, columns.column_name, columns.ordinal_position, columns.column_default, columns.data_type, columns.is_nullable, columns.character_maximum_length, columns.character_octet_length, columns.is_updatable, columns.udt_name
FROM information_schema.columns
) colum
full join (SELECT schemaname, relid, relname,objoid, objsubid, description
FROM pg_statio_all_tables ,pg_description where pg_statio_all_tables.relid= pg_description.objoid ) descre
on descre.relname = colum.table_name and descre.objsubid=colum.ordinal_position and descre.schemaname=colum.schemaname )
This query propose a solution to fix the sequence of all database tables (this generates a query in the req field which fixes the sequence of the different tables).
It finds the number of records of the table and then increment this number by one.
SELECT table_name, column_name, ordinal_position,column_default,
data_type, is_nullable, character_maximum_length, is_updatable,
description,'SELECT setval('''||schemaname||'.'|| replace(replace(column_default,'''::regclass)',''),'nextval(''','')||''', (select max( '||column_name ||')+1 from '|| table_name ||' ), true);' as req
FROM description_bd where column_default like '%nextva%'
Since I didn't like the answers, I wrote a function in PL/pgSQL to do the job.
It is called like this :
=> SELECT resequence('port','id','port_id_seq');
resequence
--------------
5090 -> 3919
Takes 3 parameters
name of table
name of column that is SERIAL
name of sequence that the SERIAL uses
The function returns a short report of what it has done, with the previous value of the sequence and the new value.
The function LOOPs over the table ORDERed by the named column and makes an UPDATE for each row. Then sets the new value for the sequence. That's it.
The order of the values is preserved.
No ADDing and DROPing of temporary columns or tables involved.
No DROPing and ADDing of constraints and foreign keys needed.
Of course You better have ON UPDATE CASCADE for those foreign keys.
The code :
CREATE OR REPLACE FUNCTION resequence(_tbl TEXT, _clm TEXT, _seq TEXT) RETURNS TEXT AS $FUNC$
DECLARE
_old BIGINT;_new BIGINT := 0;
BEGIN
FOR _old IN EXECUTE 'SELECT '||_clm||' FROM '||_tbl||' ORDER BY '||_clm LOOP
_new=_new+1;
EXECUTE 'UPDATE '||_tbl||' SET '||_clm||'='||_new||' WHERE '||_clm||'='||_old;
END LOOP;
RETURN (nextval(_seq::regclass)-1)||' -> '||setval(_seq::regclass,_new);
END $FUNC$ LANGUAGE plpgsql;