PostgreSQL Populate Column with Data - postgresql

I try to populate null-rows in a table with data from the same table. Here is my code:
create table public.testdata(
id INTEGER,
person INTEGER,
name varchar(10));
insert into testdata (id, person,name) VALUES ( 1,1,'Jane' ), ( 2,1,'Jane' ), ( 3,1,NULL ), ( 4,2,'Tom' ), ( 5,2,NULL );
select * from testdata;
enter image description here
Basically i would like to have name 'Jane' in the 3rd row and name 'Tom' in the 5th.
Here is the asnswer which i have found online to a simmilar problem:
Update testdata
SET name = COALESCE(a1.name, b1.name)
FROM testdata a1
JOIN testdata b1
on a1.person = b1.person
and a1.id <> b1.id
where a1.name is NULL;
But if i run this code, i get name 'Jane' in every column, which is not what i want. I appreciate any help and suggestions.

Example for you:
select t1.id, t1.person, t2.name from testdata t1
left join
(
select distinct person, name from testdata
where name is not null
) t2 on t1.person = t2.person

Get the person (id?) and the desired name via a CTE. Then use the results to update names. So (see demo):
with namer (person, name) as
( select distinct on (person)
person, name
from testdata
where name is not null
order by person, name
)
update testdata d1
set name = (select n1.name
from namer n1
where n1.person = d1.person
)
where d1.name is null;
NOTE: Demo contains additional rows where the entry sequence of the rows is not ideal. And not all person values have associated name.

Related

filter taking to much time in posgresdb on gender field

I have one table with 100M plus rows which looks like this
Create table member (
id bigint,
gender text,
//..other fields
primary key (id)
);
Now the gender field has two possible value 'M' or 'F'
Whenever I am using the gender field then it's taking to much time I have indexes on other fields like id, member details, mobile number
select
count(1) filter (where mod.is_active and m.gender = 'M') as male,
count(1) filter (where mod.is_active and m.gender = 'F') as female
from member_other_details mod
inner join member m on m.id = mod.member_id
This query is taking hrs to complete
How can I optimize this?
Personnally i would execute this query
select m.gender,count(*)
from member_other_details mod inner join member m on m.id = mod.member_id
where mod.is_active
group by m.gender

Getting jsonb field names from query result

I have two table like this:
create table product (
id serial primary key,
name text
);
create table selectedattribute (
id serial primary key,
product integer references product,
attribute text,
val text
);
and I'm creating a materialized view with this select query
select product.name,
jsonb_build_object(
'color', COALESCE(jsonb_agg(val) FILTER (WHERE attribute='color'), '[]'),
'diameter', COALESCE(jsonb_agg(val) FILTER (WHERE attribute='diameter'), '[]')
)
from product
left join selectedattribute on product.id = selectedattribute.product
group by product.id;
the problem with this select query is when I add a new attribute, I have to add it to select query in order to create an up-to-date materialized view.
Is there a way to write an aggregate expression that dynamically gets attributes without all these hard-coded attribute names?
You can try my code in SQL Fiddle: http://sqlfiddle.com/#!17/c4150/4
You need to nest the aggregation. First collect all values for an attribute then aggregate that into a JSON:
select id, name, jsonb_object_agg(attribute, vals)
from (
select p.id, p.name, a.attribute, jsonb_agg(a.val) vals
from product p
left join selectedattribute a on p.id = a.product
group by p.id, a.attribute
) t
group by id, name;
Updated SQLFiddle: http://sqlfiddle.com/#!17/c4150/5

Joined aggregate function update?

I'm trying to denormalize an aggregate bit of data for performance, but can't figure out how to get the aggregation working...
CREATE TABLE brands (
id SERIAL,
name TEXT,
total INTEGER,
unitcount INTEGER
)
CREATE TABLE items (
brandid INTEGER,
id SERIAL,
unitvalue INTEGER
)
UPDATE brands SET b.total = i.sumScore,
b.unitcount = i.unitcount
FROM brands b
INNER JOIN
(
SELECT brandid,
SUM(unitvalue) sumScore,
COUNT(unitvalue) unitcount
FROM items
group by brandid
) as a
ON i.brandid = b.id
This updates EVERY record in brand with the same values, despite the inner join query showing a correct table set of distinct values for each brand. How can I get that correlated?
please try
UPDATE brands b
INNER JOIN (
SELECT brandid,SUM(unitvalue) value_to_update,COUNT(unitvalue) cnt
FROM items GROUP BY brandid) abc
ON b.id=abc.brandid
SET b.total=abc.value_to_update,b.unitcount=abc.cnt

PostgreSql unable to create view due to "duplicate column"

I am trying to create a country_name, and country cid pair between each country that are neighbours:
Here's the schema:
CREATE TABLE country (
cid INTEGER PRIMARY KEY,
cname VARCHAR(20) NOT NULL,
height INTEGER NOT NULL,
population INTEGER NOT NULL);
CREATE TABLE neighbour (
country INTEGER REFERENCES country(cid) ON DELETE RESTRICT,
neighbor INTEGER REFERENCES country(cid) ON DELETE RESTRICT,
length INTEGER NOT NULL,
PRIMARY KEY(country, neighbor));
My query:
create view neighbour_pair as (
select c1.cid, c1.cname, c2.cid, c2.cname
from neighbour n join country c1 on c1.cid = n.country
join country c2 on n.neighbor = c2.cid);
I am getting error code 42701 which means that there is a duplicate column.
The actual error message I am getting is:
ERROR: column "cid" specified more than once
********** Error **********
ERROR: column "cid" specified more than once
SQL state: 42701
I am unsure how to go around the error problem since I WANT the pair of neighbour countries with the country name and their cid.
Nevermind. I edited the first line of the query and changed the column names
create view neighbour_pair as
select c1.cid as c1cid, c1.cname as c1name, c2.cid as c2cid, c2.cname as c2name
from neighbour n join country c1 on c1.cid = n.country
join country c2 on n.neighbor = c2.cid;
I ran into a similar issue recently. I had a query like:
CREATE VIEW pairs AS
SELECT p.id, p.name,
(SELECT count(id) from results
where winner = p.id),
(SELECT count(id) from results
where winner = p.id OR loser = p.id)
FROM players p LEFT JOIN matches m ON p.id = m.id
GROUP BY 1,2;
The error was telling me: ERROR: column "count" specified more than once. The query WAS working via psycopg2, however when I brought it into a .sql file for testing the error arose.
I realized I just needed to alias the 2 count subqueries:
CREATE VIEW pairs AS
SELECT p.id, p.name,
(SELECT count(id) from results
where winner = p.id) as wins,
(SELECT count(id) from results
where winner = p.id OR loser = p.id) as matches
FROM players p LEFT JOIN matches m ON p.id = m.id
GROUP BY 1,2;
You can use alias with AS:
For example your view could be as follows:
create view neighbour_pair as
(
select c1.**cid**
, c1.cname
, c2.**cid AS cid_c2**
, c2.cname
from neighbour n
join country c1 on c1.cid = n.country
join country c2 on n.neighbor = c2.cid
);

Order by objects relation (PostgreSQL)

Have 2 tables for example:
In 1st: object & parent columns
object | parent
-------+---------
object1| null
object2| object1
object3| null
2nd has: object & reference columns
object | reference
-------+---------
object1| null
object2| null
object3| object1
Need to query tables to order like following: parent is first, then - child(s), objects which have reference(s) to parent.
object1
object2
object3
Is it possible to do in one SQL query or need to sort manually in an array? Seems it is a classical task, probably solution already exists somewhere?
Is this what you're looking for?
CREATE TABLE oparen (object varchar(10), parent varchar(10));
CREATE TABLE oref (object varchar(10), ref varchar(10));
INSERT INTO oparen VALUES
('object1',null),('object2','object1'),
('object3',null),('object4','object2');
INSERT INTO oref VALUES
('object1',null),('object2',null),('object3','object1'),
('object5','object6'),('object6','object1'),('object7','object4');
WITH hier AS (
SELECT parent AS obj, 1 AS rank FROM oparen
WHERE parent IS NOT NULL
UNION
SELECT object, 2 FROM oparen
WHERE parent IS NOT NULL
UNION
SELECT object, 3 FROM oref
WHERE ref IS NOT NULL),
allobj AS (
SELECT object AS obj FROM oparen
UNION
SELECT object FROM oref)
SELECT a.obj, coalesce(h.rank, 4) AS rank
FROM allobj a LEFT JOIN hier h ON a.obj = h.obj
ORDER BY coalesce(h.rank, 4), a.obj;
EDIT: After the improved example in the answer below, the following query should do the trick:
WITH parents AS (
SELECT parent AS obj, 1 AS rank FROM oparen
WHERE parent IS NOT NULL
),
family AS (
SELECT * FROM parents
UNION ALL
SELECT object, 2 FROM oparen op
WHERE parent IS NOT NULL
AND NOT EXISTS (SELECT obj FROM parents WHERE obj = op.object)
),
hier AS (
SELECT * FROM family
UNION ALL
SELECT object AS obj, coalesce(f.rank + 2, 5) AS rank
FROM oref LEFT JOIN family f ON oref.ref = f.obj
WHERE ref IS NOT NULL
),
allobj AS (
SELECT object AS obj FROM oparen
UNION
SELECT object FROM oref)
SELECT a.obj, h.rank AS rank
FROM allobj a LEFT JOIN hier h ON a.obj = h.obj
ORDER BY h.rank, a.obj;
Testbed creation in the top is updated according to the new requirements.
I inserted following data:
INSERT INTO oparen VALUES
('object1',null),('object2','object1'),('object3',null),('object4','object2');
INSERT INTO oref VALUES
('object1',null),('object2',null),('object3','object1'),('object5','object6'),('object6','object1');
Order is incorrect and object2 listed twice. DISTINCT on obj breaks the order also. Should go 6 then 5.
No, does not work: checked for another data and simplified to use and only by oref table content:
INSERT INTO oref VALUES
('object1',null),('object2',null),('object3','object1'),
('object5','object6'),('object6','object1'),('object7','object4'), ('object4','object5');
WITH family AS (
SELECT object AS obj, 1 AS rank FROM oref
WHERE ref IS NULL
),
hier AS (
SELECT * FROM family
UNION ALL
SELECT object AS obj, coalesce(f.rank + 2, 5) AS rank
FROM oref LEFT JOIN family f ON oref.ref = f.obj
WHERE ref IS NOT NULL
),
allobj AS (
SELECT object AS obj FROM oref)
SELECT a.obj, h.rank AS rank
FROM allobj a
LEFT JOIN hier h ON a.obj = h.obj
ORDER BY h.rank, a.obj;
Think need to use recursive queries here. Will write and post here.
Following recursive query works:
WITH RECURSIVE tables(object, rank) AS (
SELECT DISTINCT o.object, 1 AS rank FROM oref o
WHERE o.ref IS NULL
UNION
SELECT o.object, t.rank + 1 AS rank
FROM (SELECT DISTINCT o.object, o.ref FROM oref o
WHERE ref IS NOT NULL) o, tables t
WHERE o.ref = t.object AND rank <= t.rank
),
ordered AS (
SELECT * FROM tables
)
SELECT * FROM tables
WHERE tables.rank = (SELECT MAX(rank) FROM ordered WHERE ordered.object = tables.object)
ORDER BY rank;
Any comments, questions, objections, propositions? ;)