Join and Concatenate rows from table into into string - postgresql

I have 2 tables consider table named as fp and batch, I have to join 2 tables based on fp[primary key] of 1st table and fp_inst_id from 2nd table such that my output is :
First table all columns and 2nd table one column which is concatenated string of all the rows from join of table 1 and table 2 on fp.id and batch.fp_inst_id.
Note :
[there will be multiple fp_inst_id(of table 2) for unique ID(of table 1)]
Let me give you an example :
Created tables :
CREATE TABLE fp (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
CREATE TABLE batch (
batchID int,
fp_inst_id int,
xyz varchar(255),
abc varchar(255)
);
insert into fp values(1,'savan','nahar','abc','xyz');
insert into fp values(2,'mmm','asmd','aawd','12k3mn');
insert into batch values(1,1,'garbage1', 'abc1');
insert into batch values(2,1,'garbage2', 'abc2');
insert into batch values(3,1,'garbage3', 'abc3');
insert into batch values(4,2,'garbage9', 'abc9');
If i do normal join like this :
select * from fp join batch on fp.PersonID = batch.fp_inst_id;
What I want is :
Batch columns can be different like it's ok if it has some other delimiter of not surrounded by [] and separated on ';' or something.
What I have tried:
The same thing can be done using MYSQL using STUFF, FOR XML PATH
But it seems to be difficult in POSTGRES SQL as it doesn't support these things,
In POSTGRES SQL I tried string_agg, but it says me to group by everything
2nd thing I was trying was :
Using with clause first create the concatenated strings of table 2 group by on fp_inst_id, but in POSTGRES SQL, it allows group by on primary key(which is normal select) or it asks to use the aggregate function
I'm trying to do this in POSTGRES SQL through a query.
Thanks for the help in advance

Use array_agg to combine the batch rows and group-by to bracket the combination.
select personid,lastname,firstname,address,city,
array_agg(batch)
from fp
join batch on fp.PersonID = batch.fp_inst_id
group by personid,lastname,firstname,address,city;
eg:
jasen=# select personid,lastname,firstname,address,city,array_agg(batch) from fp join batch on fp.PersonID = batch.fp_inst_id group by 1,2,3,4,5;
personid | lastname | firstname | address | city | array_agg
----------+----------+-----------+---------+--------+---------------------------------------------------------------------
2 | mmm | asmd | aawd | 12k3mn | {"(4,2,garbage9,abc9)"}
1 | savan | nahar | abc | xyz | {"(1,1,garbage1,abc1)","(2,1,garbage2,abc2)","(3,1,garbage3,abc3)"}
here the batch column technically contains an array of tuples, but the sting representation seems acceptable.

Alternatively you can use concat_ws() to concat the values and then group by
select personid,lastname,firstname, address,city, array_agg(batch_columns) as batch_columns
from
(select fp.*, concat_ws(' / ',batch.batchid,batch.fp_inst_id, batch.xyz,batch.abc)::text as batch_columns
from fp
join batch
on fp.personid=batch.fp_inst_id)as table1
group by 1,2,3,4,5;
personid | lastname | firstname | address | city | batch_columns
----------+----------+-----------+---------+--------+---------------------------------------------------------------------------------
1 | savan | nahar | abc | xyz | {"1 / 1 / garbage1 / abc1","2 / 1 / garbage2 / abc2","3 / 1 / garbage3 / abc3"}
2 | mmm | asmd | aawd | 12k3mn | {"4 / 2 / garbage9 / abc9"}

Related

Join Same Column from Same Table Twice

I am new to the SQL world. I would like to replace the Games.home_team_id and Games.away_team_id with the Corresponding entry in the Teams.name column.
First I start by initializing a small table of data:
CREATE TABLE Games (id,away_team_id INT,away_team_score INT,home_team_id INT, home_team_score INT);
CREATE TABLE
INSERT INTO Games (id,away_team_id,away_team_score,home_team_id,home_team_score)
VALUES
(1,1,1,2,4),
(2,1,3,3,2),
(3,1,1,4,1),
(4,2,0,3,2),
(5,2,3,4,1),
(6,3,5,4,2)
;
INSERT 0 6
Then I create a template of a reference table
CREATE TABLE Teams (id INT, name VARCHAR(63);
CREATE TABLE
INSERT INTO Teams (id, name)
VALUES
(1, 'Oogabooga FC'),
(2, 'FC Milawnchair'),
(3, 'Ron\'s Footy United'),
(4, 'Pylon City FC')
;
INSERT 0 4
I would like to have the table displayed as such:
| id | away_team_name | away_team_score | home_team_name | home_team_score |
-----+----------------+-----------------+----------------+------------------
| 1 | Oogabooga FC | 1 | FC Milawnchair | 4 |
...
I managed to get a join query to show the first value from Teams.name in the away_team_name field using this JOIN:
SELECT
Games.id,
Teams.name AS away_team_name,
Games.away_team_score,
Teams.name AS home_team_name,
Games.home_team_score
FROM Games
JOIN Teams ON Teams.id = Games.away_team_id
;
| id | away_team_name | away_team_score | home_team_name | home_team_score |
-----+----------------+-----------------+----------------+------------------
| 1 | Oogabooga FC | 1 | Oogabooga FC | 4 |
...
But now I am stuck when I call it twice as a JOIN it shows the error:
SELECT
Games.id,
Teams.name AS away_team_name,
Games.away_team_score,
Teams.name AS home_team_name,
Games.home_team_score
FROM Games
JOIN Teams ON Teams.id = Games.away_team_id
JOIN Teams ON Teams.id = Games.home_team_id
;
ERROR: table name "teams" specified more than once
How do you reference the same reference the same column of the same table twice for a join?
You need to specify an alias for at least one of the instances of the table; preferably both.
SELECT
Games.id,
Away.name AS away_team_name,
Games.away_team_score,
Home.name AS home_team_name,
Games.home_team_score
FROM Games
JOIN Teams AS Away ON Away.id = Games.away_team_id
JOIN Teams AS Home ON Home.id = Games.home_team_id
Explanation: As you are joining to the same table twice, the DBMS (in your case, PostgreSQL) is unable to identify which of the tables you're referencing to when using its fields; the way to solve this is to assign an alias to the joined tables the same way you assign aliases for your columns. This way you can specify which of the joined instances are you referencing to in your SELECT, JOIN and WHERE statements.

How to remove bulk rows of duplicated chars from string in Postgres or SQLAlchemy?

I have a table with a column named "ids" , type of String. Could someone tell me how to remove the duplicated values in each of the rows?
Example, table is:
--------------------------------------------------
primary_key | ids
--------------------------------------------------
1 | {23,40,23}
--------------------------------------------------
2 | {78,40,13,78}
--------------------------------------------------
3 | {20,13,20}
--------------------------------------------------
4 | {7,2,7}
--------------------------------------------------
and I want to update it into:
--------------------------------------------------
primary_key | ids
--------------------------------------------------
1 | {23,40}
--------------------------------------------------
2 | {78,40,13}
--------------------------------------------------
3 | {20,13}
--------------------------------------------------
4 | {7,2}
--------------------------------------------------
In postgres I wrote:
UPDATE table_name
SET ids = (SELECT DISTINCT UNNEST(
(SELECT ids FROM table_name)::text[]))
In sqlalchemy I wrote:
session.query(table_name.ids).\
update({table_name.ids: func.unnest(table_name.ids,String).alias('data_view')},
synchronize_session=False)
None of these are working, so please help me, thanks in advance!
I think you could improve the design by storing these ids in another table one id per row with a foreign key referencing table_name.primary_key.
Also storing Array data as text strings seems strange.
Anyway, here is one way to do it: I wrapped the set returned by UNNEST with an inner subselect to be able to apply the aggregate_function needed to concatenate the strings again.
UPDATE table_name
SET ids = new_ids
FROM LATERAL (
SELECT primary_key, array_agg(elem)::text AS new_ids
FROM (SELECT DISTINCT primary_key, UNNEST(ids::text[]) as elem
FROM table_name ) t_inner
GROUP by primary_key )t_sub
WHERE t_sub.primary_key = table_name.primary_key

Fetch records with distinct value of one column while replacing another col's value when multiple records

I have 2 tables that I need to join based on distinct rid while replacing the column value with having different values in multiple rows. Better explained with an example set below.
CREATE TABLE usr (rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(12) NOT NULL,
email VARCHAR(20) NOT NULL);
CREATE TABLE usr_loc
(rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
code CHAR NOT NULL PRIMARY KEY,
loc_id INT NOT NULL PRIMARY KEY);
INSERT INTO usr VALUES
(1,'John','john#product'),
(2,'Linda','linda#product'),
(3,'Greg','greg#product'),
(4,'Kate','kate#product'),
(5,'Johny','johny#product'),
(6,'Mary','mary#test');
INSERT INTO usr_loc VALUES
(1,'A',4532),
(1,'I',4538),
(1,'I',4545),
(2,'I',3123),
(3,'A',4512),
(3,'A',4527),
(4,'I',4567),
(4,'A',4565),
(5,'I',4512),
(6,'I',4567);
(6,'I',4569);
Required Result Set
+-----+-------+------+-----------------+
| rid | name | Code | email |
+-----+-------+------+-----------------+
| 1 | John | B | 'john#product' |
| 2 | Linda | I | 'linda#product' |
| 3 | Greg | A | 'greg#product' |
| 4 | Kate | B | 'kate#product' |
| 5 | Johny | I | 'johny#product' |
| 6 | Mary | I | 'mary#test' |
+-----+-------+------+-----------------+
I have tried some queries to join and some to count but lost with the one which exactly satisfies the whole scenario.
The query I came up with is
SELECT distinct(a.rid)as rid, a.name, a.email, 'B' as code
FROM usr
JOIN usr_loc b ON a.rid=b.rid
WHERE a.rid IN (SELECT rid FROM usr_loc GROUP BY rid HAVING COUNT(*) > 1);`
You need to group by the users and count how many occurrences you have in usr_loc. If more than a single one, then replace the code by B. See below:
select
rid,
name,
case when cnt > 1 then 'B' else min_code end as code,
email
from (
select u.rid, u.name, u.email, min(l.code) as min_code, count(*) as cnt
from usr u
join usr_loc l on l.rid = u.rid
group by u.rid, u.name, u.email
) x;
Seems to me that you are using MySQL, rather than IBM DB2. Is that so?

Need cleaner update method in PostgreSQL 9.1.3

Using PostgreSQL 9.1.3 I have a points table like so (What's the right way to show tables here??)
| Column | Type | Table Modifiers | Storage
|--------|-------------------|-----------------------------------------------------|----------|
| id | integer | not null default nextval('points_id_seq'::regclass) | plain |
| name | character varying | not null | extended |
| abbrev | character varying | not null | extended |
| amount | real | not null | plain |
In another table, orders I have a bunch of columns where the name of the column exists in the points table via the abbrev column, as well as a total_points column
| Column | Type | Table Modifiers |
|--------------|--------|--------------------|
| ud | real | not null default 0 |
| sw | real | not null default 0 |
| prrv | real | not null default 0 |
| total_points | real | default 0 |
So in orders I have the sw column, and in points I'll now have an amount that realtes to the column where abbrev = sw
I have about 15 columns like that in the points table, and now I want to set a trigger so that when I create/update an entry in the points table, I calculate a total score. Basically with just those three shown I could do it long-hand like this:
UPDATE points
SET total_points =
ud * (SELECT amount FROM points WHERE abbrev = 'ud') +
sw * (SELECT amount FROM points WHERE abbrev = 'sw') +
prrv * (SELECT amount FROM points WHERE abbrev = 'prrv')
WHERE ....
But that's just plain ugly and repetative, and like I said there are really 15 of them (right now...). I'm hoping there's a more sophisticated way to handle this.
In general each of those silly names on the orders table represents a type of work associated with the order, and each of those types has a 'cost' to it, which is stores in the points table. I'm not married to this structure if there's a cleaner setup.
"Serialize" the costs for orders:
CREATE TABLE order_cost (
order_cost_id serial PRIMARY KEY
, order_id int NOT NULL REFERENCES order
, cost_type_id int NOT NULL REFERENCES points
, cost int NOT NULL DEFAULT 0 -- in Cent
);
For a single row:
UPDATE orders o
SET total_points = COALESCE((
SELECT sum(oc.cost * p.amount) AS order_cost
FROM order_cost oc
JOIN points p ON oc.cost_type_id = p.id
WHERE oc.order_id = o.order_id
), 0);
WHERE o.order_id = $<order_id> -- your order_id here ...
Never use the lossy type real for currency data. Use exact types like money, numeric or just integer - where integer is supposed to store the amount in Cent.
More advice in this closely related example:
How to implement a many-to-many relationship in PostgreSQL?

how to make array_agg() work like group_concat() from mySQL

So I have this table:
create table test (
id integer,
rank integer,
image varchar(30)
);
Then some values:
id | rank | image
---+------+-------
1 | 2 | bbb
1 | 3 | ccc
1 | 1 | aaa
2 | 3 | c
2 | 1 | a
2 | 2 | b
I want to group them by id and concatenate the image name in the order given by rank. In mySQL I can do this:
select id,
group_concat( image order by rank asc separator ',' )
from test
group by id;
And the output would be:
1 aaa,bbb,ccc
2 a,b,c
Is there a way I can have this in postgresql?
If I try to use array_agg() the names will not show in the correct order and apparently I was not able to find a way to sort them. (I was using postgres 8.4 )
In PostgreSQL 8.4 you cannot explicitly order array_agg but you can work around it by ordering the rows passed into to the group/aggregate with a subquery:
SELECT id, array_to_string(array_agg(image), ',')
FROM (SELECT * FROM test ORDER BY id, rank) x
GROUP BY id;
In PostgreSQL 9.0 aggregate expressions can have an ORDER BY clause:
SELECT id, array_to_string(array_agg(image ORDER BY rank), ',')
FROM test
GROUP BY id;