Join column from a table with concat columns from another one - postgresql

I try to match names from a table with a concatenation of columns in another table with Postgres.
What I have:
Table A:
id,name
1,John Smith
2,Laura Doe Van Renburg
3,Laura Thorpe
4,Carl Leonard Dong
Table B:
id,firstname,lastname
1,Aloys,Smith
2,Laura,Doe Van Renburg
3,Pedro,De Mung
4,Carl Leonard, Dong
The result I expect
Laura Doe Van Renburg
Carl Leonard Dong
What I tried
I think concatening the columns firstname and lastname from table B could help but I can't figure out what the correct syntax is.
select A.name from A
join (select concat(firstname,' ',lastname) from B) as firstandlast
on a.name = firstandlast;
But it's not the correct way. Any clue would be welcome!

You were close:
select a.name
from table_a a
join table_b b on concat(b.firstname, ' ', b.lastname) = a.name

Related

Redshift : Join the tables with like condition

I want table_A to join when the value in column_A matches with the part or full string in Column_B of Table B
Ex:
TableA:
column_A
Denver
Chicago
Newyork
Dallas
TableB:
Column_B
Chicago
Newyork, Dallas
$Denver
Expected Result
column_A Column_B
Denver $Denver
Chicago Chicago
Newyork Newyork, Dallas
Dallas Newyork, Dallas
I am trying this -
SELECT *
FROM TableA a
JOIN TableB b ON a.Column_A LIKE '%' || b.Column_B || '%'
The above concat seems to work for single values but not where we have commas. Any suggestions would be appreciated. Thanks.
I think you can simply use the similar to operation. See the documentation.

Concat Names against row_number() or similar function

my data repeats rows for individual relationships between people. For example, the below states that John Smith is known by 3 employees:
Person EmployeeWhoKnowsPerson
John Smith Derek Jones
John Smith Adrian Daniels
John Smith Peter Low
I am looking to do the following:
1) Count the number of people who know John Smith. I have done this via the row_number() function and it appears to be behaving:
select Person, MAX(rowrank) as rowrank
from (
select Person, EmployeeWhoKnowsPerson, rowrank=ROW_NUMBER() over (partition by Person order by EmployeeWhoKnowsPerson desc)
from Data
) as t
group by Person
Which returns:
Person rowrank
John Smith 3
But now i am looking at concatenating the EmployeeWhoKnowsPerson column to return and was wondering how this might be possible:
Person rowrank EmployeesWhoKnow
John Smith 3 Derek Jones, Adrian Daniels, Peter Low
For SQL Server 2017 +
select
person,
count(*) as KnowsCount,
string_agg(EmployeeWhoKnowsPerson, ',') WITHIN GROUP (ORDER BY EmployeeWhoKnowsPerson ASC) AS EmployeesWhoKnowPerson
from
data
group by person;
For prior versions:
select
person,
count(*) as KnowsCount,
stuff((select ',' + EmployeeWhoKnowsPerson
from data as dd
where dd.Person = d.Person
order by EmployeeWhoKnowsPerson
for xml path('')), 1, 1, '') AS EmployeesWhoKnowPerson
from
data as d
group by person;
And you're overthinking that whole count of who knows piece.
Here's a SQL Fiddle Demo with an extra name thrown in.
If 2017+, you can use string_agg() in a simple group by
Example
Declare #YourTable Table ([Person] varchar(50),[EmployeeWhoKnowsPerson] varchar(50)) Insert Into #YourTable Values
('John Smith','Derek Jones')
,('John Smith','Adrian Daniels')
,('John Smith','Peter Low')
Select Person
,rowrank = sum(1)
,[EmployeeWhoKnowsPerson] = string_agg([EmployeeWhoKnowsPerson],', ')
From #YourTable
Group By Person
Returns
Person rowrank EmployeeWhoKnowsPerson
John Smith 3 Derek Jones, Adrian Daniels, Peter Low
If <2017 ... use the stuff()/xml approach
Select Person
,rowrank = sum(1)
,[EmployeeWhoKnowsPerson] = stuff((Select ', ' + [EmployeeWhoKnowsPerson]
From #YourTable
Where Person=A.Person
For XML Path ('')),1,2,'')
From #YourTable A
Group By Person

PostgreSQL join with similar address

I am trying to join data from disparate sources. The only common field to join is address. In table 1 , address has extra data (representing neighborhood) between street and state. Is there a way to join these tables using most similar address? I have 85,000 addresses, so a manual search using LIKE and wildcards will not work.
Table 1:
"239 Dudley St Dudley Square Roxbury MA 02119"
"539 Dudley St Dudley Square Roxbury MA 02119"
Table 2:
"239 Dudley St Roxbury MA 02119"
"539 Dudley St Roxbury MA 02119"
I have two suggestions:
1) "All words in the table 2 address are present in the table 1 address":
select *
from t1 join
t2 on (string_to_array(t2.address,' ') <# string_to_array(t1.address,' '));
2) "For each table 1 address find the most similar address from the table 2":
select distinct on(t1.address) *
from t1 cross join t2
order by t1.address, similarity(t1.address, t2.address) desc;

Combine similar rows using case statement

I have a query currently populating a report which has a few rows of "duplicate" information. Similar IDs are being passed through which should be combined but are unique enough that we do not want to Concat/Insert them within our model. In order for the report to be processed correctly, I need to sum their $ values (The only information I actually need to keep preserved is the name, the final Summed amount, and the ID.
Is there a simple way to achieve this by creating a case statement the solely will sum the Amount field? I tried using a SUM(CASE WHEN statement but I do not want a new column since my report is only using that field to populate $$ information. Here is a sample of my issue below:
ID Name Amount Person
+-------+--------------+------------+-----------------------+
21011 Place A -210.30 John Doe
210115 Place A-a 6500.70 John Doe
21060 Place B 255.00 Wayne C
2106015 Place Bb 212.30 Wayne C
2106015 Place Bb 1212.30 Wayne C
2106015 Place Bb 212.30 Wayne C
21080 Place J 57212.30 Billy J
My desired result for this would be:
ID Name Amount Person
+-------+--------------+------------+-----------------------+
21011 Place A 6290.40 John Doe
21060 Place B 1889.90 Wayne C
21080 Place J 57212.30 Billy J
Is there a simplified way to combine these rows in TSQL without modifying the db?
You can try this (provided your ID column is a number and not a character field):
;WITH cte_getsum AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY Person ORDER BY ID) AS RowNum,
ID,
NAME,
(SELECT SUM(Amount) FROM TableName WHERE TableName.Person = t1.Person) AS SumAmount,
Person
FROM
TableName t1
)
SELECT * FROM cte_getsum
WHERE rownum = 1
You can try with below script, I created a temp table just for sample Data.. but in your case you can directly refer to table you have.
SELECT * INTO #tmpInput
FROM (VALUES('21011','Place A', -210.30,'John Doe'),
('210115','Place A-a',6500.70,'John Doe'),
('21060', 'Place B' ,255.00,'Wayne C'),
('2106015', 'Place Bb' ,212.30,'Wayne C'),
('2106015' , 'Place Bb' ,1212.30,'Wayne C'),
('2106015' , 'Place Bb' ,212.30 ,'Wayne C')
,('21080' , 'Place J' ,57212.30,'Billy J')
)Input (ID,Name,Amount,Person)
SELECT SUBSTRING(t1.ID,0,6) ID
,t2.Name
,SUM(t1.Amount) AMOUNT
,t2.Person
FROM #tmpInput t1
INNER JOIN #tmpInput t2 ON t2.ID=SUBSTRING(t1.ID,0,6)
GROUP BY SUBSTRING(t1.ID,0,6),t2.Name,t2.Person

Compare 2 tables fields and if they match, copy primary key over to form relation POSTGRESQL

This shows some sample data that I might have (real data is much larger):
table1:
date forename surname PK
1998 john harry
1928 fred kale
table2:
date forename surname PK
1998 john harry 2
1928 fred kale 98
I need to compare table2 with table1 and if they match then I need to add the same PK from table2 into table1 to form a relation.
EDIT: I would like to add that in table1, the "people" can appear twice but only once in table2.
PostgreSQL:
UPDATE table1
SET FK = table2.PK
FROM table2
WHERE
table1.date = table2.date
AND table1.forename = table2.forename
AND table1.surname = table2.surname
SQL Server
UPDATE t1
SET FK = t2.PK
FROM
table1 t1 INNER JOIN table2 t2
ON t1.date = t2.date
AND t1.forename = t2.forename
AND t1.surname = t2.surname