How to upsert based on values from another table? - postgresql

I'm trying to UPSERT into postgres DB based on values from another table using PeeWee.
**table1**
pk_t1 int
name
city
country
**table2**
pk_t2 int
name
city
country
comments
INSERT INTO table2 (pk_t2, name, city, country)
SELECT pk_1, name, city, country
FROM table1
ON CONFLICT (pk_t2) DO UPDATE
SET name = excluded.name, city = excluded.city, country = excluded.country;
But I'm unable to find a suitable peewee example from documents or SO.

Here you go:
q = T1.select()
iq = (T2
.insert_from(q, fields=[T2.id, T2.name, T2.city, T2.country])
.on_conflict(conflict_target=[T2.id], preserve=[T2.name, T2.city, T2.country]))
Corresponding SQL peewee generates:
insert into t2 (id, name, city, country)
select t1.id, t1.name, t1.city, t1.country
from t1
on conflict(id) do update set
name=excluded.name,
city=excluded.city,
country=excluded.country

Related

Selecting distinct values

The domain is:
company (id, name, adress)
employee (id, name, adress, company_id, expertise_id)
dependantrelative (id, name, employee_id)
expertise (id, name, class)
I want to know how to get the number of dependantrelatives of each employee who are unique experts in their respective companies.
The Query below does not return the correct answer. Can you help me?
SELECT DISTINCT dependantrelative.employee_id
, COUNT(*) AS qty_dependantrelatives
FROM dependantrelative
INNER JOIN employee
ON employee.id = dependantrelative.employee_id
GROUP BY dependantrelative.employee_id
I just tried out the Query below and it works, but I want to know if there is a faster and simple way of getting the answer.
SELECT employee.id
,COUNT(dependantrelative.employee_id) AS qty_dependantrelatives
FROM (
SELECT employee.company_id
, employee.expertise_id AS expert
, COUNT(employee.expertise_id)
FROM employee
GROUP BY employee.company_id
, employee.expertise_id
HAVING COUNT(employee.expertise_id)<2
) AS uniexpert
LEFT JOIN employee
ON employee.expertise_id = uniexpert.expert
LEFT JOIN salesorderdetail
ON dependantrelative.employee_id = employee.id
GROUP BY employee.id
ORDER BY employee.id

Returning rows with distinct column value with data jpa named query

Assuming I have a table with 3 columns, ID, Name, City and I want to use named query to return rows with unique city..can it be done?
Are you asking whether it is possible to write a query that will return the cities that appear in exactly one row, in a table that has ID/Name/City triplets where there could be multiple rows for the same city but with different names?
If so, it would depend on the database engine behind the scenes - but you could try things like:
with candidates (city, num) as (
select city, count(*) from table
group by city
)
select city from candidates where num = 1
Or
select t1.city from table t1
where not exists (
select * from table t2
where t2.city = t1.city and t2.id <> t1.id
)
where table is your table with these triplets.

Using "UNION ALL" and "GROUP BY" to implement "Intersect"

I'v provided following query to find common records in 2 data sets but it's difficult for me to make sure about correctness of my query because of that I have a lot of data records in my DB.
Is it OK to implement Intersect between "Customers" & "Employees" tables using UNION ALL and apply GROUP BY on the result like below?
SELECT D.Country, D.Region, D.City
FROM (SELECT DISTINCT Country, Region, City
FROM Customers
UNION ALL
SELECT DISTINCT Country, Region, City
FROM Employees) AS D
GROUP BY D.Country, D.Region, D.City
HAVING COUNT(*) = 2;
So can we say that any record which exists in the result of this query also exists in the Intersect set between "Customers & Employees" tables AND any record that exists in Intersect set between "Customers & Employees" tables will be in the result of this query too?
So is it right to say any record in result of this query is in
"Intersect" set between "Customers & Employees" "AND" any record that
exist in "Intersect" set between "Customers & Employees" is in result
of this query too?
YES.
... Yes, but it won't be as efficient because you are filtering out duplicates three times instead of once. In your query you're
Using DISTINCT to pull unique records from employees
Using DISTINCT to pull unique records from customers
Combining both queries using UNION ALL
Using GROUP BY in your outer query to to filter the records you retrieved in steps 1,2 and 3.
Using INTERSECT will return identical results but more efficiently. To see for yourself you can create the sample data below and run both queries:
use tempdb
go
if object_id('dbo.customers') is not null drop table dbo.customers;
if object_id('dbo.employees') is not null drop table dbo.employees;
create table dbo.customers
(
customerId int identity,
country varchar(50),
region varchar(50),
city varchar(100)
);
create table dbo.employees
(
employeeId int identity,
country varchar(50),
region varchar(50),
city varchar(100)
);
insert dbo.customers(country, region, city)
values ('us', 'N/E', 'New York'), ('us', 'N/W', 'Seattle'),('us', 'Midwest', 'Chicago');
insert dbo.employees
values ('us', 'S/E', 'Miami'), ('us', 'N/W', 'Portland'),('us', 'Midwest', 'Chicago');
Run these queries:
SELECT D.Country, D.Region, D.City
FROM
(
SELECT DISTINCT Country, Region, City
FROM Customers
UNION ALL
SELECT DISTINCT Country, Region, City
FROM Employees
) AS D
GROUP BY D.Country, D.Region, D.City
HAVING COUNT(*) = 2;
SELECT Country, Region, City
FROM dbo.customers
INTERSECT
SELECT Country, Region, City
FROM dbo.employees;
Results:
Country Region City
----------- ---------- ----------
us Midwest Chicago
Country Region City
----------- ---------- ----------
us Midwest Chicago
If using INTERSECT is not an option OR you want a faster query you could improve the query you posted a couple different ways, such as:
Option 1: let GROUP BY handle ALL the de-duplication like this:
This is the same as what you posted but without the DISTINCTS
SELECT D.Country, D.Region, D.City
FROM
(
SELECT Country, Region, City
FROM Customers
UNION ALL
SELECT Country, Region, City
FROM Employees
) AS D
GROUP BY D.Country, D.Region, D.City
HAVING COUNT(*) = 2;
Option 2: Use ROW_NUMBER
This would be my preference and will likely be most efficient
SELECT Country, Region, City
FROM
(
SELECT
rn = row_number() over (partition by D.Country, D.Region, D.City order by (SELECT null)),
D.Country, D.Region, D.City
FROM
(
SELECT Country, Region, City
FROM Customers
UNION ALL
SELECT Country, Region, City
FROM Employees
) AS D
) uniquify
WHERE rn = 2;

Consume The Changes / Deltas using Postgresql

Following is my scenario:
I have 2 landing tables source_table and destination_table.
I need a query/queries which will update the destination table with the new rows as well as the updated rows from source table.
Sample Data would be:
source table:
id name salary
1 P1 10000
2 P2 20000
target table:
id name salary
1 P1 8000
And the expected output should be:
target table:
id name salary
1 P1 10000 (salary updated)
2 P2 20000 (new row inserted)
This doesn't seem to work:
select * from user_source
except
select * from user_target as s
INSERT INTO user_target (id, name, salary)
VALUES (s.id, s.name, s.salary) WHERE id !=s.id
UPDATE user_target
SET name=s.name, salary=s.salary,
WHERE id = s.id
Seems like a simple insert ... on conflict to me:
insert into target_table (id, name, salary)
select id, name, salary
from source_table
on conflict (id) do update
set name = excluded.name,
salary = excluded.salary;
This assumes that the id column is the primary (or unique) key. Looking at the sample data (id, name) might also be unique. In that case you need to change the on conflict() clause and obviously remove the update of the name column as well.

Query-Sql Developer

I am creating some queries for my project, but I face some difficulties with the follow ones:
A SELECT statement containing a subquery to retrieve a list of Locations (location id and street_address) that have employees with higher salary than the average of their department. The list must contain the number of those employees and their total salary per location. Name these aggregates respectively "emp" and "totalsalary". The locations in the list must be ordered by location_id.
Select LOCATION_ID, STREET_ADDRESS
from HR.LOCATIONS IN
(Select Employee_id
from HR.Employees
Where Salary > round(avg(SALARY)))
order by location_id;
error: SQL command not properly ended
and the second query is the following
The JOB_HISTORY table can contain more than one entries for an employee who was hired more than once. Create a query to retrieve a list of Employees that were hired more than once. Include the columns EMPLOYEE_ID, LAST_NAME, FIRST_NAME and the aggregate "Times Hired".
SELECT FIRST_NAME,LAST_NAME,EMPLOYEE_ID,
count (*)as TIMES_HIRED
from HR.JOB_HISTORY, HR.EMPLOYEES
where EMPLOYEE_ID= LAST_NAME
having COUNT(*) >1;
error: not a single-group
Try these hope they help. I am making an assumption that employee table has Location_Id column. I am adding Employee_id to Group by to make sure you get correct TotalSalary:
Select LOCATION_ID, STREET_ADDRESS, Count(Employee_id) AS emp, SUM(salary) AS totalsalary
from HR.LOCATIONS INNER JOIN
(Select Employee_id, salary
from HR.Employees
Having Salary > round(avg(SALARY), 0)) AS Emp ON HR.LOCATION_ID = Emp.Location_ID
Group By LOCATION_ID, STREET_ADDRESS, Employee_id
order by location_id;
For the second question:
SELECT FIRST_NAME,LAST_NAME,EMPLOYEE_ID,
count(Employee_id) as TIMES_HIRED
from HR.JOB_HISTORY inner join HR.EMPLOYEES On JOB_HISTORY.Employee_id = Employees.Employee_id
Group By FIRST_NAME,LAST_NAME,EMPLOYEE_ID
Having count(Employee_id) >1;