How to manage NULL strings or dates in sql queries (PostgreSQL) - postgresql

PostgreSQL 11.1
With the below sql query, where $1 and $2 are strings and $3 is a timestamp, how can the below query be rewritten so that a null value in $3 allows for every date to be selected (not just null dates).
SELECT lastname, firstname, birthdate FROM patients
WHERE UPPER(lastname) LIKE UPPER($1)||'%' and UPPER(firstname) LIKE UPPER($2)||'%' AND birthdate::date = $3::date
UNION
SELECT lastname, firstname, birthdate FROM appointment_book
WHERE UPPER(lastname) LIKE UPPER($1)||'%' and UPPER(firstname) LIKE UPPER($2)||'$' and birthdate::date = $3::date
That is, if $3 is null, then this should reduce to:
SELECT lastname, firstname, birthdate FROM patients
WHERE UPPER(lastname) LIKE UPPER($1)||'%' and UPPER(firstname) LIKE UPPER($2)||'%'
UNION
SELECT lastname, firstname, birthdate FROM appointment_book
WHERE UPPER(lastname) LIKE UPPER($1)||'%' and UPPER(firstname) LIKE UPPER($2)||'$'

Untested but I think you can handle that with a CASE expression
SELECT lastname, firstname, birthdate FROM patients p
WHERE UPPER(p.lastname) LIKE UPPER($1)||'%'
AND UPPER(p.firstname) LIKE UPPER($2)||'%'
AND (CASE WHEN $3 IS NULL THEN TRUE
ELSE p.birthdate::date = $3::date
END)
UNION
SELECT lastname, firstname, birthdate FROM appointment_book ab
WHERE UPPER(ab.lastname) LIKE UPPER($1)||'%'
AND UPPER(ab.firstname) LIKE UPPER($2)||'%'
AND (CASE WHEN $3 IS NULL THEN TRUE
ELSE ab.birthdate::date = $3::date
END);

Related

SqlAlchemy: groupby of a union Error: SELECT construct for inclusion in a UNION or other set construct expected

I trying to do 2 queries with different table, union them and then do a groupby with sum.
Here is the PostgreSQL code snippet:
SELECT
to_char(worklist.date,'YYYY-MM') AS year_month,
department.department_name AS department_name,
user.username AS user_name,
SUM(worklist.hour) AS TotalHour,
SUM(worklist.overtime_hour) AS overtime_TotalHour,
CAST(NULL AS numeric(20)) AS total_price
FROM user, worklist, department
WHERE user.id = worklist.user_id AND
user.department_id = department.id
group BY year_month, user_name, department_name
UNION
SELECT
to_char(expenditure.date,'YYYY-MM') AS year_month,
department.department_name AS department_name,
user.username AS user_name,
CAST(NULL AS numeric(20)) AS TotalHour,
CAST(NULL AS numeric(20)) AS overtime_TotalHour,
SUM(expenditure.price) AS total_price
FROM user, department, expenditure
WHERE user.id = expenditure.user_id AND
user.department_id = department.id
group BY year_month, user_name, department_name
ORDER BY year_month DESC
query result:
PostgreSQL query result
I want to implement this in sqlalchemy and here is how i would approach it:
def get_allusers_monthly(db: Session):
qry1 = (db.query(
func.to_char(workhour.Workhour.date,'YYYY-MM').label('year_month'),
department.Department.department_name.label('department_name'),
user.User.username.label('user_name'),
func.sum(workhour.Workhour.hour).label('total_hour'),
func.sum(workhour.Workhour.overtime_hour).label('total_overtime_hour'),
cast(expen.Expenditure.price, Numeric(10)).label('total_pric')
).filter(user.User.id == workhour.Workhour.user_id and user.User.department_id == department.Department.id
).group_by(
'year_month',
'department_name',
'user_name',
'total_pric'
).all())
qry2 = (db.query(
func.to_char(expen.Expenditure.date,'YYYY-MM').label('year_month'),
department.Department.department_name.label('department_name'),
user.User.username.label('user_name'),
cast(workhour.Workhour.hour, Numeric(10)).label('total_hour'),
cast(workhour.Workhour.overtime_hour, Numeric(10)).label('total_overtime_hour'),
func.sum(expen.Expenditure.price).label('total_pric')
).filter(user.User.id == expen.Expenditure.user_id and user.User.department_id == department.Department.id
).group_by(
'year_month',
'department_name',
'user_name',
'total_hour',
'total_overtime_hour'
).all())
all_queries = [qry1, qry2]
golden_set = union(*all_queries).subquery()
return golden_set
Here is the outputted that gets this error:
sqlalchemy.exc.ArgumentError: SELECT construct for inclusion in a UNION or other set construct expected
Where am I wrong? or is there has any better way to implement this
Can anybody please help?!

Am able to parse the first and last name, from full name, how do I parse the Middle Name?

Am able to parse the first and last name, from full name, how do I parse the Middle Name? There are no titles used such as 'MR','MS','DR','FR', 'MRS','LRD','SIR', 'LORD','LADY','MISS','PROF so I think I can use the substring. The name format can be firstname middlename lastname, or firstname lastname, with the space in the middle.
UPDATE p
SET p.LAST_NAME = c.LASTNAME --tested that join is correct, contact name is combined, will need to parse it out ***, need to reference inserted
--Need FIRST_NAME, MIDDLE_NAME, LAST_NAME
p.FIRST_NAME = SUBSTRING(c.CONTACT, 1, CHARINDEX(' ', c.CONTACT) - 1) AS FirstName,
p.MIDDLE_NAME = --need middle name
p.LAST_NAME = SUBSTRING(CONTACT, CHARINDEX(' ', CONTACT) + 1, len(CONTACT)) AS LastName
FROM GMUnitTest.dbo.CONTACT1 c
JOIN PCUnitTest.dbo.PEOPLE p
ON p.PEOPLE_ID = c.KEY4
WHERE c.Key1 = '31';
Based on what you said, that there must be a middle name, you can use something like this:
declare #table table (fullName varchar(256))
insert into #table values
('First Middle Last'),
('John Mary-Lou Smith'),
('Frank NMN Sanatra')
select
CHARINDEX(' ',fullName,1)
,left(fullName,CHARINDEX(' ',fullName,1) - 1) as FirstName
,substring(fullName,CHARINDEX(' ',fullName,1) + 1,(len(fullName) - CHARINDEX(' ',fullName,1)) - charindex(' ',reverse(fullName),1)) as MiddleName
,right(fullName,charindex(' ',reverse(fullName),1)) as LastName
from
#table

How to separate text using substring

I was wondering how can I separate a column containing the following:
BURGER, Petrus (CHV 494081)
Into 3 columns:
FirstName, LastName, ID
SELECT
a[2] AS FirstName,
a[1] AS LastName,
a[3] AS ID
FROM (
SELECT regexp_matches(column_name, '(.+), (.+) \((.+)\)')
FROM table_name
) t(a)

Identifying duplicates within a table: looking for query advice

So I am trying to identify duplicated contact records within an account, and looking for the best way to do this. There is a an account table, and a contact table. Below is the query I've come up with to give me what I need, but I feel like there is probably a better/more efficient way to do this, so looking for any feedback/advice. Thanks in advance!
SELECT * FROM sysdba.CONTACT a WITH(NOLOCK)
WHERE EXISTS
(
SELECT ACCOUNTID, FIRSTNAME, LASTNAME, EMAIL FROM sysdba.CONTACT b WITH(NOLOCK)
GROUP BY ACCOUNTID, FIRSTNAME, LASTNAME, EMAIL
HAVING COUNT(*) > 1
AND a.ACCOUNTID = b.ACCOUNTID AND a.FIRSTNAME = b.FIRSTNAME AND a.LASTNAME = b.LASTNAME AND a.EMAIL = b.EMAIL
)
ORDER BY ACCOUNTID, FIRSTNAME, LASTNAME, EMAIL
Here is another way I can do this, but having to use DISTINCT seems ugly..
SELECT DISTINCT a.CONTACTID, a.FIRSTNAME, a.LASTNAME, a.EMAIL FROM sysdba.CONTACT a WITH(NOLOCK)
JOIN sysdba.CONTACT b WITH(NOLOCK)
ON a.ACCOUNTID = b.ACCOUNTID AND a.FIRSTNAME = b.FIRSTNAME AND a.LASTNAME = b.LASTNAME AND a.EMAIL = b.EMAIL AND a.CONTACTID != b.CONTACTID
ORDER BY a.CONTACTID, a.FIRSTNAME, a.LASTNAME, a.EMAIL
When checking the execution plans for both, the first query is 37% compared to 63% in the second query, which is surprising, as I've always though (apparently wrong) that using joins is quicker than relying on a where clause.
Quite common practice, when you trying to identify duplicates, is to use windowed aggregate functions, such as COUNT() OVER (...) and ROW_NUMBER() OVER (...).
Below is the query that should return you groups of records, where there are more than one CONTACTID for the same ACCOUNTID, FIRSTNAME, LASTNAME, EMAIL combination. In other words this query returns records, having duplicates, along with their duplicates:
;WITH cteCONTACT
AS (
SELECT ACCOUNTID, FIRSTNAME, LASTNAME, EMAIL, CONTACTID,
CNT = COUNT(*) OVER (PARTITION BY ACCOUNTID, FIRSTNAME, LASTNAME, EMAIL)
FROM sysdba.CONTACT
)
SELECT ACCOUNTID, FIRSTNAME, LASTNAME, EMAIL, CONTACTID
FROM cteCONTACT
WHERE CNT > 1;
And the following query should return duplicates only, without records that they duplicates are:
;WITH cteCONTACT
AS (
SELECT ACCOUNTID, FIRSTNAME, LASTNAME, EMAIL, CONTACTID,
NUM = ROW_NUMBER() OVER (
PARTITION BY ACCOUNTID, FIRSTNAME, LASTNAME, EMAIL
ORDER BY CONTACTID)
FROM sysdba.CONTACT
)
SELECT ACCOUNTID, FIRSTNAME, LASTNAME, EMAIL, CONTACTID
FROM cteCONTACT
WHERE NUM > 1;

How do you perform a search on a 1-to-many relationship when the criteria could be on either table?

I am using t-sql. I have what I thought would be an easy search. There is a 1-to-many relationship between SalesPerson and TradeShow. 1 salesperson could have gone to many trade shows. I need to be able to search on the SalePerson. I also need to be able to search on the LAST trade show they attended. I thought I would be able to do simple join and group on their last trade show, but I can not display the City or State.
SELECT SalePersonID, FirstName, LastName, TradeShow.DateLastWent
FROM SalesPerson INNER JOIN
(SELECT SalePersonID, MAX(DateLastWent) AS DateLastWent
FROM TradeShow
GROUP BY SalesPersonID) AS TradeShow ON SalesPerson.SalePersonID= TradeShow.SalePersonID
This workds, but the Tradeshow also has city and State. I need to be able to search on and display city and state. But if I include them in the subquery, I have to include thm in an aggregate function, and if I do that, I get the incorrect city and state.
The tables are simple
SALEPERSON
salespersonID PK
firstname
lastname
TRADESHOW
tradeshowID PK
datelastwent
city
state
salespersonID FK
Re-word it: what you want is the salesperson, plus the information from the last show that they have been to.
Select
SalePersonID,
FirstName,
LastName,
TradeShow.DateLastWent,
TradeShow.City,
TradeShow.State
From
SalesPerson
Inner Join TradeShow
On SalesPerson.SalePersonID = TradeShow.SalePersonID
Where
TradeShow.TradeShowID =
(Select Top 1 Latest.TradeShowID
From TradeShow As Latest
Where SalesPerson.SalePersonID = Latest.SalePersonID
Order By Latest.DateLastWent Desc)
You can join TradeShow twice :
SELECT SalePersonID, FirstName, LastName, TS1.DateLastWent,
TS2.City, TS2.State
FROM SalesPerson INNER JOIN
(SELECT SalePersonID, MAX(DateLastWent) AS DateLastWent
FROM TradeShow
GROUP BY SalesPersonID
) AS TS1 ON (SalesPerson.SalePersonID= TradeShow.SalePersonID)
INNER JOIN TradeShow TS2 ON
(TS2.SalePersonID = TS1.SalePersonID AND TS2.DateLastWent = TS1.DateLastWent)
WHERE TS2.City = 'CityName'
There is likely a more elegant way to solve this, but my first thought is to simply grab the newest TradeShow record to join with
SELECT SalePersonID, FirstName, LastName, TradeShow.DateLastWent
FROM SalesPerson
INNER JOIN (
SELECT *
FROM (
SELECT TradeShowId, DateLastWent, City, State, SalesPersonId
FROM TradeShow
ORDER BY datelastwent DESC
)
WHERE ROWNUM <= 1
) ON SalesPerson.SalesPersonId = TradeShow.SalesPersonId
Edit
Oops... been playing with Oracle too much
ROW_NUMBER() OVER(order by date) or SELECT TOP X
would be thw SQL Server way for doing this... don't have an instance of SQL-Server running, but pretty sure the syntax ends up being something like
SELECT SalePersonID, FirstName, LastName, TradeShow.DateLastWent
FROM SalesPerson
INNER JOIN (
SELECT TradeShowId, DateLastWent, City, State, SalesPersonId, ROW_NUMBER() OVER(PARTITION BY TradeShow.SalesPersonId ORDER BY DateLastWent DESC) RowNumber
FROM TradeShow
) ON SalesPerson.SalesPersonId = TradeShow.SalesPersonId AN TradeShow.RowNumber = 1