Matching two date columns in a postgres query with multiple joined tables - postgresql

I am getting all of the information that I need from the following query:
SELECT
o.id,
o.customer_context,
o.organization_name,
o.shipping_name,
o.shipping_street1,
o.shipping_city,
o.shipping_state,
o.shipping_postal_code,
order_total,
shipping_charge,
sales_tax_charge,
discount_amount,
charge_date,
ship_date,
o.email,
shipping_country,
c.status,
c.unsubscribe,
c.last_logon,
c.last_action,
c.full_name AS customer_name,
c.email AS customer_email,
c.billing_email AS customer_billing_email,
c.organization_name AS customer_org_name,
c.phone AS customer_phone,
li.valid_from_dt,
li.valid_thru_dt,
pr.name,
sum(wi.order_qty) AS printed_book_count
FROM
online_order_onlineorder AS o
LEFT OUTER JOIN online_order_weborderitem AS wi ON (wi.web_order_id = o.id
AND format = 'PRT')
LEFT OUTER JOIN customer_customer AS c ON (c.id = o.customer_id)
LEFT OUTER JOIN customer_customer_curriculum_license AS li ON (li.customer_id = c.id)
LEFT OUTER JOIN product_curriculumlicense AS pli ON (pli.product_ptr_id = li.license_id)
LEFT OUTER JOIN product_product AS pr ON (pr.id = pli.product_ptr_id)
WHERE
o.status in('F', 'FF')
AND o.charge_date >= '2019-03-05'
AND o.charge_date < '2020-10-05'
GROUP BY
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28
ORDER BY
charge_date,
shipping_name
The only problem is that I can not get the o.charge_date to match the li.valid_from_dt
I have tried adding comparison operators such as:
WHERE
o.status in('F', 'FF')
AND o.charge_date >= '2019-03-05'
AND o.charge_date < '2020-10-05'
AND li.valid_from_dt >= '2019-03-05'
AND li.valid_from_dt < '2020-10-05'
but as expected it just limits the pool of li.valid_from_dt and still doesn't match up with o.charge_date I also need to account for the fact that certain orders will have NULL for the li.valid_from_dt
The only relations between o.charge_date and li.valid_from_dt is that they both share a relation with the c.id is there some way to bring these tables together to match the two dates, and keep the all of the other data the same?
I have spent a while working on this and any help is greatly appreciated.
EDIT: Additional info, here is an example of the customer_customer_curriculum_license table from the same customer.
id
valid_from_dt
valid_thru_dt
created_on
updated_on
purchase_price
max_head_count
customer_id
license_id
2262
2014-06-03
2015-06-03
2015-06-24 18:35:36.884+00
2015-07-01 21:43:55.125+00
440.00
29
4178
1
2263
2014-06-03
2015-06-03
2015-06-24 18:35:36.888+00
2015-07-01 21:43:55.128+00
440.00
19
4178
17
2264
2014-06-03
2015-07-13
2015-06-24 18:35:36.891+00
2015-06-29 21:55:30.095+00
440.00
29
4178
13
2265
2014-06-03
2015-07-13
2015-06-24 18:35:36.894+00
2015-06-29 21:54:16.496+00
440.00
19
4178
20
2266
2014-06-03
2015-07-13
2015-06-24 18:35:36.897+00
2015-07-01 21:43:55.126+00
440.00
29
4178
14
2267
2014-06-03
2015-07-13
2015-06-24 18:35:36.901+00
2015-06-29 21:41:29.784+00
440.00
29
4178
16
And an example of the online_order_onlineorder table.
id
status
customer_context
email
phone
billing_name
billing_street1
billing_street2
billing_state
billing_city
billing_postal_code
shipping_name
shipping_street1
shipping_street2
shipping_state
shipping_city
shipping_postal_code
order_total
shipping_charge
sales_tax_rate
sales_tax_charge
discount_amount
authorization_code
reference_number
transaction_id
created_on
updated_on
ship_date
charge_date
is_shipped
tracking_number
applied_offer_id
customer_id
billing_country
shipping_country
shipping_option_id
shipping_weight
customer_name
organization_contact
organization_name
gift_message
is_gift
418
FF
O
example#example.com
0.00
0.00
0.00000
0.00
0.00
2012-06-28 05:00:00+00
2015-06-24 18:40:55.194+00
f
4177
US
0.00
f
420
FF
O
example#example.com
0.00
0.00
0.00000
0.00
0.00
2012-07-05 05:00:00+00
2015-06-24 18:40:55.214+00
f
4177
US
0.00
f
The only related field between the two tables is customer_id and I need to match the o.charge_date and the li.valid_from_dt with the same o.customer_id to get the date when they purchased and when the license started that same day per each order.
My Results, as you can see the customer ordered 2019-03-05 16:02:24.10583+00, but the valid_from_dt is incorrect, it should be the same date the customer ordered.
id
customer_context
organization_name
shipping_name
shipping_street1
shipping_city
shipping_state
shipping_postal_code
order_total
shipping_charge
sales_tax_charge
discount_amount
charge_date
ship_date
email
shipping_country
status
unsubscribe
last_logon
last_action
customer_name
customer_email
customer_billing_email
customer_org_name
customer_phone
valid_from_dt
valid_thru_dt
name
printed_book_count
33733
O
None
John Doe
111 test
Test
HI
99999
180.00
0.00
0.00
0.00
2019-03-05 16:02:24.10583+00
example#example.com
United States (Domestic And Apo/Fpo/Dpo Mail)
O
f
2020-08-19 13:49:26.338082+00
None
John Doe
example#example.com
2017-04-25
2018-04-24
Valid for 365 Days

Well, from the much appreciated help from Richard Huxton, I have settled with the following SQL query.
EDITED, the previous query had an issue. The following is the correct query to use. I converted both o.charge_date and li.valid_from_dt using ::date, I was then able to match on the exact data that I needed!
SELECT DISTINCT ON (o.id)
o.id,
o.customer_context,
o.organization_name,
o.shipping_name,
o.shipping_street1,
o.shipping_city,
o.shipping_state,
o.shipping_postal_code,
order_total,
shipping_charge,
sales_tax_charge,
discount_amount,
charge_date,
ship_date,
o.email,
shipping_country,
c.status,
c.unsubscribe,
c.last_logon,
c.last_action,
c.full_name AS customer_name,
c.email AS customer_email,
c.billing_email AS customer_billing_email,
c.organization_name AS customer_org_name,
c.phone AS customer_phone,
li.valid_from_dt,
li.valid_thru_dt,
pr.name,
sum(wi.order_qty) AS printed_book_count
FROM
online_order_onlineorder AS o
LEFT OUTER JOIN online_order_weborderitem AS wi ON (wi.web_order_id = o.id
AND format = 'PRT')
LEFT OUTER JOIN customer_customer AS c ON (c.id = o.customer_id)
LEFT OUTER JOIN customer_customer_curriculum_license AS li ON (li.valid_from_dt::date = o.charge_date::date)
LEFT OUTER JOIN product_curriculumlicense AS pli ON (pli.product_ptr_id = li.license_id)
LEFT OUTER JOIN product_product AS pr ON (pr.id = pli.product_ptr_id)
WHERE
o.status in('F', 'FF')
AND o.charge_date >= '2019-03-05'
AND o.charge_date < '2020-10-05'
GROUP BY
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28
ORDER BY
id,
charge_date,
shipping_name

Related

find max value for specific column while still seeing other columns

For tables patient and labh
patient
id lastname
19 patientone
20 patienttwo
patientid lastname loinc datetime numerical
19 patientone 4548-4 2014-05-15 00:00:00 6.5
19 patientone 4548-4 2015-05-15 00:00:00 7.5
19 patientone 4548-4 2016-05-15 00:00:00 3.5
19 patientone 4548-4 2017-05-15 00:00:00 5.5
19 patientone 5000-3 2018-05-15 00:00:00 123
20 patienttwo 4548-4 2013-05-15 00:00:00 2.5
20 patienttwo 4548-4 2012-05-15 00:00:00 1.5
20 patienttwo 4548-4 2011-05-15 00:00:00 9.5
20 patienttwo 4548-4 2010-05-15 00:00:00 3.5
Desired output:
patientid lastname datetime numerical
19 patientone 2017-05-15 00:00:00 5.5
20 patienttwo 2013-05-15 00:00:00 2.5
The labh table hold lab values(numerical), the type of lab (loinc) and when they were done (datetime). I'd like to query for the most recent value of loinc=4548-4 , and i'd like the output to show both the date and the value.
i've tried this below and it shows the most recent dates, but I can't see the values (numerical) at the same time. when I add the numerical column, the it shows all the values, not just the most recent.
Select Distinct patient.id, patient.lastname, Max(Date_Trunc('day', labh.datetime)) As "Date" From patient Inner Join labh On patient.id = labh.patientid Where labh.loinc = '4548-4' Group By patient.id, patient.lastname, patient.firstname Order By patient.id
you haven't selected the numerical column in your query. You can use CTE to store the data temporarily through ranking on pratition over patient id and ordering each partition on the basis of date.
So, according to this, you can try:
WITH summary AS (
SELECT p.id as "Patient ID",
p.lastname as "Patient Name",
l.datetime As "Date",
l.numerical as "Numerical",
ROW_NUMBER() OVER (PARTITION BY p.id
ORDER BY l.datetime DESC) AS rank
FROM patient p
Inner Join labh l
On p.id = l.patientid)
SELECT "Patient ID",
"Patient Name",
"Date",
"Numerical"
FROM summary
WHERE rank = 1;
And this will give you:
Patient ID
Patient Name
Date
Numerical
19
patientone
2017-05-15T00:00:00.000Z
5.5
20
patienttwo
2013-05-15T00:00:00.000Z
2.5
UPDATE
As you've updated the question and changed the expectation, the modified query will be nothing but adding a where condition inside cte construction:
WITH summary AS (
SELECT p.id as "Patient ID",
p.lastname as "Patient Name",
l.datetime As "Date",
l.numerical as "Numerical",
ROW_NUMBER() OVER (PARTITION BY p.id
ORDER BY l.datetime DESC) AS rank
FROM patient p
Inner Join labh l
On p.id = l.patientid
where l.loinc = '4548-4') -- Added this line
SELECT "Patient ID",
"Patient Name",
"Date",
"Numerical"
FROM summary
WHERE rank = 1;
This will give you the same result:
Patient ID
Patient Name
Date
Numerical
19
patientone
2017-05-15T00:00:00.000Z
5.5
20
patienttwo
2013-05-15T00:00:00.000Z
2.5
In order to achieve what you're looking for in Postgres (and other SQL RDBMSes), you need to essentially identify the max value and its corresponding primary key, then join it with the rest of the data set you are looking to retrieve:
SELECT patient.*, labh.*
FROM patient
JOIN labh
ON patient.id = labh.patientid
JOIN (SELECT patientid, max(datetime)
FROM labh
GROUP BY patientid) maxvals
ON maxvals.patientid = labh.patientid AND
maxvals.datetime = labh.datetime

Problem Displaying Multiple Items From Same Column In One Row

I have three tables, DailyFieldRecord, AB953,and Lookup. The DailyFieldRecord table contains DailyFieldRecordID.The AB953 table contains DailyFieldRecordID,GroupID,LookupID, and PersonID. The Lookup table contains GroupID, Description, and LookupID. I'm trying to display the persons ethnicity, age, and gender in the same row based on each DailyFieldRecordID and PersonID. The problem I'm having is that the descriptions of ethnicity, age, and gender are in the same column in the lookup table. I've tried different ways, but am only able to get the correct information for one person. Any input would be helpful.
DailyFieldRecord: AB953:
DailyFieldRecordID DailyFieldRecordID: LookupID: GroupID: PersonID:
1111 1111 1260 300 1
1111 1262 200 1
1111 1264 310 1
1111 1258 300 2
1111 1261 200 2
1111 1265 310 2
Lookup:
GroupID: Description: LookupID:
300 white 1260
300 latin 1258
200 17 1262
200 18 1261
310 male 1264
310 female 1265
Select ab.DailyFieldRecordID, lkp.Description as
Ethinicity,lkp2.Description as Age, lkp3.Description as Gender,
ab.PersonID
FROM DailyFieldRecord dfr
LEFT JOIN AB953 ab ON ab.DailyFieldRecordID=dfr.DailyFieldRecordID and
ab.GroupID=300 and ab.PersonID=1
LEFT JOIN AB953 ab2 ON ab2.DailyFieldRecordID=dfr.DailyFieldRecordID and
ab2.GroupID=200 and ab2.PersonID=1
LEFT JOIN AB953 ab3 ON ab3.DailyFieldRecordID=dfr.DailyFieldRecordID and
ab3.GroupID=310 and ab3.PersonID=1
LEFT JOIN Lookup lkp ON lkp.LookupID=ab.ItemID
LEFT JOIN Lookup lkp2 ON lkp2.LookupID=ab2.ItemID
LEFT JOIN Lookup lkp3 ON lkp3.LookupID=ab3.ItemID
Current output:
DailyFieldRecordID: Ethnicity: Age: Gender: PersonID:
1111 white 17 male 1
Expected output:
DailyFieldRecordID: Ethnicity: Age: Gender: PersonID:
1111 white 17 male 1
1111 latin 18 female 2
Though i must say, this is very bad DB design, Yet you are getting only first person ID coz you are using PersonID = 1 in the query. Please try below query removing PersonID = 1.
Select ab.DailyFieldRecordID
,MAX(CASE WHEN lkp.GroupID = 300 THEN lkp.Description) as Ethinicity
,MAX(CASE WHEN lkp.GroupID = 200 THEN lkp.Description) as Age
,MAX(CASE WHEN lkp.GroupID = 310 THEN lkp.Description) as Gender
,ab.PersonID
FROM DailyFieldRecord dfr
LEFT JOIN AB953 ab ON ab.DailyFieldRecordID=dfr.DailyFieldRecordID
LEFT JOIN Lookup lkp ON lkp.GroupID=ab.GroupID
GROUP BY ab.DailyFieldRecordID, ab.PersonID

PostgreSQL - filter function for dates

I am trying to use the built-in filter function in PostgreSQL to filter for a date range in order to sum only entries falling within this time-frame.
I cannot understand why the filter isn't being applied.
I am trying to filter for all product transactions that have a created_at date of the previous month (so in this case that were created in June 2017).
SELECT pt.created_at::date, pt.customer_id,
sum(pt.amount/100::double precision) filter (where (date_part('month', pt.created_at) =date_part('month', NOW() - interval '1 month') and
date_part('year', pt.created_at) = date_part('year', NOW()) ))
from
product_transactions pt
LEFT JOIN customers c
ON c.id= pt.customer_id
GROUP BY pt.created_at::date,pt.customer_id
Please find my expected results (sum of the amount for each day in the previous month - for each customer_id if an entry for that day exists) and the actual results I get from the query - below (using date_trunc).
Expected results:
created_at| customer_id | amount
2017-06-30 1 220.5
2017-06-28 15 34.8
2017-06-28 12 157
2017-06-28 48 105.6
2017-06-27 332 425.8
2017-06-25 1 58.0
2017-06-25 23 22.5
2017-06-21 14 88.9
2017-06-17 2 34.8
2017-06-12 87 250
2017-06-05 48 135.2
2017-06-05 12 95.7
2017-06-01 44 120
Results:
created_at| customer_id | amount
2017-06-30 1 220.5
2017-06-28 15 34.8
2017-06-28 12 157
2017-06-28 48 105.6
2017-06-27 332 425.8
2017-06-25 1 58.0
2017-06-25 23 22.5
2017-06-21 14 88.9
2017-06-17 2 34.8
2017-06-12 87 250
2017-06-05 48 135.2
2017-06-05 12 95.7
2017-06-01 44 120
2017-05-30 XX YYY
2017-05-25 XX YYY
2017-05-15 XX YYY
2017-04-30 XX YYY
2017-03-02 XX YYY
2016-11-02 XX YYY
The actual results give me the sum for all dates in the database, so no date time-frame is being applied in the query for a reason I cannot understand. I'm seeing dates that are both not for June 2017 and also from previous years.
Use date_trunc(..) function:
SELECT pt.created_at::date, pt.customer_id, c.name,
sum(pt.amount/100::double precision) filter (where date_trunc('month', pt.created_at) = date_trunc('month', NOW() - interval '1 month'))
from
product_transactions pt
LEFT JOIN customers c
ON c.id= pt.customer_id
GROUP BY pt.created_at::date

Problems with Group By - want to call a column without using in Group By (T-SQL, SQL Server)

I want to be able to select the top Max(HR) leaders by LgID and YearID. But I also want the Player's name column. (T-SQL, SQL Server 2012 Express)
When I query with the player name it returns '0' for every Max(HR) output. It seems SQL Server 2012 Express won't allow me to omit the PlayerID in the group by when I have it in the select statement. Is there a way to get by this? A Case when? Or something else?
Select
playerID,
yearID,
lgID,
Max(HR) HR_Leader
from batting
group by
yearID,
lgID
order by
yearID desc,
lgID,
Max(HR)
Returns this error:
Msg 8120, Level 16, State 1, Line 2
Column 'batting.playerID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
But when I comment out the PlayerID, it runs, but I have no name, as seen here:
Select
--playerID,
yearID,
lgID,
Max(HR) HR_Leader
from batting
group by
yearID,
lgID
order by
yearID desc,
lgID,
Max(HR)
yearID lgID HR_Leader
2013 AL 53
2013 NL 36
2012 AL 44
2012 NL 41
2011 AL 43
2011 NL 39
2010 AL 54
2010 NL 42
2009 AL 39
2009 NL 47
Update after 1st comment/question.
Select
playerID,
yearID,
lgID,
Max(HR) HR_Leader
from batting
group by
playerID,
yearID,
lgID
order by
yearID desc,
lgID,
Max(HR) desc
Query Returns this: Which doesn't have the look of the 1st output (
playerID yearID lgID HR_Leader
davisch02 2013 AL 53
cabremi01 2013 AL 44
encared01 2013 AL 36
dunnad01 2013 AL 34
trumbma01 2013 AL 34
jonesad01 2013 AL 33
longoev01 2013 AL 32
ortizda01 2013 AL 30
mossbr01 2013 AL 30
beltrad01 2013 AL 30
What I want is this:
PlayerID yearID lgID HR_Leader
Player1 2013 AL 53
Player2 2013 NL 36
Player3 2012 AL 44
Player4 2012 NL 41
Player5 2011 AL 43
Player6 2011 NL 39
Player7 2010 AL 54
Player8 2010 NL 42
Player9 2010 NL 42
Here a simple way. Use a common table expression (CTE) to get the top HR for each League and Year. Then join that back to batting to get the players that own the those numbers. The sample data includes a tie which returns both players in no particular order.
CREATE TABLE batting (playerID INT, yearID INT, lgID CHAR(2), HR INT)
INSERT INTO batting
SELECT 1, 2010, 'AL', 40 UNION
SELECT 2, 2010, 'AL', 35 UNION
SELECT 3, 2010, 'NL', 35 UNION
SELECT 4, 2010, 'NL', 30 UNION
SELECT 5, 2011, 'AL', 50 UNION
SELECT 6, 2011, 'AL', 45 UNION
SELECT 3, 2011, 'NL', 45 UNION
SELECT 7, 2011, 'NL', 45 UNION
SELECT 4, 2011, 'NL', 40
;WITH cte AS (
SELECT yearID
,lgID
,MAX(HR) HR_Leader
FROM batting
GROUP BY yearID
,lgID
)
SELECT playerID
,c.*
FROM batting b
INNER JOIN
cte c ON b.yearID=c.yearID
AND b.lgID=c.lgID
AND b.HR=c.HR_Leader
ORDER BY c.yearID DESC
,c.lgID
DROP TABLE batting

Select latest date

SELECT
distinct
HRM_Employee.EmployeeId EmployeeXId,
([HRM_Employee].[FirstName] +' '+ISNULL([HRM_Employee].[MiddleName],' ')+' '+ISNULL([HRM_Employee].[LastName],' ')) AS FirstName
-- ,[FirstName]
,[HRM_Employee].[MiddleName]
,[HRM_Employee].[LastName]
,[HRM_Employee].[Code]
,[HRM_Employee].[UserName]
,[HRM_Employee].[Password]
,[HRM_Employee].[DateOfBirth]
,[HRM_Employee].[OriginalBirthDate]
,[HRM_Employee].[Gender]
,[HRM_Employee].[BloodGroup]
,[HRM_Employee].[Height]
,[HRM_Employee].[MaritalStatus]
,[HRM_Employee].[DateOfMarriage]
,[HRM_Employee].[IdentificationMark1]
,[HRM_Employee].[IdentificationMark2]
,[HRM_Employee].[Religion]
,(SELECT [A].[FirstName] +' '+ [A].[MiddleName] +' '+ [A].[LastName]
FROM [dbo].[HRM_Employee] [A] WHERE [A].EmployeeId = [HRM_Transfer].[ReportingOfficerXId]
) [PersonInCharge]
,[HRM_Department].[Name] [DepartmentName]
,[HRM_Branch].[Name] [BranchName]
,[HRM_Division].[Name] [DivisionName]
,[HRM_Designation].[Name] [DesignationName]
,HRM_Transfer.TransferDate
from HRM_Employee
LEFT join [dbo].[HRM_Division]
ON [HRM_Employee].DivisionXId = [HRM_Division].DivisionId
JOIN [dbo].[HRM_Designation]
ON [HRM_Employee].DesignationXId = [HRM_Designation].DesignationId
JOIN [HRM_Department]
ON [HRM_Employee].[DepartmentXId] = [HRM_Department].[DepartmentId]
JOIN [HRM_Branch]
ON [HRM_Employee].[BranchXId] = [HRM_Branch].[BranchId]
INNER JOIN HRM_Transfer
ON HRM_Transfer.EmployeeXId=HRM_Employee.EmployeeId
WHERE
Convert(varchar(11),HRM_Transfer.TransferDate,103) <=Convert(varchar(11), getdate(),103)
END
When I excecute this i got the output as follows
EmployeeXId FirstName TransferDate
34 Ambarish V 2012-08-09 00:00:00.000
54 Anil N P 2012-08-09 00:00:00.000
55 Ann Rose Abraham 2012-08-08 00:00:00.000
55 Ann Rose Abraham 2012-08-09 00:00:00.000
74 Anees M S 2012-08-09 00:00:00.000
From this I want to display data with latest Transfer date only. That is in EmployeeId 55 I need to display data with Transfer date 2012-08-09 00:00:00.000 only. What modification is I want to do in above SP to get required answer?. Please help me to solve this.
Group by EmployeeXId and select MAX(TransferDate).