Creating a column that returns date based on various conditions - tsql

Context: I'm fairly new to coding as a whole and is learning SQL. This is one of my practice/training session
I'm trying to create a Dimension Table called "Employee Info" using the Adventureworks2019 public Database. Below is my attempt query to fetch all the data needed for this table.
SELECT
e.BusinessEntityID AS EmployeeID,
EEKey = ROW_NUMBER() OVER(ORDER BY(SELECT NULL)),
p.FirstName,
p.MiddleName,
p.LastName,
p.PersonType,
e.Gender,
e.JobTitle,
ep.Rate,
ep.PayFrequency,
e.BirthDate,
e.HireDate,
ep.RateChangeDate AS PayFrom,
e.MaritalStatus
From HumanResources.Employee AS e FULL JOIN
Person.Person AS p ON p.BusinessEntityID = e.BusinessEntityID FULL JOIN
Person.BusinessEntityAddress AS bea ON bea.BusinessEntityID = e.BusinessEntityID FULL JOIN
HumanResources.EmployeePayHistory AS ep ON ep.BusinessEntityID = e.BusinessEntityID
Where
PersonType='SP'
OR PersonType='EM'
ORDER BY EmployeeID;
Query result
Each employee (EE for short) will have a unique [EmployeeID]. The [EEKey] is simply used to mark ordinal numbers of each record.
EEs are paid different rates shown in the [Rate] column. There will be duplicate records if any EE receives a change in his/her pay rate.
There is currently a [PayFrom] column indicating the first date a pay rate is being applied to each record.
Current requirements: Create a [PayTo] column on the right of [PayFrom] to return the last date each EE is getting paid their corresponding pay rate. There should be 2 scenarios:
If the EE being checked has multiple records, meaning his/her pay rate was adjusted at some point. [PayTo] will return the [PayFrom] date of the next record minus 1 day.
If the EE being checked does not have any additional record indicating pay rate changes. [PayTo] will return a fixed day that was specified (Say 31/12/2070)
Example:
[EmployeeID] no. 4 - Rob Walters with 3 consecutive records in Line 4,5,6. In Line 4, the [PayTo] column is expected to return the [PayFrom] date of Line 5 minus 1 day (2010-05-30). The same rule should be applied for Line 5, returning (2011-12-14).
As for Line 6, since there is no additional similar record to fetch data from, it will return the specified date (2070-12-31), using the same rule as every single-record EE.
As I have mentioned, I am a fresher and completely new to coding, so my interpretation and method might be off. If you can kindly point out what I'm doing wrong or show me what should I do to solve this issue, it will be much appreciated.

Related

How to select max date value while selecting max value

I have the following sample from a table with students results with date for a school entry exam
First student passed exam - This is the most common record found for most students
Second student failed 1st time entry and passed second time based on the date
3rd student had a failed input entry and was corrected based on the Version
I need the results to like like the picture above, so we take into regard using the latest date and highest version!
My basic query thus far is
select studentid
,examdate --(Date)
,result -- (charvar)
from StudentEntryExam
How should I approach this issue?
demo:db<>fiddle
SELECT DISTINCT ON (studentid)
*
FROM mytable
ORDER BY studentid, examdate DESC, version DESC
DISTINCT ON returns the first record of an ordered group. In that case the groups are the studentids. You must find the correct order to set the required record first. So, you need to order by studentid, of course. Then you need the most recent examdate first, which can be achieved with DESC order. If there are two records on the same date, you need to order the highest version first as well using the DESC modifier, too.

Compare 2 Tables When 1 Is Null in PostgreSQL

List item
I am kinda new in PostgreSQL and I have difficulty to get the result that I want.
In order to get the appropriate result, I need to make multiple joins and I have difficulty when counting grouping them in one query as well.
The table names as following: pers_person, pers_position, and acc_transaction.
What I want to accomplish is;
To see who was absent on which date comparing pers_person with acc_transaction for any record, if there any record its fine... but if record is null the person was definitely absent.
I want to count the absence by pers_person, how many times in month this person is absent.
Also the person hired_date should be considered, the person might be hired in November in October report this person should be filtered out.
pers_postition table is for giving position information of that person.
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
SELECT tr.create_time::date AS Date, pers.pin, tr.dept_name, tr.name, tr.last_name, pos.name, Count(*)
FROM acc_transaction AS tr
RIGHT JOIN pers_person as pers
ON tr.pin = pers.pin
LEFT JOIN pers_position as pos
ON pers.position_id=pos.id
WHERE tr.event_no = 0 AND DATE_PART('month', DATE)=10 AND DATE_PART('month', pr.hire_date::date)<=10 AND pr.pin IS DISTINCT FROM tr.pin
GROUP BY DATE
ORDER BY DATE
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
*This is report for octeber,
*Pin is ID number
I'd start by
changing the RIGHT JOIN for a LEFT JOIN as they works the same in reverse but it's confusing to figure them both in mind :
removing for now the pers_position table as it is used for added information purpose rather than changing any returned result
there is an unknown alias pr and I'd assume it is meant for pers (?), changing it accordingly
that leads to strange WHERE conditions, removing them
"pers.pin IS DISTINCT FROM pers.pin" (a field is never distinct from itself)
"AND DATE_PART('month', DATE)=10 " (always true when run in october, always false otherwise)
Giving the resulting query :
SELECT tr.create_time::date AS Date, pers.pin, tr.dept_name, tr.name, tr.last_name, Count(*)
FROM pers_person as pers
LEFT JOIN acc_transaction AS tr ON tr.pin = pers.pin
WHERE tr.event_no = 0
AND DATE_PART('month', pers.hire_date::date)<=10
GROUP BY DATE
ORDER BY DATE
At the end, I don't know if that answers the question, since the title says "Compare 2 Tables When 1 Is Null in PostgreSQL" and the content of the question says nothing about it.

Tableau: Getting Aggregate Count Based on Boolean Attributes

I am really new in Tableau and I would be needing help in some calculation.
My simplified data consists of three columns:
customer no, transaction date, lost_flag
here lost_flag is a boolean which marks as true if a customer made a transaction in the last 365 days.
(max([transaction date)< dateadd('year',-1,max([Report Date])))
I need to find the:
1. number of customers that are lost
2. number of customers that are not lost
3. attrition rate
For number one, I initially did
countd(if ([Lost_flag]) then [Customer No] else "" END)
But obviously it did not work.
Note: Customer_No is not unique here since this is a transactional sales data source
Thanks in advance.
First you need to make sure that your lost flag is being calculated at the customer level rather than the transaction level. In order to do this use the following formula, note that it is similar to yours however I have made it be fixed at customer id and also replaced report date with todays date:
Lost Flag = { FIXED [Customer ID]: (max([Transacton Date])<dateadd('year',-1,max(TODAY())))}
This will add a TRUE or FALSE flag against every transaction for a customer.It is important that this is fixed at the customer id level rather than the transaction otherwise all old transactions for a customer will be flagged as lost even if they have a recent transaction.
So in order to see how many customers are lost do the following:
1) drag lost_flag onto the rows shelf
2) drag customer id onto the text mark and then right click- measure - count distinct.

Checking for rows relating to previous days which might not exist

I am having some trouble in my check of whether or not I received prices yesterday for let´s say - my apples.
The tricky part is that in the table where prices are stored, there won´t be any row relating to yesterday if I did not get prices yesterday. So how can I make my check everyday if I want to be sure that the day before I got some prices?
If you have a Calendar table (see here for example) with a field called Date and making some assumptions about your data structure:
SELECT c.[Date],
ISNULL(p.Prices,'No Prices')
FROM Calendar c
LEFT JOIN Prices p ON c.[Date] = p.[Date]
Your question is not very clear, but it actually might even be as simple as just checking for the presence of a row for the previous day, rather than reporting across all dates (in this case I consider there are multiple products):
SELECT DISTINCT
prod.Product,
CASE WHEN prev.Product IS NULL
THEN 'No Prices for yesterday'
ELSE 'Prices recorded for yesterday'
END AS PricesYesterday
FROM Prices prod
LEFT JOIN Prices prev ON prev.Product = prod.Product
AND prev.[Date] = dateadd(day,datediff(day,0,GETDATE()),0) - 1

SQL - using the Min field to achieve desired result

Wondering the best SQL to handle below situation: Client only wants to see invoices that have been declined. I started with only show me when STATUS_ID = 2, but then realized that it was paid as it was resubmitted and accepted so that didn't work. What is the best way to handle 2 records like below where I don't want the SQL to return any records if manifest + order code have a 1. Would you do a Min on Status ID or something of that nature?
VENDOR NAME manifest ORDER_CODE STATUS_ID
VENDOR 12345 BHGSDKJF1234 RU07 2 (invoice decline)
VENDOR 12345 BHGSDKJF1234 RU07 1 (paid)
This trick can be work for you in this case, but it's not solve the general case (what happens if the STATUS_ID for paid is 3, and all possible values are 0-5?)
you can use in general SWICH-CASE clause, that gives you some 1 (true) if the client has STATUS_ID = 1, and 0 otherwise. Then, pick the MAX() for each invoice.
You can also consider another design that might work for you:
Add time\time-stamp column (Maybe, for your purpose, you can use SYSDATE time for insertion time of the record to db).
After you have a time column, you probably can choose the columns with the last time STATUS_ID for each invoice (get the STATUS_ID in the row with the max time).