Redshift - Extracting data from table based on filter condition - amazon-redshift

I have some sales data that shows if a store has done a sale or not. I am trying to pull out all stores that have no sale done till date. Given below is the query and the sample data I am working with.
store_name,sale_made,count
store_a,0,100
store_a,1,23
store_b,1,18
store_c,0,32
store_d,0,50
store_d,1,70
Expected output:
store_name,sale_made,count
store_c,0,32
Reason being only store_c in that list has sale_made = 0 and no sale_made = 1

you need something like this
SELECT store_name FROM table
GROUP BY 1
HAVING SUM(sale_made)=0

Related

I have problem in getting data perfectly in adf

Cconsider one CSV file with employee details and attendance marking attendance with 0 and 1. For example, 1 indicates the employee is present, 0 indicates employee is absent. My problem is to get the working date of employee if they are present (1). It should be the same day where the employee is absent (0). It should be next working day by reading the previous row.
emp id
working
working day
123
1
11/14/2022
123
0
11/15/2022
123
1
11/14/2022
I have tried using data flow in ADF, but it is not getting. Please provide solution for me in Azure Data Factory.
To manipulate the csv data in data factory we have to use Data Flow activity in Azure data factory
I reproduce your scenario in my environment please follow below steps to get issue resolved:
I took one sample file as source in Data flow activity similar to data you provided as below (assuming you don't have date values when employee is absent):
Then I took windows transformation activity over emp id and sort by emp id and created windows column working date with updating Date based on previous row value with expression:
lag(addDays({working day}, 1),1)
Windows transformation data preview:
Now I took derived column transformation to get date working date of employee if they are present or absent. I am updating the column working day if working is 0 then value should be working date else value will working day.
iif(working==0,{working date},{working day})
Derived column transformation data preview:
Now with select activity delete unnecessary columns and store the data in sink.
Select transformation data preview:

Selecting record based on dates

I have a sample source table here and currently, it is returning multiple records due to multiple tutors.
I want to select only the tutor_name if the exam_date is between the effective date and the next record's effective date.
Sample desired output:

Tableau Calculated Field to Display Specific Rows

I'm trying to find a way to display only certain rows of my data based off a very specific criteria. I will try to explain it the best way I can. Let's start with a screenshot here:
Picture of part of the Tableau sheet as-is
What I'm trying to do is create a way to display only the values of "Order: Sales Order #" that have a value filled in for "Item: Connected Product Category". As you see on the screenshot, order number 15589543 has one Connected Product Category that displays "Connectable".
Since this order number does not only have null field for the Connect Product Category, I would like ALL of the rows (even the blank ones) be displayed for order # 15589543. If an order # has NO rows that have "connectable" displayed in them (orders 10305573, 15573299, 15699578, etc.) I would like these orders to be filtered out.
This is a screenshot of just a small part of the data. Basically, if an order has a "connectable" field in it, I need all of the rows for that order # to be displayed.
I tried to do logic such as IF [Item: Connected Product Category] = "Connectable" THEN [Order: Sales Order #] ELSE NULL END but this only displays the rows that literally contain "connectable" in them, not all of the rows for that order number.
Any assistance would be greatly appreciated. After extensive research I'm not sure if this is even possible. Thanks
It is simple. Create a calculated field desired filter as
{FIXED [Order: Sales Order #] : SUM(
IF [Item: Connected Product Category] = 'Connectable' THEN 1 ELSE 0 END
)} > 0
This calculated field will evalaute as TRUE/FALSE and setting filter on this field for TRUE will filter records as desired.
Try this. Good luck

Identifying next closest record by date in tableau

I have a table of users and another table of transactions.
The transactions all have a date against them. What I am trying to ascertain for each user is the average time between transactions.
User | Transaction Date
-----+-----------------
A | 2001-01-01
A | 2001-01-10
A | 2001-01-12
Consider the above transactions for user A. I am basically looking for the distance from one transaction to the next chronologically to determine the distances.
There are 9 days between transactions one and two; and there are 2 days between transactions three and four. The average of these is obviously 4.5, so I would want to identify the average time between user A's transactions to be 4.5 days.
Any idea of how to achieve this in Tableau?
I am trying to create a calculated field for each transaction to identify the date of the "next" transaction but I am struggling.
{ FIXED [user id] : MIN(IF [Transaction Date] > **this transaction date** THEN [Transaction Date]) }
I am not sure what to replace this transaction date with or whether this is the right approach at all.
Any advice would be greatly appreciated.
LODs dont have access to previous values directly, so you need to create a self join in your data connection. Follow below steps to achieve what you want.
Create a self join with your data with following criteria
Create an LOD calculation as below
{FIXED [User],[Transaction Date]:
MIN(DATEDIFF('day',[Transaction Date],[Transaction Date (Data1)]))
}
Build the View
PS: If you want to improve the performance, Custom SQL might be the way.
The only type of calculation that can take order sequence into account (e.g., when the value for a calculated field depends on the value of the immediately preceding row) is a table calc. You can't use an LOD calc for this kind of problem.
You'll need to understand how partitioning and addressing works with table calcs, along with specifying your sort order criteria. See the online help. You can then do something like, for example, define days_since_last_transaction as:
if first() > 0 then min([Transaction Date]) -
lookup(min([Transaction Date]), -1) end
If you have very large data or for other reasons want to do your calculations at the database instead of in Tableau by a table calc, then you use SQL windowing (aka analytical) queries instead via Tableau's custom SQL.
Please attach an example workbook and anything you tried along with the error you have.
This might not be useful if you cannot set User ID Field as a filter.
So, you can set
User ID
as a filter. Then following the steps mentioned in here will lead you to calculating difference between any two dates. Ideally if you select any one value in the filter, the calculated field from the link should give you the difference in the dates that you have in the transaction dates column.

Trying to select second oldest record for each customer and group them by month shipped

I'm trying to create a report which picks 2nd oldest order for every user and groups them by month/year. I know how to select the oldest order for each, but can't figure out how to get the second oldest.
The input data is a table which has every customers' order as a single row with the relevant columns being
order.id, order.user_id, order.date, product.name...
The ideal result I'm looking for is something like:
mon/year : number of second orders
12/2013 : 14
01/2014 : 2
Try this Code
select to_char(date_trunc('month',date),'MM/yyyy'),(select count(*) from order where date=min(p.date)) from order p
group by date_trunc('month',date)