TALEND STUDIO - Get number of Non-working Days between start_date and end_date using TMAP in Talend - talend

I am a beginner in Talend and I need to achieve the following:
For example, I have two input tables in the TMAP component.
Table 1:
Start_Date
End_Date
25/8/2022
1/9/2022
Table 2 (Lookup Table):
Non_working_days
Remark
27/8/2022
Weekend
28/8/2022
Weekend
31/8/2022
Weekend
I would want my output to count the number of non-working days from the lookup table.
For exp:
Start_Date
End_Date
No_of_non_working_days
25/8/2022
1/9/2022
3
Can this be achieved by using the expression editor in the TMAP component or I will need to create a routine to achieve it?
Thanks.

This is doable with a subjob , a bit complex but an interesting one :
Main idea: generate all dates between startDate and endDate, then compare each one of these dates to the content of table 2. Then count the number of corresponding dates.
tFixedFlow1 (table 1) place here your input table 1
tFlowToIterate : this will create global variables for startDate and endDate, that will be important for the next steps
tLoop : the aim is to generate all dates contained between startDate and endDate
See detail :
tIterateToFlow : once we have created all dates between start and endDate, regroupe the iterationFlow into a unique flow.
tLogRow : just so you can control content.
tMap+table 2 : join input flow with lookup from your table 2. Make it an innner join.
tAggregate : count the number of lines in the output
tLogRow : print screen of the result.

Related

Is there a way to pull just the Year out a VARCHAR datetime value?

I am working on a project, in Snowflake, that requires me to combine pest & weather data tables, but the opposing tables do not share a common column. My solution has been to create a view that extracts the year from the Pest Table dates, format ex.
CREATION_DATE: 03/26/2020 09:11:15 PM,
to match the YEAR column in the Weather tables, format ex.
DATEYEAR: 2021.
However, I have come to find that the dates in the pest report are VARCHAR as opposed to traditional date/datetime values. Is there a way to pull just the Year out the VARCHAR date value? Additional information: I cannot change the tables themselves, I will need to create a view that preserves all other columns and adds a new "DATEYEAR" column.
Yes , we can and below is working example:
create table test (dt string );
insert into test(dt) values ('01/04/2022');
Select dt, DATE_PART( year, dt::date) from test
To make it easy, you can split the string into an array and take the third member of the array (using 2 since arrays are 0 based):
select strtok_to_array('03/26/2020', '/')[2]::int as MY_YEAR;

Calculated Field to Count While Between Dates

I am creating a Tableau visualization for floor stock in our plant. We have a column for incoming date, quantity, and outgoing date. I am trying to create a visualization that sums the quantity but only while between the 2 columns.
So for example, if we have 9 parts in stock that arrived on 9/1 and is scheduled to ship out on 9/14, I would like this visualization to include these 9 parts in the sum only while it is in our stock between those 2 dates. Here is an example of some of the data I am working with.
4/20/2018 006 5/30/2018
4/20/2018 017 5/30/2018
4/20/2018 008 5/30/2018
6/29/2018 161 9/7/2018
Create a new calculation:
if [ArrivalDate]>="2018-09-01" and [ArrivalDate]<"2018-09-15"
and [Shipdate]<'2018-09-15"
then [MEASUREofStock] else 0 end
Here is a solution using UNIONs written before Tableau added support for Unions (so it required custom SQL)
Volume of an Incident Queue at a Point in Time
For several years now, Tableau has supported Union directly, so now it is possible to get the same effect without writing custom SQL, but the concept is the same.
The main thing to understand is that you need a data row per event (per arrival or per departure) and a single date column, not two. That will let you calculate the net change in quantity per day, and you can then use a running total if you want to see the absolute quantity at the close of each day
There is no simple way to display the total quantity between the two dates without changing the input table structure. If you want to show all dates and the "eligible" quantity in each day, you should
Create a calendar table that has all dates start from 1990-01-01 to 2029-12-31. (You can limit the dates to be displayed in dashboard later by applying date filter, but here you want to be safe and include all dates that may exist in your stock table) Here is how to create the date table quickly.
Left join the date table to stock table and calculate the eligible quantity in each day.
SELECT
a.date,
SUM(CASE WHEN b.quantity IS NULL THEN 0 ELSE b.quantity END) AS quantity
FROM date a
LEFT JOIN
stock b on a.date BETWEEN b.Incoming_Date AND b.Outgoing_Date
GROUP BY a.date
Import the output table to Tableau, and simply add dates and quantity to the chart.

Managing dates in SPSS - Time difference in months

Im a novice SPSS user and are working on a data set with two columns, customer ID and order date. I want to create a third variable with a month integer of number of inactive months since the observed customer ID:s last order date. This is how the data looks like:
This will create some sample data to demonstrate on:
data list list/ID (f3) OrderDate (adate10).
begin data
1 09/18/2016
1 03/02/2017
1 05/12/2017
2 06/06/2016
2 09/09/2017
end data.
Now you can run the following syntax to create a variable that contains the number of complete months between the date in the present row and the date in the previous row:
sort cases by ID OrderDate.
if ID=lag(ID) MonthSince=DATEDIF(OrderDate, lag(OrderDate), "months").

How to get the time difference in talend?

How to get the difference in time by comparing with the previous value and getting the result .Say for example
There are
2017-01-01 13:00:00
2017-01-01 13:15:00
I need the difference as 15 minutes after finding the difference,How to do it?
Firstly, you'll have to use TalendDate.diffDate(column1,column2,"pattern") to get the time difference.
Then, if you want to compare current value with previous one (in the same column), you can set a sequence on your flow, it will help you identify which one is the previous value. Then, you'll just have to read twice your flow, and have an inner join between current sequence and current sequence -1 to get the currentDate and the previous Date.
First subjob :
YourFlow -> tMap -> tHashOutput
In tMap, add a new "sequence" column to your field and use Numeric.sequence("s1",1,1).
This way all lines will have an ID.
Then, read twice your Hash , and join flows on "sequence - 1"
tHashInput_1----|
|--tMap--->Output
tHashInput_2----|
Put the TalendDate.diffDate() method in the output, using the two Dates fields.
Here is an alternative :
Start defining starting talend job execution time, this way (here in a tJava, but you can also use tSetGlobalVar component) :
globalMap.put("startDate", TalendDate.getDate("CCYY/MM/DD hh:mm:ss"));
The following code is used later in the job inside a tJava :
String endDate = TalendDate.getDate("CCYY/MM/DD hh:mm:ss");
long executionTime = format.parse(endDate).getTime() - format.parse(((String)globalMap.get("startDate"))).getTime();
System.out.println("Execution Time : "+(executionTime/(60*60*1000))+" Hour(s) "+(executionTime/(60*1000)%60)+" Minute(s) "+(executionTime/1000%60)+" second(s).");

SQL query to find date between a range of dates

Table X has start_date and end_date and related information associated with these dates. BOTH stored in DB in date format.
I need to find a specific date say 12th-Jan-2000 and extract the rows whose date range includes this date.
Could someone please help me out. I am new to SQL so need some guidance here.
Table X:
ID |start_date|end_date
1 |12/30/1999|01/12/2000
2 |01/20/2000|01/30/2000
3 |01/07/2000|01/15/2000
Thus my query should give back the ID-3 since 12th January falls in the range 01/07/2000-01/15/2000
Thanks
use the BETWEEN operator:
SELECT *
FROM TableX
WHERE DATE'2000-01-12' BETWEEN start_date AND end_date