I have the need to compare a date column with the max(date column) while making a filter selection.
E.g., when I compare [Date] = {max([Date])}, it finds the max/latest date in the entire data and compares. This gives me correct result when the latest month is included in the filter, but fails if I keep all months except the latest.
Is there a way in which the latest date can be searched in the subset of the data (based on filter selection)?
I am working with Redshift database (live connection).
look at the attached. https://www.dropbox.com/s/5zdkw9n003rxgvl/170524%20stack%20question.twbx?dl=0
{fixed : max(date)} will reflect only what is in the context filter.
Related
I am very new to Azure Data Factory. I have created a simple Pipeline using the same source and target table. The pipeline is supposed to take the date column from the source table, apply an expression to the column date (datatype date as shown in the schema below) in the source table, and it is supposed to either load 1 if the date is within the last 7 days or 0 otherwise in the column last_7_days (as in schema).
The schema for both source and target tables look like this:
Now, I am facing a challenge to write an expression in the component DerivedColumn. I have managed to find out the date which is 7 days ago with the expression: .
In summary, the idea is to load last_7_days column in target Table with value '1' if date >= current date - interval 7 day and date <= current date like in SQL.I would be very grateful, if anyone could help me with any tips and suggestions. If you require further information, please let me know.
Just for more information: source/target table column date is static with 10 years of date from 2020 till 2030 in yyyy-mm-dd format. ETL should run everyday and only put value 1 to the last 7 days: column last_7_days looking back from current date. Other entries must recieve value 0.
You currently use the expression bellow:
case ( date == currentDate(),1, date >= subDays(currentDate(),7),1, date <subDays(currentDate(),7,0, date > currentDate(),0)
If we were you, we will also choose case() function to build the expression.
About you question in comment, I'm afraid no, there isn't an another elegant way for. To achieve our request, Data Flow expression can be complex. It may be comprised with many functions. case() function is the best one for you.
It's very clear and easy to understand.
I am building a dashboard for retention. The data that I am getting for the most recent day seems to have huge spikes because the denominator is not having entire data for the day.
So, I just want to show the data till previous day and it should be automated likewise.
Please let me know if anyone had dealt with the same problem.
Thanks,
Sai
The best way to do this is through creating a calculated field:
Create a field called Recent Date as follows:
DATETRUNC('day',[Date]) = {FIXED : MAX(DATETRUNC('day',[Date]))}
What this does is creates a Boolean field where the most recent date will be flagged as TRUE and all others as FALSE.
Drag this field into the filters pane and select FALSE. This will remove the most recent dates data.
Create a calculated field to return a boolean value if the data is from today's date. Then filter on False.
DATETRUNC('day', [Your Date Field]) = DATETRUNC('day', TODAY())
Let's say I have a date dimension and from my business requirements I know that the most granular I would need to go is to examine the specific day of the month that an event occurred.
The data I am given provides me with the exact time that an event occurred (YYYY-MM-DD HH:MM:SS). I have two opitons:
Before loading the data into the date dimension, slice the HH:MM:SS from the date.
Create the time attributes in my date dimension and insert the full date time.
The way I see it, I should go with the option 1. This would remove redundant data and save some space. However, if I go with option 2, should the business requirements ever change or if my manager suddenly wants to be more granular I wouldn't need to modify my original design. Which option is more commonly used? Are there more options that I did not consider?
Update - follow up question
I receive new data every month. If I used a pre built date dimension with all the dates would I then need to run my script every month to populate the table with new dates of that month or would I have a continuous process where by every day insert into the table one row, which would be that date?
I would agree with you and avoid option 2. A standard date dimension table is at the individual date level. If you did need to analyse by time of day, you could create an additional time of day dimension at the level of a second in a single day, and link to that from your fact table.
Your date dimension should be created by script automatically, rather than from the dates that events occurred. This allows you to analyse across a range of events from other facts, and on dates where no events occur, using a standard, prebuilt dimension.
I would also include the full date/time stamp as a column in the fact table, along with the 'DateKey' to the dimension table. This would allow you some visibility/analysis of the timestamp, you would not lose the data, and would still allow you to analyse by the date dimension.
Update - follow up question
Your pre-built date dimension (the standard way of doing it) would usually contain some dates in the future. There's no reason not to, for example, include another 5 years of dates in the table. But if you'd like it to gradually grow over time, you could have a script that is run once a day, once a month, or once a year to add new dates. Its totally up to you! There are many example scripts for building date dimensions- just google date dimension script. They exist for the language of your choice, e.g. SQL, C#, Power Query, etc.
Tableau is reading my dates wrong. I have 2 columns, Date and number for each day.
The date format is “yyyymmdd” i.e. (20160617) and per day number is integer. I am fetching this data directly from SQL server and my problem is, tableau is reading my dates wrong.
So I tried DATEPARSE() to convert my date.
My DATEPARSE function is : DATEPARSE(“yyyymmdd”,”Date”) , now after using DATEPARSE function, I get NULL for my dates.
Can anyone please help me why I get NULL for dates, my query returns 30-day data which is divided into per day count.
Sample after running the query on SQL
Date Per day number
20160617 215674
Tableau does not accept this date format and I applied DateParse(), which I guess is returning string since my date is null. I would ideally like to get the correct date so I can apply a trend line on my data.
Thanks in advance.
Cheers!
You aren't using DateParse() correctly. The second parameter, which you have as "Date", should be the name of the field you want parsed. So for example, if you store 20160617 in a field called my_date_as_integer, your function should be DateParse("yyyymmdd", [my_date_as_integer])
I am new to Cognos and I am trying to add a filter to a column that only allows rows that are in between Yesterday at 4 AM and today at 3 AM. I have a working query in db2 but when I try to add it to the filter in Cognos I get a parsing error. Also, I found in the properties that the data type for the column I am trying to filter to be Unknown (Unsupported) type. I started off by creating two Data Item Expressions for each time frame I am trying to limit the data by. But I got a parsing error on the first one:
[Presentation Layer].[Cr dtime]=timestamp(current date) - 1 day + 4 hour
This works in my db2 local test database but doesn't even compile in Cognos. I also tried casting the column into a timestamp but that isn't working either. Any help is appreciated. I also tried using the _add_days function but I still get a parsing error. Also sampling the column I get values that appear to be timestamps as this string: 2016-01-02T11:11:45.000000000
Eventually if I get the two filters working I expect the original filter to be close to this syntax:
[Presentation Layer].[Cr dtime] is between [Yesterday 4AM] AND [Today 3AM]
Here is your filter:
[Presentation Layer].[Cr dtime] between
cast(_add_hours(_add_days(current_date,-1),4),timestamp)
and
cast(_add_hours(current_date,3),timestamp)
This works because current_date in Cognos does not have a time component. If you were to cast it directly to a timestamp type you would see the time part of the date as 12:00:00.000 AM, or midnight. Knowing this we can then simply add how much time after midnight we want, cast as a timestamp type and use this in the filter.