Finding the last seven days in a time series - date

I have a spreadsheet with column A which holds a timestamp and updates daily. Column B holds a value. Like the following:
+--------------------+---------+
| 11/24/2012 1:14:21 | $487.20 |
| 11/25/2012 1:14:03 | $487.20 |
| 11/26/2012 1:14:14 | $487.20 |
| 11/27/2012 1:14:05 | $487.20 |
| 11/28/2012 1:13:56 | $487.20 |
| 11/29/2012 1:13:57 | $487.20 |
| 11/30/2012 1:13:53 | $487.20 |
| 12/1/2012 1:13:54 | $492.60 |
+--------------------+---------+
What I am trying to do is get the average of the last 7, 14, 30 days.
I been playing with GoogleClock() function in order to filter the dates in column A but I can't seem to find the way to subtract TODAY - 7 days. I suspect FILTER will also help, but I am a little bit lost.

There are a few ways to go about this; one way is to return an array of values with a QUERY function (this assumes a header row in row 1, and you want the last 7 dates):
=QUERY(A2:B;"select B order by A desc limit 7";0)
and you can wrap this in whatever aggregation function you like:
=AVERAGE(QUERY(A2:B;"select B order by A desc limit 7";0))

Related

How to get non-aggregated measures?

I calculate my metrics with SQL and publish the resulting table to Tableau Server. Afterward, use this data source to create charts and dashboards.
For one analysis, I already calculated the measures per day with SQL. When I use the resulting table in Tableau, it aggregates these measures to SUM by default. However, I don't want to have SUM or AVG of the average or SUM of the Percentiles.
What I want is the result when I don't select date dimension and not GROUP BY date in SQL as attached below.
Here is the query:
SELECT
-- date,
COUNT(DISTINCT id) AS count_of_id,
AVG(timediff_in_sec) AS avg_timediff,
PERCENTILE_CONT(0.25) WITHIN GROUP(ORDER BY timediff_in_sec) AS percentile_25,
PERCENTILE_CONT(0.50) WITHIN GROUP(ORDER BY timediff_in_sec) AS percentile_50
FROM
(
--subquery
) AS t1
-- GROUP BY date
Here are the first 10 rows of the resulting table:
+------------+--------------+-------------+---------------+---------------+
| date | avg_timediff | count_of_id | percentile_25 | percentile_50 |
+------------+--------------+-------------+---------------+---------------+
| 10/06/2020 | 61,65186364 | 22 | 8,5765 | 13,3015 |
| 11/06/2020 | 127,2913333 | 3 | 15,6045 | 17,494 |
| 12/06/2020 | 306,0348214 | 28 | 12,2565 | 17,629 |
| 13/06/2020 | 13,2664 | 5 | 11,944 | 13,862 |
| 14/06/2020 | 16,728 | 7 | 14,021 | 17,187 |
| 15/06/2020 | 398,6424595 | 37 | 11,893 | 19,271 |
| 16/06/2020 | 293,6925152 | 33 | 12,527 | 17,134 |
| 17/06/2020 | 155,6554286 | 21 | 13,452 | 16,715 |
| 18/06/2020 | 383,8101429 | 7 | 266,048 | 493,722 |
+------------+--------------+-------------+---------------+---------------+
How can I achieve the desired output above?
Drag them all into the dimensions list, then they will be static dimensions. For your use you could also just drag the Date field to Rows. Aggregating 1 value, which you have for each date, returns the same value whatever the aggregation type.

Lead on datastage for timestamp?

I need to use the Lead function on Datastage with the column timestamp, but when I need to put the next column substriting-10 minutes, if someone did it before I tried, and I had also, that can't put strings on the timestamp.
How can I solve this problem?
Input
Id | date. |
1 | 01/04/2016 13:45:25|
2 | 10/04/2016 01:25:36|
3 | 26/10/2017 22:35:13|
Output
Id| date. | Befor. |
1 | 01/04/2016 13:45:25 | 10/04/2016 01:15:36|
2 | 10/04/2016 01:25:36 | 26/10/2017 22:35:13|
3 | 26/10/2017 22:35:13 |null |

Tableau - Calculated field for difference between date and maximum date in table

I have the following table that I have loaded in Tableau (It has only one column CreatedOnDate)
+-----------------+
| CreatedOnDate |
+-----------------+
| 1/1/2016 |
| 1/2/2016 |
| 1/3/2016 |
| 1/4/2016 |
| 1/5/2016 |
| 1/6/2016 |
| 1/7/2016 |
| 1/8/2016 |
| 1/9/2016 |
| 1/10/2016 |
| 1/11/2016 |
| 1/12/2016 |
| 1/13/2016 |
| 1/14/2016 |
+-----------------+
I want to be able to find the maximum date in the table, compare it with every date in the table and get the difference in days. For the above table, the maximum date in table is 1/14/2016. Every date is compared to 1/14/2016 to find the difference.
Expected Output
+-----------------+------------+
| CreatedOnDate | Difference |
+-----------------+------------+
| 1/1/2016 | 13 |
| 1/2/2016 | 12 |
| 1/3/2016 | 11 |
| 1/4/2016 | 10 |
| 1/5/2016 | 9 |
| 1/6/2016 | 8 |
| 1/7/2016 | 7 |
| 1/8/2016 | 6 |
| 1/9/2016 | 5 |
| 1/10/2016 | 4 |
| 1/11/2016 | 3 |
| 1/12/2016 | 2 |
| 1/13/2016 | 1 |
| 1/14/2016 | 0 |
+-----------------+------------+
My goal is to create this Difference calculated field. I am struggling to find a way to do this using DATEDIFF.
And help would be appreciated!!
woodhead92, this approach would work, but means you have to use table calculations. Much more flexible approach (available since v8) is Level of Details expressions:
First, define a MAX date for the whole dataset with this calculated field called MaxDate LOD:
{FIXED : MAX(CreatedOnDate) }
This will always calculate the maximum date on table (will overwrite filters as well, if you need to reflect them, make sure you add them to context.
Then you can use pretty much the same calculated field, but no need for ATTR or Table Calculations:
DATEDIFF('day', [CreatedOnDate], [MaxDate LOD])
Hope this helps!

Tableau: DATEDIFF( 'days', MIN([Start Date]), [End Date])

Cheers!
I'm trying to get a chart working that shows me the count of work orders that are completed each day after work on a unit (serial number) starts. I'd like to be able to "shadow" multiple serial numbers on top of each other, normalized to a start date of '0'.
Currently I have columns in my data set:
Work order number (0..999), repeats for each serial number
Serial number (0..999)
Work order start date (Datetime)
Work order end date (Datetime)
Say for instance that a new serial number starts each day, contains 5 work orders, and requires 5 days to complete (there are 5 units in WIP at any given time).
The data might look like (dates shown as ints):
| Work order number | Serial number | Work order start date | Work order end date |
| ----------------- | ------------- | --------------------- | ------------------- |
| 1 | 1 | 1 | 2 |
| 2 | 1 | 1 | 3 |
| 3 | 1 | 2 | 4 |
| 4 | 1 | 3 | 5 |
| 5 | 1 | 4 | 5 |
| 1 | 2 | 2 | 3 |
| 2 | 2 | 2 | 4 |
| 3 | 2 | 3 | 5 |
| 4 | 2 | 4 | 6 |
| 5 | 2 | 5 | 6 |
I'm assuming I'll need a calculated column that would perhaps go something like:
[Work order end days since start] =
[Work order end date] - MIN(
IF(*serial number matches current*, [Work order start date], NULL)
)
I (clearly) have no idea how to actually create such a calculated field in Tableau.
The values in the column (same order as the data above) should be:
| Work order end days since start |
| ------------------------------- |
| 1 |
| 2 |
| 3 |
| 4 |
| 4 |
| 1 |
| 2 |
| 3 |
| 4 |
| 4 |
Any guidance or help? Happy to clarify anything as well. Many thanks! Cheers!
You will have better results with this kind of data if you reshape it to have a single date column and add a type column indicating whether the current row describes the start or completion of a workorder.
| Work order number | Serial number | date | type |
Think of each row representing a state change, not a work order.
Open work orders on a particular date would be those that have a start record prior to that date, but don't have a completion record prior to that date. If you define a calculated field as +1 if type = New and -1 if type = Completion, then you can use a running total of that field to view the number of open work orders over time.

Join column with timestamps where value is maximum

I have a table that looks like
+-------+-----------+
| value | timestamp |
+-------+-----------+
and I'm trying to build a query that gives a result like
+-------+-----------+------------+------------------------+
| value | timestamp | MAX(value) | timestamp of max value |
+-------+-----------+------------+------------------------+
so that the result looks like
+---+----------+---+----------+
| 1 | 1.2.1001 | 3 | 1.1.1000 |
| 2 | 5.5.1021 | 3 | 1.1.1000 |
| 3 | 1.1.1000 | 3 | 1.1.1000 |
+---+----------+---+----------+
but I got stuck on joining the column with the corresponding timestamps.
Any hints or suggestions?
Thanks in advance!
For further information (if that helps):
In the real project the max-values are grouped by month and day (with group by clause, which works btw), but somehow I got stuck on joining the timestamps for max-values.
EDIT
Cross joins are a good idea, but I want to have them grouped by month e.g.:
+---+----------+---+----------+
| 1 | 1.1.1101 | 6 | 1.1.1300 |
| 2 | 2.6.1021 | 5 | 5.6.1000 |
| 3 | 1.1.1200 | 6 | 1.1.1300 |
| 4 | 1.1.1040 | 6 | 1.1.1300 |
| 5 | 5.6.1000 | 5 | 5.6.1000 |
| 6 | 1.1.1300 | 6 | 1.1.1300 |
+---+----------+---+----------+
EDIT 2
I've added a fiddle for some sample data and and example of the current query.
http://sqlfiddle.com/#!1/efa42/1
How to add the corresponding timestamp to the maximum?
Try a cross join with two sub queries, the first one selects all records, the second one gets one row that represents the time_stamp of the max value, <3;"1000-01-01"> for example.
SELECT col_value,col_timestamp,max_col_value, col_timestamp_of_max_value FROM table1
cross join
(
select max(col_value) max_col_value ,col_timestamp col_timestamp_of_max_value from table1
group by col_timestamp
order by max_col_value desc
limit 1
) A --One row that represents the time_stamp of the max value, ie: <3;"1000-01-01">
Use the window cause you use with pg
Select *, max( value ) over (), max( timestamp ) over() from table
That gives you the max values from all values in every row
http://www.postgresql.org/docs/9.1/static/tutorial-window.html