I have a peculiar problem which I need some help with. I have a reading taken every day of the year for last 10 years. Column A contains dates from 01 Jan to 31 Dec, whereas column B contains values taken every day in 2012, Column C contains values taken everyday in 2013, and so on and so forth.
What I am looking for is to copy data from Col A and Col B and paste it on new worksheet only where reading was 0 on consecutive days. For example:
Col A: Date
Col B: 2012
01 Jan
0
02 Jan
1
03 Jan
0
04 Jan
0
05 Jan
3
06 Jan
0
07 Jan
0
08 Jan
0
09 Jan
2
10 Jan
0
From the above, I need to pick 03 Jan, 04 Jan, 06 Jan, 07 Jan, and 08 Jan only, since these are the dates where 0 occurs consecutively.
Many thanks in advance for help!
I understand that isolating dates where 0 occurs is very simple. Its the condition of consecutive which has proven tricky for me.
Related
I'm trying to set up a look at YoY based on quarters, thus (Q1 2016 Rev/Q2 2015 Rev) - 1.
My data is in quarters, so I'm trying to set up a calculated field with (Rev at Current Quarter / Rev at (Current Quarter - 4)) - 1
But I'm not sure how to set up that dependency in Tableau.
Thanks for reading
EDIT:
Example of Data
quarter_id | quarter_revenue
10 | 200
11 | 430
12 | 250
13 | 300
14 | 405
15 | 493
16 | 299
So quarter_id 10 corresponds to 2015 Q1, then 11 is 2015 Q2, etc. Currently I can set this into Tableau and use Quick Table Calculation: Percent Difference on Quarter_Revenue which gets me the difference for id 11 and 10 (2015 Q2 and 2015 Q1).
What I want to do is look a year ahead however, and do this calculation 4 quarters ahead. So to compare 2015 Q1 vs 2016 Q1, I would need to do look at id 14 and id 10, and the calculation for Percent Difference would be (405/200)-1.
I have a data like this
Date Temperature Load
2017-08-05 7:45:00 PM 10 5
2017-08-05 7:46:00 PM 10 4
2017-08-05 7:47:00 PM 10 2
2017-08-05 7:48:00 PM 10 1
2017-08-05 7:49:00 PM 10 5
.........
In Tableau, how can I sum the Load by hours and plot scatters?
I would drag SUM(Load) onto Rows and AVG(Temperature) onto columns and discrete hour of Date onto Details. That would give you a scatter plot like
Here is what my structure looks like:
SET OF
SEQUENCE:
INTEGER: XX
INTEGER: YY
My encoding looks like this:
11 08 10 06 02 01 XX 02 01 YY
11 08 -- SET OF
10 06 -- SEQUENCE
However, when I decode with openssl, I don't see the expected output. It looks like
0:d=0 hl=2 l= 8 prim: SET
0000 - 10 06 02 01 XX 02 01 YY-
This is not what I expected to see. (Look at the structure I wanted it to look like)
I am not sure what I am missing. Any help would be much appreciated.
A SET and SEQUENCE are constructed types. That means that the bit that indicates a constructed type in the tag needs to be set. That would be bit 5 or 6 (depending if you start with bit 0 or 1). If the bit isn't set then the parser will view it as a primitive type, which means it has a single value instead of children. This is why you get prim in your output. The tag number is still 17 or 16 which denotes a SET OF or SEQUENCE, so the structure is still seen to be a SET.
So instead of 11 and 10 you should be using values 31 and 30. Then your code should parse correctly.
I am working with time series data in MATLAB. I have got two vectors of date numbers, one of which relies on a somewhat subjective data source. If both vectors were perfectly accurate, all date numbers should correspond to trading days while one vector would be a 'proper subset' of the other. Unfortunately this is not the case because one of the vectors contains several date numbers which correspond to non-business days. I would like to find a way to replace all non-business days in this vector with the next available business day.
Example:
datenumbers = [736062;736063;736064;736065;736066;736067]
% corresponds to [wed, thu, fri, sat, sun, mon]
This contains [736065;736066] which corresponds to the upcoming weekend. Because these are not working days I would like to identify the date number corresponding to the subsequent monday and change both entries so that:
datenumbers = [736062;736063;736064;736067;736067;736067]
% corresponds to [wed, thu, fri, mon, mon, mon]
One way to do it is simply to loop through all your values and change them if they are Saturday or Sunday.
datenumbers = [736062;736063;736064;736065;736066;736067];
for i = 1:length(datenumbers)
weekDay = mod(datenumbers(i),7);
if weekDay == 1 || weekDay == 2
datenumbers(i) = (3-weekDay) + datenumbers(i);
end
end
I have an array of dates and i would like to discard any dates that don't have at least one another date in a specific time interval, for example 5 minutes. I need to find a smart way to do it, as loops take forever with a larger dataset.
input data:
2009 07 07 16:01:30
2009 07 07 16:04:06
2009 07 07 16:05:00
2009 07 07 16:12:00
2009 07 07 16:19:43
2009 07 07 16:24:00
results:
2009 07 07 16:01:30
2009 07 07 16:04:06
2009 07 07 16:05:00
2009 07 07 16:19:43
2009 07 07 16:24:00
The value 2009 07 07 16:12:00 was discarded because it was more than 5 minutes away from any other timestamp.
Thanks,
Cristi
Secondary issue:
Both Dan and nkjt suggested an implementation that worked, thanks! What if the dates are part of 2 groups: A or B and i want to find if there exist a date from group A that has a corresponding date in group B that is within a number of seconds/minutes apart? if not just remove the date from group A..
You can use diff. You'll need to use datenum to convert your data into a vector of values. In MATLAB datenums, "1" is a single day, so you can define a datenum step in terms of a time unit divided by the number of those in a day:
s = num_mins/(24*60);
Here's the trick with diff:
x = datenum(mydata);
s = num_mins/(24*60);
% for increasing times we shouldn't need the `abs` but to be safe
d = abs(diff(x));
q = [d (s+1)]>s&[(s+1) d]>s;
(You can use datestr to convert back, or apply q to the original data)
How it works:
The output of diff is one shorter than the original - it's just the difference between neighbouring values. We need it to be directional - to check each value against the one that comes before and after.
[d (s+1)]>s makes a vector the same length as the original, and checks if the difference values are larger than s. Because we set the last value to be s+1, the final value will always return true. This is a check to whether there's a gap between a value and the one following it (so for the final value this is always true).
[(s+1) d]>s does the same but on the other side. Again, we are setting one value, this time the first, to be larger than s so it's always true.
Combining these gives us the points where the difference is more than five minutes on either side (or for the end points, on one side).