d3 date/time plotting out by 100 years - date

Can anyone help a beginner d3'er? I've created a scatter plot using d3 that shows Filing Date (of a Document Family) on the x-axis, and # family Members on the y-axis. Dots are all the same radius and are colour coded by Status. I have 300+ data records in a csv file format and all but the first 4 of them plot perfectly. All dates are post 1900.
Included below are the first 9 lines of the csv data set.(I've used line breaks here to make it a bit more readable). The first 4 records are plotting out at +100 years on their actual date. e.g. 1934 plots as 2034.... but the 3-Nov-75 and the ~300 others all plot as expected. The data came from an Excel spreadsheet, all date cells were formatted the same way, all show the correct 4-digit date year, and none behave as text strings instead of dates. The date display format in Excel was dd-mmm-yy.
Num,Word,FilingDate,Status,Type,FamNum
63872,Word1,23-May-34,Removed,Word,2
69105,Word2,19-Oct-36,Registered,Word,1
175164,Word3,31-Jul-62,Registered,Word,6
207804,Word4,1-Feb-67,Registered,Word,6
291765,Word5,3-Nov-75,Registered,Word,12
381067,Word6,15-Sep-82,Removed,Word,2
381069,Word7,15-Sep-82,Removed,Word,2
402936,Word8,27-Jan-84,Removed,Word,2
410476,Word9,20-Jun-84,Removed,Word,2
I've included the following in the html code, which is working for all but the first 4 records.
// Parse the date/time
var parseDate = d3.time.format("%d-%b-%y").parse;
Does anyone have any suggestions on what may be going haywire?
Thanks.

You will have to hack a little to produce the date you wish.
Instead of using
var parseDate = d3.time.format("%d-%b-%y").parse;
create a new function to parse your date:
var parseDate = function(d){
var last2 = parseInt(d.slice(-2));
var date = d3.time.format("%d-%b-%y").parse(d)
if(last2 < 69){//if year is less than 69 then change the year
date.setFullYear(1900 + last2);//will generate 1900 date
}
return date
}

Related

Plot Two Regression Lines on Same Scatter Plot By Year: X-Axis Date MM/DD

I have a scatter plot of calls / time. My x variable is the date (Day/Month) and my Y variable is a number of calls on each date. I would like to plot two regression lines using PROC SGPLOT REG, one for 2019 and one for 2020. However, when I try to do this, all I get is a regular scatter plot with no regression lines. Here is my code:
proc sgplot data=intern.bothphase1;
reg x=date y=count / group=Year;
label count="Calls Per Day" year="Year";
Title "Comparison of EMS Calls per Day 1/1 - 3/31 in 2019 vs.
2020";
run;
The scatter plot comes up without issue (2019 and 2020 values in different colors) but I want to see how the trends differed between the two time periods, so I really want to get the regression lines on there. Can anyone help?
I imagine this has to do with the fact that I concatenated my day and month with a / so it is a character variable and so SAS cannot calculate the regression. I did this so I could use year as a class variable. I still have the original date variable in my table, is there a way I could get SAS to give me the month/day from that as a numeric variable?
Thanks!
EDIT: I used a date value in SAS and changed the format to mm/dd, but this doesn't help because the regression lines are just on either end of the graph rather than overlapping (picture attached). what I want is to have the regression lines overlap for the same time period 2019 vs. 2020 This is because SAS dates correspond to numbers from 1/1/1960. What I want is the mm/dd to correspond to numbers 1-365 so I get two overlapping regression lines to show how the trends changed from one year to the next. Anyone know how I can do this?
So two steps here: first, you need to generate a "day" value that's 1-365... so let's just subtract out 01JAN from the day value.
data have;
do date = '01JAN2019'd to '31DEC2020'd;
count = 25+2*rand('uniform');
year = year(date);
if month(date) le 3 then output;
end;
format date date9.;
run;
data adjusted;
set have;
date_fixed = date - intnx('year',date,0,'b') + 1; *current date minus jan 1 plus 1 (otherwise off by 1);
format date_fixed date5.; *this does not actually affect the graph axis, oddly;
run;
proc sgplot data=adjusted;
reg x=date_fixed y=count / group=Year;
xaxis valuesformat=date5.; *this seems to be needed for some reason;
label count="Calls Per Day" year="Year";
Title "Comparison of EMS Calls per Day 1/1 - 3/31 in 2019 vs.
2020";
run;
Then we add the xaxis line because for some reason it won't obey the DATE5. format (could also use MMDDYY5. as Reeza noted in comments, but we can force it to here.
Here is what I get. You can use other axis options to further limit things, so for example 01APR doesn't show up.
)

Spotfire: Empty values on x-axis

Spotfire question: I have a data table with monthly data that is visualized in a bar chart. I however also want to visualize the information in quarters, by showing the latest month of the quarter in the format '20Q1, 20Q2, etc.'. (So I don't want to use standard 'date format'.)
My idea was to create an additional column that is filled for March, June, Sep, Dec and empty for the other months. Then with a document property, the user can select to either view the data in months or in quarters (i.e. the last month of the quarter).
So far so good, my data now looks like this:
Month
Value
YearQuarter
Jan-20
100
Feb-20
200
Mar-20
400
20Q1
Apr-20
125
May-20
101
Jun-20
300
20Q2
The problem now is that when I visualize the data with YearQuarter on the x-axis, it also shows all the (empty) values in a bucket. See below. How to solve this? Note that the x-axis has a custom expression "<$esc(${Granularity})>", where Granularity is a document property to determine what column to pick.
Did you try limiting your data with whatever expression you have put in x axis.
Thanks

MATLAB: Find all values on one date, then filter down to an hour and find average [duplicate]

This question already has answers here:
Counting values by day/hour with timeseries in MATLAB
(3 answers)
Closed 6 years ago.
I have a year's worth of data, the data is recorded one minute intervals each day of the year.
The date and time was imported from excel (in form 243.981944, then by adding 42004 (so will be for 2015) and formatting to date it becomes 31.8.15 23:34:00).
Importing to MATLAB it becomes
'31/08/2015 23:34:00'
I require the data for each day of the year to be at hourly intervals, so I need to sum the data recorded in each hour and divide that by the number of data recorded for that hour, giving me the hourly average.
For some reason the data in August actually increments in 2 minute intervals, data for every other month increments in one minute intervals.
ie
...
31/07/2015 23:57:00
31/07/2015 23:58:00
31/07/2015 23:59:00
31/08/2015 00:00:00
31/08/2015 00:02:00
31/08/2015 00:04:00
...
I'm not sure how I can find all the values for a specific date and hour in order to work out the averages. I was thinking of using a for loop to find the values on each day, but when I got down to writing code realised this wouldn't work the way I was thinking.
I presume there must be some kind of functions available that would allow for data to be filtered by the date and time?
edit:
So I tried the following but I get these errors.
dates is a 520000x1 cell array containing the dates form = formatIn.
formatIn = 'DD/MM/YYYY HH:MM:SS';
[~,M,D,H] = datevec(dates, formatIn);
Error using cnv2icudf (line 131) Unrecognized minute format.
Format string: DD/MM/YYYY HH:MM:SS.
Error in datevec (line 112) icu_dtformat = cnv2icudf(varargin{isdateformat});`
Assuming your data is in a matrix or cell-array of strings called A, and your other data is in a vector X. Let's say all the data is in the same year (so we can ignore years)
[~,M,D,H] = datevec(A, 'dd/mm/yyyy HH:MM:SS');
mean_A = accumarray([M, D, H+1], X, [], #mean);
Then data from February will be in
mean_A(2,:,:)
To look at the data, you may find the squeeze() function useful, e.g.
squeeze(mean_A(2,1:10,13:24))
shows the average for the hours after midday (by column) for the first ten days (by row) of February.
See also:
Counting values by day/hour with timeseries in MATLAB

Converting a date/math formula in Excel into Numbers for Mac

I have a formula in Excel that subtracts a birth date from today's date and divides by 365 which gives the age in decimal format. Example below.
B4 is equal to birthday of 10/03/2011.
E4 is today's date.
The result is 2.73. My child is a little over 2 and 1/2.
=IF(B4>0,(E$4-B4)/365," ")
When I try to use this formula in Numbers for Mac, it gives me an error about comparing dates with numbers and so. I looked at DatedIF, TimeValue, and DateValue but couldn't figure out how to do it in Number.
Anyone know how I could get this formula to return a decimal value of 2.73 years of age?
Assuming B4 is in "date" format, you could try the following formula
=IF(ISBLANK(B4),"",YEARFRAC(B4,TODAY(),1))
Here is some documentation on YEARFRAC from the horse's mouth
Try entering the formula
=B4>0
in a different cell, you will then encounter en error
You can’t compare a date with a number because their data types are different.

Concatenate variable horizontally to make one variable in matlab

Is it possible to concatenate multiple variable horizontally to make a single variable in Matlab?
For Example, I want to join:
year = 2001, month = 06, day = 15
to make one variable '20010615' which I could search in a matrix.
I hope I am clear.
Regards,
If you want a string output, use string formatting and sprintf
sprintf('%04d%02d%02d', year, month, day );
If you want a numeric output, simply multiply
day + 100 * month + 10000 * year
Update:
Thanks to #Joshua's comment: if you are indeed working with date/time information you should also look into datestr that allows more speciallized formatting for date and time information.