Hello currently my dates are stored as numeric in the form of 40547. How can I convert these to MMDDYY10.?
data SevenSec11;
set Seven11;
DateRecieved = input(put(DateRecieved, 8.), MMDDYY10.);
format DateRecieved MMDDYY10.;
run;
How to convert it depends on what the value represents. If it is dates as stored by Excel then change the offset value. If it is supposed to represent MMDDYY values then use the Z6. format in your PUT() function call.
data test;
input num ;
sasdate1 = num + '30DEC1899'd ;
sasdate2 = input(put(num ,z6.),mmddyy10.);
format num comma7. sasdate: yymmdd10. ;
cards;
40547
;
Result:
Obs num sasdate1 sasdate2
1 40,547 2011-01-04 1947-04-05
Note that using Y-M-D order for dates will eliminate confusion that truncated leading zeros can cause. It will also prevent half of your audience from confusing April 5th with May 4th.
Related
i am currently trying to write some code that goes through my data and marks a number 0-12 based off the date in the "Week" column. this number appears in a new column called group which is created by the code you see below. The problem is that this column is periods all the way down and not numbers. There are no errors messages in the log so i dont know where i went wrong (im fairly new to sas). PS. the dates range from 6/17 to 9/9
data have;
set have;
if today()+84 = Week > today()+79 then group=12;
else if today()+77 = Week > today()+72 then group=11;
else if today()+70 = Week > today()+65 then group=10;
else if today()+63 = Week > today()+58 then group=9;
else if today()+56 = Week > today()+51 then group=8;
else if today()+49 = Week > today()+45 then group=7;
else if today()+42 = Week > today()+37 then group=6;
else if today()+35 = Week > today()+30 then group=5;
else if today()+28 = Week > today()+23 then group=4;
else if today()+21 = Week > today()+16 then group=3;
else if today()+14 = Week > today()+11 then group=2;
else if today()+7 = Week > today()+2 then group=1;
else if today() = Week > today()-5 then group=0;
run;
update:
the first column is called week and is a monday date that goes 12 weeks into the future. the rest of the columns are variables that i will end up summing based on the group that row is in.
ex:
week ID var2 ... var18
17jun2019 1 x x
24jun2019 1 x x
and it continues until 09sept2019.. it does this for each ID (roughly 10,000 of them) but not every id goes 12 weeks out thats why i am using the else if
i would like it to look like
week ID var2 ... var18 group
17jun2019 1 x x 0
24jun2019 1 x x 1
01july2019 1 x x 2
A full reference to SAS operators can be found in SAS help by searching SAS Operators in Expression. SAS expressions can use some operators that are relatively unique across the spectrum of coding languages. Here are some that are not typically found in newly coded SAS (at time of this post)
<> MAX operator
>< MIN operator
implied AND operator
Two comparisons with a common variable linked by AND can be condensed with an implied AND.
So the uninitiated readers of the question may misunderstand
…
if today()+35 = Week > today()+30 then group=5;
…
as incorrect, instead of recognizing it as an implied AND
…
if today()+35 = Week AND Week > today()+30 then group=5;
…
When syntactically correct, the = in the implied AND causes the expression to be true only on equality. The week value in open interval ( today()+35, today()+34 ) will never evaluate as true in the above expression. This is the likely cause of the missing values (.) you are seeing.
Why does the code exhibit non-static delta of 7 in the sequence 30,23,16,11,2,-5 ?
Should it be 30,23,16,9,2,-5.
In other words why is group 1 apparently shooting for a 5 day range [+7, +2) when all the others are 3, such as [+14, +11) ?
Why are there 2-days domains, presumed weekends, in which group is not assigned, and would thus be missing (.) ?
This type of wallpaper code is often better represented by a an arithmetic expression.
For example, presuming integer SAS date values:
group = ifn ( MOD (week-today(), 7) in (1,2)
, .
, CEIL (week-today() / 7 )
);
if not ( 0 <= group <= 12 ) then group = .; * probably dont want this but makes it compliant with OP;
Tomorrow the group value could 'wrong' because it is today() based. Consider coding a view instead of creating a permanent data set -- OR -- place meta information in the variable name group_on_20190622 = …
If you insist on wallpaper, consider using a select statement which is less prone to typing errors that can happen with errant semi-colons or missing elses.
It is not at all clear what you are trying to do. It sounds a little like you want to group observations based on how many weeks the date variable (called WEEK) is away from today's date. It might be easiest to just use the INTCK() function. That will count how many week boundary's are crossed between the two dates.
data have ;
input id week date9.;
format week date9.;
cards;
1 17jun2019
1 24jun2019
1 01jul2019
2 24jun2019
2 01jul2019
2 08jul2019
;
data want ;
set have;
group = intck('week',today(),week);
run;
You can then summarize the number of ID's per group.
proc freq data=want;
tables group;
run;
Results:
The FREQ Procedure
Cumulative Cumulative
group Frequency Percent Frequency Percent
----------------------------------------------------------
-1 1 16.67 1 16.67
0 2 33.33 3 50.00
1 2 33.33 5 83.33
2 1 16.67 6 100.00
Assuming week is date and not datetime.
data test;
do i = 1 to 30;
dt = intnx('day',today(),1*i);
output;
end;
format dt date9.;
run;
data test2;
set test;
if dt ge today() and dt le today()+7 then dt2 = 1;
else if dt ge today()+8 and dt le today()+14 then dt2 = 2;
else if dt ge today()+15 and dt le today()+21 then dt2 = 3;
else if dt ge today()+22 and dt le today()+28 then dt2 = 4;
else if dt ge today()+29 and dt le today()+35 then dt2 = 5;
/* another way */
dt3 = ceil(intck('day',today(),dt)/7);
run;
removed wrong answer.
I know the date in SAS looks like 01Jan2017. What I want is 1 January 2017. Is there function to make it?
Thank,
Andrea
Your question is pretty vague - it's hard to tell if you just want to display it differently or store it differently.
Display it differently: Just change the format to keep the date as a number in the background (number of days since 01JAN1960) but display it how you would like.
data ds1;
date = '01JAN2017'd;
format date WORDDATX20.;
run;
Store it differently: You can use the put function to create a separate character variable containing the formatted version of your date.
data ds2;
date1 = '01JAN2017'd;
date2 = put(date1, WORDDATX20.);
run;
Answers provided by Bhavika and pm2r should also help you understand what's going on here.
your code would look like this:
data _null_;
length date1 8. string $40;
date1=today();
string = cats( date1 );
put string=;
string = cats( put(date1,date10.) );
put string=;
string = cats( put(date1,WORDDATX20.) );
put string=;
run;
and the output would be:
string=20838
string=19JAN2017
string=19 January 2017
I am trying to maintain a table using some panel data. I have all the data outputting fine, but I am having difficulty getting the correct dates to display. The method I am using is the following:
gen ymdny = date(date,"MDY"); /*<- date var from panel dataset that i import*/
sort name ymdny;
summ ymdny;
local lastdate : disp %tdM-D r(max);
local lastdate2 : disp %tdM-D (r(max)-1);
local lastw : disp %tdM-D (r(max)-7);
This would work fine if the data were daily, but the dataset I have is actually business daily (ie. missing for the weekends and bank national holidays). It seems silly but I have not been able to figure out a workaround that does the job. Ideally - there is a function that i can use to print the corresponding date to a particular value.
For example:
gen resbal_1d = round(l1.resbal,0.1);
gen dateOf = dateOf(resbal_1d); /* <- pseudocode example of what I would like */
I'm not sure what you're asking for but my guess is that you want to see a human readable form date as the output, given a numerical input. (This is your last sentence.) So simply try something like:
display %td 10
The format is important as the following shows (see help format):
display %tq 10
Same numerical input, different format, different output.
Two other examples from the manual:
* string to integer
display date("5-12-1998", "MDY")
* string to date format
display %td date("5-12-1998", "MDY")
As for your example code, I don't get what you're aiming for. In effect, you can summarize the date variable because in Stata, dates are just integers. It's legal but couldn't say if it's good form. Below a simple example.
clear all
set more off
set obs 10
gen date = _n // create the data
format date %td // give date format
list
summarize date
local onedate = r(max)
display %td `onedate'
Some references:
[U] 24 Working with dates and time
help datetime
help datetime business calendars
http://www.stata.com/support/faqs/data-management/creating-date-variables/
http://www.ats.ucla.edu/stat/stata/modules/dates.htm
(Maybe you can explain with more detail and context what it is you want.)
Edit
Your comment
I do not see how this helps with the date output. For example,
displaying r(max) - 1 on a monday will still display the sunday date.
does not explain, at all, the problems you're having with Stata's business calendars.
I'm adding what is basically an example taken from the help file I already referenced. I do this with the hope of convincing you that (re)-reading the help files is worthwhile.
*clear all
set more off
* import string dates
infile str10 sdate float x using http://www.stata-press.com/data/r13/bcal_simple
list
*----- Regular dates -----
* create elapsed dates - Stata's way of managing dates
generate rdate = date(sdate, "MD20Y")
format rdate %td
drop sdate x
list
* compute previous and next dates
generate tomorrow1 = rdate + 1
format tomorrow1 %td
generate yesterday1 = rdate - 1
format yesterday1 %td
list
*----- Business dates -----
* convert regular date to business dates
generate bdate = bofd("simple", rdate)
format bdate %tbsimple
* compute previous and next dates
generate tomorrow2 = bdate + 1
format tomorrow2 %tbsimple
generate yesterday2 = bdate - 1
format yesterday2 %tbsimple
order yesterday1 rdate tomorrow1 yesterday2 bdate tomorrow2
list
/*
The stbcal-file for simple, the calendar shown below,
November 2011
Su Mo Tu We Th Fr Sa
---------------------------
1 2 3 4 X
X 7 8 9 10 11 X
X 14 15 16 17 18 X
X 21 22 23 X X X
X 28 29 30
---------------------------
*/
Notice that if you add or substract 1 from a regular date, then business days are not taken into account. If you do the same with a business calendar date, you get what you want. Business calendars are defined by .stbcal files; the example uses a built-in calendar called simple. You maybe need to make your own .stbcal file but it is not difficult. Again, the details are in the help files.
I need to convert date and time into a numerical value. for example:
>> num = datenum('2011-05-07 11:52:23')
num =
7.3463e+05
How would I write a script to do this for numerous values without inputting the date and time manually?
You can store your date strings first in a cell array (or a matrix, provided they are of fixed format), and feed it straight to datenum. For example:
C = {'2011-05-07 11:52:23'
'2011-03-01 20:30:01'};
vals = datenum(C)
I have a dataset for which I have extracted the date at which an event occurred. The date is in the format of MMDDYY although MatLab does not show leading zeros so often it's MDDYY.
Is there a method to find the mean or median (I could use either) date? median works fine when there is an odd number of days but for even numbers I believe it is averaging the two middle ones which doesn't produce sensible values. I've been trying to convert the dates to a MatLab format with regexp and put it back together but I haven't gotten it to work. Thanks
dates=[32381 41081 40581 32381 32981 41081 40981 40581];
You can use datenum to convert dates to a serial date number (1 at 01/01/0000, 2 at 02/01/0000, 367 at 01/01/0001, etc.):
strDate='27112011';
numDate = datenum(strDate,'ddmmyyyy')
Any arithmetic operation can then be performed on these date numbers, like taking a mean or median:
mean(numDates)
median(numDates)
The only problem here, is that you don't have your dates in a string type, but as numbers. Luckily datenum also accepts numeric input, but you'll have to give the day, month and year separated in a vector:
numDate = datenum([year month day])
or as rows in a matrix if you have multiple timestamps.
So for your specified example data:
dates=[32381 41081 40581 32381 32981 41081 40981 40581];
years = mod(dates,100);
dates = (dates-years)./100;
days = mod(dates,100);
months = (dates-days)./100;
years = years + 1900; % set the years to the 20th century
numDates = datenum([years(:) months(:) days(:)]);
fprintf('The mean date is %s\n', datestr(mean(numDates)));
fprintf('The median date is %s\n', datestr(median(numDates)));
In this example I converted the resulting mean and median back to a readable date format using datestr, which takes the serial date number as input.
Try this:
dates=[32381 41081 40581 32381 32981 41081 40981 40581];
d=zeros(1,length(dates));
for i=1:length(dates)
d(i)=datenum(num2str(dates(i)),'ddmmyy');
end
m=mean(d);
m_str=datestr(m,'dd.mm.yy')
I hope this info to be useful, regards
Store the dates as YYMMDD, rather than as MMDDYY. This has the useful side effect that the numeric order of the dates is also the chronological order.
Here is the pseudo-code for a function that you could write.
foreach date:
year = date % 100
date = (date - year) / 100
day = date % 100
date = (date - day) / 100
month = date
newdate = year * 100 * 100 + month * 100 + day
end for
Once you have the dates in YYMMDD format, then find the median (numerically), and this is also the median chronologically.
You see above how to present dates as numbers.
I will add no your issue of finding median of the list. The default matlab median function will average the two middle values when there are an even number of values.
But you can do it yourself! Try this:
dates; % is your array of dates in numeric form
sdates = sort(dates);
mediandate = sdates(round((length(sdates)+1)/2));