I want to assign the current year in a YY format to either a macro or data set variable.
I am able to use the automatic macro variables &sysdate or &sysdate9 to get the current date. However, extracting the year in a YY format is proving to be a nightmare. Below are some examples of what I've been trying.
There exists the YEARw. format. But when I try to use it I get errors or weird results. For instance, running
data _null_;
yy = year(input("&sysdate9.", year2.));
put yy=;
run;
produces the error
ERROR 48-59: The informat YEAR was not found or could not be loaded.
If I try to format the variable in the output, I get 1965 instead of the current year. The following
data _null_;
yy = year(input("&sysdate9.", date9.));
put yy= yy year2.;
run;
outputs
yy=2016 65
Please help.
This works to get you the 2-digit year number of the current year:
DATA _NULL_;
YEAR = PUT(TODAY(),YEAR2.);
PUT YEAR;
RUN;
/* Returns: 16 */
To breakdown what I am doing here:
I use TODAY() to get the current date as a DATE type. &SASDATE needs to be converted to a DATE, but also it is the date that the SAS session started. TODAY() is the current date.
PUT allows us to pass in a non-character (numeric/date) value, which is why it is used with TODAY() as opposed to INPUT.
I think it is worth exploring the issues here in more detail.
First, Formats are patterns for converting numeric values to a human readable format. That's what you want to do here: convert a date value to a human readable format, in this case to a year.
Informats, on the other hand, convert human readable information to numeric values. That's not what you're doing here; you have a value already.
Second, put matches with Formats, and input matches with informats, exclusively.
Third, you get close in your last try: but you misuse the year format. Formats are basically value mappings, so they map every possible numeric value in their range (sometimes "all values" is the range, sometimes not) to a display value (string). You need to know what kind of value is expected on the input. YEARw. expects a date value as input, not a year value: meaning input is "number of days from 1/1/1960", mapped to "year". So you cannot take a value you've already mapped to a year value and map it again with that method; it will not make any sense.
Let's look at it:
data _null_;
yy = year(input("&sysdate9.", date9.));
put yy= yy year2.;
run;
yy contains the result of the year function - 2016. Good so far. Now, you need the 2 digit year (16); you can get that through mod function, if you like, or put/substr/input:
data _null_;
yy = input(substr(put(year(input("&sysdate9.", date9.)),4.),3,2),2.);
put yy=;
run;
mod is probably easier though since it's a number. But of course you could've used year:
data _null_;
yy = put(input("&sysdate9.", date9.),year2.);
put yy=;
run;
Now, yy is character, so you could wrap that with input(...,2.) or leave it character depending on your purposes.
Finally - a use note on &sysdate9.. You can easily make this a date without input:
"&sysdate9."d
So:
yy = put("&sysdate9."d,year2.);
That's called a date literal (and "..."dt and "..."t also work for datetime,time). They require things in the standard SAS formats to work properly.
And as pointed out in Nicarus' answer, today() is a bit better than &sysdate9 since it is guaranteed to be today. If you're running this in batch or restart your session daily, this won't matter, but it will if you have a long-running session.
Apply the year function to the date variable
Convert to string
Take last 2 digits
EDIT: change input to PUT
Year = substr(put(year(today()), 4.), 3);
Related
I've looked for help on the internet for the following, but I could not find a satisfying answer: for an assignment, I need to plot the time series of a certain variable (the term spread in percentages), with years on the x-axis.
However, we use daily data. Does anybody know a convenient way in which this can be done? The 'date' variable that I've got is formulated in the following way: 20111017 represents the 17th of October 2011.
I tried to extract the first 4 numbers of the variable 'date', by using the substr(date, 1, 4) command, but the message 'type mismatch' popped up. Also, I'm not quite sure if it gives the right information if I only use the years to plot daily data (over the years). It now gives the following graph, which doesn't look that nice.
Answering the question in your title.
The date() function expects a string. If your variable with value 20111017 is in a numeric format you can convert it like this: tostring datenum , gen(datestr).
Then when using the date() function you must provide a mask that tells Stata what format the date string is in. Below is a reproducible example you can run to see how this works.
* Example generated by -dataex-. For more info, type help dataex
clear
input float datenum
20111016
end
* Convert numberic varaible to string
tostring datenum , gen(datestr)
* Convert string to date
gen date = date(datestr, "YMD")
* Display date as date
format date %td
If this does not help you, try to provide a reproducible example.
This adds some details to the helpful answer by #TheIceBear.
As he indicates, one way to get a Stata daily date from your run-together date variable is convert it to a string first. But tostring is just one way to do that and not essential. (I have nothing against tostring, as its original author, but it is better suited to other tasks.)
Here I use daily() not date(): the results are identical, but it's a good idea to use daily(): date() is all too often misunderstood as a generic date function, whereas all it does is produce daily dates (or missings).
To get a numeric year variable, just divide by 10000 and round down. You could convert to a string, extract the first 4 characters, and then convert to numeric, but that's more operations.
clear
set obs 1
gen long date = 20111017
format date %8.0f
gen ddate = daily(strofreal(date, "%8.0f"), "YMD")
format %td ddate
gen year = floor(date/10000)
list
+-----------------------------+
| date ddate year |
|-----------------------------|
1. | 20111017 17oct2011 2011 |
+-----------------------------+
* date is in %td format
gen date1 = real(string(mofd(daily(date, "DMY")), "%tmCYN"))
* type mismatch error
tostring date, gen(dt)
gen date1 = real(string(mofd(daily(dt, "DMY")), "%tmCYN"))
* the code runs but generates no results
tostring date, gen(dt)
gen date2=date(dt, "YMD")
* the code runs but generates no results
If a date variable has a display format %td it must be numeric and stored as some kind of integer. The display format is, and is only, an instruction to Stata on how to display such integers. Confusions about conversion often seem to hinge on a misunderstanding about what format means, as format is an overloaded word in computing, referring variously to file format (as in graphics file format, .png or jpg or whatever); data layout (as in wide or long layout, structure or format); variable or storage type; and (here) display format. There could well be yet other meanings.
A date displayed as 30jan2015 is stored as an integer, namely
. display mdy(1, 30, 2015)
20118
and a glance at help data types shows that your variable date could be stored as an int, float, long or double. All would work, although int is least demanding of memory. You would need (e.g.) to run describe date to find out which type is being used in your case, but nothing to come in this answer depends on knowing that type. Note that finding out what Stata is doing and thinking can be illuminated by running display with simple, single examples.
Your question is ambiguous.
Want to change display format? If you wish merely to see your dates in a display format exemplified by 20150130 then consulting help datetime display formats shows that the display format is as tested here with display, which can be abbreviated all the way down to di
. di %tdCCYYNNDD 20118
20150130
so
format date %tdCCYYNNDD
is what you need. That instructs Stata to change the display format, but the numbers stored remain precisely as they were.
Want such dates as variables held as integers? If you want the dates to be held as integers like 20150130 then you could convert it to string using the display format above, and then to a real value. A minimal sandbox dataset shows this:
. clear
. set obs 1
Number of observations (_N) was 0, now 1.
. gen date = 20118
. gen wanted = real(strofreal(date, "%tdCCYYNNDD"))
. format wanted %8.0f
. l
+------------------+
| date wanted |
|------------------|
1. | 20118 20150130 |
+------------------+
A display format such as %8.0f is needed to see such values directly.
Another method is to generate a large integer directly. You need to be explicit about a suitable storage type and (as just mentioned) need to set an appropriate format, but it can be got to work:
. gen long also = 10000 * year(date) + 100 * month(date) + day(date)
. format also %8.0f
Want such dates as variables held as strings? This is the previous solution, but leave off the real(). The default display format will work fine.
. gen WANTED = strofreal(date, "%tdCCYYNNDD")
. l
+-----------------------------+
| date wanted WANTED |
|-----------------------------|
1. | 20118 20150130 20150130 |
+-----------------------------+
I have not used tostring here but as its original author I have no bias against it. The principles needed here are better illustrated using the underlying function strofreal(). The older name string() will still work.
Turning to your code,
tostring date, gen(dt)
will just put integers like 20118 in string form, so "20118", but there is no way that Stata can understand that alone to be a daily date. You could have run tostring with a format argument, which would have been equivalent to the code above. The advantage of tostring would only be if you had several such variables you wished to convert at once, as tostring would loop over such variables for you.
I can't follow why you thought that conversion to a monthly date or use of a monthly date display format was needed or helpful, as at best you'd lose the information on day of the month. Thus at best Stata can only map a monthly date back to the first day of that month, and at worst a monthly date (here 660) could not be understood as anything you want.
. di mofd(20118)
660
. di %td mofd(20118)
22oct1961
. di %td dofm(mofd(20118))
01jan2015
There is no shortcut to understanding how Stata thinks about dates that doesn't involve reading the needed parts of help datetime and help datetime display formats.
Yet more explanation and examples can be found at https://www.stata-journal.com/article.html?article=dm0067
I have dataset with dates stored as strings in a format ddMonyy e.g. 19Dec16.
When converting the strings using date7. informat to SAS date, some years are interpreted as 19yy and some as 20yy.
Here is a sample code
data strDates;
infile cards;
input StringDate $;
cards;
31Dec99
01Jan00
19Dec16
31Dec25
01Jan26
;
run;
data convertTest;
set strDates;
format Date date9.;
Date=input(StringDate,date7.);
run;
Running the code today (19 Dec 2016) produces the following results
strDate date
31Dec99 31DEC1999
01Jan00 01JAN2000
19Dec16 19DEC2016
31Dec25 31DEC2025
01jan26 01JAN1926
Dates between 01Jan00 and 31Dec25 are assigned to years 2000-2025 while dates from 01Jan26-31Dec99 are treated as years 1926-1999
Question:
How is it determined if 2000 or 1900 is to added to the year? I suspect it is dependent on the runtime (calendar year when the code is run?) - but I was not able to find any reference to this in SAS documentation.
There is an option, YEARCUTOFF, which depending on your system and version probably has a value of either 20 or 26. See KB note 46368 for more information on the change.
It sounds like you're using SAS 9.4, which means the default is 26: anything from 0-25 will be '20xx' and anything from 26-99 will be '19xx'. You can change the YEARCUTOFF option if that value does not work for your data (or, construct the 4 digit year yourself).
I currently have a dataset with dates in the format "FY15 FEB". In attempting to format this variable for use with SAS's times and dates, I've done the following:
data temp;
set pre_temp;
yr = substr(fiscal,3,2);
month = substr(fiscal,6,length(fiscal));
mmmyy = month||yr;
input mmmyy MONYY5.;
datalines;
run;
So, I have the strings representing the year and corresponding month. However, running this code gives me the error "The informat $MONYY was not found or could not be loaded." Doing some background on this error tells me that it has something to do with passing the informat a value with the wrong type; what should I alter in order to get the correct output?
*Edit: I see on the SAS support page for formats that "MONYYw. expects a SAS date value as input;" given this, how do I go from strings to a different date format before this one?
When you see a $, it means character value. In this case, you're feeding SAS a character value and giving it a numeric format. SAS inserts the $ for you, but there is no such format in existence.
I'm going to ignore the datalines statement, because I'm not sure why it's there (though I do notice there is no set statement). You might have an easier time just changing your program to:
data temp;
yr = substr(fiscal,3,2);
month = substr(fiscal,6,length(fiscal));
pre_mmmyy = strip(month)||strip(yr);
mmmyy=input(pre_mmmyy,MONYY5.);
run;
you can also remove the "length(fiscal))" from the substring function. The 3rd argument to the substring function is optional, and will go to the end of the string by default.
Hi I have a date conversion problem in SAS,
I imported an excel file which has the following dates.,
2012-01-09
2011-01-31
2010-06-28
2005-06-10
2012-09-19
2012-09-19
2007-06-12
2012-09-20
2004-11-01
2007-03-27
2008-06-23
2006-04-20
2012-09-20
2010-07-14
after I imported the dates have changed like this
40917
40574
40357
38513
41171
41171
39245
41172
38292
39168
39622
38827
41172
40373
I have used the input function to convert the dates but it gives a strange result.,
the code I used.,
want_date=input(have_date, anydtdte12.);
informat want_date date9.; format have_date date9.;run;
I get very stange and out of the World dates., any idea how can I convert these?
You can encourage SAS to convert the data as date during the import, although this isn't necessarily a panacea.
proc import file=whatever out=whatever dbms=excel replace;
dbdsopts=(dbSasType=( datevar=date ) );
run;
where datevar is your date column name. This tells SAS to expect this to be a date and to try to convert it.
See So Your Data Are in Excel for more information, or the documentation.
From : http://www2.sas.com/proceedings/sugi29/068-29.pdf
Times are counted internally in SAS as seconds since midnight and
date/time combinations are calculated as the number of seconds since
midnight 1 January 1960.
Excel also uses simple numerical values for dates and times
internally. For the date values the difference with the SAS date is
only the anchor point. Excel uses 1 January 1900 as day one.
So add a constant.
EXAMPLES:
SAS_date = Excel_date - 21916;
SAS_time = Excel_time * 86400;
SAS_date_time = (Excel_date_time - 21916) * 86400;
As Justin wrote you need to correct for the different zero date (SAS vs. Excel).
Then you just need to apply a format (if you want to get a date variable to do calculations):
want_date = have_date-21916;
format want_date date9.;
Or convert it to a string:
want_date = put(have_date-21916, date9.);
In either case you can choose the date format you prefer.