Convert Character Date variable to SAS Date - date

I have the following Variable called Date in an excel file which I'm reading into SAS:
Date
May2005
June2005
July2005
..
July2015
Both the format and the informat are characters ($8)
I wanted to convert these into a SAS Date variable.
How can I accomplish this task?
I thought about using substr to first create a month and year variable,
then use proc format to convert all the months to numeric (e.g 'jan' = 1).
The use the mdy date function to create a new date. But I wonder if there is a shorter way to accomplish this task?

You can use the ANYDTDTE. informat if you prepend a day to your month+year string.
data want ;
set have ;
actual_date = input('01'||date,anydtdte.);
format actual_date date9.;
run;
Note that the FORMAT or INFORMAT attached to the character variable is meaningless, but having a variable of only length 8 will not allow room to store longer month names. Perhaps the length got set to only 8 because your particular example set of data did not include any longer month names.
If you are running such an old version of SAS that the ANYDTDTE. informat does not exist or does not work with fully spelled out months then you will need to work a little harder. You could transform the string into DATE9 format.
actual_date = input
('01'||substr(date,1,3)||substr(date,length(date)-3)
,DATE9.);

As #Tom hints towards, you have to use an informat that SAS can interpret as a numeric value when reading in character dates. I'm not sure if there is one that reads MONTHYYYYw., (naturally, ANYDTDTE works but I prefer to avoid it). In this case, I would use MONYYw., combined with substr to get the length 3 Month abbreviation and the 2 digit year:
data have;
input Date $13.;
datalines;
January2005
Feburary2005
March2005
April2005
May2005
June2005
July2005
August2005
September2005
October2005
November2005
December2005
;
run;
data want;
set have;
Date2 = input(SUBSTR(Date,1,3)||SUBSTR(Date,length(date)-1,2),MONYY13.);
Format Date2 DATE8.;
run;
proc print data = want; run;

Related

Convert character date to SAS date

My data is currently structured in the following way (dummy data below), where the date is formated as such example: 3/7/20 (M/D/YY) but I need the data in the following month-year form: 032020 (i.e. I need format mmyyn6.). I have tried a number of different things to get in this format, but nothing has worked.
Current data structure:
DATA HAVE;
INPUT GROUP $ DATE $ COUNT_CUMU;
DATALINES;
A 3/7/20 2
A 3/8/20 8
A 3/9/20 16
RUN;
These solutions don't work, and give me extraneous numbers.
DATA WANT1;
SET HAVE;
MONTH_YEAR = INPUT(DATE,mmyyn6.);
FORMAT DATE MMDDYY8.;
RUN;
PROC SQL;
CREATE TABLE WANT2 AS
SELECT *,
INPUT(DATE, ANYDTDTM.) AS MONTH_YEAR FORMAT=mmyyn6.
FROM HAVE;
QUIT;
This solution works, but is not the format I need it in.
PROC SQL;
CREATE TABLE WANT3 AS
SELECT *,
INPUT(DATE, ANYDTDTM.) AS MONTH_YEAR FORMAT=DTMONYY7.
FROM HAVE;
QUIT;
Thank you for any advise or code you can share.
It is easy to do what you asked for.
Use the MMDDYY informat to convert the strings into date values. Note that the INPUT() function does not care if you use a width on the informat that is larger than the length of the string being read, so use the maximum width the informat supports.
You can use the MMDDYYN format to display dates without any separator character.
You can use MMYYN format to display only the month and year. But in that case you might always want to change the date values to the first of the month.
And it works for the example data you provided.
DATA HAVE;
INPUT GROUP $ DATE $ COUNT_CUMU;
DATALINES;
A 3/7/20 2
A 3/8/20 8
A 3/9/20 16
;
data want;
set have;
real_date = input(date,mmddyy10.);
format real_date yymmdd10. ;
month_year = intnx('month',real_date,0);
year_month = month_year;
format month_year mmyyn6. year_month yymmn6. ;
run;
Results:
COUNT_ month_ year_
Obs GROUP DATE CUMU real_date year month
1 A 3/7/20 2 2020-03-07 032020 202003
2 A 3/8/20 8 2020-03-08 032020 202003
3 A 3/9/20 16 2020-03-09 032020 202003
If it does not work for you then you need to show examples of the input strings that do not work. Or explain how having a date value that is displayed using the MMDDYYN format does not work for you.
PS You should avoid using only two digits to record or display years. Look up Y2K problem. You should also avoid either MDY or DMY ordering of date digits to avoid confusing 50% of your audience. If you want to use only digits then use YMD order (YYMMDD or YYMMDDN format).

How to Display SAS date internal value

How can I display the internal date value of a date variable in sas?
I have it currently formatted as a date in the format ddmmyy10. and I would like to display the internal date value.
I initially thought of using the datediff function to get the difference from my date and January 1st 1960 but was wondering if there were a simpler way.
Thanks in advance
Alex
Simply set it as the numeric format 8.
/* Example data */
data have;
date = '02MAY2022'd;
format date date9.;
run;
/* Change the format of date in the dataset 'have' */
proc datasets lib=work nolist;
modify have;
format date 8.;
quit;
Output:
date
22767
Or, in Enterprise Guide, change the format through the GUI:
Just use a different format. Since recent dates are in the tens of thousands of days since 1/1/1960 the COMMA format would work well.
proc print data=have ;
format date comma8. ;
run;
Or remove the format completely and let SAS use its default display method for the numbers.
proc print data=have ;
format date ;
run;

How do you change a variable in a CSV from text format to date format in SAS?

I'm trying to import a csv which has a variable called date. I'm trying to import it so the format once it's in a SAS table is date9 (e.g. 01MAY2021). Here is what I've tried:
Data Test;
infile "\\file.csv"
delimiter="," missover firstobs=2 dsd lrecl=32767;
informat Date ddmmyyyy.;
format Name $100.;
format Location $100.;
format Date date9.;
format Type $10.;
input Name $
Location $
Date
Type $;
run;
This currently just returns blank rows for the Date variable... The rows under the Date column in the csv are all populated as 'May-21'.
While I do not know what format your date variable is in for your csv file, use the informat anydtdte. to read the date. This automatically checks for a variety of different date types automatically and parses it. Use a : on your date variable in your input statement to specify the informat to use.
Data test;
infile "\\file.csv" dlm="," missover firstobs=2 dsd lrecl=32767;
length Name
Location $100.
Date 8.
Type $10.
;
format date date9.;
input name$ location$ date:anydtdte. type$;
run;
The ANYDTDTE informat reads input data that corresponds to any of the
following informats or date, time, or datetime forms. Then, the
informat extracts the date part from the derived value.
Since you say your date shows as May-21 which is a MONYY format (roughly) but then you use an INFORMAT of ddmmyyyy that doesn't align. Your informat should reflect how your data looks before you read it in. You can use the MONYY informat to ensure it's read correctly.
data have;
infile cards truncover;
informat orig monyy.;
format orig date9.;
input orig ;
cards;
May-21
Jun-21
Sep-20
Jan-19
;;;;
run;
proc print data=have;
run;
Results:
Obs orig
1 01MAY2021
2 01JUN2021
3 01SEP2020
4 01JAN2019
Use an INFORMAT that matches the pattern of the text in the file. The MONYY informat will work with your example string.
You can then attach any date type FORMAT you want to control how the date value is printed. I wouldn't use DMY order or MDY order for dates as either choice will confuse half of the audience. You could use DATE or YYMMDD to avoid confusion. Or you could use MONYY format to get values similar to the original text. But make sure to use 4 digits for the year to avoid confusion about which century they mean.
There is no need to attach formats to your character variables. Also no need to include $ in the INPUT statement if you have already set the variable type before that statement.
data Test;
infile "\\file.csv" dsd truncover firstobs=2;
length Name $100 Date 8 Location $100 Type $10 ;
informat date monyy.;
format date monyy7. ;
input name date location type;
run;

How to Define date format from a given date

I wanted to know, how can we define date format from given date
for example, i have date 20180423 then in sas I want to define format as 'yyyymmdd'
similarly , i have date given in data as 12022018 then i want to define as 'ddmmyyyy'
Please note that, date is provided to me in proper date, but i want to define format now.
Date given may be different in future
so I need to take care all of the date format through SAS
What I thought was given date 20180422
use substr function
data test;
a=20180422;
a=substr(a,1,4);
b=substr(a,5,1);
c=substr(a,7,1);
run;
but not sure.
If anyone can provide the solution,then it really helps me in my project work.
Thanks in Advance for help.
It sounds like you want to convert various values to a date. SAS stores dates as a number, being the number of days since 1st Jan 1960. It's then usual to format this number to display as a date, in whichever format is preferred.
When importing dates that's are already in a format, it is necessary to use the input function, along with an informat, to convert the formatted value to a SAS date. If the date values being read in are all in the same format, then the specific informat can be used. In your case, where different formats are used, you can use the anydtdte. informat which will convert most of the standard date formats to a SAS date.
The example below converts 3 different date formats to a SAS date, then displays the SAS date in the date9. format. I've printed both the unformatted and formatted new values to the log, just so you can see they are stored as numbers.
data _null_;
input date_in $20.;
date_out = input(date_in, anydtdte20.);
put date_in date_out date_out :date9.;
datalines;
20180422
12022018
27apr2018
;
run;
Use the input(a,anydtdte20.); this will convert any date to SAS date, then use the functions Year(), Month(), Day() to extract the data you want.
You will find this SAS Post very useful about dates and locales.
Solution:
I created a table with two rows; each row have a different date format YYYYMMDD & DDMMYYYY to show you how the code will handles different date formats, saved them to SAS date and broke them down to Year, Month & Day:
options DATESTYLE=DMY;
data have;
input a;
datalines;
20180422
12022018
;
run;
data test;
set have;
format date_a date9.;
date_a=input(a,anydtdte20.);
Year_a=year(date_a);
month_a=month(date_a);
day_a=day(date_a);
run;
Output:
a=20180422 date_a=22APR2018 Year_a=2018 month_a=4 day_a=22
a=12022018 date_a=12FEB2018 Year_a=2018 month_a=2 day_a=12
You can use an if condition inside a data step. Using If condition, check for the condition to be true (check date value satisfies the required criteria), then format the date using a put function.Put function can take a source as first argument and format as second argument , and return the formatted value. Different values of same column, can have different formats specified that way.
Something like this,
if a = 'date1CheckCondtion' then newA = put(a , dateformat1.);
if a = 'date2' then newA = put(a , dateformat2.);
You may then choose to get all values in a common format like this:
dateA=input(newA,mmddyy6.);

SAS Num to Date in Quarter format

My dataset has a date in NUM format 201011 which is Nov 2010. I want it converted to 2010Q4 in date format. I applied YYQ6. format but it shows results as 2510Q4. What is wrong in here?
data abc;
date=201011;
run;
data abc2;
set abc1;
format date YYQ.;
run;
You need to write your date as a date literal, otherwise SAS will interpret it as the number of days since 1st January 1960.
Try this:
data abc;
date='01nov2010'd;
run;
data abc2;
set abc1;
format date YYQ.;
run;
If you have an existing numeric variable in the format yyyymm, you will need to create a new one first before applying the format, e.g.
newdate = input(put(date,6.), yymmn6.);
format newdate yyq.;
data _null_;
datenum=201011;
dateval=mdy(mod(datenum,100),1,floor(datenum/100));
put dateval mmddyy10.;
run;
Not sure the FLOOR function is necessary--in this case, it worked okay without it.
The question is, is this arithmetic manipulation and use of MDY more or less efficient than converting to character and then back to a date?