Feels like an obvious question, but Stata help hasn't yielded answers. Most Stata users are interested in converting a non-date variable into a date variable, but I want the opposite.
I have a date variable date, type long, format %tdCCYYNN. I'm trying to append it to a dataset in which the same variable date is type long and format %12.0g. To accurately do this, I need to convert date in the first dataset from %tdCCYYNN to %12.0g. When I do format %12.0g date, date values change to incorrect ones.
Let's say, in the first dataset, I have date=201204. I still want it to read 201204, just as a %12.0g variable. Is there a way to do this?
I +1 all the comments above by Nick and William and suggest you read help datetime. I have been using Stata for a few years and still frequently visit this help file. Stata's date/time functionality is fantastic and you will benefit from learning it earlier rather than later.
I would convert the other data to Stata date format. Really. But if you need to convert your %td date to an "integer YYYYNN" date, then pass through a temporary file. If you write your %td date to plain text, then it will keep the displayed format and you can read it back as an integer YYYYNN date.
// data that matches your decsription
clear
set obs 1
generate date = date("20120401", "YMD")
format date %tdCCYYNN
list
// write to tempfile as plain text
tempfile plainText
outsheet using "`plainText'"
// read back with dates as integers
preserve
tempfile StataData
insheet using "`plainText'", clear
rename date dateInteger
save "`StataData'"
restore
// merge to original data
merge 1:1 _n using "`StataData'"
list
describe
This yields the following.
. list
+---------------------------------+
| date dateIn~r _merge |
|---------------------------------|
1. | 201204 201204 matched (3) |
+---------------------------------+
. describe
Contains data
obs: 1
vars: 3
size: 7
-----------------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
-----------------------------------------------------------------------------------------------------
date int %tdCCYYNN
dateInteger long %12.0g
_merge byte %23.0g _merge
-----------------------------------------------------------------------------------------------------
Sorted by:
Note: Dataset has changed since last saved.
But I suggest you take advantage of Stata's date/time functionality.
Related
I have a spark data frame having two columns (SEQ - Integer, MAIN_DATE - Date) as:
Now I want to add a column based on the condition that if the format of MAIN_DATE is "MMM-YYYY" then it should be converted to Last day of the month and new data frame should look like this:
Any suggestion will be much appreciated.
You can use Spark's when/otherwise methods in order to operate differently for each different date format of the MAIN_DATE column.
More specifically, you can simply match the MMM-yyyy date format values of the column based on the field's String length (since we know that those values we always have 8 characters) as a condition in when and then:
use to_date to convert the String value to a valid date based on a format we give as an argument, and
use last_date to get the last day of the month each curry date in MAIN_DATE is referring to.
As for the "regular" rows with the dd-MMM-yyyy date format, just a to_date conversion would be sufficient within the otherwise method.
After that, all there's left to do is to convert the dates back to the desired dd-MMM-yyyy format (because to_date converts a given date to the yyyy-MM-dd format).
This is the solution in Scala (split in into two withColumns to make it more readable, instead of an one-liner):
df.withColumn("END_DATE",
when(length(col("MAIN_DATE")).equalTo(8), last_day(to_date(col("MAIN_DATE"), "MMM-yyyy")))
.otherwise(to_date(col("MAIN_DATE"), "dd-MMM-yyyy")))
.withColumn("END_DATE", date_format(col("END_DATE"), "dd-MMM-yyyy"))
This is what the resulting df DataFrame will look like:
+---+-----------+-----------+
|SEQ| MAIN_DATE| END_DATE|
+---+-----------+-----------+
| 1|16-JAN-2020|16-Jan-2020|
| 2| FEB-2017|28-Feb-2017|
+---+-----------+-----------+
I need help to convert Numbers into date format using Power Query Editor in either Excel or PowerBI
The date appears in number form like this 930101 and I want to convert it to normal Uk date format
Not sure which one is month and which one is date among "0101" in your string. But you can handle this your self and follow this below steps for get your required output in Power Query Editor-
First, split your string value using fixed 2 character and your data will be divide into 3 column now. define which one is Year, Month and Day.
Now, merge those 3 column maintain the UK pattern DD/MM/YY using a separator "/" and you will get a string like "01/01/93".
Finally, create a custom column using the below code-
Date.From([Merged],"en-GB")
Here is the final output-
In the above image, you can see the date in still US format just because of my Laptop's locally setup.
I have table ABC in which I have column Z of datatype Date. The format of the data is YYYYMMDD. Now I am looking to convert the above format to YYYY-MON-DD format. Can someone help?
You can use to_char
TO_CHAR(Z,'YYYY-MON-DD')
Depending on what the purpose of the reformatting is, you can either explicitly cast it to a VARCHAR/CHAR and define the format, or you can change your display format to however you'd like to see all dates:
ALTER SESSION SET DATE_OUTPUT_FORMAT = 'YYYY-MON-DD';
It's important to understand that if the data is in a DATE field, then it is stored as a date, and the format of the date is dependent on your viewing preferences, not how it is stored.
Since the value of the date field is stored as a number, you have to convert it to date.
ALTER SESSION SET DATE_OUTPUT_FORMAT = 'YYYY-MON-DD';
select to_date(to_char( z ), 'YYYYMMDD');
(adding this answer to summarize and resolve the question - since the clues and answers are scattered through comments)
The question stated that column Z is of type DATE, but it really seems to be a NUMBER.
Then before parsing a number like 20201017 to a date, first you need to transform it to a STRING.
Once the original number is parsed to a date, it can be represented as a new string formatted as desired.
WITH data AS (
SELECT 20201017 AS z
)
SELECT TO_CHAR(TO_DATE(TO_CHAR(z), 'YYYYMMDD'), 'YYYY-MON-DD')
FROM data;
# 2020-Oct-17
I wanted to know, how can we define date format from given date
for example, i have date 20180423 then in sas I want to define format as 'yyyymmdd'
similarly , i have date given in data as 12022018 then i want to define as 'ddmmyyyy'
Please note that, date is provided to me in proper date, but i want to define format now.
Date given may be different in future
so I need to take care all of the date format through SAS
What I thought was given date 20180422
use substr function
data test;
a=20180422;
a=substr(a,1,4);
b=substr(a,5,1);
c=substr(a,7,1);
run;
but not sure.
If anyone can provide the solution,then it really helps me in my project work.
Thanks in Advance for help.
It sounds like you want to convert various values to a date. SAS stores dates as a number, being the number of days since 1st Jan 1960. It's then usual to format this number to display as a date, in whichever format is preferred.
When importing dates that's are already in a format, it is necessary to use the input function, along with an informat, to convert the formatted value to a SAS date. If the date values being read in are all in the same format, then the specific informat can be used. In your case, where different formats are used, you can use the anydtdte. informat which will convert most of the standard date formats to a SAS date.
The example below converts 3 different date formats to a SAS date, then displays the SAS date in the date9. format. I've printed both the unformatted and formatted new values to the log, just so you can see they are stored as numbers.
data _null_;
input date_in $20.;
date_out = input(date_in, anydtdte20.);
put date_in date_out date_out :date9.;
datalines;
20180422
12022018
27apr2018
;
run;
Use the input(a,anydtdte20.); this will convert any date to SAS date, then use the functions Year(), Month(), Day() to extract the data you want.
You will find this SAS Post very useful about dates and locales.
Solution:
I created a table with two rows; each row have a different date format YYYYMMDD & DDMMYYYY to show you how the code will handles different date formats, saved them to SAS date and broke them down to Year, Month & Day:
options DATESTYLE=DMY;
data have;
input a;
datalines;
20180422
12022018
;
run;
data test;
set have;
format date_a date9.;
date_a=input(a,anydtdte20.);
Year_a=year(date_a);
month_a=month(date_a);
day_a=day(date_a);
run;
Output:
a=20180422 date_a=22APR2018 Year_a=2018 month_a=4 day_a=22
a=12022018 date_a=12FEB2018 Year_a=2018 month_a=2 day_a=12
You can use an if condition inside a data step. Using If condition, check for the condition to be true (check date value satisfies the required criteria), then format the date using a put function.Put function can take a source as first argument and format as second argument , and return the formatted value. Different values of same column, can have different formats specified that way.
Something like this,
if a = 'date1CheckCondtion' then newA = put(a , dateformat1.);
if a = 'date2' then newA = put(a , dateformat2.);
You may then choose to get all values in a common format like this:
dateA=input(newA,mmddyy6.);
I am trying to see if a variable falls into a boundary of dates.
I hate a DATE1 already in MMDDYY10.
I use the following code
DATA GIANT;
SET GIANT;
UPPER_BOUND= intnx('week', DATE1, 2);
run;
it gives me back something in Num 8.
I want to restore it to MMDDYY10. so that I can compare it to my other dates.
Two Questions:
How can I convert a NUMERIC of length 8 into a date?
Why does intnx ... designed to work with dates return a numeric and not something in the same format?
I tried to convert it like this:
DATA GIANT;
SET GIANT;
UP_DATE=INPUT(PUT(UPPER_BOUND, 8.), MMDDYY10.);
FORMAT UP_DOS MMDDYY10.;
run;
but now it all comes up as null.
SAS Dates are always numeric (# of days since 1/1/1960). Date formats are simply a way of making that numeric readable. INTNX returns a numeric because that's all a date is; it's up to you to apply a date format to the new variable.
In your case it's very simple. You almost got it right in your attempt, but you don't need the input/put business.
data giant;
set giant;
upper_bound=intnx('week',Date1,2);
format upper_bound MMDDYY10.;
run;
INPUT converts human readable text into a value (usually a number). PUT converts a value into human readable text. PUT(INPUT(...)) is commonly used to convert a formatted value into a different kind of formatted value (for example, to convert the string "1/1/1960" to "01JAN1960"); INPUT(PUT(...)) is not very commonly used unless you are parsing the string that PUT created (such as, to read just a particular date element or something like that). Both change the type (from numeric to character in PUT or other way in INPUT) in most cases and certainly change the actual stored value.
Applying a format to a numeric column leaves the column as a numeric (which is usually good) but tells SAS how to display that numeric so you can understand it (also usually good). So underneath the value is 19857 but what is displayed is 05/14/2014.