Referring to an exact date in calculation - date

I have a date variable (dd/mm/yyyy).
I need to create a similar variable that is equivalent to Dec. 31 2016 to use it in a calculation.
How would I do this?

You need to use the daily() function and then format the numeric variable accordingly:
clear
set obs 1
generate date = daily("31Dec2016", "DMY")
format %tdMonDDCCYY date
list
+-----------+
| date |
|-----------|
1. | Dec312016 |
+-----------+
Type help daily() and help format from Stata's command prompt for details.

I take it that you have a numeric daily date variable. Some people hold dates as strings, which isn't very useful in Stata, and there are other kinds of numeric date variable.
A date like 31 December 2016 is a constant which can be calculated as
. di mdy(12, 31, 2016)
20819
and for display could be
. di %td mdy(12, 31, 2016)
31dec2016
You can get the same result in other ways, such as
. di daily("31 Dec 2016", "DMY")
20819
Nothing stops you putting this constant in a variable, but that just copies the same value as many times as you have observations, and is for most purposes pointless. Either use it directly or make your code easier to understand by using some evocative macro or scalar name:
. local Dec_31_2016 = mdy(12, 31, 2016)
. local today = mdy(8, 7, 2018)
. di `today' - `Dec_31_2016'
584
I have guessed that the most likely use for a date constant is to calculate time elapsed since some benchmark date.

Related

Problem with displaying reformatted string into a four-digit year in Stata 17

I turned to a Stata video "Data management: How to create a date variable from a date stored as a string by Chuck Huber" to make sure my date variable were formatted properly, however, I cannot get to show me the reformatted variable (school_year2) to display as a year (e.g. 2018).
Can someone let me know what I may be missing here?
Thank you,
.do file
gen school_year2 = date(school_year,"Y")
format %ty school_year2
list school_year school_year2 in 1/10
+---------------------+
| school~r school~2 |
|---------------------|
1. | 2016 2.0e+04 |
2. | 2016 2.0e+04 |
3. | 2016 2.0e+04 |
4. | 2016 2.0e+04 |
5. | 2016 2.0e+04 |
|---------------------|
6. | 2016 2.0e+04 |
7. | 2016 2.0e+04 |
8. | 2016 2.0e+04 |
9. | 2016 2.0e+04 |
10. | 2016 2.0e+04 |
+---------------------+
.
end of do-file
The value of the underlying data is still days from 1 Jan 1960 as you are using the date() function. So keep %td as you are working with days here, not years. But then you can decide for it to display only the year using %tdCCYY C standing for century and Y for year. But remember, the underlying data point is still the day 1 Jan 2016 and not 2016
clear
input str4 school_year
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
end
gen school_year2 = date(school_year,"Y")
format %tdCCYY school_year2
list school_year school_year2 in 1/10
If year is all you want to work with then use the year() function to get the year from the date. The examples below details steps you can play around with.
clear
input str4 school_year
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
"2016"
end
gen school_year2 = date(school_year,"Y")
gen school_year3 = year(school_year2)
format %tdCCYY school_year2
format %ty school_year3
list in 1/10
Note that in the last example, all values look the same to you. But the first variable is a string with the text "2016", the second is a date stored as the number of days from 1 Jan 1960 with only its year value displayed, and the last is a number with the number of years from year 0 displayed as a year (which in this case would have been the same had it been displayed as its underlying number).
#TheiceBear has already explained the main point, but here is the story told a little differently in case that is helpful.
The fallacy here is that changing the (display) format is just that, a change in format. It has no effect on what is stored, that is, on the value of data held within variables in the question.
You are using generate to create new variables, which is fine, but the basic principles can be seen directly using di (display) on scalar constants. That's also a good way to check understanding of Stata's rules.
The date() function -- despite its historic name -- is for creating numeric daily dates (only). If you tell date() that your input is a string containing the year only, then it imputes 1 January as day and month. The result is an integer, counted from the origin of the scale at 1 January 1960.
. di date("2016", "Y")
20454
. di date("1 Jan 2016", "DMY")
20454
. di date("1 Jan 1960", "DMY")
0
It is a fair bet that few are willing or able to work out what 20454 is on such a scale, but you can specify a daily date display format so that you and readers of your code can see directly.
. di %td 20454
01jan2016
There are many minor variations on that to display daily dates (or parts of them, such as monthly or yearly dates). The different format names for daily dates all start %td.
Conversely, if you say that the value 20454 is to be displayed using a yearly format, you are referring to the year 20454, several thousand years into the future. Stata doesn't act puzzled, except that it doesn't expect such values as years and just shows you a year rounded to 2.0e+04, that is 20000. If you had good reason to work with dates thousands or millions of years into the future, date display formats are likely to be neither needed nor helpful.
. di %ty 20454
2.0e+04
This paper riffs on the idea that a change in display format is only that and that doesn't affect stored values.

Date Increment Using Autohotkey

I'm looking for a way to set an arbitrary date, and every time I press a key it will print the day after it (tomorrow).
global jDate = "June 1, 1986"
^+z::
;Output our date in LongDate format
FormatTime, TimeString, %jDate%, LongDate
SendInput, %TimeString%
;Increment the date by a single day
jDate += 1, Days
Return
Unfortunately, it the code keeps starting jDate as today's current date/time rather than the past date I specify in the initial variable assignment. Not sure why. The incrementing works fine, it just increments starting from todays date rather that the 1986 date.
FormatTime is expecting any date/time input to be in the "YYYYMMDD..." format. Since what you've assigned to jDate doesn't fit that criterion, it assumes it's invalid and uses today's date. To make it work how you expect, just modify your jDate input.
jDate := "19860601" ; 1986 -> YYYY, 06 -> MM, 01 ->DD
A couple of things to note: (1) global is not needed in this context; (2) I would recommend getting out of the habit of assigning variables using the = comparator (use := assignment operator instead). It only works for legacy reasons but generates more confusion than it's worth. In the context that you're using it, the quotes would need to be removed.

Transform string monthly dates in Stata

I have a problem in Stata with the format of the dates. I believe it is a very simple question but I can't see how to fix it.
I have a csv file (file.csv) that looks like
v1 v2
01/01/2000 1.1
01/02/2000 1.2
01/03/2000 1.3
...
01/12/2000 1.12
01/02/2001 1.1
...
01/12/2001 1.12
The form of v1 is dd/mm/yyyy.
I import the file in Stata using import delimited ...file.csv
v1 is a string variable, v2 is a float.
I want to transform v1 in a monthly date that Stata can read.
My attempts:
1)
gen Time = date(v1, "DMY")
format Time %tm
which gives me
Time
3177m7
3180m2
3182m7
...
that looks wrong.
2) In alternative
gen v1_1=v1
replace v1_1 = substr(v1_1,4,length(v1_1))
gen Time_1 = date(v1_1, "MY")
format Time_1 %tm
which gives exactly the same result.
And if I type
tsset Time, format(%tm)
it tells me that there are gaps but there are no gaps in the data.
Could you help me to understand what I'm doing wrong?
Stata has wonderful documentation on dates and times, which you should read from beginning to end if you plan on using time-related variables. Reading this documentation will not only solve your current problem, but will potentially prevent costly errors in the future. The section related to your question is titled "SIF-to-SIF conversion." SIF means "Stata internal form."
To explain your current issue:
Stata stores dates as numbers; you interpret them as "dates" when you assign a format. Consider the following:
set obs 1
gen dt = date("01/01/2003", "DMY")
list dt
// 15706
So that date is assigned the value 15706. Let's format it to look like a day:
format dt %td
list
// 01jan2003
Now let's format it to be a month:
format dt %tm
list
// 3268m11
Notice that dt is just a number that you can format and use like a day or month. To get a "month number" from a "day number", do the following:
gen mt = mofd(dt) // mofd = month of day
format mt %tm
list
// dt mt
// 3268m11 2003m1
The variable mt now equals 516. January 2003 is 516 months from January 1960. Stata's "epoch time" is January 1, 1960 00:00:00.000. Date variables are stored as days since the epoch time, and datetime variables are stored as miliseconds since the epoch time. A month variable can be stored as months since the epoch time (that's how the %tm formatting determines which month to show).

Stata: adding a number to a date variable

I am working with hospital admission data, where information on admission date and discharge date is stored in clock format %tcCCYY-NN-DD_hh:MM_AM, i.e. for example
discharge date
2009-04-21 9:00 AM
So the data information is stored as milliseconds since January 1, 1960, and transforming this into a numeric double variable gives me
discharge date
1556269200000
Now, I would like to shift some of my date variables by 1 minute (just an example), and generate a new variable
gen new_discharge_date = discharge_date + 60*1000
This will only incidentally shift the discharge date by exactly one minute
In the example above this will instead give me
new_discharge_date
2009-04-25 9:00 AM
or as double
new_discharge_date
1556269236224
The difference between new_discharge_date and discharge_date is only 36224 milliseconds instead of 60000.
The problem occurs systematically, sometimes the number of milliseconds since January 1, 1960, will even be lower than before.
Any idea what I am doing wrong?
Executive summary: Adding a constant to a date-time variable with units milliseconds creates another date-time variable. Both variables should be type double.
First note that clock is not a storage format in Stata. Clock date-time variables are stored as integers; clock format is a numeric display format, which is quite different. In fact the description in the original question is backwards: the date-time data arrive as strings, which are then converted to milliseconds with the clock() function.
You are correct that clock date-times should be stored as doubles, as they are often very large integers, but for precisely that reason your shifted date-time (1 minute more than the original values) should not be stored in a float, which is what your generate does by default. You need to specify double in the generate statement. Using float instead just gives a crude approximation, which is why you observe errors. This is easy to check using your example as sandbox.
. clear
. set obs 1
number of observations (_N) was 0, now 1
. gen s_discharge_date = "2009-04-21 9:00 AM"
. gen double discharge_date = clock(s_discharge_date, "YMD hm")
. format discharge_date %tc
. gen double new_discharge_date = discharge_date + 60*1000
. format new %tc
. gen long new_discharge_date2 = discharge_date + 60*1000
. format new_discharge_date2 %tc
. list
+--------------------------------------------------------------+
1. | s_discharge_date | discharge_date | new_discharge_date |
| 2009-04-21 9:00 AM | 21apr2009 09:00:00 | 21apr2009 09:01:00 |
|--------------------------------------------------------------|
| new_di~2 |
| . |
+--------------------------------------------------------------+
The advice given in a comment to use long is wrong, as the last experiment shows immediately. Fairly recent date-times have values in trillions, some orders of magnitude larger than be could held in a long. help data types shows the limits on values in various types.

Subtracting a Minute in Perl

I have a variable in Perl that I initialized as $invoice_date = '1/6/14' (June 1st 2014). How can I determine the datatype that Perl considers this variable to be?
I'd like to subtract a minute from the invoice date to get May 31 2014 11:59PM. How can I do this with or without declaring $invoice_date to be a certain datatype?
Update: Thanks for the comments and answers. Since it is a string, I am going to try to concatenate the time portion. I have a another variable $period_end_date which is set to May 31, 2014. I'm going to try to concatenate the 11:59PM to it.
The string is subsequently sent in a SQL statement. If I can figure out what SQL expects for the string, it should be possible to insert the time portion.
You need some date manipulation module as '1/6/14' is plain string, and two digit years were abandoned prior to Y2K event.
use Time::Piece;
use Time::Seconds;
my $t = Time::Piece->strptime("1/6/2014", "%d/%m/%Y");
$t -= ONE_MINUTE;
print $t;
output
Sat May 31 23:59:00 2014