Find number of days between two given dates in Rcpp - date

I'm newbie in Rcpp, but I have a task which's connected with Date and Datetime.
Let me have market data in DataFrame in my Rcpp function. So, Date field has formatting like this:
2016-04-19 00:01:00
Dataframe field name which contains Date values is "Date". So, I get 2 vectors:
DatetimeVector datetime = df["Date"];
DateVector pureDate = df["Date"];
Problems:
1) I can't take difference between 2 Date values of Date (I don't know why, but gcc-4.9.3 gives me an error on difference like this:
Date pureDay = pureDate[0];
auto tmp = pureDate[j+1] - pureDay;
error: ambiguous overload for 'operator-' (operand types are
'Rcpp::traits::storage_type<14>::type {aka double}' and 'Rcpp::Date')
auto tmp = tmpDate[j+1] - tmpTradeDay;
But if I use code like this:
Date pureDay = pureDate[0];
auto tmp = pureDate[j+1] - pureDate[j];
It works well.
2) How it possible to format an output for Date and Datetime objects? to_string won't format it well - I give a result like this: 1461176460.000000
3) I expected that syntax like Date(datetime[i]) will give me a Date object. But it won't. I know that pureDate[1] - pureDate[0] should have the same Y-M-D value, but they differ for series lag (60 seconds).
Thnxs. Could anyone helps me with these problems?

You seem a little lost, and there are really multiple questions in this one.
Question 1) will give a short example below.
Question 2) is mostly about formatting, you may want to look into the class documentation and header; both Date and Datetime have a format() method that works just like the R equivalent or C library function for date(time) formating, the fabled strftime().
Question 3) is unclear; I am not sure what you are asking. Maybe the answer to question 1) below helps.
Simple example for question 1:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double question1(DateVector dv) {
double d = dv[1] - dv[0];
return d;
}
/*** R
set.seed(123)
datevector <- Sys.Date() + cumsum(runif(3)*30);
datevector
diff(datevector)
question1(datevector)
*/
and its outcome:
R> Rcpp::sourceCpp("/tmp/datequestion.cpp")
R> set.seed(123)
R> datevector <- Sys.Date() + cumsum(runif(3)*30);
R> datevector
[1] "2018-03-28" "2018-04-21" "2018-05-03"
R> diff(datevector)
Time differences in days
[1] 23.6492 12.2693
R> question1(datevector)
[1] 23.6492
R>
Same answer as from R. Your code still has an index calculation, that sometimes confuses the compiler. Making it simpler (ie more steps) often helps.
Lastly, maybe look at some Rcpp documentation and examples. The RcppExamples package has a function on dates and datetimes ...

Related

Trying to build Expression for Table field to sort text dates, some with missing elements

Hi I am a newbie and have a problem I have been trying to solve for weeks. I have a table imported from excel with dates in text format (because dates go back to 1700s) Most are in the format "mmmyyyy", so it is relatively easy to add "1" to the date, convert to date format, and sort in correct date order. The problem I have is that some of the dates in the table are simply "yyyy", and some are empty. I cannot find an expression that works to convert these last two to eg 1 Jan yyyy and 1 Jan 1000 within the same expression. Is this possible, or would I need to do this in two queries? Sorry if this question is very basic - I cannot find an answer anywhere.
TIA
You can do something like:
Public Function ConvertDate(Byval Expression As Variant) As Date
Dim Result As Date
If IsNull(Expression) Then
Result = DateSerial(1000, 1, 1)
ElseIf Len(Expression) = 4 Then
Result = DateSerial(Expression, 1, 1)
Else
Result = DateValue(Right(Expression, 4) & "/" & Left(Expression, 3) & "/1")
End If
ConvertDate = Result
End Function

Why won't my factor column value change to a date value?

I know this is elementary but I can't seem to figure it out, even after reading other posts.
In a dataset, I want to convert an entire column into a date. The current class is factor.
The value in the field looks like this 12/25/2012
This is what I've tried.
C$DateofDeath=as.Date(C$DateofDeath,'%m/%d/%Y')
Error in as.Date.default(C$DateofDeath, "%m/%d/%Y") :
do not know how to convert 'C$DateofDeath' to class “Date”
C$DateofDeath=as.Date(C$DateofDeath,"%m/%d/%Y")
Error in as.Date.default(C$DateofDeath, "%m/%d/%Y") :
do not know how to convert 'C$DateofDeath' to class “Date”
Claims$DateofDeath=strptime(as.character(Claims$DateofDeath),format= '%m/%d/%Y')
Error in `$<-.data.frame`(`*tmp*`, "DateofDeath", value = list(sec = numeric(0), :
replacement has 0 rows, data has 71616
Claims$DateofDeath=strptime(as.character(Claims$DateofDeath),format= "%m/%d/%Y")
Error in `$<-.data.frame`(`*tmp*`, "DateofDeath", value = list(sec = numeric(0), :
replacement has 0 rows, data has 71616
Use as.POSIXct
C$DateOfDeath<-as.POSIXct(as.character(C$DateOfDeath), format = "%d/%m/%Y")
There are lots of R experts here but you have to specify R as one of your tags to get them to notice your question.
Looks like you have tried a bunch of combinations but not the right one.
> C <- data.frame(DateofDeath="12/25/2012",other=TRUE)
> as.Date(as.character(C$DateofDeath),format="%m/%d/%Y")
[1] "2012-12-25"
Notice that as.Date() takes a character input, not a factor. So you need to convert to character, then to Date.
Your strptime() versions seem fine to me except that you call are referring to the dataframe Claims instead of C. Actually strptime() should convert the factor to character for you, so you don't need the as.character() part with those.

displaying dates of values - time series data

I am trying to maintain a table using some panel data. I have all the data outputting fine, but I am having difficulty getting the correct dates to display. The method I am using is the following:
gen ymdny = date(date,"MDY"); /*<- date var from panel dataset that i import*/
sort name ymdny;
summ ymdny;
local lastdate : disp %tdM-D r(max);
local lastdate2 : disp %tdM-D (r(max)-1);
local lastw : disp %tdM-D (r(max)-7);
This would work fine if the data were daily, but the dataset I have is actually business daily (ie. missing for the weekends and bank national holidays). It seems silly but I have not been able to figure out a workaround that does the job. Ideally - there is a function that i can use to print the corresponding date to a particular value.
For example:
gen resbal_1d = round(l1.resbal,0.1);
gen dateOf = dateOf(resbal_1d); /* <- pseudocode example of what I would like */
I'm not sure what you're asking for but my guess is that you want to see a human readable form date as the output, given a numerical input. (This is your last sentence.) So simply try something like:
display %td 10
The format is important as the following shows (see help format):
display %tq 10
Same numerical input, different format, different output.
Two other examples from the manual:
* string to integer
display date("5-12-1998", "MDY")
* string to date format
display %td date("5-12-1998", "MDY")
As for your example code, I don't get what you're aiming for. In effect, you can summarize the date variable because in Stata, dates are just integers. It's legal but couldn't say if it's good form. Below a simple example.
clear all
set more off
set obs 10
gen date = _n // create the data
format date %td // give date format
list
summarize date
local onedate = r(max)
display %td `onedate'
Some references:
[U] 24 Working with dates and time
help datetime
help datetime business calendars
http://www.stata.com/support/faqs/data-management/creating-date-variables/
http://www.ats.ucla.edu/stat/stata/modules/dates.htm
(Maybe you can explain with more detail and context what it is you want.)
Edit
Your comment
I do not see how this helps with the date output. For example,
displaying r(max) - 1 on a monday will still display the sunday date.
does not explain, at all, the problems you're having with Stata's business calendars.
I'm adding what is basically an example taken from the help file I already referenced. I do this with the hope of convincing you that (re)-reading the help files is worthwhile.
*clear all
set more off
* import string dates
infile str10 sdate float x using http://www.stata-press.com/data/r13/bcal_simple
list
*----- Regular dates -----
* create elapsed dates - Stata's way of managing dates
generate rdate = date(sdate, "MD20Y")
format rdate %td
drop sdate x
list
* compute previous and next dates
generate tomorrow1 = rdate + 1
format tomorrow1 %td
generate yesterday1 = rdate - 1
format yesterday1 %td
list
*----- Business dates -----
* convert regular date to business dates
generate bdate = bofd("simple", rdate)
format bdate %tbsimple
* compute previous and next dates
generate tomorrow2 = bdate + 1
format tomorrow2 %tbsimple
generate yesterday2 = bdate - 1
format yesterday2 %tbsimple
order yesterday1 rdate tomorrow1 yesterday2 bdate tomorrow2
list
/*
The stbcal-file for simple, the calendar shown below,
November 2011
Su Mo Tu We Th Fr Sa
---------------------------
1 2 3 4 X
X 7 8 9 10 11 X
X 14 15 16 17 18 X
X 21 22 23 X X X
X 28 29 30
---------------------------
*/
Notice that if you add or substract 1 from a regular date, then business days are not taken into account. If you do the same with a business calendar date, you get what you want. Business calendars are defined by .stbcal files; the example uses a built-in calendar called simple. You maybe need to make your own .stbcal file but it is not difficult. Again, the details are in the help files.

Function equivalent to SUM() for multiplication in SQL Reporting

I'm looking for a function or solution to the following:
For the chart in SQL Reporting i need to multiply values from a Column A. For summation i would use =SUM(COLUMN_A) for the chart. But what can i use for multiplication - i was not able to find a solution so far?
Currently i am calculating the value of the stacked column as following:
=ROUND(SUM(Fields!Value_Is.Value)/SUM(Fields!StartValue.Value),3)
Instead of SUM i need something to multiply the values.
Something like that:
=ROUND(MULTIPLY(Fields!Value_Is.Value)/MULTIPLY(Fields!StartValue.Value),3)
EDIT #1
Okay tried to get this thing running.
The expression for the chart looks like this:
=Exp(Sum(Log(IIf(Fields!Menge_Ist.Value = 0, 10^-306, Fields!Menge_Ist.Value)))) / Exp(Sum(Log(IIf(Fields!Startmenge.Value = 0, 10^-306, Fields!Startmenge.Value))))
If i calculate my 'needs' manually i have to get the following result:
In my SQL Report i get the following result:
To make it easier, these are the raw values:
and you have the possibility to group the chart by CW, CQ or CY
(The values from the first pictures are aggregated Sum values from the raw values by FertStufe)
EDIT #2
Tried your expression, which results in this:
Just to make it clear:
The values in the column
=Value_IS / Start_Value
in the first picture are multiplied against each other
0,9947 x 1,0000 x 0,59401 = 0,58573
Diffusion Calenderweek 44 Sums
Startvalue: 1900,00 Value Is: 1890,00 == yield:0,99474
Waffer unbestrahlt Calenderweek 44 Sums
Startvalue: 620,00 Value Is: 620,00 == yield 1,0000
Pellet Calenderweek 44 Sums
Startvalue: 271,00 Value Is: 160,00 == yield 0,59041
yield Diffusion x yield Wafer x yield Pellet = needed Value in chart = 0,58730
EDIT #3
The raw values look like this:
The chart ist grouped - like in the image - on these fields
CY (Calendar year), CM (Calendar month), CW (Calendar week)
You can download the data as xls here:
https://www.dropbox.com/s/g0yrzo3330adgem/2013-01-17_data.xls
The expression i use (copy / past from the edit window)
=Exp(Sum(Log(Fields!Menge_Ist.Value / Fields!Startmenge.Value)))
I've exported the whole report result to excel, you can get it here:
https://www.dropbox.com/s/uogdh9ac2onuqh6/2013-01-17_report.xls
it's actually a workaround. But I am pretty sure is the only solution for this infamous problem :D
This is how I did:
Exp(∑(Log(X))), so what you should do is:
Exp(Sum(Log(Fields!YourField.Value)))
Who said math was worth nothing? =D
EDIT:
Corrected the formula.
By the way, it's tested.
Addressing Ian's concern:
Exp(Sum(Log(IIf(Fields!YourField.Value = 0, 10^-306, Fields!YourField.Value))))
The idea is change 0 with a very small number. Just an idea.
EDIT:
Based on your updated question this is what you should do:
Exp(Sum(Log(Fields!Value_IS.Value / Fields!Start_Value.Value)))
I just tested the above code and got the result you hoped for.

Vectorising Date Array Calculations

I simply want to generate a series of dates 1 year apart from today.
I tried this
CurveLength=30;
t=zeros(CurveLength);
t(1)=datestr(today);
x=2:CurveLength-1;
t=addtodate(t(1),x,'year');
I am getting two errors so far?
??? In an assignment A(I) = B, the number of elements in B and
Which I am guessing is related to the fact that the date is a string, but when I modified the string to be the same length as the date dd-mmm-yyyy i.e. 11 letters I still get the same error.
Lsstly I get the error
??? Error using ==> addtodate at 45
Quantity must be a numeric scalar.
Which seems to suggest that the function can't be vectorised? If this is true is there anyway to tell in advance which functions can be vectorised and which can not?
To add n years to a date x, you do this:
y = addtodate(x, n, 'year');
However, addtodate requires the following:
x must be a scalar number, not a string.
n must be a scalar number, not a vector.
Hence the errors you get.
I suggest you use a loop to do this:
CurveLength = 30;
t = zeros(CurveLength, 1);
t(1) = today; % # Whatever today equals to...
for ii = 2:CurveLength
t(ii) = addtodate(t(1), ii - 1, 'year');
end
Now that you have all your date values, you can convert it to strings with:
datestr(t);
And here's a neat one-liner using arrayfun;
datestr(arrayfun(#(n)addtodate(today, n, 'year'), 0:CurveLength))
If you're sequence has a constant known start, you can use datenum in the following way:
t = datenum( startYear:endYear, 1, 1)
This works fine also with months, days, hours etc. as long as the sequence doesn't run into negative numbers (like 1:-1:-10). Then months and days behave in a non-standard way.
Here a solution without a loop (possibly faster):
CurveLength=30;
t=datevec(repmat(now(),CurveLength,1));
x=[0:CurveLength-1]';
t(:,1)=t(:,1)+x;
t=datestr(t)
datevec splits the date into six columns [year, month, day, hour, min, sec]. So if you want to change e.g. the year you can just add or subtract from it.
If you want to change the month just add to t(:,2). You can even add numbers > 12 to the month and it will increase the year and month correctly if you transfer it back to a datenum or datestr.