extracting date from timestamp in pig - date

First, I ran this code and it went through. Now, I want to extract only date from timestamp which includes data and time. But, I have no idea how to do it. I used GetYear, GetMonth, GetDay but always an error popped up.
define Quantile datafu.pig.stats.Quantile('21'); data_raw = LOAD 'California/2016/March-2016.csv' USING PigStorage(',') AS (tmc_code:chararray, measurement_tstamp:chararray, speed:int, average_speed:int, reference_speed:int, travel_time_minutes:int,confidence_score:int, cvalue:int);
filtered_data = FILTER data_raw BY confidence_score == 30;
data_reqd = GROUP filtered_data BY (tmc_code, measurement_tstamp);
quantiles = FOREACH data_reqd GENERATE group.tmc_code, ToDate(group.measurement_tstamp,'YYYY-MM-DD HH:mm:ss') AS date, Quantile(filtered_data.speed);
results = Limit quantiles 10;
DUMP results;
I would appreciate it if someone can help me to extract only date from measurement_tstamp.

This is how you could do it
results_new = FOREACH quantiles GENERATE CONCAT(CONCAT(CONCAT((chararray)GetYear(date),'-')),(CONCAT((chararray)GetMonth(date),'-')),(chararray)GetDay(date)) AS Day;
I've answered a similar question here

Related

Manipulating last two rows if there's data based on a Cut date

This question is a slightly varied version of this one...
Now I'm using Measures instead of Calculated columns and the date is static instead of having it based on a dropdown list.
Here's the Power BI test .pbix file:
https://drive.google.com/open?id=1OG7keqhdvDUDYkFQFMHyxcpi9Zi6Pn3d
This printscreen describes what I'm trying to accomplish:
Basically the date in P6 Update table is used as a cut date and will be fixed\static. It's imported from an Excel sheet where the user can customize it however they want.
Here's what should happen when a matching row in Test data table is found for P6 Update date:
column Earned Daily - must have its value summed with the next row if there's one;
column Earned Cum - must grab the next row's value;
all the previous rows should remain intact, that is, their values won't change;
all subsequent rows must have their values assigned 0.
So for example:
If P6 Update is 1-May-2018, this is the expected result:
1-May 7,498 52,106
2-May 0 0
If P6 Update is 30-Apr-2018, this is the expected result:
30-Apr 13,173 50,699
1-May 0 0
2-May 0 0
If P6 Update is 29-Apr-2018, this is the expected result:
29-Apr 11,906 44,608
30-Apr 0 0
1-May 0 0
2-May 0 0
and so on...
Hope this makes sense.
This is easier in Excel, but trying to do this in Power BI is making me go nuts.
I will ignore previously asked related questions and start from scratch.
First, create a measure:
Current Earn =
CALCULATE (
SUM( 'Test data'[Value]),
'Test data'[Act Rem] = "Actual Units",
'Test data'[Type] = "Current"
)
This measure will be used in other measures, to save you from typing all these conditions ("Actual Units" and "Current") again and again. It's a great practice to re-use measures in other measures - saves work, makes code cleaner and easier to refactor.
Create another measure:
Cut Date = SELECTEDVALUE('P6 Update'[Date])
We will use this measure whenever we need a cut off date. Please note that it does not have to be hard-coded - if P6 table contains a list of dates, you can create a pull-down slicer from the dates, and can choose the cut-off date dynamically. The formula will work properly.
Create third measure:
Next Earn =
VAR Cut_Date = [Cut Date]
VAR Current_Date = MAX ( 'Test data'[Date] )
VAR Next_Date = Current_Date + 1
VAR Current_Earn = [Current Earn]
VAR Next_Earn = CALCULATE ( [Current Earn], 'Test data'[Date] = Next_Date )
RETURN
SWITCH (
TRUE,
Current_Date < Cut_Date, Current_Earn,
Current_Date = Cut_Date, Current_Earn + Next_Earn,
BLANK ()
)
I am not sure if "Next Earn" is a good name for it, hopefully you will find a more intuitive name. The way it works: we save all necessary inputs into variables, and then use SWITCH function to define the results. Hopefully it's self-explanatory. (Note: if you need 0 above Cut Date, replace BLANK() with 0).
Finally, we define a measure for cumulative earn. It does not require any special logic, because previous measure takes care of it properly:
Cum Earn =
VAR Current_Date = MAX('Test data'[Date])
RETURN
CALCULATE(
[Next Earn],
FILTER(ALL('Test data'[Date]), 'Test data'[Date] <= Current_Date))
Result:

How can I convert one date and time from two colums?

Im trying to convert the first two columns of a cell into a Matlab time. First column {1,1} is the date in YYYY-MM-DD format and the second is the time in HH:MM format.
Any ideas where I'm going wrong? My code:
file = 'D:\Beach Erosion and Recovery\Bournemouth\Bournemouth Tidal
Data\tidal_data_jtide.txt'
fileID = fopen(file);
LT_celldata = textscan(fileID,'%D%D%D%D%d%[^\n\r]','delimiter',',');
formattime = 'yyyy-mm-dd HH:MM'
date = LT_celldata{1,1};
time = LT_celldata{1,2};
date_time = datenum('date','time'); code
Screenshot below is LT_celldata{1,1} :
You can combine variables date and time with the following code:
date = datetime(LT_celldata{1,1},'InputFormat','yyyy-MM-dd');
time = datetime(LT_celldata{1,2},'InputFormat','HH:mm:ss','Format','HH:mm:ss');
myDatetime = datetime(date + timeofday(time),'Format','yyyy-MM-dd HH:mm:ss');
The code uses timeofday function to combine date and time information from the two different variables. You may find more information and examples at this documentation page.

how to check the date using internet in matlab?

when using date command in matlab, i get the system current date. is there a way to get the current date using internet in matlab? the link i've given is similar to this question but that question was asked for vb, and i'm trying to do this thing in matlab.
https://stackoverflow.com/questions/21198527/how-to-check-the-real-date-time-through-an-internet-connection]
MATLAB Code based on urlread to get current date using internet (URL) and with couple of bounding keys (to find the date string) -
URL = 'http://time.is/';
key1 = 'title="Click for calendar">';
key2 = '</h2>';
data = urlread(URL);
start_ind = strfind(data,key1);
data1 = data(start_ind:end);
off_stop_ind = strfind(data1,key2);
current_date = data(start_ind+ numel(key1):start_ind + off_stop_ind(1)-2)
Output at my location -
current_date =
Saturday, September 6, 2014, week 36
If you would like to have it in the DD-MM-YYYY format, use this -
date_split = strsplit(current_date,',')
current_date1 = datestr(strcat(date_split(2),date_split(3)))
Output -
current_date1 =
06-Sep-2014

MATLAB - Count Number of Entries in Each Year

I have a .mat file that contains data from the years 2006-2100. Each year, there is a different number of lines. I need to count how many lines are 2006, how many are 2007, etc.
The set up, by column, is: Year, Month, Day, Lat, Long
I just want to count the number of rows containing the same Year entry and get an array back with an array containing that info.
I'm thinking a for or while loop should work, but I don't know how to right it.
If we assume your data are in a numeric matrix, you can just do:
num_lines2006 = sum(data(:,1)==2006);
data2006 = data(data(:,1)==2006),:);
If you want to add a column with number of rows for corresponding year, here is a solution with a loop:
for k=size(data,1):-1:1
num_year(k,1) = sum(data(:,1)==data(k,1));
end
data = [data num_year];
Here is a solution without loop:
[unq_year,~,idx] = unique(data(:,1),'stable');
num_year = grpstats(data(:,1),unq_year,#numel);
data = [data num_year(idx)];
To count numeric entries, you may want to use histc
years = unique(data(:,1);
counts = histc(data(:,1),years);
Since you just want to count the number of rows you could just write something simple like:
years = unique(data(:, 1));
counts = arrayfun(#(year) nnz(data(:, 1) == year), years);
years contains the unique years, and numRows the number of times they are found.
You could also use a one-liner inspired by Jonas' answer:
[counts, years] = hist(data(:,1), unique(data(:,1))');

Difference in minutes between 2 dates and times?

I need to compute time difference in minutes with four input parameters, DATE_FROM, DATE_TO, TIME_FROM, TIME_TO. And one output parameter DIFF_TIME. I have created a function module, I need to write a formula which computes the time diff in minutes.
Any help would be great!
Thanks,
Sai.
Use CL_ABAP_TSTMP=>TD_SUBTRACT to get the number of seconds between two date/time pairs.
(then, to get the number of minutes, divide the number of seconds by 60).
Example:
DATA(today_date) = CONV d( '20190704' ).
DATA(today_time) = CONV t( '000010' ).
DATA(yesterday_date) = CONV d( '20190703' ).
DATA(yesterday_time) = CONV t( '235950' ).
cl_abap_tstmp=>td_subtract(
EXPORTING
date1 = today_date
time1 = today_time
date2 = yesterday_date
time2 = yesterday_time
IMPORTING
res_secs = DATA(diff) ).
ASSERT diff = 20. " verify expectation or short dump
If the values are guaranteed to be in the same time zone, it's easy enough that you don't need any special function module or utility method. Read this, then get the difference of the dates and multiply that by 24 * 60 and get the difference of the times (which is in seconds) and divide that by 60. Sum it up and there you are.