Cumulative sum computation from hourly rainfall data MATLAB - matlab

I have a hourly rainfall timeseries from 1970 to 2003. I want to calculate:
n-hour values, n = 2,3,6,12,24 & 48 hr. 24-hr rainfall data can be calculated by accumulating 24 consecutive 1-hr data.
Similarly, 48-hr rainfall can be calculated by adding 2-day rainfall values.
From n-hr timeseries, I want to calculate maximum rainfall value for each year.
Likewise, I can compute for other accumulation periods. However, I need suggestions for calculation of annual maximum rainfall value from n-hrly data (aggregated time series), which can be calculated from maximum value of yearly n-hourly rainfall information. For example, from 1970 to 2003, I want to extract 34 annual maxima values correspond to 2,...24 hrs and 19 annual maxima values for 48-hr. Please find sample dataset here:
https://docs.google.com/document/d/1e8g54c6KDw8lwdQ53xi0Bs9fJmasTqbIk2n4LbKA-gM/edit
The frist, second, third & fourth column indicates year, month, day & values respectively.
I tried this code:
ny_p = []; Ann_Max = [];grp_pr = [];
for yr = 1970:1975
i = yr - 1969;
matched = ismember(Precip_Final(:,1), yr, 'rows');
grp_pr = Precip_Final(matched,4); % extracting hourly value of the same year
[nrow,ncol] = size(grp_pr);
for row = 6:nrow % to get 6~ hourly sum
p_new = sum(grp_pr(row-5:row));
ny_p(end+1) = p_new;
end
Max_p = max(ny_p);
Ann_Max = [Ann_Max;Max_p];
clear matched; clear grp_pr; clear i; clear Max_p;
end
I edited my code. Now the problem is: in the matrix ny_p, the earlier years' value also getting stored while running. I want to get an array of maximum n-hrly value of each year in the matrix Ann_Max.

Related

find the indices to calculate the monthly averages of some hours in a time series

If I have one year worth of data
Jday = datenum('2010-01-01 00:00','yyyy-mm-dd HH:MM'):1/24:...
datenum('2010-12-31 23:00','yyyy-mm-dd HH:MM');
dat = rand(length(Jday),1);
and would like to calculate the monthly averages of 'dat', I would use:
% monthly averages
dateV = datevec(Jday);
[~,~,b] = unique(dateV(:,1:2),'rows');
monthly_av = accumarray(b,dat,[],#nanmean);
I would, however, like to calculate the monthly averages for the points that occur during the day i.e. between hours 6 and 18, how can this be done?
I can isolate the hours I wish to use in the monthly averages:
idx = dateV(:,4) >= 6 & dateV(:,4) <= 18;
and can then change 'b' to include only these points by:
b(double(idx) == 0) = 0;
and then calculate the averages by
monthly_av_new = accumarray(b,dat,[],#nanmean);
but this doesn't work because accumarray can only work with positive integers thus I get an error
Error using accumarray
First input SUBS must contain positive integer subscripts.
What would be the best way of doing what I've outlined? Keep in mind that I do not want to alter the variable 'dat' when doing this i.e. remove some values from 'dat' prior to calculating the averages.
Thinking about it, would the best solution be
monthly_av = accumarray(b(idx),dat(idx),[],#nanmean);
You almost have it. Just use logical indexing with idx in b and in dat:
monthly_av_new = accumarray(b(idx),dat(idx),[],#nanmean);
(and the line b(double(idx) == 0) = 0; is no longer needed).
This way, b(idx) contains only the indices corresponding to your desired hour interval, and data(idx) contains the corresponding values.
EDIT: Now I see you already found the solution! Yes, I think it's the best approach.

Splitting a numerical matrix by column values in MATLAB

I have a matrix in MATLAB of 50572x4 doubles. The last column has datenum format dates, increasing values from 7.3025e+05 to 7.3139e+05. The question is:
How can I split this matrix into sub-matrices, each that cover intervals of 30 days?
If I'm not being clear enough… the difference between the first element in the 4th column and the last element in the 4th column is 7.3139e5 − 7.3025e5 = 1.1376e3, or 1137.6. I would like to partition this into 30 day segments, and get a bunch of matrices that have a range of 30 for the 4th columns. I'm not quite sure how to go about doing this...I'm quite new to MATLAB, but the dataset I'm working with has only this representation, necessitating such an action.
Note that a unit interval between datenum timestamps represents 1 day, so your data, in fact, covers a time period of 1137.6 days). The straightforward approach is to compare each timestamps with the edges in order to determine which 30-day interval it belongs to:
t = A(:, end) - min(A:, end); %// Normalize timestamps to start from 0
idx = sum(bsxfun(#lt, t, 30:30:max(t))); %// Starting indices of intervals
rows = diff([0, idx, numel(t)]); %// Number of rows in each interval
where A is your data matrix, where the last column is assumed to contain the timestamps. rows stores the number of rows of the corresponding 30-day intervals. Finally, you can employ cell arrays to split the original data matrix:
C = mat2cell(A, rows, size(A, 2)); %// Split matrix into intervals
C = C(~cellfun('isempty', C)); %// Remove empty matrices
Hope it helps!
Well, all you need is to find the edge times and the matrix indexes in between them. So, if your numbers are at datenum format, one unit is the same as one day, which means that we can jump from 30 and 30 units until we get as close as we can to the end, as follows:
startTime = originalMatrix(1,4);
endTime = originalMatrix(end,4);
edgeTimes = startTime:30:endTime;
% And then loop though the edges checking for samples that complete a cycle:
nEdges = numel(edgeTimes);
totalMeasures = size(originalMatrix,1);
subMatrixes = cell(1,nEdges);
prevEdgeIdx = 0;
for curEdgeIdx = 1:nEdges
nearIdx=getNearestIdx(originalMatrix(:,4),edgeTimes(curEdgeIdx));
if originalMatrix(nearIdx,4)>edgeTimes(curEdgeIdx)
nearIdx = nearIdx-1;
end
if nearIdx>0 && nearIdx<=totalMeasures
subMatrix{curEdgeIdx} = originalMatrix(prevEdgeIdx+1:curEdgeIdx,:);
prevEdgeIdx=curEdgeIdx;
else
error('For some reason the edge was not inbound.');
end
end
% Now we check for the remaining days after the edges which does not complete a 30 day cycle:
if curEdgeIdx<totalMeasures
subMatrix{end+1} = originalMatrix(curEdgeIdx+1:end,:);
end
The function getNearestIdx was discussed here and it gives you the nearest point from the input values without checking all possible points.
function vIdx = getNearestIdx(values,point)
if isempty(values) || ~numel(values)
vIdx = [];
return
end
vIdx = 1+round((point-values(1))*(numel(values)-1)...
/(values(end)-values(1)));
if vIdx < 1, vIdx = []; end
if vIdx > numel(values), vIdx = []; end
end
Note: This is pseudocode and may contain errors. Please try to adjust it into your problem.

find the correlation for a time series of hourly measurements

I have the following example:
DateTime=datestr(datenum('2011-01-01 00:00','yyyy-mm-dd HH:MM'):1/24:...
datenum('2011-12-31 23:00','yyyy-mm-dd HH:MM'),...
'yyyy-mm-dd HH:MM');
Data = [datenum(DateTime) - datenum(2011,0,0),rand(13,length(DateTime(:,1)))'];
This data contains the day of year in the first column, water temperature in column (2:end-1) and air temperature in the last column. I would like to calculate the correlation between air temperature (last column) and each column of temperature. I can do this as follows:
R = arrayfun(#(i)nonzeros(tril(corrcoef(Data(:,i),Data(:,end)),-1)),2:size(Data,2)-1,'un',0);
Next, I am trying to generate a matrix of the correlation values for each individual days (i.e. each 24 rows). So my question is how can I calculate the correlation between each column of temperature with air temperature as indicated above but for each individual day as denoted by 'Data(:,i)'. The outcome should include 365 rows (days) and 12 columns (temperatures)
In addition, I can find the row number for each day by:
[a,b,b] = unique(floor(Data(:,1)));
Try the following:
dayIdx = floor(Data(:,1));
R = zeros(365,12);
for i=1:365
c = corrcoef( Data(dayIdx==i,:) ); %# corr between all variables for one day
R(i,:) = c(end,2:end-1); %# extract those between water temps and air temp
end

matlab updating time vector

I have 19 cells (19x1) with temperature data for an entire year where the first 18 cells represent 20 days (each) and the last cell represents 5 days, hence (18*20)+5 = 365days.
In each cell there should be 7200 measurements (apart from cell 19) where each measurement is taken every 4 minutes thus 360 measurements per day (360*20 = 7200).
The time vector for the measurements is only expressed as day number i.e. 1,2,3...and so on (thus no decimal day),
which is therefore displayed as 360 x 1's... and so on.
As the sensor failed during some days, some of the cells contain less than 7200 measurements, where one in
particular only contains 858 rows, which looks similar to the following example:
a=rand(858,3);
a(1:281,1)=1;
a(281:327,1)=2;
a(327:328,1)=5;
a(329:330,1)=9;
a(331:498,1)=19;
a(499:858,1)=20;
Where column 1 = day, column 2 and 3 are the data.
By knowing that each day number should be repeated 360 times is there a method for including an additional
amount of every value from 1:20 in order to make up the 360. For example, the first column requires
79 x 1's, 46 x 2's, 360 x 3's... and so on; where the final array should therefore have 7200 values in
order from 1 to 20.
If this is possible, in the rows where these values have been added, the second and third column should
changed to nan.
I realise that this is an unusual question, and that it is difficult to understand what is asked, but I hope I have been clear in expressing what i'm attempting to
acheive. Any advice would be much appreciated.
Here's one way to do it for a given element of the cell matrix:
full=zeros(7200,3)+NaN;
for i = 1:20 % for each day
starti = (i-1)*360; % find corresponding 360 indices into full array
full( starti + (1:360), 1 ) = i; % assign the day
idx = find(a(:,1)==i); % find any matching data in a for that day
full( starti + (1:length(idx)), 2:3 ) = a(idx,2:3); % copy matching data over
end
You could probably use arrayfun to make this slicker, and maybe (??) faster.
You could make this into a function and use cellfun to apply it to your cell.
PS - if you ask your question at the Matlab help forums you'll most definitely get a slicker & more efficient answer than this. Probably involving bsxfun or arrayfun or accumarray or something like that.
Update - to do this for each element in the cell array the only change is that instead of searching for i as the day number you calculate it based on how far allong the cell array you are. You'd do something like (untested):
for k = 1:length(cellarray)
for i = 1:length(cellarray{k})
starti = (i-1)*360; % ... as before
day = (k-1)*20 + i; % first cell is days 1-20, second is 21-40,...
full( starti + (1:360),1 ) = day; % <-- replace i with day
idx = find(a(:,1)==day); % <-- replace i with day
full( starti + (1:length(idx)), 2:3 ) = a(idx,2:3); % same as before
end
end
I am not sure I understood correctly what you want to do but this below works out how many measurements you are missing for each day and add at the bottom of your 'a' matrix additional lines so you do get the full 7200x3 matrix.
nbMissing = 7200-size(a,1);
a1 = nan(nbmissing,3)
l=0
for i = 1:20
nbMissing_i = 360-sum(a(:,1)=i);
a1(l+1:l+nbMissing_i,1)=i;
l = l+nb_Missing_i;
end
a_filled = [a;a1];

representing the day and some parameters of a month

Could you please help me for this matter?
I have 3 matrices, P (Power), T (Temperature) and H (Humidity)
every matrix has 31 columns (days) and there are 24 rows for every column
which are the data for the March of year 2000, i.e.
for example, the matrix P has 31 columns where every column represents
a day data for Power through 24 hours and the same idea is for T and H
I tried to write a MATLAB program that accomplish my goal but
It gave me errors.
My aim is:
In the MATLAB command window, the program should ask the user the following phrase:
Please enter the day number of March, 2000 from 1 to 31:
And I know it is as follows:
Name=input (Please enter the day number of March, 2000 from 1 to 31:)
Then, when, for example, number 5 is entered, the result shown is a matrix containing the following:
1st column: The day name or it can be represented by numbers
2nd column: simple numbers from 1 to 24 representing the hours for that day
3rd column: the 24 points of P of that day extracted from the original P
(the column number 5 of the original P)
4th column: the 24 points of T of that day extracted from the original T
(the column number 5 of the original T)
5th column: the 24 points of H of that day extracted from the original H
(the column number 5 of the original H)
Any help will be highly appreciated,
Regards
Here is what you ask for:
% some sample data
P = rand(24,31);
T = rand(24,31);
H = rand(24,31);
% input day number
daynum=input('Please enter the day number of March, 2000 from 1 to 31: ');
[r, c] = size(P);
% generate output matrix
OutputMatrix = zeros(r,5);
OutputMatrix(:,1) = repmat(weekday(datenum(2000,3,daynum)),r,1);
OutputMatrix(:,2) = 1:r;
OutputMatrix(:,3) = P(:,daynum);
OutputMatrix(:,4) = T(:,daynum);
OutputMatrix(:,5) = H(:,daynum);
disp(OutputMatrix)
The matrix can be generated in a one line, but this way is clearer.
Is it always for March 2000? :) Where do you get this information from?