I have SST data for 30 days of one rigion. However, part of daily (several days) data is missed, as shown in the following figure. So, I want to calculate the average value of these obtained SST data for these 30 days. Since some of the days are missed.
I am wondering how can I calculate the average SST value for these 30 days? My data is in .mat format.
For example, is there any function (in Matlab or Python) I can use to calculate the average value even though some of the data are missed?
Note: the 'NaN' indicates the missed data.
Thanks in advance!
Just use the 'omitnan' flag of the mean() function.
% Random input array
x = rand(30, 1);
% Insert a random number of NaNs in the array
nNaN = randi(30, 1);
idx = randi(30, [nNaN, 1]);
x(unique(idx)) = NaN;
% Calculate the average
xav = mean(x, 'omitnan');
Related
I have a dataset with two columns, the first column is duration (length of time (e.g. 5 min) and the second column is firing rates. Is it possible to plot this in such a way that firing rates are binned according to corresponding duration (e.g. 5, 10, 15 min) and then plot bars with firing rate on the y axis and time on the x?
I'm sure this can be accomplished without the for loop. Solution below uses the discretize function to accomplish the grouping. Other approaches possible.
% MATLAB R2017a
% Sample data
D = 20*rand(25,1);
FR = 550*rand(25,1);
D_bins = (0:5:20)';
ind = discretize(D,D_bins); % groups data
FR_mean = zeros(length(D_bins),1);
for k = 1:length(D_bins)
FR_mean(k) = mean(FR(ind==k));
end
bar(D_bins,FR_mean) % bar plot
% Cosmetics
xlabel('Duration (min)')
ylabel('Mean Firing Rate (unit)')
I'm positive there's a more efficient way to get the means for each group, possibly using arrayfun or some other nifty functions, but will hold off until OP provides more details.
In Matlab, given a vector A (please, find it here: https://www.dropbox.com/s/otropedwxj0lki7/A.mat?dl=0 ), how could I find the n-samples vector subset with the smallest range (or standard deviation)?
I am trying a potential solution: reshaping the vector in columns, performing range of each column and selecting the smallest. However, reshape does not always works well when applied to other examples with different lengths. How could this be worked around in an easier and more efficient way?
Fs = 1000; % sampling frequency
time = round(length(A)/Fs)-1; % calculate approximated rounded total length in time
A_reshaped = reshape(A(1:time*Fs), [], time/2); % reshape A (deleting some samples at the end) in time/2 columns
D(1,:) = mean(A_reshaped);
D(2,:) = range(A_reshaped);
[~,idx] = min(D(2,:));
Value = D(1,idx);
Any help is much appreciated.
To find the n-sample with minimum range you can sort the vector and subtract the first section of the sorted vector from the last section. Then use index of the minimum to find the n-sample:
n=4
a= rand(1,10);
s= sort(a);
[~,I]=min(s(n:end)-s(1:end-n+1))
result = s(I:I+n-1)
I have some travel time data stored as column vectors. I want to write a script that will allow me run a linear interpolation from specified initial and final values, to make a column of distances, so I can calculate velocity.
example: Column 1: t1,t2,t3......tn; Column 2: (using the linear interpolation we create) d1, d2, d3....dn
So here we have generated a distance for each travel time based on an initial distance and a final distance.
then it should be simple to generate a new column that is simply the interpolated distances / travel times. Thanks for your help. Cheers
interp1 is your friend here:
% from zero to one hour
measuredTime = [0 1];
% from 0 to 100 km
measuredDistance = [0 100];
% 10 minute intervals
intermediateTimes = measuredTime(1):10/60:measuredTime(end);
% interpolated distances
intermediateDistances = interp1(measuredTime,measuredDistance,intermediateTimes);
I have a 4-D matrix. The dimensions are longitude, latitude, days, years as [17,14,122,16].
I have to find out frequency of values above 98 percentile for each cell so that final output comes as as array of 17x14 containing number of occurrence of values above a 98 percent threshold.
I did something which gives me a matrix 17x14 of values associated with 98 percentile for each cell but I am unable to determine the frequency of occurrences.
k=0;
p=cell(1,238);
r=cell(1,238);
for i=1:17
for j=1:14
n=m(i,j,[1:122],[1:16]);
n=squeeze(n);
k=k+1;
q=prctile(n(:),98);
r{k}=nansum(nansum(n>=q));
p{k}=q;
end
end
This code gives matrix p fine but matrix r contains same values for all cells. How can this be possible? What am I doing wrong with this? Please help.
By definition, the frequency of values above the 98th percentile is 2%.
I'm guessing the same value you are getting for r is 39; the number of elements in the top 2% of your 122x16 matrix (i.e. 1952 elements).
r = 0.02*1952;
r =
39.040
Your code is verifying the theoretical value. Perhaps you are thinking of a different question?
Here's a simulated example, using randomly generated (uniform distribution) from 0 to 100 for your data (n).
p=cell(1,238);
r=cell(1,238);
for i=1:17
for j=1:14
% n=m(i,j,[1:122],[1:16]);
% n=squeeze(n);
% After you do n=squeeze(n), it gives 2-D matrix of 122x16
% dimensions.
n = rand(122,16)*100; % simulation for your 2-D matrix
k=k+1;
q=prctile(n(:),98);
r{k}=nansum(nansum(n>=q));
p{k}=q;
end
end
I have a time series of prices, 2000 entries.
I have created 12 vectors where each one contains only the data for one month. They don't have the same length, vary about 20 values between 160 and 180 values.
So now I need to plot all these vectors in the same plot, in sequence of course, starting with January data, and a little space in between, and on the x-axis put the month names (which I have in an array ['jan' 'feb' etc]
For an example click on the link and scroll down to seasonal subseries plot
http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc443.htm
To obtain a similar graph to the above, you can insert a row of NaNs after each month. Since, each month has a different number of rows, you cannot simply reshape, concatenate NaNs, and reshape back.
Suppose you have a timestamps in the first column and some data in the second column:
data = [(now-11:now+13)' rand(25,1)];
% Count in 'idx' when each year-month pair ends
[y,m] = datevec(data(:,1));
[~, idx] = unique([y,m],'rows','last');
% Preallocate expanded Out with NaN separations between each month
szData = size(data);
Out = NaN(szData(1) + numel(idx)-1,2);
% Reposition 'data' within 'Out'
pos = ones(szData(1),1);
pos(idx(1:end-1)+1) = 2;
Out(cumsum(pos),:) = data;
% Example plot
plot(Out(:,1),Out(:,2))
set(gca,'Xtick',data([1 11 12 25],1))
datetick('x','dd-mmm','keepticks')