I want to make a timeline.
The below code extracts information from columns A and B of some Excel workbooks. In Column A are years, column B contains the day number (for that year) when an event happened.
My question is: How can I plot this with Station1, Station2 ect. on the Y-axis, and year on X-axis? I want the graph to make a point on the day (and the right year) where my Excel sheet has data.
num = xlsread('station1.xlsx', 1, 'A:B');
num3 = xlsread('station2.xlsx', 1, 'A:B');
num4 = xlsread('station3.xlsx', 1, 'A:B');
num5 = xlsread('station5.xlsx', 1, 'A:B');
Example data:
num = 2000 193
2000 199
2000 220
2000 228
2000 241
2000 244
2000 250
2000 257
2016 287
2016 292
2016 294
2016 300
Use datetime and caldays to convert your year / day of year data into actual dates:
dnum = datetime(num(:,1),1,1) + caldays(num(:,2));
% dnum = '12-Jul-2000'
% '18-Jul-2000'
% '08-Aug-2000'
% ...
Plot a line with marks on every date:
hold on % to plot multiple lines
plot(dnum, 1*ones(size(dnum)), 'x-') % Change the 1 to the y-axis value
plot(dnum2, 2*ones(size(dnum2)), 'x-') % Line at y=2 with other dates dnum2
hold off
Output (zoomed in on x-axis to show year 2000 dates):
If your files are named as in your example, then you can replace your whole code with a loop to avoid declaring loads of num variables and calling plot over many lines:
figure; hold on;
for ii = 1:5
num = xlsread(['station', num2str(ii), '.xlsx'], 1, 'A:B');
dnum = datetime(num(:,1),1,1) + caldays(num(:,2));
plot(dnum, 1*ones(size(dnum)), 'x-');
end
hold off
Related
An excel file contains 5 columns; first column contains year (1987 to 2080), second column contains month, third column contains days, fourth and fifth column contain values. I would like to get the sum values of columns four and five according to year in column one. For example, I would like to get the sum values of column four and five for year 1987, then 1988, then 1989...so on.!
Example of data file is attached
I have tried the following code considering that each year contains 365 days.
n=1;
for i=1:365:size(data,1)
Total(n,:) = sum(data(i:i+365-1,:));
n=n+1;
end
But the problem is that not all the years contain 365 days. Some of them (e.g. 1988, 1992) contain 366 days in a year as they are leap year. In those cases, the sum results become incorrect.
Looking for your help to get the sum values of columns 4 and 5 according to the year in column 1.
It would be greatly appreciated.
UPDATE: much faster solution at the end!
It can be done as follows with one line for each column:
% some example data
years = ceil(1987:0.3:2080)';
months = randi(12,numel(years),1);
days = randi(30,numel(years),1);
values = randi(42,numel(years),2);
% data similar to yours;
data = [ years months days values ];
That would be the easy readable long way:
% years
y = data(:,1)
% unique years
uy = unique(y);
% for column 4
C4 = arrayfun(#(x) sum( data(y == x, 4) ), uy )
% for column 5
C5 = arrayfun(#(x) sum( data(y == x, 5) ), uy )
or just short in one line per column:
C4 = arrayfun(#(x) sum( data( (data(:,1) == x), 4) ), unique(data(:,1)) )
returning a 94x1 double array with all sums for all 94 unique years of the example data.
If you want to arrange it somehow you could do it as follows:
summary = [uy, C4, C5]
returning something like:
summary = %//sum of sum of
column 4 column 5
1987 3 3
1988 40 40
1989 56 56
1990 96 96
1991 54 54
1992 15 15
1993 73 73
1994 42 42
1995 66 66
1996 56 56
...
You could also do all columns at once. Already for just 2 column it should be 50% faster.
cols = 4:5;
C = cell2mat( arrayfun(#(x) sum( data(y == x, cols),1 ), uy,'uni',0 ) )
The problem with that solution is, that you have a matrix of about 30000x5 size, and for every unique years it will apply the indexing on the whole matrix to "search" for the current year which is summed up. But actually there is an in-built function doing exactly that:
A simpler and much faster solution you can achieve using accumarray:
[~,~, i_uy] = unique(data(:,1));
C4 = accumarray(i_uy,data(:,4));
C5 = accumarray(i_uy,data(:,5));
I have a matrix, in one columnn is the day of year and in the other is the data associated with that day of year. On some days of the year there are multiple data points, while others there is one or none. This makes it difficult to plot the information, what I would like to do is plot the data based on the mean and standard deviation of the data. So if data was collected three times on the 320th day of the year then the mean and standard deviation of these three data points would be found out and then when plotted the mean line would go through the mean and the standard deviation would represent error bars. So just say the data is:
DOY DATA
30, 12
30, 10
30, 8
120, 6
110, 5
I'd Like to transform it to:
DOY DATA STD
30, 10, 2
120, 6, 0
110, 5, 0
I then wish to plot this data with the standard deviation representing error bars.
How would I go about this?
Thanks
You can use Matlab's dataset to get easy grouping -
>> doy = [30 30 30 120 110]';
>> data = [12 10 8 6 5]';
The next line creates a dataset object with two columns, called "doy" and "data"
>> ds = dataset(doy, data);
This line says to calculate group statistics, using "doy" as the grouping variable, and computing the mean and std for each group. It also gives you the number of variables in each group in the column GroupCount.
>> grpstats(ds, 'doy', {'mean', 'std'})
ans =
doy GroupCount mean_data std_data
30 30 3 10 2
110 110 1 5 0
120 120 1 6 0
You could also use accumarray especially if you don't have the stats toolbox:
doy = [30 30 30 120 110]';
data = [12 10 8 6 5]';
[~,ind,subs] = unique(DOY);
means = accumarray(subs, data, size(ind), #mean);
stds = accumarray(subs, data, size(ind), #std);
final = [DOY(ind), means, stds]
Say that I have a dataset:
Jday = datenum('2009-01-01 00:00','yyyy-mm-dd HH:MM'):1/24:...
datenum('2009-01-05 23:00','yyyy-mm-dd HH:MM');
DateV = datevec(Jday);
DateV(4,:) = [];
DateV(15,:) = [];
DateV(95,:) = [];
Dat = rand(length(Jday),1)
How is it possible to remove all of the days that have less than 24 measurements. For example, in the first day there is only 23 measurements thus I would need to remove that entire day, how could I repeat this for all of the array?
A quick solution is to group by year, month, day with unique(), then count observation per day with accumarray() and exclude those with less than 24 obs with two steps of logical indexing:
% Count observations per day
[unDate,~,subs] = unique(DateV(:,1:3),'rows');
counts = [unDate accumarray(subs,1)]
counts =
2009 1 1 22
2009 1 2 24
2009 1 3 24
2009 1 4 24
2009 1 5 23
Then, apply criteria to the counts and retrieve logical index
% index only those that meet criteria
idxC = counts(:,end) == 24
idxC =
0
1
1
1
0
% keep those which meet criteria (optional, for visual inspection)
counts(idxC,:)
ans =
2009 1 2 24
2009 1 3 24
2009 1 4 24
Finally, find the members of Dat that fall into the selected counts with a second round of logical indexinf through ismember():
idxDat = ismember(subs,find(idxC))
Dat(idxDat,:)
Rather long answer, but I think it should be useful. I would do this using containers.Map. Possibly there is a faster way, but maybe for now this one will be good.
Jday = datenum('2009-01-01 00:00','yyyy-mm-dd HH:MM'):1/24:...
datenum('2009-01-05 23:00','yyyy-mm-dd HH:MM');
DateV = datevec(Jday);
DateV(4,:) = [];
DateV(15,:) = [];
DateV(95,:) = [];
% create a map
dateMap = containers.Map();
% count measurements in each date (i.e. first three columns of DateV)
for rowi = 1:1:size(DateV,1)
dateRow = DateV(rowi, :);
dateStr = num2str(dateRow(1:3));
if ~isKey(dateMap, dateStr)
% initialize Map for a given date with 1 measurement (i.e. our
% counter of measuremnts
dateMap(dateStr) = 1;
continue;
end
% increment measurement counter for given date
dateMap(dateStr) = dateMap(dateStr) + 1;
end
% get the dates
dateStrSet = keys(dateMap);
for keyi = 1:numel(dateStrSet)
dateStrCell = dateStrSet(keyi);
dateStr = dateStrCell{1};
% get number of measurements in a given date
numOfmeasurements = dateMap(dateStr);
% if less then 24 do something about it, e.g. save the date
% for later removal from DateV
if numOfmeasurements < 24
fprintf(1, 'This date has less than 24 measurement: %s\n', dateStr);
end
end
The results is:
This date has less than 24 measurement: 2009 1 1
This date has less than 24 measurement: 2009 1 5
I have observed daily data that I need to compare to generated Monthly data so I need to get a mean of each month over the thirty year period.
My observed data set is currently in 365x31 with rows being each day (no leap years!) and the extra column being the month number (1-12).
the problem I am having is that I can only seem to get a script to get the mean of all years. ie. I cannot figure how to get the script to do it for each column separately. Example of the data is below:
1 12 14
1 -15 10
2 13 3
2 2 37
...all the way to 12 for 365 rows
SO: to recap, I need to get the mean of [12; -15; 13; 2] then [14; 10; 3; 37] and so on.
I have been trying to use the unique() function to loop through which works for getting the number rows to average but incorrect means. Now I need it to do each month(28-31 rows) and column individually. Result should be a 12x30 matrix. I feel like I am missing something SIMPLE. Code:
u = unique(m); %get unique values of m (months) ie) 1-12
for i=1:length(u)
month(i) = mean(obatm(u(i), (2:31)); % the average for each month of each year
end
Appreciate any ideas! Thanks!
You can simply filter the rows for each month and then apply mean, like so:
month = zeros(12, size(obatm, 2));
for k = 1:12
month(k, :) = mean(obatm(obatm(:, 1) == k, :));
end
EDIT:
If you want something fancy, you can also do this:
cols = size(obatm, 2) - 1;
subs = bsxfun(#plus, obatm(:, 1), (0:12:12 * (cols - 1)));
vals = obatm(:, 2:end);
month = reshape(accumarray(subs(:), vals(:), [12 * cols, 1], #mean), 12, cols)
Look, Ma, no loops!
Could you please help me for this matter?
I have 3 matrices, P (Power), T (Temperature) and H (Humidity)
every matrix has 31 columns (days) and there are 24 rows for every column
which are the data for the March of year 2000, i.e.
for example, the matrix P has 31 columns where every column represents
a day data for Power through 24 hours and the same idea is for T and H
I tried to write a MATLAB program that accomplish my goal but
It gave me errors.
My aim is:
In the MATLAB command window, the program should ask the user the following phrase:
Please enter the day number of March, 2000 from 1 to 31:
And I know it is as follows:
Name=input (Please enter the day number of March, 2000 from 1 to 31:)
Then, when, for example, number 5 is entered, the result shown is a matrix containing the following:
1st column: The day name or it can be represented by numbers
2nd column: simple numbers from 1 to 24 representing the hours for that day
3rd column: the 24 points of P of that day extracted from the original P
(the column number 5 of the original P)
4th column: the 24 points of T of that day extracted from the original T
(the column number 5 of the original T)
5th column: the 24 points of H of that day extracted from the original H
(the column number 5 of the original H)
Any help will be highly appreciated,
Regards
Here is what you ask for:
% some sample data
P = rand(24,31);
T = rand(24,31);
H = rand(24,31);
% input day number
daynum=input('Please enter the day number of March, 2000 from 1 to 31: ');
[r, c] = size(P);
% generate output matrix
OutputMatrix = zeros(r,5);
OutputMatrix(:,1) = repmat(weekday(datenum(2000,3,daynum)),r,1);
OutputMatrix(:,2) = 1:r;
OutputMatrix(:,3) = P(:,daynum);
OutputMatrix(:,4) = T(:,daynum);
OutputMatrix(:,5) = H(:,daynum);
disp(OutputMatrix)
The matrix can be generated in a one line, but this way is clearer.
Is it always for March 2000? :) Where do you get this information from?