I have a time series xlsx data which has columns like the following one. I would like to get the row data that are in between 8:00:00 AM to 10:00:00 AM for my analysis. Can any one help me out?
Add Velocity Time
0.128835374 10.34912454 8:44:23 AM
0.20423977 8.078739988 8:47:01 AM
0.110629502 13.4081172 9:19:46 AM
0.088979639 5.057336749 9:24:02 AM
0.128835374 10.60785265 10:21:29 AM
0.20423977 9.46599837 10:23:06 AM
[num, txt] = xlsread('Consective_result.xlsx');
T = num(:,3);
TimeVector = datevec(T)
You almost have it right. Use the third column of your txt cell array, and skip over the first row so you don't get the time header. I'm going to assume that your times are entered in as text. Once you do this, just use datenum and determine those times that are later than 8:00 AM and less than 10:00 AM. datenum can conveniently take in a cell array of strings, and it will output a numeric vector where each time string in your cell array is converted into its corresponding numerical representation.
Once you find those rows, you can filter out the rows in each of num and txt using what we just talked about before you continue. Therefore:
[num, txt] = xlsread('Consective_result.xlsx');
times = txt(2:end,3); %// Get the 3rd column, skip 1st row
time_nums = datenum(times); %// Get the numerical representation of the times
%// Figure out those rows that are between 8:00 AM and 10:00 AM
times_to_choose = time_nums >= datenum('08:00:00AM') & time_nums <= datenum('10:00:00AM');
%// Remove those rows then continue
num(1 + times_to_choose) = [];
txt(1 + times_to_choose) = [];
Take special care that I added a 1 to the indices because we omitted the time header in your spreadsheet. Now, num and txt should only contain those times that are between 8:00 AM and 10:00 AM.
Related
I have a CSV file contains data of Hurricane location coordinates.
I'm new to Matlab so I'm not sure how to treat correctly date and hour cells, especially when they are displayed unconventionally.
I need to apply linear interpolation so I can get the date for every 30 minutes.
Let's assume you read the data in as numerical values
Now you have some matrix like so:
data = [20130928 0 21.1 50.0
20130928 600 22.2 50.3
20130928 1200 23.3 50.6
20130928 1800 24.2 50.6];
To convert the first two columns to datetime values, we could do this:
% Concatenate first two columns, including making all times 4 digits by 0 padding
fulltime = [num2str(data(:,1)), num2str(data(:,2), '%.4u')]
% Use datetime to convert (cell) times to dates with given format
dates = datetime(cellstr(fulltime),'inputformat', 'yyyyMMddHHmm');
>> dates = 28-Sep-2013 00:00:00
28-Sep-2013 06:00:00
28-Sep-2013 12:00:00
28-Sep-2013 18:00:00
Now we can easily interpolate. First create an array of times we want to use:
% Data value every 30 mins
interpdates = dates(1):hours(0.5):dates(end)
Then use interp1
interpolateddata = interp1(dates, data(:,3:4), interpdates);
>> interpolateddata = 21.1000 50.0000
21.1917 50.0250
21.2833 50.0500
21.3750 50.0750
...
24.1250 50.6000
24.2000 50.6000
I have a cell of size 1x7 where each cell inside of that is 365x5xN in which each N is a different location (siteID). It is already sorted according to column 5 (the columns are Lat, Lon, siteID, date, and data).
(The data can be found here: https://www.dropbox.com/sh/li3hh1nvt11vok5/4YGfwStQlo. Variable in question is PM25)
I want to go through the entire 1x7 cell and, looking at only the top 36 rows (basically, the top 10 percentile), count the number of times each date shows up. In other words, I want to know on which days the data value fell in the top 10 percentile.
Does anyone know how I can do this? I can't get my mind around how to approach this issue --> counting across all these cells and spitting out a quantity for each day of the year
Assuming you have a sorted cell array, you may use this -
%%// Get all the dates for all the rows in sorted cell array
all_dates = [];
for k1=1:size(sorted_cell,2)
all_dates = [all_dates reshape(cell2mat(sorted_cell{1,k1}(:,4,:)),1,[])];
end
all_unique_dates = unique(all_dates);
all_out = [num2cell(all_unique_dates)' num2cell(zeros(numel(all_unique_dates),1))];%%//'
%%// Get all the dates for the first 36 rows in sorted cell array
dates = [];
for k1=1:size(sorted_cell,2)
dates = [dates reshape(cell2mat(sorted_cell{1,k1}(1:36,4,:)),1,[])];
end
%%// Get unique dates and their counts
unique_dates = unique(dates);
count = histc(dates, unique_dates);
%%// As output create a cell array with the first column as dates
%%// and the second column as the counts
out = [num2cell(unique_dates)' num2cell(count)']
%%// Get all the dates and the corresponding counts.
%%// Thus many would still have counts as zeros.
all_out(ismember(all_unique_dates,unique_dates),:)=out;
Often when something looks tricky from the outside, it's easier to start from the inside instead. How can we get the top dates from a single array?
dates = unique(array(1:35,4));
Now, how to do that for each cell? A loop is always straightforward, but this is a pretty simple function, so let's use the one-liner:
datecell = cellfun(#(x) unique(x(1:35,4)), cellarray, 'UniformOutput', false);
Now we have a just the dates we want, for each cell. If there's no need to keep them separated, let's just stick them all together into one big array:
dates = cell2mat(datecell);
dates = unique(dates); % in case there are any duplicates
If you want to actually count each date as well (it's a little unclear), it might be a little too involved for an anonymous function, so we could either write our own function to pass to cellfun, or just cop out and stick it in a loop:
dates = {};
counts = {};
for ii = 1:length(cellarray)
[dates{ii}, ~, idx] = unique(cellarray{ii}(1:35,4));
counts{ii} = accumarray(idx, 1);
end
Now, those cell arrays may contain duplication, so we'll have to combine the counts where necessary in a similar way:
dates = cell2mat(dates);
counts = cell2mat(counts);
[dates, ~, idx] = unique(dates);
counts = accumarray(idx, counts); % add the counts of duplicated dates together
Note that re-assigning different data to the same variable names like this isn't particularly good practice - I'm just feeling exceptionally lazy tonight, and coming up with good, descriptive names is hard ;)
I have a cell array in which some of the entries have two data points. I want to average the two data points if the data were collected on the same day.
The first column of cell array 'site' is the date. The fourth column is the data concentration. I want to average the fourth column if the data comes from the same day.
For example, if my cell array looks like this:
01/01/2011 36-061-0069 1 10.4
01/01/2011 36-061-0069 2 10.1
01/04/2011 36-061-0069 1 7.9
01/05/2011 36-061-0069 1 13
I want to average the fourth column (10.4 and 10.1) into one row and leave everything else the same.
Help? Would an if elseif loop work? I'm not sure how to approach this issue, especially since cell arrays work a little differently than matrices.
You can do it succinctly without a loop, using a combination of unique, diff and accumarray.
Define data:
data = {'01/01/2011' '36-061-0069' '1' '10.4';
'01/01/2011' '36-061-0069' '2' '10.1';
'01/04/2011' '36-061-0069' '1' '7.9';
'01/05/2011' '36-061-0069' '1' '13'};
Then:
dates = datenum(data(:,1),2); % mm/dd/yyyy format. Change "2" for other formats
[dates_sort ind_sort] = sort(dates);
[~, ii, jj] = unique(dates_sort);
n = diff([0; ii]);
result = accumarray(jj,vertcat(str2double(data(ind_sort,4))))./n;
gives the desired result:
result =
10.2500
7.9000
13.0000
If needed, you can get the non-repeated, sorted dates with data(ind_sort(ii),1).
Explanation of the code: the dates are first converted to numbers and sorted. The unique dates and repeated dates are then extracted. Finally, data in repeated rows are summed and divided by the number of repetitions to obtain the averages.
Compatibility issues for Matlab 2013a onwards:
The function unique has changed in Matlab 2013a. For that version onwards, add 'legacy' flag to unique, i.e. replace the line [~, ii, jj] = unique(dates_sort) by
[~, ii, jj] = unique(dates_sort,'legacy')
It sounds like you want to do :
for it = 1:size(CellArray,1)
sum = sum + cellArray{it}(4) % or .NameOfColumn if it a cell containing struct
end
mean(sum)
I want to do something like
scatter(timesRefined, upProb)
where timesRefined is a cell array in which each entry is a string corresponding to a time moment, such as 8:32:21.122 and upProb is simply a vector of numbers with same length as cell array. What is the most convenient way to do this?
You can convert your timesRefined cell to a numeric representation of date with datenum
>> timesRefined = {'8:32:21.122','9:30:54.123'};
>> datenum(timesRefined)
ans =
734869.355800023
734869.396459757
The resulting number expresses a date as days from the epoch. Since you are not concerned with days, just time, and provided your observations are contained within one day, you can simply take the fractional part of the datenum output:
>> datestr(mod(datenum(timesRefined),1))
ans =
8:32 AM
9:30 AM
and do scater(mod(datenum(timesRefined),1),upProb)
EDIT:
As pointed out by Pursuit, you can use the result of datenum directly as your x values and use datetick('x','HH:MM:SS.FFF')
strsplit from the Matlab file exchange should help. If all values are numeric, you'll get a matrix back.
timestr = '8:32:21.122';
timenum = strsplit(timestr,':');
convmat = [60*60; 60; 1];
time_in_seconds = sum(timenum .* convmat);
I have 19 cells (19x1) with temperature data for an entire year where the first 18 cells represent 20 days (each) and the last cell represents 5 days, hence (18*20)+5 = 365days.
In each cell there should be 7200 measurements (apart from cell 19) where each measurement is taken every 4 minutes thus 360 measurements per day (360*20 = 7200).
The time vector for the measurements is only expressed as day number i.e. 1,2,3...and so on (thus no decimal day),
which is therefore displayed as 360 x 1's... and so on.
As the sensor failed during some days, some of the cells contain less than 7200 measurements, where one in
particular only contains 858 rows, which looks similar to the following example:
a=rand(858,3);
a(1:281,1)=1;
a(281:327,1)=2;
a(327:328,1)=5;
a(329:330,1)=9;
a(331:498,1)=19;
a(499:858,1)=20;
Where column 1 = day, column 2 and 3 are the data.
By knowing that each day number should be repeated 360 times is there a method for including an additional
amount of every value from 1:20 in order to make up the 360. For example, the first column requires
79 x 1's, 46 x 2's, 360 x 3's... and so on; where the final array should therefore have 7200 values in
order from 1 to 20.
If this is possible, in the rows where these values have been added, the second and third column should
changed to nan.
I realise that this is an unusual question, and that it is difficult to understand what is asked, but I hope I have been clear in expressing what i'm attempting to
acheive. Any advice would be much appreciated.
Here's one way to do it for a given element of the cell matrix:
full=zeros(7200,3)+NaN;
for i = 1:20 % for each day
starti = (i-1)*360; % find corresponding 360 indices into full array
full( starti + (1:360), 1 ) = i; % assign the day
idx = find(a(:,1)==i); % find any matching data in a for that day
full( starti + (1:length(idx)), 2:3 ) = a(idx,2:3); % copy matching data over
end
You could probably use arrayfun to make this slicker, and maybe (??) faster.
You could make this into a function and use cellfun to apply it to your cell.
PS - if you ask your question at the Matlab help forums you'll most definitely get a slicker & more efficient answer than this. Probably involving bsxfun or arrayfun or accumarray or something like that.
Update - to do this for each element in the cell array the only change is that instead of searching for i as the day number you calculate it based on how far allong the cell array you are. You'd do something like (untested):
for k = 1:length(cellarray)
for i = 1:length(cellarray{k})
starti = (i-1)*360; % ... as before
day = (k-1)*20 + i; % first cell is days 1-20, second is 21-40,...
full( starti + (1:360),1 ) = day; % <-- replace i with day
idx = find(a(:,1)==day); % <-- replace i with day
full( starti + (1:length(idx)), 2:3 ) = a(idx,2:3); % same as before
end
end
I am not sure I understood correctly what you want to do but this below works out how many measurements you are missing for each day and add at the bottom of your 'a' matrix additional lines so you do get the full 7200x3 matrix.
nbMissing = 7200-size(a,1);
a1 = nan(nbmissing,3)
l=0
for i = 1:20
nbMissing_i = 360-sum(a(:,1)=i);
a1(l+1:l+nbMissing_i,1)=i;
l = l+nb_Missing_i;
end
a_filled = [a;a1];