I have a structure named sacfile which has data for various stations within it (sta1-sta6). The sacfile is further borken up into day increments (sacfile.day, per station), and further into hourly increments for each day (sacfile.day.hour). I would like to loop through each day and subsequently through each hour for each station comparison (i.e., day 032 loop through sta1 hr 1 compared to sta2 hr 1, sta3 hr 1, sta4 hr1, sta5 hr1, sta6 hr 1, and so on and so forth through all the hours of that day, then move onto the next day, etc. You get the point. The stations are defined in sacfile.sta. Does anyone have any suggestion on how I can do this simply?
*I only want to loop through the same day and hour for the stations, then move onto the subsequent day and hour. I don't want to cross compare different days and hours. This is important for the loop.
I tried the following:
for i = 1:length(sacfile)
for j = 1:length(sacfile(i,1).day)
for h = 1:length(sacfile(i,1).day.hour)
but that seems to loop through every hour point. Will this work, how can I be assured it's looping through the correct days, i.e., that day 1 for sta1 is the same day1 for sta2.
Here's an example of one of the structures:
name: '2013.032.00.00.00.0000.TA.POKR..BHE.sac'
date: '31-Mar-2014 12:25:33'
bytes: 11949036
isdir: 0
datenum: 7.3569e+05
net: 'TA'
sta: 'POKR'
loc: ''
comp: 'BHE'
day: [1x1 struct]
data: [2987101x1 double]
time: [1x2987101 double]
header: [1x1 struct]
The only relevent ones are net, sta, loc, comp, day and data. The net, sta, loc, comp are the key identifying fields for the file. The name is the name of the file. Day has the data broken up into hours within it. Make sense?
If I understood your problem well, the functions extractfield() and fieldnames() should help.
fields = fieldnames(sacfile);
for i = 1:numel(fields)
b = extractfield(sacfile.(fields{i}).day, 'day3');
c(i) = extractfield(b{1}.hour, 'hour_x');
end
The function extractfield() returns 1x1 cell containing a structure instead of the structure itself. That is why I do b{1}
Related
I have some ECG data for a number of subjects. For each subject, I can export an excel file with the RR interval, Heart Rate and other measures. The problem is that I have a timestamp starting at the time of recording (in this case 11:22:3:00).
I need to compare the date with other subjects and I want to automate the procedure in Matlab.
I need to flexibly compare, for instance, the first 3 minutes of subjects in condition 1 with those of sbj in condition 2. Or minutes 4 to 8 of condition 1 and 2 and so forth. To do this, I am thinking that the best way is to shift the time vector for each subject so that it starts from 0.
There are a couple of problems to note: I CANNOT create just one vector for all subjects. This would be inaccurate because the heart measures are variable for each individual.
So, IN SHORT I need to shift the time vector for each participant so that it starts at 0 and increases exactly like the original one. So, in this example:
H: M: S: MS RR HR
11:22:03:000 0.809 74.1
11:22:03:092 0.803 74.7
11:22:03:895 0.768 78.1
11:22:04:663 0.732 81.9
11:22:05:395 0.715 83.9
11:22:06:110 0.693 86.5
11:22:06:803 0.705 85.1
11:22:07:508 0.706 84.9
11:22:08:214 0.749 80.1
11:22:08:963 0.762 78.7
11:22:09:725 0.766 78.3
would become:
00:00:00:0000
00:00:00:092
00:00:00:895
00:00:01:663
and so forth...
I would like to do it in Matlab...
P.S.
I was working around the idea of extracting the info in 4 different variables.
Then, I could subtract the values for each cell from the first cell.
For instance:
11-11 = 0; 22-22=0; 03-03=0; ms: keep the same value
Maybe this could kind of work, except that it wouldn't if I have a subject that started, say, at 11:55:05:00
Thank you all for any help.
Gluce
Basic timestamp normalization just subtracts the minimum (or first, assuming they're properly ordered) time from the rest.
With MATLAB's datetime object, this is just subtraction, which yields a duration object:
ts = ["11:22:03:000", "11:22:03:092", "11:22:03:895", "11:22:04:663"];
% Convert to datetime & normalize
t = datetime(ts, 'InputFormat', 'HH:mm:ss:SSS');
t.Format = 'HH:mm:ss:SSS';
nt = t - t(1);
% Reformat & display
nt.Format = 'hh:mm:ss.SSS';
Which returns:
>> nt
nt =
1×4 duration array
00:00:00.000 00:00:00.092 00:00:00.895 00:00:01.663
Alternatively, you can normalize the datetime array itself:
ts = ["11:22:03:000", "11:22:03:092", "11:22:03:895", "11:22:04:663"];
t = datetime(ts, 'InputFormat', 'HH:mm:ss:SSS');
t.Format = 'HH:mm:ss:SSS';
[h, m, s] = hms(t);
[t.Hour, t.Minute, t.Second] = deal(h - h(1), m - m(1), s - s(1));
Which returns the same:
>> t
t =
1×4 datetime array
00:00:00:000 00:00:00:092 00:00:00:895 00:00:01:663
I want to have a for-loop, which goes through each entry in a Time Series Object.
Below are the properties of my timeseriesobject, filename:ts.
Common Properties:
Name: 'unnamed'
Time: [70001x1 double]
TimeInfo: [1x1 tsdata.timemetadata]
Data: [70001x1 double]
DataInfo: [1x1 tsdata.datametadata]
How can I easily go through each time value pair inside a for-loop. I want to have access to the data value and to the time value, so that I can temporarily store it for calculations. I didn't find the exact syntax in the documentation to do this. Hope you can me help out!
For example/ what I'm looking for written in pseudo-code!
dataValue = ts(22).data (comment: data value from entry #22 of Time Series Object ts)
TimeValue = ts(22).time (comment: time value from entry #22 of Time Series Object ts)
Time series objects are to be used per se. No need to store them nor their content in some other variables.
For example
x = rand(5,1);
ts = timeseries(x)
D = ts.data;
T = ts.time;
D and T are plain vectors and can be accessed by D(3) or ts.data(3).
I am trying to find the index in Date3, a column vector of date numbers from 01/01/2008 to 01/31/2014 that has been repeated many times, matches day. I basically want to organize idx into a cell array like idx{i} in which each cell is one day (So return all the indexes where Date3 equals day, where day is one of the days between 01/01/2008 and 01/31/2014. Eventually, I want to pull out the data for each day by applying the index I found to the variable Data2 (Reshape Data2 so that instead of a long column vector of data concentrations, I'll have a cell array in which each cell is all the data from one day)
This is what I have been doing:
for day = datenum(2008,01,01):1:datenum(2014,01,31); % All the days under consideration
idx = find(Date3, day); % Index of where Date3 equals the day under consideration
Data_PM25 = Data2(idx); % Pull out the data based on the idx
end
Example:
If Date3 looks like the following (It's actually much larger and repeats many many more times)
733408
733409
733410
733411
733412
733413
733414
733415
733416
733417
733418
733419
733420
733421
733408
733409
733410
733411
733412
733413
733414
733415
733416
733417
733418
733419
733420
733421
I want idx to be
`idx{1}` = (1, 15) % Where 733408 repeats
`idx{2}` = (2, 16) % Where 733409 repeats
...
And then Data2, which looked like:
[NaN]
[NaN]
[NaN]
[NaN]
[NaN]
[NaN]
[NaN]
[NaN]
[NaN]
[NaN]
'25.8'
'26.1'
'28.9'
'37.5'
'25.2'
'20'
'32.3'
'41'
'46.7'
'28.2'
'34.5'
'31.8'
'37.6'
'45.5'
'54.9'
'54.8'
'36.3'
'18.5'
Will now look like
'Data_PM25{1}' = ([NaN], '25.2')
'Data_PM25{2}' = ([NaN], '20')
...
Of course, the actual outputs will be much longer than just two matches.
What appears to be happening though is that I am comparing every day against Date3, a list of days, so I am getting all the days back.
This question expands off of a previous question: Find where a value matches and concatentate into column vector MATLAB
Use find(ismember…) to find where a certain day shows up in the long column of Date3 and then used that to pull out all the data, Lat, and Lon from that day.
group2cell for some reason, created double the number of days there were supposed to be. Basically, it somehow pulled out two years in which the first year it pulled out (1:365 or 1:366) was random.
day = datenum(years(y), 01, 01):datenum(years(y), 12, 31); % Create a column vector of all days in one year
for d = 1:length(day)
% Find index where Date3 matches one day, one day at a time
ind{d} = find(ismember(datenum(Date3), day(d)) == 1);
data_O3{d} = Data2(ind{d});
end
How about this: The result should be a cell array of vectors, each vector corresponding to the data from a given day.
This of course assumes Date3 and Data2 are the same size
Date3=[733408
733409
733410
733411
733412
733413
733414
733415
733416
733417
733418
733419
733420
733421
733408
733409
733410
733411
733412
733413
733414
733415
733416
733417
733418
733419
733420
733421];
Data2 = Date3.*50; % //just some dummy data for testing
Data_pm25=cell(0);
for day=datenum(2008,01,01):datenum(2014,01,01)
idx=day==Date3;
Data_pm25 = [Data_pm25,Data2(idx)];
end
The only problem I can see with this is if you don't have data for every day, you wouldn't necessarily know which day each cell is representing. This could easily be solved by storing a vector of dates with data.
As I mentioned in your other question try GROUP2CELL function from FileExchange.
You can use it on index as
[newidx, uniqueDate] = group2cell(idx,Date2);
or directly on data
[Data_PM25, uniqueDate] = group2cell(Data2,Date2);
uniqueDate will be an array of all unique Dates. uniqueDate(i) will correspond to Data_PM25(i).
I simply want to generate a series of dates 1 year apart from today.
I tried this
CurveLength=30;
t=zeros(CurveLength);
t(1)=datestr(today);
x=2:CurveLength-1;
t=addtodate(t(1),x,'year');
I am getting two errors so far?
??? In an assignment A(I) = B, the number of elements in B and
Which I am guessing is related to the fact that the date is a string, but when I modified the string to be the same length as the date dd-mmm-yyyy i.e. 11 letters I still get the same error.
Lsstly I get the error
??? Error using ==> addtodate at 45
Quantity must be a numeric scalar.
Which seems to suggest that the function can't be vectorised? If this is true is there anyway to tell in advance which functions can be vectorised and which can not?
To add n years to a date x, you do this:
y = addtodate(x, n, 'year');
However, addtodate requires the following:
x must be a scalar number, not a string.
n must be a scalar number, not a vector.
Hence the errors you get.
I suggest you use a loop to do this:
CurveLength = 30;
t = zeros(CurveLength, 1);
t(1) = today; % # Whatever today equals to...
for ii = 2:CurveLength
t(ii) = addtodate(t(1), ii - 1, 'year');
end
Now that you have all your date values, you can convert it to strings with:
datestr(t);
And here's a neat one-liner using arrayfun;
datestr(arrayfun(#(n)addtodate(today, n, 'year'), 0:CurveLength))
If you're sequence has a constant known start, you can use datenum in the following way:
t = datenum( startYear:endYear, 1, 1)
This works fine also with months, days, hours etc. as long as the sequence doesn't run into negative numbers (like 1:-1:-10). Then months and days behave in a non-standard way.
Here a solution without a loop (possibly faster):
CurveLength=30;
t=datevec(repmat(now(),CurveLength,1));
x=[0:CurveLength-1]';
t(:,1)=t(:,1)+x;
t=datestr(t)
datevec splits the date into six columns [year, month, day, hour, min, sec]. So if you want to change e.g. the year you can just add or subtract from it.
If you want to change the month just add to t(:,2). You can even add numbers > 12 to the month and it will increase the year and month correctly if you transfer it back to a datenum or datestr.
So, I'm beginning to use timeseries in MATLAB and I'm kinda stuck.
I have a list of timestamps of events which I imported into MATLAB. It's now a 3000x25 array which looks like
2000-01-01T00:01:01+00:00
2000-01-01T00:01:02+00:00
2000-01-01T00:01:03+00:00
2000-01-01T00:01:04+00:00
As you can see, each event was recorded by date, hour, minute, second, etc.
Now, I would like to count the number of events by date, hour, etc. and then do various analyses (regression, etc.).
I considered creating a timeseries object for each day, but considering the size of the data, that's not practical.
Is there any way to manipulate this array such that we have "date: # of events"?
Perhaps there's just a simpler way to count events using timeseries?
As others have suggested, you should convert the string dates to serial date numbers. This makes it easy to work with the numeric data.
An efficient way to count number of events per interval (days, hours, minutes, etc...) is to use functions like HISTC and ACCUMARRAY. The process will involve manipulating the serial dates into units/format required by such functions (for example ACCUMARRAY requires integers, whereas HISTC needs to be given the bin edges to specify the ranges).
Here is a vectorized solution (no-loop) that uses ACCUMARRAY to count number of events. This is a very efficient function (even of large input). In the beginning I generate some sample data of 5000 timestamps unevenly spaced over a period of 4 days. You obviously want to replace it with your own:
%# lets generate some random timestamp between two points (unevenly spaced)
%# 1000 timestamps over a period of 4 days
dStart = datenum('2000-01-01'); % inclusive
dEnd = datenum('2000-01-5'); % exclusive
t = sort(dStart + (dEnd-dStart).*rand(5000,1));
%#disp( datestr(t) )
%# shift values, by using dStart as reference point
dRange = (dEnd-dStart);
tt = t - dStart;
%# number of events by day/hour/minute
numEventsDays = accumarray(fix(tt)+1, 1, [dRange*1 1]);
numEventsHours = accumarray(fix(tt*24)+1, 1, [dRange*24 1]);
numEventsMinutes = accumarray(fix(tt*24*60)+1, 1, [dRange*24*60 1]);
%# corresponding datetime range/interval label
days = cellstr(datestr(dStart:1:dEnd-1));
hours = cellstr(datestr(dStart:1/24:dEnd-1/24));
minutes = cellstr(datestr(dStart:1/24/60:dEnd-1/24/60));
%# display results
[days num2cell(numEventsDays)]
[hours num2cell(numEventsHours)]
[minutes num2cell(numEventsMinutes)]
Here is the output for the number of events per day:
'01-Jan-2000' [1271]
'02-Jan-2000' [1258]
'03-Jan-2000' [1243]
'04-Jan-2000' [1228]
And an extract of the number of events per hour:
'02-Jan-2000 09:00:00' [50]
'02-Jan-2000 10:00:00' [54]
'02-Jan-2000 11:00:00' [53]
'02-Jan-2000 12:00:00' [74]
'02-Jan-2000 13:00:00' [49]
'02-Jan-2000 14:00:00' [59]
similarly for minutes:
'03-Jan-2000 08:54:00' [1]
'03-Jan-2000 08:55:00' [1]
'03-Jan-2000 08:56:00' [1]
'03-Jan-2000 08:57:00' [0]
'03-Jan-2000 08:58:00' [0]
'03-Jan-2000 08:59:00' [0]
'03-Jan-2000 09:00:00' [1]
'03-Jan-2000 09:01:00' [2]
You can convert those timestamps to a number with datenum:
A serial date number represents the whole and fractional number of days from a specific date and time, where datenum('Jan-1-0000 00:00:00') returns the number 1. (The year 0000 is merely a reference point and is not intended to be interpreted as a real year in time.)
This way, it's easier to check where a period starts and end. Eg: the week your looking for starts at x and ends at x+7.999... ; all you have to do to find events in that period is checking if the datenum value is between x and x+8:
week_x_events = find(dn_timestamp>=x & dn_timestamp<x+8)
The difficulty is in converting your timestamp to datenum acceptable format, which is doable using regexp, good luck!
I don't know what +00:00 means (maybe time zone?), but you can simply convert your string timestamps into numerical format:
>> t = datenum('2000-01-01T00:01:04+00:00', 'yyyy-mm-ddTHH:MM:SS')
t =
7.3049e+005
>> datestr(t)
ans =
01-Jan-2000 00:01:04