I have a lot of dates in MatLab (over 2 millions). Al these dates are in a cell array in 'yyyymmdd' format, and I want to convert them to 'yyyy-mm-dd' format and put this result in a cell array (not in a char matrix).
I know that I can use
temp = datestr(datenum(datesArray,'yyyymmdd'),'yyyy-mm-dd'),
and then use
mat2cell(temp, ones(1,n),10),
where n is the number of rows of datesArray (in this case approximately 2 millions) in order to get my result, but this approach is very slow.
So, I want to know a different way to do that.
Regards.
You could avoid for loops by using cellfun, let's say your date cell array is
dates = {'20120101', '20120102', '20120103'}
You can then convert them to your format as
cellfun(#(x)[x(1:4),'-',x(5:6),'-',x(7:8)], dates, 'Uniform', false)
Hope that helps.
If your date format is always "yyyymmdd" and it's in a linear cell array called datesArray, you could maybe do it by accessing the strings in datesArray and transforming them by inserting hyphens and concatenating the string.
for i=1:length(datesArray)
newDatesArray{i} = [datesArray{i}(1:4), '-', datesArray{i}(5:6), '-', datesArray{i}(7:8)];
end
Transform your dates into serial one and keep them! However, here's a solution:
% Create dummy dates (takes 10 seconds on my pc)
tic;d = cellstr(datestr(now-2e5+1:now,'yyyymmdd'));toc
% Convert to char, then concatenate with '-' and back to `cellstr()` (1 sec):
c = char(d);
dash = repmat('-',2e5,1);
c = cellstr([c(:,1:4) dash c(:,5:6) dash c(:,7:8)]);
So here my solution, which i think is quite nice!
dates = {'20120101', '20120102', '20120103'}
And you can convert using this :
cellfun(#(x)regexprep(num2str(x), '(?<=\d{4})\d{2}', '-$0'),dates,'Uniform',false)
The answer is similar to radarhead, but it uses the regexprep function instead.
Related
Suppose that I have a string of values corresponding to the height of a group of people
height_str ={'1.76000000000000';
'1.55000000000000';
'1.61000000000000';
'1.71000000000000';
'1.74000000000000';
'1.79000000000000';
'1.74000000000000';
'1.86000000000000';
'1.72000000000000';
'1.82000000000000';
'1.72000000000000';
'1.63000000000000'}
and a single height value.
height_val = 177;
I would like to find the indices of the people that are in the range height_val +- 3cm.
To find the exact match I would do like this
[idx_height,~]=find(ismember(cell2mat(height_str),height_val/100));
How can I include the matches in the previous range (174-180)?
idx_height should be = [1 5 6 7]
You can convert you strings into an numeric array (as #Divakar mentioned) by
height = str2num(char(height_str))*100; % in cm
Then just
idx_height = find(height>=height_val-3 & height<=height_val+3);
Assuming that the precision of heights stays at 0.01cm, you can use a combination of str2double and ismember for a one-liner -
idx_height = find(ismember(str2double(height_str)*100,[height_val-3:height_val+3]))
The magic with str2double is that it works directly with cell arrays to get us a numeric array without resorting to a combined effort of converting that cell array to a char array and then to a numeric array.
After the use of str2double, we can use ismember as you tried in your problem to get us the matches as a logical array, whose indices are picked up with find. That's the whole story really.
Late addition, but for binning my first choice would be to go with bsxfun and logical operations:
idx_height = find(bsxfun(#le,str2double(height_str)*100,height_val+3) & ...
bsxfun(#ge,str2double(height_str)*100,height_val-3))
I'm trying to loop through an array of dates/times in matlab, split each column using regexp with the following delimiters ('/' or ':' or '.'), and store each column separately as year, day, hour, min, sec, ss, respectively. Ultimately I'm trying to turn this array of Julian dates and times into a plot-able format in matlab. So far I've been able to loop through my array called 'time' and created a new 1x6 cell called 'clean2_time' which splits each row into 6 columns (year, day, hour, min, sec, ss) based on the delimiters '/' ':' and '.'. My issue is that the loop overwrites 'clean2_time' every iteration and I am left with only the final 1x6 time stamp for the last row. I have tried creating a new variable of all zeros 'z' and setting 'clean2_time' equal to z but have no luck.
Sample of 'time':
'2013/231/21:38:09.856619'
'2013/231/21:38:09.955640'
'2013/231/21:38:10.156685'
'2013/231/21:38:10.356550'
'2013/231/21:38:10.556770'
'2013/231/21:38:10.756565'
'2013/231/21:38:10.955627'
'2013/231/21:38:11.256588'
'2013/231/21:38:11.556649'
'2013/231/21:38:11.955597'
'2013/231/21:38:12.356627'
'2013/231/21:38:12.856557'
'2013/231/21:38:13.356558'
'2013/231/21:38:14.156530'
'2013/231/21:38:14.970500'
'2013/231/21:38:16.256545'
'2013/231/21:38:16.266736'
'2013/231/21:38:18.156398'
Code I've tried so far:
z=zeros(size(time,1),6);
for i = 1:size(time,1) % for i = 1 to 5922
clean2_time = regexp(time{i,1}, '[/:.]', 'split');
z{i,1} = clean2_time(i,1)
z{i,2} = clean2_time(i,2)
z{i,3} = clean2_time(i,3)
z{i,4} = clean2_time(i,4)
z{i,5} = clean2_time(i,5)
z{i,6} = clean2_time(i,6)
end
You are on the right track, however, you don't need the for loop.
Simply doing this would suffice:
clean2_time=regexp(time, '[/:.]', 'split');
Then clean2_time is a cell structure in which every row contains another 1x6 cell array. You can then access the different values with: clean2_time{row}{column}. If you really want clean2_time to be a nx6 numerical matrix instead of this cell array of strings, simply use this to reshape:
clean2_time=cellfun(#str2num,vertcat(clean2_time{:}))
clean2_time=zeros(size(time,1),6);
for i = 1:size(time,1) % for i = 1 to 5922
clean2_time(i,:)=regexp(time{i,1}, '[/:.]', 'split')
end
clean2_time(i,:) indexes the i-th row of the cell.
I need to convert date and time into a numerical value. for example:
>> num = datenum('2011-05-07 11:52:23')
num =
7.3463e+05
How would I write a script to do this for numerous values without inputting the date and time manually?
You can store your date strings first in a cell array (or a matrix, provided they are of fixed format), and feed it straight to datenum. For example:
C = {'2011-05-07 11:52:23'
'2011-03-01 20:30:01'};
vals = datenum(C)
I simply want to generate a series of dates 1 year apart from today.
I tried this
CurveLength=30;
t=zeros(CurveLength);
t(1)=datestr(today);
x=2:CurveLength-1;
t=addtodate(t(1),x,'year');
I am getting two errors so far?
??? In an assignment A(I) = B, the number of elements in B and
Which I am guessing is related to the fact that the date is a string, but when I modified the string to be the same length as the date dd-mmm-yyyy i.e. 11 letters I still get the same error.
Lsstly I get the error
??? Error using ==> addtodate at 45
Quantity must be a numeric scalar.
Which seems to suggest that the function can't be vectorised? If this is true is there anyway to tell in advance which functions can be vectorised and which can not?
To add n years to a date x, you do this:
y = addtodate(x, n, 'year');
However, addtodate requires the following:
x must be a scalar number, not a string.
n must be a scalar number, not a vector.
Hence the errors you get.
I suggest you use a loop to do this:
CurveLength = 30;
t = zeros(CurveLength, 1);
t(1) = today; % # Whatever today equals to...
for ii = 2:CurveLength
t(ii) = addtodate(t(1), ii - 1, 'year');
end
Now that you have all your date values, you can convert it to strings with:
datestr(t);
And here's a neat one-liner using arrayfun;
datestr(arrayfun(#(n)addtodate(today, n, 'year'), 0:CurveLength))
If you're sequence has a constant known start, you can use datenum in the following way:
t = datenum( startYear:endYear, 1, 1)
This works fine also with months, days, hours etc. as long as the sequence doesn't run into negative numbers (like 1:-1:-10). Then months and days behave in a non-standard way.
Here a solution without a loop (possibly faster):
CurveLength=30;
t=datevec(repmat(now(),CurveLength,1));
x=[0:CurveLength-1]';
t(:,1)=t(:,1)+x;
t=datestr(t)
datevec splits the date into six columns [year, month, day, hour, min, sec]. So if you want to change e.g. the year you can just add or subtract from it.
If you want to change the month just add to t(:,2). You can even add numbers > 12 to the month and it will increase the year and month correctly if you transfer it back to a datenum or datestr.
i have a cell array as below, which are dates. I am wondering how can i extract the year at the last 4 digits? Could anyone teach me how to locate the year in the string? Thank you!
'31.12.2001'
'31.12.2000'
'31.12.2004'
'31.12.2003'
'31.12.2002'
'31.12.2000'
'31.12.1999'
'31.12.1998'
'31.12.1997'
'31.12.2005'
'31.12.2004'
'31.12.2003'
'31.12.2002'
'31.12.2001'
'31.12.2000'
'31.12.1999'
'31.12.1998'
'31.12.2005'
'31.12.2004'
'31.12.2003'
'31.12.2002'
'31.12.2005'
Example cell array:
A = {'31.12.2001'; '31.12.2002'; '31.12.2003'};
Apply some regular expressions:
B = regexp(A, '\d\d\d\d', 'match')
B = [B{:}];
EDIT: I never realized that matlab will "nest" an extra layer of cells until I tested this. I don't like this solution as much now that I know the second line is necessary. Here is an alternative approach that gets you the years in numeric form:
C = datevec(A, 'dd.mm.yyyy');
C = C(:, 1);
SECOND EDIT: Suprisingly, if your cell array has less than 10000 elements, the regexp approach is faster on my machine. But the output of it is another cell array (which takes up much more memory than a numeric matrix). You can use B = cell2mat(B) to get a character array instead, but this brings the two approaches to approximately equal efficiency.
Just to add a fun answer, designed to take the OP to the stranger regions of Matlab:
C = char(C);
y = (D(:,7:end)-'0') * 10.^(3:-1:0).'
which is an order of magnitude faster than anything posted in the other answers :)
Or, to stay a bit closer to home,
y = cellfun(#(x)str2double(x(7:end)),C);
or, yet another regexp variation:
y = str2num(char(regexprep(C, '\d+\.\d+\.','')));
Assuming your matrix with dates is M or a cell array C:
In case your data is in a cell array start with
M = cell2mat(C)
Then get the relevant part
Y=M(:,end-4:end)
If required you can even make the year a number
Year = str2num(Y)
Using regexp this will works also with dates with slightly different formats, like 1.1.2000, which can mess with you offsets
res = regexp(dates, '(?<=\d+\.\d+\.)\d+', 'match')