Matlab cell array parsing - matlab

I have a large cell array and I'm trying to vectorize some string parsing. The cell is 100000 x 1 and looks like this:
data =
'"2016-07-27T14:18:08.519Z"'
'"2016-07-27T14:18:16.549Z"'
'"2016-07-27T14:18:21.544Z"'
'"2016-07-27T14:18:27.517Z"'
I want to parse this into two cell arrays that look like this:
date_str, which would look like this:
'2016-07-27'
'2016-07-27'
'2016-07-27'
'2016-07-27'
time_str, which would look like this:
'14:18:08.519'
'14:18:16.549'
'14:18:21.544'
'14:18:27.517'
I have looked at using cellfun(#strsplit,data), but it doesn't let me specify a delimiter for the "strsplit" function.

You can use regexprep to remove the double quotes, and then regexp (with the 'split' option) to split at the desired character. I'm assuming the splitting criterion is simply the occurrence of 'T'.
data = regexprep(data, '^"|"$',''); % remove double quotes
result = regexp(data, 'T', 'split'); % split at 'T'
result = vertcat(result{:}); % un-nest cell array
date_str = result(:,1);
time_str = result(:,2);

Related

Split a Cell Array

I have a 150X1 cell array. Within the array there are multiple data types. The first cell contains 0.9VA = 1.012207; the second: 0.9VA_CLK = 0.020752; and so on like this (for the most part). I would like to split the cell into two cells using the = as the delimiter. Thus, {1,1}: 0.9VA and {1,2}: 1.012207; {2,1}: 0.9VA_CLK and {2,2}: 0.020752; so on and so forth. I have tried converting them to strings and then using strsplit; however, I run into problems because the string arrays are variable in size.
If there is any other information that I can provide please let me know. Thank you for your help and time in advance.
You can indeed apply strsplit to each of the strings (char arrays) in the cell array. To do so, you can use cellfun:
c{1} = '0.9VA = 1.012207';
c{2} = '0.9VA_CLK = 0.020752';
c{3} = 'CSIPhgenSWoffList = [0, 0, 0, 0]';
c{4} = 'SomethingElse = [0.020752, 0.24564]';
c = cellfun(#(x)strsplit(x,'='),c,'UniformOutput',false);
c = cat(1,c{:});
I use a small example cell array c here, containing four strings, I hope this is representative. I apply strplit to each cell in c using cellfun(x,'='), which splits at the equal sign and returns a cell array with cell arrays. That is, each string in c is turned into a cell array with 2 strings (e.g. '0.9VA ' and ' 1.012207'. This does leave some spaces at the beginning and end of the strings.
The next line, cat, converts this cell array of cell arrays into a two-dimensional cell array. The final output is a cell array c containing the same number of rows as the original cell array, and with 2 columns. The first column corresponds to the part before the equal sign, the second column to the part after the equal sign.
To remove the spaces, you can use cellfun again, with strtrim:
c = cellfun(#strtrim,c,'UniformOutput',false);

Converting cell array of string arrays to a double array

I have a 55X1 cell array. Each cell contains a 1X178 string array of numbers. I would like to convert all the cells to a double array, but in such a way that it forms a 55X178 double array.
Take, for example, the 55X1 cell array dataCellOut = {each cell has a 1X178 string}. I can use: na=str2num(dataCellOut{1}) and this will output a 1X178 double array. I have tried using: na=cellfun(#str2num, dataCellOut, 'UniformOutput', false) and this does not work (error: "input must be a character vector or string scalar"). I have worked on this for awhile to no avail.
I hope this makes sense and if there is anything else that I can offer please don't hesitate to let me know. Thank you in advance!
According to the documentation to str2num:
The str2num function does not convert cell arrays or nonscalar string arrays, and is sensitive to spacing around + and - operators. In addition, str2num uses the eval function, which can cause unintended side effects when the input includes a function name. To avoid these issues, use str2double.
str2double, however, does just as you want:
X = str2double(str) converts the text in str to double precision values. [...] str can be a character vector, a cell array of character vectors, or a string array. [...] If str is [...] a string array, then X is a numeric array that is the same size as str.
Thus, this should work:
na = cellfun(#str2double, dataCellOut, 'UniformOutput', false);
na = cat(1,na{:});
This simple statement works for me.
str2num(char(cellstr_array))

How to display selected entries of an array of structures in MATLAB

Suppose we have an array of structure. The structure has fields: name, price and cost.
Suppose the array A has size n x 1. If I'd like to display the names of the 1st, 3rd and the 4th structure, I can use the command:
A([1,3,4]).name
The problem is that it prints the following thing on screen:
ans =
name_of_item_1
ans =
name_of_item_3
ans =
name_of_item
How can I remove those ans = things? I tried:
disp(A([1,3,4]).name);
only to get an error/warning.
By doing A([1,3,4]).name, you are returning a comma-separated list. This is equivalent to typing in the following in the MATLAB command prompt:
>> A(1).name, A(3).name, A(4).name
That's why you'll see the MATLAB command prompt give you ans = ... three times.
If you want to display all of the strings together, consider using strjoin to join all of the names together and we can separate the names by a comma. To do this, you'll have to place all of these in a cell array. Let's call this cell array names. As such, if we did this:
names = {A([1,3,4]).name};
This is the same as doing:
names = {A(1).name, A(3).name, A(4).name};
This will create a 1 x 3 cell array of names and we can use these names to join them together by separating them with a comma and a space:
names = {A([1,3,4]).name};
out = strjoin(names, ', ');
You can then show what this final string looks like:
disp(out);
You can use:
[A([1,3,4]).name]
which will, however, concatenate all of the names into a single string.
The better way is to make a cell array using:
{ A([1,3,4]).name }

How to read data from a string

I have a string in the following format :
fileName.jpg,10,20,10,10,...,12,14,True
Basically, I have a string with comma separated values. The first value is a string, then it follows an array of 100 values and lastly another string being true or false.
Is there a way or directly reading these values into 3 variable? Two strings and an array?
The array of values might contain n\a values which I want to treat as -1 or something similar or by using a cell array and having an empty cell for those? Can you recommend me something for this type of problem?
You can use textscan:
n = 100; % number of integers between filename and logical values
M = textscan(str, ['%s' repmat('%d',1, n) '%s'], 'delimiter', ',',...
'TreatAsEmpty', 'n\a', 'EmptyValue', -1, 'CollectOutput', true);
The result M is a cell array with the file name in the first cell, the 100 integer values in the second, and a string containing the logical value in the last cell.
You can use strsplit and extract the values from your String and store them in separate variables
Code Sample:
a = strsplit("fileName.jpg,10,20,10,10,...,12,14,True",",")
fileName = a(1)
flag = a(end)
data = a(2:end-1)

Convert nonuniform cell array to numeric array

I am using xlsread in MATLAB to read in sheets from an excel file. My goal is to have each column of the excel sheet read as a numeric array. One of the columns has a mix of numbers and numbers+char. For example, the values could be 200, 300A, 450, 500A, 200A, 100. here is what I have so far:
[num, txt, raw] = xlsread(fileIn, sheets{ii}); % Reading in each sheet from a for loop
myCol = raw(:, 4) % I want all rows of column 4
for kk=1:numel(myCol)
if iscellstr(myCol(kk))
myCol(kk) = (cellfun(#(x)strrep(x, 'A', ''), myCol(kk), 'UniformOutput', false));
end
end
myCol = cell2mat(myCol);
This is able to strip off the char from the number but then I am left with:
myCol =
[200]
'300'
[450]
'500'
'200'
[100]
which errors out on cell2mat with:
cell2mat(myCol)
??? Error using ==> cell2mat at 46
All contents of the input cell array must be of the same data type.
I feel like I am probably mixing up () and {} somewhere. Can someone help me out with this?
Let me start from reading the file
[num, txt, raw] = xlsread('test.xlsx');
myCol = raw(:, 4);
idx = cellfun(#ischar,myCol ); %# find strings
data = zeros(size(myCol)); %# preallocate matrix for numeric data
data(~idx) = cell2mat(myCol(~idx)); %# convert numeric data
data(idx) = str2double(regexprep(myCol(idx),'\D','')); %# remove non-digits and convert to numeric
The variable myCol is initially a cell array containing both numbers and strings, something like this in your example:
myCol = {200; '300A'; 450; '500A'; '200A'; 100};
The steps you have to follow to convert the string entries into numeric values is:
Identify the cell entries in myCol that are strings. You can use a loop to do this, as in your example, or you can use the function CELLFUN to get a logical index like so:
index = cellfun(#ischar,myCol);
Remove the letters. If you know the letters to remove will always be 'A', as in your example, you can use a simple function like STRREP on all of your indexed cells like so:
strrep(myCol(index),'A','')
If you can have all sorts of other characters and letters in the string, then a function like REGEXPREP may work better for you. For your example, you could do this:
regexprep(myCol(index),'\D','')
Convert the strings of numbers to numeric values. You can do this for all of your indexed cells using the function STR2DOUBLE:
str2double(regexprep(myCol(index),'\D',''))
The final result of the above can then be combined with the original numeric values in myCol. Putting it all together, you get the following:
>> index = cellfun(#ischar,myCol);
>> result(index,1) = str2double(regexprep(myCol(index),'\D',''));
>> result(~index) = [myCol{~index}]
result =
200
300
450
500
200
100