Convert a column of a table to vector - matlab

I have a table named cellName of size 10000*1. Each entry is a character string of cell names. Each cell name is of different length.
And I want to coerce it into a vector with 10000 elements. How can i do that in matlab? It should be very easy as that in r but i didn't find such command in matlab.
OR: I used readtable to load the 10000*1 table from csv file at the very beginning. It would be great if I can directly read the 10000*1 entries as a single vector too. This is what i did at first.
cellName = readtable('cell.csv');
cellName=cellName(1:10000,1);
Thank you in advance!
Clear example: A is a table of 5*1.
A= apple
banana
pear
peach
watermelon
And i want to coerce A into a vector of 5 elements: A=[apple,banana,pear,peach,watermelon] instead of a table

If a character cell array is what you want, I may have your answer. I also suggest you to read how to access data in a table.
readtable returns a Matlab table data type when succeeded. The table can be accessed like struct with your column name as the fieldname, or be indexed by {} operator like you would access a cell array.
In your example, assuming A is the return value of readtable and your
A = table({'apple','banana','pear','peach','watermelon'}','variableNames',{'cellName'})
Then you can either call
cellName = A.cellName
or
cellName = A{:,1}
to get your cell array.

Related

How to place columns of data set from excel file into numeric arrays

I have an excel file that I grab by:
ds = dataset('XLSFile',fullfile('file path here', 'waterReal.xlsx'))
It looks like this:
I want each column in its own numeric array though! Like how when I load an example dataset: load carsmall, I get a bunch of individual numeric arrays. But I can't figure out how to do that.
I can do this individually by writing:
A = ds.TEMP, B = ds.PROD, ...
Bu what if I had BIG excel file? What then?
You can convert a dataset to a struct or a cell like this:
To struct:
s = dataset2struct(ds, 'AsScalar',true)
To cell:
fnames = fieldnames(ds);
c = cell(1, numel(fnames));
for i = 1:numel(fnames)
c{i} = ds.(fnames{i});
end
By the way: use tables instead of datasets. They're newer and better. Use the readtable function to read your Excel file into a table. And tables are nicer enough that you might not want to bother converting them into a simpler cell array, because you can just grab the columns out with t{:,i} where t is your table and i is the index of the column you want.

Get a whole column in a struct field in MATLAB

I have a struct in MATLAB with the size 46x6, the fields are:
name, folder, date, bytes, isdir, datenum
Now I want all 46 entries of name. However, the MATLAB function getfield(structname, 'name') only returns the first entry.
How can I get all elements of the struct?
Name holds strings
If you want the results as a cell array you can call {structname(:).name}.
To return an array you can call [structname(:).name].
First I had to convert the Struct to a cell, and then access it with round brackets
tmp = struct2cell(mystruct)
tmp(1,:)
for i = 1:numel(structname)
name(i)= structname(i).name;
end

Create structure fieldnames from array of numbers

I have a dataset that I would like to categorise and store in a structure based on the value in one column of the dataset. For example, the data can be categorised into element 'label_100', 'label_200' or 'label_300' as I attempt below:
%The labels I would like are based on the dataset
example_data = [repmat(100,1,100),repmat(200,1,100),repmat(300,1,100)];
data_names = unique(example_data);
%create a cell array of strings for the structure fieldnames
for i = 1:length(data_names)
cell_data_names{i}=sprintf('label_%d', data_names(i));
end
%create a cell array of data (just 0's for now)
others = num2cell(zeros(size(cell_data_names)));
%try and create the structure
data = struct(cell_data_names{:},others{:})
This fails and I get the following error message:
"Error using struct
Field names must be strings."
(Also, is there a more direct method to achieve what I am trying to do above?)
According to the documentation of struct,
S = struct('field1',VALUES1,'field2',VALUES2,...) creates a
structure array with the specified fields and values.
So you need to have each value right after its field name. The way you are calling struct now is
S = struct('field1','field2',VALUES1,VALUES2,...)
instead of the correct
S = struct('field1',VALUES1,'field2',VALUES2,...).
You can solve that by concatenating cell_data_names and others vertically and then using {:} to produce a comma-separated list. This will give the cells' contents in column-major order, so each field name fill be immediately followed by the corresponding value:
cell_data_names_others = [cell_data_names; others]
data = struct(cell_data_names_others{:})

Matlab: How do you seperate text in existing cell

I'm a bit new to the matlab world, and I'm running into an issue that I'm sure has an easy solution.
I've imported some data from a text file and parsed out the headers, which resulted in a 1x35 cell called Data. In each cell (for example Data{1,1,1}) is data that looks like:
'600000 -947.772827 -107.045776 -70.818062'
'600001 -920.431396 -86.098122 -56.485119'
'600002 -878.332886 -88.673630 -85.249130'
'600003 -851.637695 -68.546539 -96.691711'
'600004 -834.707642 -28.951260 -73.218872'
'600005 -783.431580 40.657402 24.242268'
The problem is, each line is contained in a single column. I'd like to parse it out so that I have 4 columns instead of one.
I tried parsing out the Data cell even further using:
textscan(Data{1,1,1}, '%u%f10%f10%f10', 1)
But it resulted in the following error:
Error using textscan
First input must be of type double or string.
Can I use textscan this way, or do I need to use some other method to break out the text?
With textscan, you can only specify a single string or a single number. With your input, I suspect it is a 6 x 1 cell array of strings. As such, you have no choice but to iterate over each cell and convert each cell array contents with textscan Also, get rid of the %10 spacing as it's actually screwing up where you're parsing out the string. Also, set the identifier to identify the first number you see to double (%f) as opposed to unsigned integer (%u) to allow for easier conversion.
Therefore, do something like this:
>> Data{1,1,1} = {'600000 -947.772827 -107.045776 -70.818062'
'600001 -920.431396 -86.098122 -56.485119'
'600002 -878.332886 -88.673630 -85.249130'
'600003 -851.637695 -68.546539 -96.691711'
'600004 -834.707642 -28.951260 -73.218872'
'600005 -783.431580 40.657402 24.242268'};
>> format long g;
>> vals = cell2mat(cellfun(#(x) cell2mat(textscan(x, '%f%f%f%f', 1)), Data{1,1,1}, 'uni', 0))
vals =
Columns 1 through 3
600000 -947.772827 -107.045776
600001 -920.431396 -86.098122
600002 -878.332886 -88.67363
600003 -851.637695 -68.546539
600004 -834.707642 -28.95126
600005 -783.43158 40.657402
Column 4
-70.818062
-56.485119
-85.24913
-96.691711
-73.218872
24.242268
That statement vals = ... is quite a mouthful, but easy to explain. Start with this statement:
cell2mat(textscan(x, '%f%f%f%f', 1))
For a given cell x in Data{1,1,1}, we want to parse out four numbers for each string that is stored in x. textscan will place these numbers as individual cell elements into a cell array. We want to convert each element into a numeric array, and so cell2mat is required for us to do so.
In order to operate over all of the elements in Data{1,1,1}, we need to use cellfun to allow us to do so:
cellfun(#(x) cell2mat(textscan(x, '%f%f%f%f', 1)), Data{1,1,1}, 'uni', 0)
The first input is a function that operates on each cell stored in Data{1,1,1} (the second input). We are basically telling cellfun that we want to operate on each cell in the cell array stored in Data{1,1,1} in the way I talked about before. This function has input parameter x, which is one cell from Data{1,1,1}. Now, the uni flag is set to 0 because the output of cellfun will not be a single number, but an array of numbers - one array per line that you have in your cell array. The output of this stage would be a 6 element cell array where each location is a 4 element numeric array. To finish it off, we call cell2mat on this output to finally convert our text into a 2D matrix and therefore:
vals = cell2mat(cellfun(#(x) cell2mat(textscan(x, '%f%f%f%f', 1)), Data{1,1,1}, 'uni', 0))
format long g allows for better display formatting so we can see both the dominant number as well as the floating point numbers neatly.

Scanning data from cell array and removing based on file extensions

I have a cell array that is a list of file names. I transposed them because I find that easier to work with. Now I am attempting to go through each line in each cell and remove the lines based on their file extension. Eventually, I want to use this list as file names to import data from. This is how I transpose the list
for i = 1:numel(F);
a = F(1,i);
b{i} = [a{:}'];
end;
The code I am using to try and read the data in each cell keeps giving me the error input must be of type double or string. Any ideas?
for i = 1:numel(b);
for k = 1:numel(b{1,i});
b(cellfun(textscan(b{1,i}(k,1),'%s.lbl',numel(b)),b))=[];
end;
end;
Thanks in advance.
EDIT: This is for MATLAB. Should have been clear on that. Thanks Brian.
EDIT2: whos for F is
Name Size Bytes Class Attributes
b 1x11 13986188 cell
while for a is
Name Size Bytes Class Attributes
a 1x1 118408 cell
From your description I am not certain how your F array looks, but assuming
F = {'file1.ext1', 'file2.ext2', 'file3.ext2', 'file2.ext1'};
you could remove all files ending with .ext2 like this:
F = F(cellfun('isempty', regexpi(F, '\.ext2$')));
regexpi, which operates on each element in the cell array, returns [] for all files not matching the expression. The cellfun call converts the cell array to a logical array with false at positions corresponding to files ending with .ext2and true for all others. The resulting array may be used as a logical index to F that returns the files that should be kept.
You're using cellfun wrong. It's signature is [A1,...,Am] = cellfun(func,C1,...,Cn). It takes a function as first argument, but you're passing it the result of textscan, which is a cell array of the matching strings. The second argument is a cell array as it should be, but it doesn't make sense to call it over and over in a loop. `cellfunĀ“'s job is to write the loop for you when you want to do the same thing to every cell in a cell array.
Instead of parsing the filename yourself with textscan, I suggest you use fileparts
Since you're already looping over the cell array in transpose-step, it might make sense to do the filtering there. It might look something like this:
for i = 1:numel(F);
a = F(1,i);
[~,~,ext] = fileparts(a{:});
if strcmpi(ext, '.lbl')
b{i} = [a{:}'];
end
end;