Unable to create an array from a table - matlab

I'm trying to load an external CSV file using MATLAB.
I managed to download it using webread, but I only need a subset of the columns.
I tried
Tb = webread('https://datahub.io/machine-learning/iris/r/iris.csv');
X = [sepallength sepalwidth petallength petalwidth];
But I cannot form X this way because the names are not recognized. How can I create X correctly?

The line
Tb = webread('https://datahub.io/machine-learning/iris/r/iris.csv');
Produces a table object with column names you later try to access as if they were workspace variables - which they aren't. Instead, you should modify your code to use:
X = [Tb.sepallength Tb.sepalwidth Tb.petallength Tb.petalwidth];

Related

Write Matrix Data to Each Member of Datatype in HDF5 file via MATLAB

This is my first go at trying to create an HDF5 file from scratch using the Low-Level commands via MATLAB.
My issue is that I am having a hard time trying to write data to each specific member in the datatype on my dataset.
First, I create a new HDF5 file, and set the right layer of groups:
new_h5 = H5F.create('new_hdf5_file.h5','H5F_ACC_TRUNC','H5P_DEFAULT','H5P_DEFAULT');
new_h5 = H5G.create(new_h5,'first','H5P_DEFAULT','H5P_DEFAULT','H5P_DEFAULT');
new_h5 = H5G.create(new_h5,'second','H5P_DEFAULT','H5P_DEFAULT','H5P_DEFAULT');
Then, I create my datatype:
datatype = H5T.create('H5T_compound',20);
H5T.insert(datatype,'first_element',0,'H5T_NATIVE_INT');
H5T.insert(datatype,'second_element',4,'H5T_NATIVE_DOUBLE');
H5T.insert(datatype,'third_element',12,'H5T_NATIVE_DOUBLE');
Then, I format that into my dataset:
new_h5 = H5D.create(new_h5,'location',datatype,H5S.create('H5S_SCALAR'),'H5P_DEFAULT');
subset = H5D.get_type(H5D.open(new_h5,'/first/second/location'));
mem_type = H5T.get_member_type(subset,0);
I receive an error with the following command:
H5D.write(mem_type,'H5ML_DEFAULT','H5S_ALL','H5S_ALL','H5P_DEFAULT',data);
Error using hdf5lib2
Unhandled HDF5 class (H5T_NO_CLASS) encountered. It is not possible to write to this attribute or dataset.
So, I try this method instead:
new_h5 = H5D.create(new_h5,'location',datatype,H5S.create_simple(2,dims,dims),'H5P_DEFAULT'); %where dims are the dimensions of all matrices of data structure
H5D.write(mem_type,'H5ML_DEFAULT','H5S_ALL','H5S_ALL','H5P_DEFAULT',data); %where data is a structure
I receive an error with this following command:
H5D.write(mem_type,'H5ML_DEFAULT','H5S_ALL','H5S_ALL','H5P_DEFAULT',data);
Error using hdf5lib2
Attempted to transfer too many values to or from the library buffer.
When looking here for the XML tags for the error messages, it describes the above error as "illegalArrayAccess." Apparently, according to this question, you can only write to 4 members without the buffer throwing an error?
Is this correct? How can I correctly write to each member. I am about to reach my mental limit trying to figure this one out.
EDIT:
References kept here for general information:
HDF5 Compound Datatypes Example
HDF5 Compount Datatypes
H5D.write MATLAB Command
I found out why I cannot write data. I have solved the problem. I had my dimensions set incorrectly (which is code I forgot to include originally). My apologies. I had my dimensions like this:
dims = fliplr(size(data_matrix));
Where dims was a 15x250 matrix. The error was in that the buffer was unable to write a 250x15 matrix for each member, because it only had data for a 250x1 for each member.
The following code will (generically) work for writing data to each member:
new_h5 = H5F.create('new_hdf5_file.h5','H5F_ACC_TRUNC','H5P_DEFAULT','H5P_DEFAULT');
new_h5 = H5G.create(new_h5,'first','H5P_DEFAULT','H5P_DEFAULT','H5P_DEFAULT');
new_h5 = H5G.create(new_h5,'second','H5P_DEFAULT','H5P_DEFAULT','H5P_DEFAULT');
datatype = H5T.create('H5T_compound',20);
H5T.insert(datatype,'first_element',0,'H5T_NATIVE_INT');
H5T.insert(datatype,'second_element',4,'H5T_NATIVE_DOUBLE');
H5T.insert(datatype,'third_element',12,'H5T_NATIVE_DOUBLE');
dims = fliplr(size(data_matrix)); dims = [1 dims(1,2)];
new_h5 = H5D.create(new_h5,'location',datatype,H5S.create_simple(2,dims,dims),'H5P_DEFAULT');
H5D.write(new_h5,'H5ML_DEFAULT','H5S_ALL','H5S_ALL','H5P_DEFAULT',data_structure);
where data_matrix is a 15x250 matrix containing all data, and where data_structure is a sctucture containing 15 fields, each 250x1 in size.

Sequential import of datafiles according to rule in Matlab

I have a list of .txt datafiles to import. Suppose they are called like that
file100data.txt file101data.txt ... file109data.txt I want to import them all using readtable.
I tried using the for to specify a vector a = [0:9] through which matlab could loop the readtable command but I cannot make it work.
for a = [0:9]
T_a_ = readtable('file10_a_data.txt')
end
I know I cannot just put _a_ where I want the vector to loop through, so my question is how can I actually do it?
Thank you in advance!
Here is a solution that should work even if you have missing files in your folder (e.g. you have file100data.txt to file107data.txt, but you are missing file file108data.txt and file109data.txt):
files=dir('file10*data.txt'); %list all data files in your folder
nof=size(files,1); %number of files
for i=1:nof %loop over the number of files
table_index=files(i).name(7) %recover table index from data filename
eval(sprintf('T%s = readtable(files(i).name)', table_index)); %read table
end
Now, please note that is it generally regarded as poor practice to dynamically name variables in Matlab (see this post for example). You may want to resort to structures or cells to store your data.
You need to convert the value of a into a string and combine strings together, like this:
Tables = struct();
for a = 0:9
% note: using dynamic structure field names to store the imported tables
fname = ['file10_' num2str(a) '_data'];
Tables.(fname) = readtable([fname '.txt']);
end

Trouble with looping function into structure index

I'm relatively new to matlab and would really appreciate any help.
Currently, I have a function (we'll call it readf) that reads in data from a single ascii file into a struct of multiple fields (we'll call it cdata).
names = cellstr(char('A','B','C','D','E','F','G'));
cdata = readf('filestring','dataNames',names);
The function works fine and gives me the correct output of a struct with these field names, with the value of each field name being a cell array of the corresponding data.
My task is to create a for loop that uses this readf function to read in a folder of these ascii files at once. I'm trying to work it so that the for loop creates a struct with an index of the different cdata structs. After trying a few different methods, I am stumped.
This is what I have so far.
files = struct2cell(dir('folderstring')); %creates a cell array of the names of the files withing the folder
for ii=length(files);
cdata(ii) = readf([folderstring,files(1,1:ii),names],'dataName',names);
end;
This is currently giving me the following error.
"Error using horzcat
Dimensions of matrices being concatenated are not consistent."
I am not sure what is wrong. How can I fix this code so i can read in all the data from a folder at once??? Is there a better and more efficient way to do this than making an index to this struct? Perhaps a cell array of different structures or even a structure of nested structures? Thanks!
Change:
for ii=length(files);
cdata(ii) = readf([folderstring,files(1,1:ii),names],'dataName',names);
end;
To:
for ii=1:length(files); % CHECK to make sure length(files) is giving you the right number
cdata(ii) = readf([folderstring,files{ii},names],'dataName',names);
end;
% CHECK files{ii}, with 1,2,3 etc. is giving you the correct file name.

Matlab: dynamic name for structure

I want to create a structure with a variable name in a matlab script. The idea is to extract a part of an input string filled by the user and to create a structure with this name. For example:
CompleteCaseName = input('s');
USER WRITES '2013-06-12_test001_blabla';
CompleteCaseName = '2013-06-12_test001_blabla'
casename(12:18) = struct('x','y','z');
In this example, casename(12:18) gives me the result test001.
I would like to do this to allow me to compare easily two cases by importing the results of each case successively. So I could write, for instance :
plot(test001.x,test001.y,test002.x,test002.y);
The problem is that the line casename(12:18) = struct('x','y','z'); is invalid for Matlab because it makes me change a string to a struct. All the examples I find with struct are based on a definition like
S = struct('x','y','z');
And I can't find a way to make a dynamical name for S based on a string.
I hope someone understood what I write :) I checked on the FAQ and with Google but I wasn't able to find the same problem.
Use a structure with a dynamic field name.
For example,
mydata.(casename(12:18)) = struct;
will give you a struct mydata with a field test001.
You can then later add your x, y, z fields to this.
You can use the fields later either by mydata.test001.x, or by mydata.(casename(12:18)).x.
If at all possible, try to stay away from using eval, as another answer suggests. It makes things very difficult to debug, and the example given there, which directly evals user input:
eval('%s = struct(''x'',''y'',''z'');',casename(12:18));
is even a security risk - what happens if the user types in a string where the selected characters are system(''rm -r /''); a? Something bad, that's what.
As I already commented, the best case scenario is when all your x and y vectors have same length. In this case you can store all data from the different files into 2 matrices and call plot(x,y) to plot each column as a series.
Alternatively, you can use a cell array such that:
c = cell(2,nufiles);
for ii = 1:numfiles
c{1,ii} = import x data from file ii
c{2,ii} = import y data from file ii
end
plot(c{:})
A structure, on the other hand
s.('test001').x = ...
s.('test001').y = ...
Use eval:
eval(sprintf('%s = struct(''x'',''y'',''z'');',casename(12:18)));
Edit: apologies, forgot the sprintf.

HDF format on Matlab

I have a Modis image with hdf format.
fileinfo = hdfinfo('MOD09GA.A2011288.hdf');
I'm trying to create a matrix but I only need three bands that are stored on the attributes (I know it because I've checked on Erdas). I've checked the structure of the attributes and there are 12 bands (fileinfo.Attributes= <1x12 struct>). How can I extract and create a matrix with three bands?
sds_info = fileinfo.SDS(2);
What I'm trying to do is the following...
data1 = hdfread(sds_info.Attributes)
But I get the following error:
??? Error using ==>
hdfread>dataSetInfo at 418
HINFO must be a structure
describing a specific data set
in the file.
Checking the help I know I have to use that structure. How can I know the content of the attributes? How can I select and create a matrix with that information?
data1 = hdfread(s.Vdata(1), 'Fields', {'Idx', 'Temp', 'Dewpt'})
PS) I'm using the hdftool importing every band. There another way to do it?
At the end, this is what I've done (I don't erase the post just in case could help someone):
sur_refl_b01_1 = hdfread('MOD09GA.A2011288.h17v05.005.2011293000105.hdf', '/MODIS_Grid_500m_2D/Data Fields/sur_refl_b01_1', 'Index', {[1 1],[1 1],[2400 2400]});