Matlab: list of trees - matlab

I have implemented a binary tree in Matlab using 'struct' type elements as nodes. An N-node tree has, say, N such structs.
My problem is that I have M such trees, each having a different number of nodes, N_1, N_2, ..., N_M. How can I hold these trees in a list or array which can be iterated? Several trials like struct of structs did not seem to work.
Edit: I want to do something like the following. myClassTree returns a tree with N_i nodes.
trees = struct;
for i=1:nTrees
tree = myClassTree(train(bags(i,:),:), entropy, depth);
trees(i) = tree;
end

The easiest thing is to create a cell array. Simply replace trees(i) = tree; with trees{i} = tree; (note braces, rather than parenthesis).
Cell arrays are useful whenever you want to store an array of mixed data types. To access elements of a cell array, you can use the braces again. For example, this should work as you expect:
currentTree = trees{someIndex};
The code that you posted creates an array of structs, which only works if the structures have the same fieldnames.
If you wanted (not recommended) you could created a struct of structs, but doing somehting like this trees.(['n' sprintf('%04d',i)]) = tree;. (But please don't.)

Related

Contents of a subfield

I need to obtain a list (cell array) of the contents of a single subfield in a PDB file. I have prepared my structure and am now looking for something like
resnames = getfield(PS.Model.Atom,'resName')
This however only leaves me with the first entry. I need an output similar to the what command.
I believe you may have something like this:
a{1}='s';
a{2}='t';
Now calling it like so
a{:}
will return several times:
ans =
s
ans =
t
However, if you can wrap it with brackets:
{a{:}}
It will nicely return everything at once:
ans =
's' 't'
Now ans is a cell array.
I have been unable to guess your exact structure, but hopefully this solution (wrapping stuff with brackets) still works. If not, you can always just loop through your variable and extract the elements one by one.
Your struct array is not scalar at some point in the tree. Somewhere in the hierarchy you have 1xN struct array, which would give you multiple ans outputs like that. Another suggestion is to use dynamic field names. Instead of getfield, this would be:
PS.Model.Atom.('resName')
Thus, you could take Dennis' suggestion and form a cell of all stings like:
resnames = {PS.Model.Atom.('resName')};

'Find' and 'unique' in an array of user-defined class objects (MATLAB)

After dabbling in C# I'm now keen to use some OOP in Matlab - to date I've done none!
Let's say that I define a class for a data object with a handful of properties...
classdef TestRigData
properties
testTemperature
sampleNumber
testStrainAmplitude
sampleMaterial
rawData
end
methods
% Some constructors and data manipulation methods in here
end
end
...where 'rawData' would be a m-by-n array of actual experimental data, and the other values being doubles or strings to help identify each specific experiment.
If I have an array of TestRigData objects, what would be the best way of finding the indices of objects which meet specific criteria (e.g. testTemperature == 200)? Or getting all the unique values of a property (e.g. all the unique sample numbers in this collection).
If they were arrays of their own, (myNewArray = [3 5 7 22 60 60 5]) it would be easy enough using the find() and unique() functions. Is there a class I can inherit from here which will allow something like that to work on my object array? Or will I have to add my own find() and unique() methods to my class?
You can assign an ID value (a hash value in the general case) to TestRigData objects and store it as a new property. You can then extract all ID values at once to a cell array, e.g {yourarray.id} (or [yourarray.id] if the ID values are scalars), allowing you to apply find and unique with ease.
Adding your own find and unique is definitely possible, of course, but why make life harder? ;)
The suggestion of creating this as a handle class (rather than value class) is something I need to think about more in the future... after having put together some initial code, going back and trying to change classdef TestRigData to classdef TestRigData < handle seems to be causing issues with the constructor.
Bit unclear of how I would go about using a hash value unique to each object... but the syntax of extracting all values to an array is ultimately what got me in the right direction.
Getting a new object array which is the subset of the original big data array conforming to a certain property value is as easy as:
newObjectArray = oldObjectArray([oldObjectArray.testTemperature]==200);
Or for just the indices...
indicesOfInterest = find([oldObjectArray.testTemperature]==200);
Or in the case of non-scalar values, e.g. string property for sample material...
indicesOfInterest = find(strcmpi({oldObjectArray.sampleMaterial},'steel'));

Is there a way to remove all but a few desired fields from a struct in MATLAB?

So I have several structs that contains data that is used is a dozen or so scripts. The problem is that for each script I only need a handfull of variables and the rest I can ignore. I am using a massive amount of data (gigs of data) and MATLAB often gives me out of memory errors so I need to remove all unnecessary fields from the structs.
Currently I have a cell that contains all unneeded fields and then I call rmfield on the structs. But the fields in the structs often change and it is getting to be a pain to be constantly updating the list of unneeded fields. So is there a way to tell MATLAB to keep only those fields I want and remove everything else even if I don't know what everything else is?
Here is an example,
Struct 1 has: A, B, C, D, E fields
Struct 2 has: A, B, C, D, E, F fields
Struct 3 has: A, B, C, D, E, F, G, H, I fields
Sometimes Struct 3 might only have A thru G.
I want to keep only A, B, and C fields and remove all other data from all the structs.
Here is one way to do it:
Get the list of all fieldnames using fieldnames
Remove the ones that you want to keep from the list
Remove everything that is left in the list
Example
s.a=1
s.b=2
s.c=3
s.d='chitchat'
tokeep = {'a','b'}
f=fieldnames(s)
toRemove = f(~ismember(f,tokeep));
s = rmfield(s,[toRemove])
You could copy your struct's desired fields to a new variable in a function.
function newVar = getABC(strct)
newVar.A = strct.A;
newVar.B = strct.B;
newVar.C = strct.C;
end
strct will not be copied in memory beacuse you will not be manipulating it.
MATLAB uses a system commonly called "copy-on-write" to avoid making a
copy of the input argument inside the function workspace until or
unless you modify the input argument. If you do not modify the input
argument, MATLAB will avoid making a copy.
You can get newVar and then clear strct from memory.
Fred's generalized version:
function newVar = getFields(oldVar, desiredCell)
for idx = 1:length(desiredCell)
newVar.(desiredCell{idx}) = oldVar.(desiredCell{idx});
end
1) Let's say you a have a structure S
2) You want to keep only the first three fields of S and delete all the others
fieldsS = fieldnames(S);
S = rmfield(S,fieldsS(4:end));
The rmfield method in MATLAB is rather slow, so when dealing with large structures it is best to avoid it.
This MATLAB file exchange item: kpfield is basically the inverse of rmfield and should work exactly as you require.
It converts the structure to a cell array before keeping only the required indices by creating a logical array based on whether the fields exist in the fieldnames or not. The modified cell array is then converted back to a structure.
Disclaimer: I have written kpfield as I came across exactly the same issue.
I had success loading the struct- physically deleting fields I wanted to remove and then resaving the struct.
Deleting fields in the workspace does not delete them from the original struct - so resaving is necessary.

What does the . operator do in matlab?

I came across some matlab code that did the following:
thing.x=linspace(...
I know that usually the . operator makes the next operation elementwise, but what does it do by itself? Is this just a sub-object operator, like in C++?
Yes its subobject.
You can have things like
Roger.lastname = "Poodle";
Roger.SSID = 111234997;
Roger.children.boys = {"Jim", "John"};
Roger.children.girls = {"Lucy"};
And the things to the right of the dots are called fields.
You can also define classes in Matlab, instatiate objects of those classes, and then if thing was one of those objects, thing.x would be an instance variable in that object.
The matlab documentation is excellent, look up "fields" and "classes" in it.
There are other uses for ., M*N means multiploy two things, if M, N are both matrices, this implements the rules for matrix multiplication to get a new matrix as its result. But M.*N means, if M, N are same shape, multiply each element. And so no like that with more subtleties, but out of scope of what you asked here.
As #marc points out, dot is also used to reference fields and subfields of something matlab calls a struct or structure. These are a lot like classes, subclasses and enums, seems to me. The idea is you can have a struct data say, and store all the info that goes with data like this:
olddata = data; % we assume we have an old struct like the one we are creating, we keep a reference to it
data.date_created=date();
data.x_axis = [1 5 2 9];
data.notes = "This is just a trivial example for stackoverflow. I didn't check to see if it runs in matlab or not, my bad."
data.versions.current = "this one";
data.versions.previous = olddata;
The point is ANY matlab object/datatype/whatever you want to call it, can be referenced by a field in the struct. The last entry shows that we can even reference another struct in the field of a struct. The implication of this last bit is we could look at the date of creation of the previous verions:
data.versions.previous.date_created
To me this looks just like objects in java EXCEPT I haven't put any methods in there. Matlab does support java objects which to me look a lot like these structs, except some of the fields can reference functions.
Technically, it's a form of indexing, as per mwengler's answer. However, it can also be used for method invocation on objects in recent versions of MATLAB, i.e.
obj.methodCall;
However note that there is some inefficiency in that style - basically, the system has to first work out if you meant indexing into a field, and if not, then call the method. It's more efficient to do
methodCall(obj);

Avoiding eval in assigning data to struct array

I have a struct array called AnalysisResults, that may contain any MATLAB datatypes, including other struct arrays and cell arrays.
Then I have a string called IndexString, which is the index to a specific subfield of StructArray, and it may contain several indices to different struct arrays and cell arrays, for example:
'SubjectData(5).fmriSessions{2}.Stats' or 'SubjectData(14).TestResults.Test1.Factor{4}.Subfactor{3}'.
And then I have a variable called DataToBeEntered, which can be of any MATLAB datatype, usually some kind of struct array, cell array or matrix.
Using eval, it is easy to enter the data to the field or cell indexed by IndexString:
eval([ 'AnalysisResults.', IndexString, ' = DataToBeEntered;' ])
But is it possible to avoid using eval in this? setfield doesn't work for this.
Thank you :)
Well, eval surely is the easiest way, but also the dirtiest.
The "right" way to do so, I guess, would be to use subsasgn. You will have to parse the partial MATLAB command (e.g. SubjectData(5).fmriSessions{2}.Stats) into the proper representation for those functions. Part of the work can be done by substruct, but that is the lightest part.
So for example, SubjectData(5).fmriSessions{2}.Stats would need to be translated into
indexes = {'.' , 'SubjectData',
'()', {5},
'.' , 'fmriSessions',
'{}', {2},
'.' , 'Stats'};
indexStruct = substruct(indexes{:});
AnalysisResult = subsasgn(AnalysisResult, indexStruct, DataToBeEntered);
Where you have to develop the code such that the cell array indexes is made as above. It shouldn't be that hard, but it isn't trivial either. Last year I ported some eval-heavy code with similar purpose and it seemed easy, but it is quite hard to get everything exactly right.
You can use dynamic field names:
someStruct.(someField) = DataToBeEntered;
where someField is a variable holding the field name, but you will have to parse your IndexString to single field name and indices.