Undefined function or variable 'DataMatrix' - matlab

I was trying to create a DataMatrix variable by calling the DataMatrix() function.
But that function doesn't exist. If I type this:
>> DataMatrix
I got this error message:
Undefined function or variable 'DataMatrix'.
I did install the Bioinformatics Toolbox and my version is 2016b on Mac
Any ideas?

As #Andras mentioned in the comments, the procedure to import and use this class is already mentioned in the documentation for the class (though you might be forgiven if you missed it, as it's not on the top part of the page dealing with syntax).
The tl;dr version is that you should either access the class constructor as, e.g.:
D = bioma.data.DataMatrix(...);
or, import the class from the package / namespace first, and then use it directly, i.e.:
import bioma.data.DataMatrix;
D = DataMatrix(...);
Explanation
The reason you need this step in the first place, is because this class is enclosed inside a "package" (a.k.a. a "namespace"). Read the section called "Packages Create Namespaces" in the matlab documentation to find out more what this means.
However, in principle it boils down to the fact that, if you have a folder whose name has a + prefix, then this acts as a namespace for the functions contained within.
So, if you have a folder called +MyPackage on your path, and this contains a function m-file called myfunction.m (but this is not in your path), then you can access this function in the matlab terminal by typing MyPackage.myfunction().
Or, you can import MyPackage.myfunction from that package / namespace and then use myfunction directly.
So, going back to DataMatrix, you will see that if you search where the class definition is located in your matlab folder, you'll find it here:
./toolbox/bioinfo/microarray/+bioma/+data/#DataMatrix/DataMatrix.m
and presumably ./toolbox/bioinfo/microarray is already in your path.
I.e. the bioma package/namespace is in your path, and you can access the data package/namespace below it, and then the class definition for DataMatrix by doing bioma.data.DataMatrix.
PS: Furthermore, the "#" prefix in front of a folder name denotes a class folder, containing the constructor and class methods. If this "#folder" is in your path (or imported etc), then this means you have access to the underlying constructor. This is a remnant from matlab's old object-oriented style, before the classdef keyword was introduced. You can read more about class directories here if you're interested.

Related

How to arrange matlab code?

Let's say I have some MATLAB code that uses some functions.
I don't want to define the functions in the same file as the code that uses
the functions.
On the other hand, the solution of making an m file for each function is also not good for me because I don't want a lot of files.
What I want is something like a utils file that holds those functions, from which I can import the functions like we do in python, for example.
What would you recommend?
What you probably want is to use a package, which is kind of like a python module in that it is a folder that can hold multiple files. You do this by putting a + at the beginning of the folder name, like +mypackage. You can then access the functions and classes in the folder using package.function notation similar to Python without it polluting the global list of functions (only the package is added to the global list, rather than every function in it). You can also import individual functions or classes. However, you always have to use the full function path, there is no such thing as relative paths like in Python.
However, if you really want multiple functions per file, probably the best you can do is create a top-level function that returns a struct of function handles for the other functions in the file, and then access the function handles from that struct. Since MATLAB doesn't require the use of () with functions that don't require any inputs, this would superficially behave similarly to a python module (although I don't know how it will affect performance).
I know this is a pain in the neck. There is no reason mathworks couldn't allow using files as packages like they currently do for folders, such as by putting + at the beginning of the file name. But they don't.
A solution close to what you are looking for could be the use of classes. A class contains methods that can be public (visible from outside) or private (only visible from inside). Implementations of the methods of a class can be either in multiple files or in the same file.
Here is a simplistic example
classdef Class1
methods (Static)
function Hello
disp('hello');
end
function x = Add(a,b)
x = Class1.AddInternal(a,b);
end
end
methods (Static, Access=private)
function x = AddInternal(a,b)
x = a+ b;
end
end
end
Example of use-:
>> Class1.Hello
hello
>> Class1.Add(1,2)
ans =
3
>> Class1.AddInternal(2,3)
Error using Class1.AddInternal
Cannot access method 'AddInternal' in class 'Class1'.

Source tree organization in MATLAB (#include)

Suppose i have a lot of source files, i want to organize them in folders-tree structure.
Is it possible for me to have several files with same name and use every of them from place i need or i must have all functions and classes with different names?
In C++ i have #include to introduce functions i need, that's here?
Just to illustrate:
.\main.m
.\Algorithms\QR\Factory.m % function Factory
.\Algorithms\QR\Algorithm.m % function Algorithm
.\Algorithms\SVD\Factory.m % function Factory
.\Algorithms\SVD\Algorithm.m % function Algorithm
MATLAB has support for namespaces. So in your example you would create the following:
C:\some\path\main.m
C:\some\path\+algorithms\+qr\factor.m
C:\some\path\+algorithms\+svd\factor.m
(Note: Only the top-level package folder's parent folder must be on the MATLAB path, i.e: addpath('C:\some\path'))
Then you could invoke each using its fully qualified name:
>> y = algorithms.qr.factor(x)
>> y = algorithms.svd.factor(x)
You could also import the package inside some scope. For example:
function y = main(x)
import algorithms.svd.*;
y = factor(x)
end
To understand the problem I need to explain some difference between the relation of c++ source and header files, and to .m files.
First: In matlab, you can only run the function which is defined highest up in the .m file. This file defines the top of the hierarchy. Then subfunctions can be implemented in the same m file, but these can only be used inside the same .m file.
Secondly: In addition to this matlab searched the include path for a specific filename and assume that the function inside the file will have the same name. You will notice this by a warning if you define the function with another name than the filename. The thing here is that you cannot have 2 matlab functions with the same name if all functions are global. This would be the same as if you would have 2 functions with the same name and in the same namespace in c++.
Note: The include path in matlab can typically be done with a hardcoded file in the to folder of your program. This function uses the matlab funcion addpath.
This is a fundamental difference to c/c++ where multiple functions are allowed to be defined in the same source file. Then the header file select what source code that you implement in the program, by providing the function definitions. The important thing here is that the header is completely disconnected from the function names, which they are not in matlab. This means that the analogy in your examples is not exactly accurate. The proposed thing by you is to "include" 2 functions with the same name. This is not possible either c/c++ (assuming the functions uses the same namespace or ar global) or in matlab.
Example: If the headers topFolder/foo/bar.h and topFolder/baz/bar.h would both contain the function void myDup(int a) and both headers uses the same namespace (or are global), then that would generate an error.
However, if the functions are only used by a limited number of other functions, then a function, eg. Factory.m, could be included as private functions in different folders. That would also mean that only this folder can access it. It is also possible to use matlab namespace as said in Amro's answer.

How to convert a directory into a package?

I have a directory with some helper functions that should be put into a package. Step one is obviously naming the directory something like +mypackage\ so I can call functions with mypackage.somefunction. The problem is, some functions depend on one another, and apparently MATLAB requires package functions to call functions in the very same package still by explicitly stating the package name, so I'd have to rewrite all function calls. Even worse, should I decide to rename the package, all function calls would have to be rewritten as well. These functions don't even work correctly anymore when I cd into the directory as soon as its name starts with a +.
Is there an easier solution than rewriting a lot? Or at least something self-referential like import this.* to facilitate future package renaming?
edit I noticed the same goes for classes and static methods, which is why I put the self-referential part into this separate question.
In truth, I don't know that you should really be renaming your packages often. It seems to me that the whole idea behind a package in MATLAB is to organize a set of related functions and classes into a single collection that you could easily use or distribute as a "toolbox" without having to worry about name collisions.
As such, placing functions and classes into packages is like a final step that you perform to make a nice polished collection of tools, so you really shouldn't have much reason to rename your packages. Furthermore, you should only have to go through once prepending the package name to package function calls.
... (pausing to think if what I'm about to suggest is a good idea ;) ) ...
However, if you really want to avoid having to go through your package and prepend your function calls with a new package name, one approach would be to use the function mfilename to get the full file path for the currently running package function, parse the path string to find the parent package directories (which start with "+"), then pass the result to the import function to import the parent packages. You could even place these steps in a separate function packagename (requiring that you also use the function evalin):
function name = packagename
% Get full path of calling function:
callerPath = evalin('caller', 'mfilename(''fullpath'')');
% Parse the path string to get package directories:
name = regexp(callerPath, '\+(\w)+', 'tokens');
% Format the output:
name = strcat([name{:}], [repmat({'.'}, 1, numel(name)-1) {''}]);
name = [name{:}];
end
And you could then place this at the very beginning of your package functions to automatically have them include their parent package namespace:
import([packagename '.*']);
Is this a good idea? Well, I'm not sure what the computational impacts will be if you're doing this every time you call a package function. Also, if you have packages nested within packages you will get output from packagename that looks like this:
'mainpack.subpack.subsubpack'
And the call to import will only include the immediate parent package subsubpack. If you also want to include the other parent packages, you would have to sequentially remove the last package from the above string and import the remainder of the string.
In short, this isn't a very clean solution, but it is possible to make your package a little easier to rename in this way. However, I would still suggest that it's better to view the creation of a package as a final step in the process of creating a core set of tools, in which case renaming should be an unlikely scenario and prepending package function calls with the package name would only have to be done once.
I have been exploring answers to the same question and I have found that combining package with private folders can allow most or all of the code to be used without modification.
Say you have
+mypackage\intfc1.m
+mypackage\intfc2.m
+mypackage\private\foo1.m
+mypackage\private\foo2.m
+mypackage\private\foo3.m
Then from intfc1, foo1, foo2, and foo3 are all reachable without any package qualifiers or import statements, and foo1, foo2, and foo3 can also call each other without any package qualifiers or import statements. If foo1, foo2, or foo3 needs to call intfc1 or intfc2, then that needs qualification as mypackage.intfc1 or an import statement.
In the case that you have a large set of mutually interdependent functions and a small number of entry points, this reduces the burden of adding qualifiers or import statements.
To go even further, you could create new wrapper functions at the package level with the same name as private functions
+mypackage\foo1.m <--- new interface layer wraps private foo1
+mypackage\private\foo1.m <--- original function
where for example +mypackage\foo1.m might be:
function answer = foo1(some_parameter)
answer = foo1(some_parameter); % calls private function, not itself
end
This way, unlike the intfc1 example above, all the private code can run without modification. In particular there is no need for package qualifiers when calling any other function, regardless of whether it is exposed by a wrapper at the package level.
With this configuration, all functions including the package-level wrappers are oblivious of the package name, so renaming the package is nothing more than renaming the folder.

Using MATLAB, why does something like fts.data work in one directory but not another?

I am working with the financial tooldbox that has a type called FINTS. If I copy some code out of its toolbox directory to customize it, when I try do do something like fts.data, `I get
The specified field, 'data', does not exist in the object.
But the same thing works fine in the MATLAB library directory. They are both in my path, so what else do I need to change?
I think, but I haven't checked the documentation on this one, that it is a peculiarity of MATLAB that the class FINTS must be defined in the directory #fints. So if you want to extend the class, you have to put your code into that directory. And if you want to work on a class MYFINTS, you need to put the code into directory #myfints.
OK, I figured it out. MATLAB defines class methods in what it calls method directories which are named after the class. So in this case, the class is fints, so all its methods are in #fints. All I had to do was make a new directory in my own workspace called #fints, and it will become another class method of fints. You can see all the methods a class has by calling what className.
Make sure the path is specified from the root directory, and not relative.
For instance
addpath 'c:\...\...\MATLAB\mytoolbox
not
addpath 'mytoolbox'
the latter will break if you change your working directory

What is the closest thing MATLAB has to namespaces?

We have a lot of MATLAB code in my lab. The problem is there's really no way to organize it. Since all the functions have to be in the same folder to be called (or you have to add a bunch of folders to MATLAB's path environment variable), it seems that we're doomed have loads of files in the same folder, all in the global namespace. Is there a better way to organize our files and functions? I really wish there were some sort of module system...
MATLAB has a notion of packages which can be nested and include both classes and functions.
Just make a directory somewhere on your path with a + as the first character, like +mypkg. Then, if there is a class or function in that directory, it may be referred to as mypkg.mything. You can also import from a package using import mypkg.mysubpkg.*.
The one main gotcha about moving a bunch of functions into a package is that functions and classes do not automatically import the package they live in. This means that if you have a bunch of functions in different m-files that call each other, you may have to spend a while dropping imports in or qualifying function calls. Don't forget to put imports into subfunctions that call out as well. More info:
http://www.mathworks.com/help/matlab/matlab_oop/scoping-classes-with-packages.html
I don't see the problem with having to add some folder to Matlab's search path. I have modified startup.m so that it recursively looks for directories in my Matlab startup directory, and adds them to the path (it also runs svn update on everything). This way, if I change the directory structure, Matlab is still going to see all the functions the next time I start it.
Otherwise, you can look into object-oriented code, where you store all the methods in a #objectName folder. However, this may lead to a lot of re-writing code that can be avoided by updating the path (there is even a button add with subfolders if you add the folder to the path from the File menu) and doing a bit of moving code.
EDIT
If you would like to organize your code so that some functions are only visible to the functions that call them directly (and if you don't want to re-write in OOP), you put the calling functions in a directory, and within this directory, you create a subdirectory called private. The functions in there will only be visible to the functions in the parent directory. This is very useful if you have to overload some built-in Matlab functions for a subset of your code.
Another way to organize & reuse code is using matlab's object-oriented features. Each Object is customarily in a folder that begins with an "#" and has the file(s) for that class inside. (though the newer syntax does not require this for a class defined in a single file.) Using private folders inside class folders, matlab even supports private class members. Matlab's new class notation is relatively fully-featured, but even the old syntax is useful.
BTW, my startup.m that examines a well-known location that I do my SVN checkouts into, and adds all of the subfolders onto my path automatically.
The package system is probably the best. I use the class system (#ClassName folder), but I actually write objects. If you're not doing that, it's silly just to write a bunch of static methods. One thing that can be helpful is to put all your matlab code into a folder that isn't on the matlab path. Then you can selectively add just the code you need to the path.
So say you have two projects, stored in "c:\matlabcode\foo" and "c"\matlabcode\bar", that both use common code stored in "c:\matlabcode\common," you might have a function "setupPaths.m" like this:
function setupPaths(projectName)
basedir = fullfile('c:', 'matlabcode');
addpath(genpath(fullfile(basedir, projectName)));
switch (projectName)
case {'foo', 'bar'}
addpath(genpath(fullfile(basedir, 'common')));
end
Of course you could extend this. An obvious extension would be to include a text file in each directory saying what other directories should be added to the path to use the functions in that directory.
Another useful thing if you share code is to set up a "user specific/LabMember" directory structure, where you have different lab members save code they are working on. That way you have access to their code if you need it, but don't get clobbered when they write a function with the same name as one of yours.