How to convert a directory into a package? - matlab

I have a directory with some helper functions that should be put into a package. Step one is obviously naming the directory something like +mypackage\ so I can call functions with mypackage.somefunction. The problem is, some functions depend on one another, and apparently MATLAB requires package functions to call functions in the very same package still by explicitly stating the package name, so I'd have to rewrite all function calls. Even worse, should I decide to rename the package, all function calls would have to be rewritten as well. These functions don't even work correctly anymore when I cd into the directory as soon as its name starts with a +.
Is there an easier solution than rewriting a lot? Or at least something self-referential like import this.* to facilitate future package renaming?
edit I noticed the same goes for classes and static methods, which is why I put the self-referential part into this separate question.

In truth, I don't know that you should really be renaming your packages often. It seems to me that the whole idea behind a package in MATLAB is to organize a set of related functions and classes into a single collection that you could easily use or distribute as a "toolbox" without having to worry about name collisions.
As such, placing functions and classes into packages is like a final step that you perform to make a nice polished collection of tools, so you really shouldn't have much reason to rename your packages. Furthermore, you should only have to go through once prepending the package name to package function calls.
... (pausing to think if what I'm about to suggest is a good idea ;) ) ...
However, if you really want to avoid having to go through your package and prepend your function calls with a new package name, one approach would be to use the function mfilename to get the full file path for the currently running package function, parse the path string to find the parent package directories (which start with "+"), then pass the result to the import function to import the parent packages. You could even place these steps in a separate function packagename (requiring that you also use the function evalin):
function name = packagename
% Get full path of calling function:
callerPath = evalin('caller', 'mfilename(''fullpath'')');
% Parse the path string to get package directories:
name = regexp(callerPath, '\+(\w)+', 'tokens');
% Format the output:
name = strcat([name{:}], [repmat({'.'}, 1, numel(name)-1) {''}]);
name = [name{:}];
end
And you could then place this at the very beginning of your package functions to automatically have them include their parent package namespace:
import([packagename '.*']);
Is this a good idea? Well, I'm not sure what the computational impacts will be if you're doing this every time you call a package function. Also, if you have packages nested within packages you will get output from packagename that looks like this:
'mainpack.subpack.subsubpack'
And the call to import will only include the immediate parent package subsubpack. If you also want to include the other parent packages, you would have to sequentially remove the last package from the above string and import the remainder of the string.
In short, this isn't a very clean solution, but it is possible to make your package a little easier to rename in this way. However, I would still suggest that it's better to view the creation of a package as a final step in the process of creating a core set of tools, in which case renaming should be an unlikely scenario and prepending package function calls with the package name would only have to be done once.

I have been exploring answers to the same question and I have found that combining package with private folders can allow most or all of the code to be used without modification.
Say you have
+mypackage\intfc1.m
+mypackage\intfc2.m
+mypackage\private\foo1.m
+mypackage\private\foo2.m
+mypackage\private\foo3.m
Then from intfc1, foo1, foo2, and foo3 are all reachable without any package qualifiers or import statements, and foo1, foo2, and foo3 can also call each other without any package qualifiers or import statements. If foo1, foo2, or foo3 needs to call intfc1 or intfc2, then that needs qualification as mypackage.intfc1 or an import statement.
In the case that you have a large set of mutually interdependent functions and a small number of entry points, this reduces the burden of adding qualifiers or import statements.
To go even further, you could create new wrapper functions at the package level with the same name as private functions
+mypackage\foo1.m <--- new interface layer wraps private foo1
+mypackage\private\foo1.m <--- original function
where for example +mypackage\foo1.m might be:
function answer = foo1(some_parameter)
answer = foo1(some_parameter); % calls private function, not itself
end
This way, unlike the intfc1 example above, all the private code can run without modification. In particular there is no need for package qualifiers when calling any other function, regardless of whether it is exposed by a wrapper at the package level.
With this configuration, all functions including the package-level wrappers are oblivious of the package name, so renaming the package is nothing more than renaming the folder.

Related

Using two versions of matlab code - how to handle paths?

I would like to use two versions of the same Matlab software package at the same time (I would like to compare their outputs for testing). The package modifies the path so that functions in subdirectories can be found. This seems problematic since the package assumes that it is the only copy running on the machine. The path is essentially a global variable which is unintentionally shared between the two copies of the code.
Example simplified code structure:
/main_code.m
/compare_results.m
/code_a/somefn.m
/code_a/submethods/
/code_b/somefn.m
/code_b/submethods/
Note that somefn.m adds the submethods directory to the path, and calls code from the submethods folder by relying on the path.
Example of code that I would like to run:
for i = 1:1000000
% Run version A:
result_a = code_a.somefn(i);
% Run version B:
result_b = code_b.somefn(i);
% Compare the output from the two versions:
compare_results(a,b);
end
One solution that I can think of is to manually update the Matlab path every time that I want to switch to a different version of the package. This seems like unnecessary coding overhead, and potentially a performance problem (due to switching the path so often).
Another solution might be to rewrite the code to be object oriented, so that the functions are attached to objects, and I can create objects of different versions. The problem with this is that in reality the code package contains hundreds of files, and I was not the original author so rewriting would be a huge task .
(Yet another option would be to change directory all the time, so that the code to run is always in the current directory. This would be so much of a headache due to the number of subfolders that I do not think this is a serious solution. It also has potential performance overhead drawbacks similar to always changing the path.)
Is there a cleaner way to handle this? Can I somehow specify the folder of the code that I want to run? What is the best way to design such a code package so that this problem does not come up?
Just create packages, which can contain functions with the same names: Packages Create Namespaces
Basically create two folders named, say, +package1 and +package2. The "+" in the folder name is important. Then place your functions under both of them, say, foo.m. Then, you can call each separately without messing with MATLAB path as:
>> package1.foo
>> package2.foo
You can use private functions. Change the directory structure as:
/main_code.m
/compare_results.m
/+code_a/somefn.m
/+code_a/private/
/+code_b/somefn.m
/+code_b/private/
Each somefn has access to the functions contained in its private sub-folder. So there is no need to create global variable and to add private sub-folders to the path.

Reference function within package matlab

I would like to structure my project in several packages. Each package should be each own namespace (so as to avoid conflicting filename) but within a package, I want everything to be in the same namespace (without having to put all the files in the same folder; I'd like different folders).
In practice I would like this structure
Project
main.m
commonLibrary
+part1Project
mainPart1.m
otherFolder
supportFile.m
+part2Project
mainPart2.m
otherFolder2
supportFile2.m
This is the behavious I would like:
When in main.m, I can call everything in common library and everything in any sub-project, including the functions inside the subfolder. So I would like to call part1Project.supportFile
When in mainPart1.m, I want to call the support files without using the prefix of the current package (i.e. I want to call supportFile directly)
When in mainPart2, I want to call supportFile2 directly. If I want access to files in the the part 1 of the project, I can call part1Project.supportFile.
The current setup is that I added the Project folder and all the subfolders to the matlab path. But this means that
I CANNOT call supportFile from anywhere; not from main (part1Project.supportFile will not work) and not even from mainPart1 (supportFile can't be found)
Much in the same way, it is hard to access elements of part1Project from part2Project
How can I achieve the behaviour I want?
You cannot access functions within a subfolder of a package unless that subfolder is a private folder in which case it will only be accessible to the functions in the immediate parent folder.
If you do use the private folder approach, then you can call functions within this private folder from the functions in the containing folder without using the fully-qualified package name.
Your layout would look like:
Project
main.m
commonLibrary
+part1Project
mainPart1.m
private
supportFile.m
+part2Project
mainPart2.m
private
supportFile2.m
Your first point will not work but the other two will. There is no built-in way to accomplish the first point.
Another option would be to use import statements in all functions within each package such that it imports all package members at the beginning of the function.
Your layout would look like
Project
main.m
commonLibrary
+part1Project
mainPart1.m
supportFile.m
+part2Project
mainPart2.m
supportFile2.m
And the contents of mainPart1.m (any any function) would look something like:
function mainPart1()
% Import the entire namespace
import part1Project.*
% No package name required
supportFile()
end
And then from main you could access supportFile
function main()
part1Project.supportFile()
end

How do I wrap a Matlab library without polluting my path variable?

Let us assume, I want to use a foreign Matlab library with a structure like this:
folderName
play.m
run.m
open.m
If I simply add folderName to my Matlab path variable, it will easily yield name conflicts. I don't want to rename the files, to be able to obtain new releases of the example library (the package concept is not used in the example library). Renaming would need to modify the code as well, if there are calls from one library function to the other.
How do I write local wrappers, which wrap the functions from that example library? My wrappers could then have my desired names and input parameters.
Clarification: How do I use an external library (toolbox) without name conflicts, without renaming and without modifying each function?
Rename files: Makes it hard to update the external library.
Simply put them in a package folder: This will break internal library function calls.
You want to use a package, which will establish a namespace, such that things in the package, are then qualified with the package name. You can find more information here: http://www.mathworks.com/help/matlab/matlab_oop/scoping-classes-with-packages.html

Calling functions in a nested package

I'm trying out organising my matlab code into packages, but having to use fully qualified names in nested package functions is killing me.
Say I have a package called +myPack that looks like this:
+myPack
bar.m
baz.m
The bar function might look like
function bar()
myPack.baz()
end
This is all fine and logical. However, +myPack is a componant that will be reused in multiple other packages. Lets say one looks like this:
+mySuperPack
foo.m
+myPack
bar.m
baz.m
This time, foo calls bar, which in turn calls baz. However, the original code for bar will fail because I have not used the fully qualified name
mySuperPack.myPack.baz()
Obviously +myPack doesn't know which super pack it is in, so I can't do that.
This also stops you from being able to use static methods in classes that are in packages; the class has to know which package it is in to call its own static methods, which seems crazy.
Is there any way to use nested packages like this, or am I doing packages totally wrong?
Instead of writing
mySuperPack.myPack.baz()
you can write
import mySuperPack.myPack.*
baz()
You only need to write the import statement once. Unfortunately (and this is one of the few things that really bugs me with MATLAB) import only imports into the current workspace, so you need to write it once per function/method.
I really wish you could just write it once at the top of a file, and have it import into all functions in the file, or at the top of a class and have it import into all methods of the class, but there you go. At least it's better than always needing fully qualified names everywhere.
PS on the issue of static methods: although you might typically call them as ClassName.staticMethodName, or pkgName.ClassName.staticMethodName, if you have an object obj of that class, you can also call it using obj.staticMethodName. Obviously that's not always relevant, but if you're calling a static method from within a normal method it can be convenient, as you don't need to mention (or import) the package.

What is the closest thing MATLAB has to namespaces?

We have a lot of MATLAB code in my lab. The problem is there's really no way to organize it. Since all the functions have to be in the same folder to be called (or you have to add a bunch of folders to MATLAB's path environment variable), it seems that we're doomed have loads of files in the same folder, all in the global namespace. Is there a better way to organize our files and functions? I really wish there were some sort of module system...
MATLAB has a notion of packages which can be nested and include both classes and functions.
Just make a directory somewhere on your path with a + as the first character, like +mypkg. Then, if there is a class or function in that directory, it may be referred to as mypkg.mything. You can also import from a package using import mypkg.mysubpkg.*.
The one main gotcha about moving a bunch of functions into a package is that functions and classes do not automatically import the package they live in. This means that if you have a bunch of functions in different m-files that call each other, you may have to spend a while dropping imports in or qualifying function calls. Don't forget to put imports into subfunctions that call out as well. More info:
http://www.mathworks.com/help/matlab/matlab_oop/scoping-classes-with-packages.html
I don't see the problem with having to add some folder to Matlab's search path. I have modified startup.m so that it recursively looks for directories in my Matlab startup directory, and adds them to the path (it also runs svn update on everything). This way, if I change the directory structure, Matlab is still going to see all the functions the next time I start it.
Otherwise, you can look into object-oriented code, where you store all the methods in a #objectName folder. However, this may lead to a lot of re-writing code that can be avoided by updating the path (there is even a button add with subfolders if you add the folder to the path from the File menu) and doing a bit of moving code.
EDIT
If you would like to organize your code so that some functions are only visible to the functions that call them directly (and if you don't want to re-write in OOP), you put the calling functions in a directory, and within this directory, you create a subdirectory called private. The functions in there will only be visible to the functions in the parent directory. This is very useful if you have to overload some built-in Matlab functions for a subset of your code.
Another way to organize & reuse code is using matlab's object-oriented features. Each Object is customarily in a folder that begins with an "#" and has the file(s) for that class inside. (though the newer syntax does not require this for a class defined in a single file.) Using private folders inside class folders, matlab even supports private class members. Matlab's new class notation is relatively fully-featured, but even the old syntax is useful.
BTW, my startup.m that examines a well-known location that I do my SVN checkouts into, and adds all of the subfolders onto my path automatically.
The package system is probably the best. I use the class system (#ClassName folder), but I actually write objects. If you're not doing that, it's silly just to write a bunch of static methods. One thing that can be helpful is to put all your matlab code into a folder that isn't on the matlab path. Then you can selectively add just the code you need to the path.
So say you have two projects, stored in "c:\matlabcode\foo" and "c"\matlabcode\bar", that both use common code stored in "c:\matlabcode\common," you might have a function "setupPaths.m" like this:
function setupPaths(projectName)
basedir = fullfile('c:', 'matlabcode');
addpath(genpath(fullfile(basedir, projectName)));
switch (projectName)
case {'foo', 'bar'}
addpath(genpath(fullfile(basedir, 'common')));
end
Of course you could extend this. An obvious extension would be to include a text file in each directory saying what other directories should be added to the path to use the functions in that directory.
Another useful thing if you share code is to set up a "user specific/LabMember" directory structure, where you have different lab members save code they are working on. That way you have access to their code if you need it, but don't get clobbered when they write a function with the same name as one of yours.