I am writing a script (lets call it main.m) which calls a function that I wrote (lets call it myfunc.m). It seems I have a few of these myfunc.m functions at different locations on my MATLAB path.
I would like to somehow restrict matlab to only look within the same folder where my main.m class is, when looking for custom functions.
So for example if I have in
C:\example\main.m
C:\example\myfunc.m
and
C:\asd\main.m
C:\asd\myfunc.m
and I open main.m in folder example, when it comes to the call of myfunc.m, it can ONLY call a function within folder C:\example\. Same goes for if I run main.m in folder C:\asd\.
I hope this makes sense, thanks.
In the short term, a fairly quick solution would be for you to make your myfunc.m files into private functions that are ahead in terms of precedence compared to regular functions, and can only be called by functions in the same parent folder.
Simply place your myfunc.m files in a folder called private:
C:\example\main.m
C:\example\private\myfunc.m
and
C:\asd\main.m
C:\asd\private\myfunc.m
Now example\private\myfunc.m is only callable by things in the folder example, and \asd\private\myfunc.m is only callable by things in the folder asd. In addition, they are higher in precedence than other functions, so you can ensure that the right one always gets called.
Longer term, you might benefit from taking a look at some of the other more extensive ways that MATLAB provides for managing namespace conflicts, such as subfunctions, object-oriented programming, and packages.
Subfunctions are extremely simple to get the hang of. Packages are not at all complicated but require a bit of thought about how to organize your code (which is usually well worth it). OO programming is a much larger change in typical programming style, but for larger applications is pretty essential.
Related
I'm working on a large data analysis that incorporates lots of different elements and hence I make heavy use of external toolboxes and functions from file exchange and github. Adding them all to the path via startup.m is my current working method but I'm running into problems of shadowing function names across toolboxes. I don't want to manually change function names or turn them into packages, since a) it's a lot of work to check for shadowing and find all function calls and more importantly b) I'm often updating the toolboxes via git. Since I'm not the author all my changes would be lost.
Is there programmatic way of packaging the toolboxes to create their own namespaces? (With as little overhead as possible?)
Thanks for the help
You can achieve this. Basic idea is to make all functions in a private folder and have only one entry point visible. This entry point is the only file seeing the toolbox, and at the same time it sees the toolbox function first regardless of the order in the search path.
toolbox/exampleToolbox.m
function varargout=exampleToolbox(fcn,varargin)
fcn=str2func(fcn);
varargout=cell(1,nargout);
[varargout{:}]=fcn(varargin{:});
end
with toolbox/exampleToolbox/private/foo.m beeing an example function.
Now you can call foo(1,2,3) via exampleToolbox('foo',1,2,3)
The same technique could be used generating a class. This way you could use exampleToolbox.foo(1,2,3)
What was the original reason for MATLAB's one (primary) function = one file, and why is it still so, after so many years of development?
What are the advantages of this approach, compared to its disadvantages (people put too many things in functions and scripts, when they should obviously be separated ... resulting in loss of code clarity)?
Matlab's schema of loading one class/function per file seems to match Java's choice in this matter. I am betting that there were other technical reasons for speeding up the parser in when it was introduced the 1980's. This schema was chosen by Java to discourage extremely large files with everything stuffed inside, which has been the primary argument for any language I've seen using one-file class symantics.
However, forcing one class per file semantics doesn't stop mega files -- KPIB is a perfect example of a complicated, horrifically long function/class file (though a quite useful maga file). So the one class file system is a way of trying to make the user aware about code abstraction more than a functionally useful mechanism.
A positive result of the one function/class file system of Matlab is that it's very easy to know what functions are available at a quick glance of a project directory. Additionally many of the names had to be made descriptive enough to differentiate them from other files, so naming as a minor form of documentation is present as a side effect.
In the end I don't think there are strong arguments for or against one file classes as it's usually just a minor semantically change to go from onw to the other (unless your code is in a horribly unorganized state... in which case you should be shamed into fixing it).
EDIT!
I fixed the bad reference to Matlab adopting Java's one class file system -- after more research it appears that both developers adopted this style independently (or rather didn't specify that the other language influenced their decision). This is especially true since Matlab didn't bundle Java until 2000.
I don't think there any advantage. But you can put as many functions as you need in a single file.
For example:
classdef UTILS
methods (Static)
function help
% prints help for all functions
disp(char(methods(mfilename, '-full')));
end
function func_01()
end
function func_02()
end
% ...more functions
end
end
I find it very neat.
>> UTILS.help
obj UTILS
Static func_01
Static func_02
Static help
>> UTILS.func_01()
I recently worked on a project that we had to deployed using powershell scripts. We coded 2000 of lines, more or less, in different files. Some of them were dedicated to common methods but, after coding 500 lines for each one, it was hard to find what method to use or if it was necessary to implement a new one.
So, my question regards to what is the best way to implement a powershell functions library:
Is better having some files with lot of code than having a lot of files with few lines of code?
The answer from #MikeShepard is conceptually the way to go. Here are just a few implementation ideas to go along with it:
I have open-source libraries in a number of languages. My PowerShell API starts with the top-level being organized into different topics, as Mike Shepard suggested.
For those topics (modules) with more than one function (e.g. SvnSupport), each public function is in a separate file with its private support functions and variables, thereby increasing cohesion and decreasing coupling.
To wrap the collection of functions within a topic (module) together, you could enumerate them individually (either by dot-sourcing or including in the manifest, as #Thomas Lee suggested). But my preference is for a technique I picked up from Scott Muc. Use the following code as the entire contents of your .psm1 file and place each of your other functions in separate .ps1 files in the same directory.
Resolve-Path $PSScriptRoot\*.ps1 |
? { -not ($_.ProviderPath.Contains(".Tests.")) } |
% { . $_.ProviderPath }
There is actually quite a lot more to say about functions and modules; the interested reader might find my article Further Down the Rabbit Hole: PowerShell Modules and Encapsulation published on Simple-Talk.com a useful starting point.
You can create a Module where you can store all your script dedicated to common jobs.
I agree with #Christian's suggestion and use a module.
One tatic you might use is to break up the module into multiple scripts and include them all in the final module. You can either explicity dot-source them in a .PSM1 file, or specify the files in a manifest (.PSD1 file).
I tend to have multiple modules based on subject matter (loosely, nouns). For instance, if I had a bunch of functions dealing with MongoDB, I'd have a MongoDB module. That makes it easy to pull them into a session if I need them, but doesn't clutter every session with a bunch of functions that I rarely use. A consistent naming convention will make it easy to know what to import. For example, modMongoDB.psm1 would be an easy name to remember.
As a side note, in 3.0 module loading can be configured to be automatic so there's no need to preload a bunch of modules in your profile.
This is a general design question not relating to any language. I'm a bit torn between going for minimum code or optimum organization.
I'll use my current project as an example. I have a bunch of tabs on a form that perform different functions. Lets say Tab 1 reads in a file with a specific layout, tab 2 exports a file to a specific location, etc. The problem I'm running into now is that I need these tabs to do something slightly different based on the contents of a variable. If it contains a 1 I may need to use Layout A and perform some extra concatenation, if it contains a 2 I may need to use Layout B and do no concatenation but add two integer fields, etc. There could be 10+ codes that I will be looking at.
Is it more preferable to create an individual path for each code early on, or attempt to create a single path that branches out only when absolutely required.
Creating an individual path for each code would allow my code to be extremely easy to follow at a glance, which in turn will help me out later on down the road when debugging or making changes. The downside to this is that I will increase the amount of code written by calling some of the same functions in multiple places (for example, steps 3, 5, and 9 for every single code may be exactly the same.
Creating a single path that would branch out only when required will be a bit messier and more difficult to follow at a glance, but I would create less code by placing conditionals only at steps that are unique.
I realize that this may be a case-by-case decision, but in general, if you were handed a previously built program to work on, which would you prefer?
Edit: I've drawn some simple images to help express it. Codes 1/2/3 are the variables and the lines under them represent the paths they would take. All of these steps need to be performed in a linear chronological fashion, so there would be a function to essentially just call other functions in the proper order.
Different Paths
Single Path
Creating a single path that would
branch out only when required will be
a bit messier and more difficult to
follow at a glance, but I would create
less code by placing conditionals only
at steps that are unique.
Im not buying this statement. There is a level of finesse when deciding when to write new functions. Functions should be as simple and reusable as possible (but no simpler). The correct answer is almost never 'one big file that does a lot of branching'.
Less LOC (lines of code) should not be the goal. Readability and maintainability should be the goal. When you create functions, the names should be self documenting. If you have a large block of code, it is good to do something like
function doSomethingComplicated() {
stepOne();
stepTwo();
// and so on
}
where the function names are self documenting. Not only will the code be more readable, you will make it easier to unit test each segment of the code in isolation.
For the case where you will have a lot of methods that call the same exact methods, you can use good OO design and design patterns to minimize the number of functions that do the same thing. This is in reference to your statement "The downside to this is that I will increase the amount of code written by calling some of the same functions in multiple places (for example, steps 3, 5, and 9 for every single code may be exactly the same."
The biggest danger in starting with one big block of code is that it will never actually get refactored into smaller units. Just start down the right path to begin with....
EDIT --
for your picture, I would create a base-class with all of the common methods that are used. The base class would be abstract, with an abstract method. Subclasses would implement the abstract method and use the common functions they need. Of course, replace 'abstract' with whatever your language of choice provides.
You should always err on the side of generalization, with the only exception being early prototyping (where throughput of generating working stuff is majorly impacted by designing correct abstractions/generalizations). having said that, you should NEVER leave that mess of non-generalized cloned branches past the early prototype stage, as it leads to messy hard to maintain code (if you are doing almost the same thing 3 different times, and need to change that thing, you're almost sure to forget to change 1 out of 3).
Again it's hard to specifically answer such an open ended question, but I believe you don't have to sacrifice one for the other.
OOP techniques solves this issue by allowing you to encapsulate the reusable portions of your code and generate child classes to handle object specific behaviors.
Personally I think you might (if possible by your API) create inherited forms, create them on fly on master form (with tabs), pass agruments and embed in tab container.
When to inherit form and when to decide to use arguments (code) to show/hide/add/remove functionality is up to you, yet master form should contain only decisions and argument passing and embeddable forms just plain functionality - this way you can separate organisation from implementation.
I want to create a big file for all cool functions I find somehow reusable and useful, and put them all into that single file. Well, for the beginning I don't have many, so it's not worth thinking much about making several files, I guess. I would use pragma marks to separate them visually.
But the question: Would those unused methods bother in any way? Would my application explode or have less performance? Or is the compiler / linker clever enough to know that function A and B are not needed, and thus does not copy their "code" into my resulting app?
This sounds like an absolute architectural and maintenance nightmare. As a matter of practice, you should never make a huge blob file with a random set of methods you find useful. Add the methods to the appropriate classes or categories. See here for information on the blob anti-pattern, which is what you are doing here.
To directly answer your question: no, methods that are never called will not affect the performance of your app.
No, they won't directly affect your app. Keep in mind though, all that unused code is going to make your functions file harder to read and maintain. Plus, writing functions you're not actually using at the moment makes it easy to introduce bugs that aren't going to become apparent until much later on when you start using those functions, which can be very confusing because you've forgotten how they're written and will probably assume they're correct because you haven't touched them in so long.
Also, in an object oriented language like Objective-C global functions should really only be used for exceptional, very reusable cases. In most instances, you should be writing methods in classes instead. I might have one or two global functions in my apps, usually related to debugging, but typically nothing else.
So no, it's not going to hurt anything, but I'd still avoid it and focus on writing the code you need now, at this very moment.
The code would still be compiled and linked into the project, it just wouldn't be used by your code, meaning your resultant executable will be larger.
I'd probably split the functions into seperate files, depending on the common areas they are to address, so I'd have a library of image functions separate from a library of string manipulation functions, then include whichever are pertinent to the project in hand.
I don't think having unused functions in the .h file will hurt you in any way. If you compile all the corresponding .m files containing the unused functions in your build target, then you will end up making a bigger executable than is required. Same goes for if you include the code via static libraries.
If you do use a function but you didn't include the right .m file or library, then you'll get a link error.