I've written a piece of code which calls numerous functions, which in turn also call several sub-functions.
I am calling the main file from the command line and supplementing the call with certain arguments to initiate certain modes I have accounted for.
E.g. octave classify_file.m --debug <file> would run in my custom debug mode, which sets a constant debug to 1 and subsequently outputs all plots and relevant variables. Omitting the argument outputs only 1 variable.
Similarily I have a template and a histogram mode, which essentially all do the same thing, except output some more variables, matrices and plots depending on the mode.
As it is now, I have to include the debug, template and histogram constants as arguments to each and every function if I want them to be influenced by the respective modes.
This is cumbersome and confusing, there has to be a better way. I've never worked with global variables, but would this be a good place to use one? What's an elegant solution for this problem?
This is a situation in which global variables will come in handy, although as you may be aware they are sometimes frowned upon, and can also have certain performance implications in matlab. Personally I don't think passing the mode all the way down the call stack is too bad - although are you treating all 3 as separate arguments? The least you could do is put them in a struct in your highest level function so that you only have 1 argument:
mode.debug = [whatever]
mode.histogram = [whatever]
mode.template = [whatever]
myFunction(mode);
OR, if you can only have one mode at a time what about some integer constants?
mode = MODE_DEBUG
or
mode = MODE_NONE
I would define the constants by creating short functions, this is how the pi constant works for example.
Finally, there is an alternative to global variables which I rather like, which is functions that use persistent variables. For example:
function m = debugMode(newValue)
persistant isModeOn;
if nargin > 0
isModeOn = newValue
end
m = isModeOn;
end
This way you can call debugMode(1) to set it on, and you can call m = debugMode anywhere to get the value.
Related
My problem is the following:
I have very many (~1000) mutually calling Matlab scripts, which are very poorly written, regularly damage each other's environments and generally became unmanageable.
One of the reasons I even got this problem is that I need to write a testsuite covering a big part of them. Luckily, for most of them the main criterion of 'correctness' is 'they don't crash'.
Just running them one by one in a loop is generally not an option, because they regularly call clear classes, close all, clc, shadow built-in functions and operators, et cetera.
So my original aim was to find a way to run a matlab script in sort of an 'isolated environment', but I didn't find a good way to do it. (Suggestions welcome, but it is not the main question.)
Since I will need to convert them all to functions anyway, I am looking for some way to do it auto-magically, or at least semi-automatically.
What I can mean semi-automatically:
Just add a line function varargout = $filename( varargin ) as the first line of the file, and end as the last one. This will at least make them runnable as functions with feval and all such functions and (more importantly) prevent them from damaging the test-runner.
Do point 1 and scan the file for referencing undeclared variables and add them as function arguments. This should be also doable, since the names of the variables are known. This will not help identifying output variables, but will still be a lot of help. For example, we could pack the whole workspace into one big output structure.
Do a runtime version of point 2. This way the 'magical converter' can actually track execution environments (workspaces) and identify which variables are implicitly used as 'input arguments' of a script, and which would be later used 'output arguments'. This option looks EXPHARD, but for a small number of calls should be not too bad in practice.
Point 1 I can implement myself using sed, as I also can get rid of all clear classes and clc, but the options 2 and 3 seem much harder. Is there anything at least remotely resembling options 2 or 3?
This is probably just a short syntax question. I have:
clear all
macro drop _all
global variables var1 var2
and I want
global means m_var1 m_var2
which I have generated elsewhere. The goal is to use both globals in a Mundlak regression (like reg depvar $variables $means and not having to calculate/include the means by hand for different specifications. My idea was something along the lines of:
global means "m_`variables'"
but that simply ignores the variables global. Again, sorry for the R-think...
Edit: My strategy: I am trying to write a program which runs models (Mundlak/Chamberlain random effects logit, see Wooldridges Panel book 2nd ed p. 487) on several distinct lists of variables and returns graphs of regression results. This should be done such that I only have to change the globals/locals specifying these variables in the beginning. Thus, I need to have code that creates time averages of the globals and uses these and the original global in the logit specification.
I'm not convinced your general strategy is a good one, but I don't have information on the issue you face, so I won't comment much more.
I'll state that using locals is a better idea if you can spare the globals, and that you can redefine the contents of a macro using a loop:
clear all
set more off
local variables var1 var2
// original
local means "m_`variables'"
// loop
local means2
foreach v of local variables {
local means2 `means2' m_`v'
}
display "`means'"
display "`means2'"
I have a rather bulky program that I've been running as a script from the MATLAB command line. I decided to clean it up a bit with some nested functions (I need to keep everything in one file), but in order for that to work it required me to also make the program itself a function. As a result, the program no longer runs in the base workspace like it did when it was a script. This means I no longer have access to the dozens of useful variables that used to remain after the program runs, which are important for extra calculations and information about the run.
The suggested workarounds I can find are to use assignin, evalin, define the variables as global, or set the output in the definition of the now function-ized program. None of these solutions appeal to me, however, and I would really like to find a way to force the workspace itself to base. Does any such workaround exist? Or is there any other way to do this that doesn't require me to manually define or label each specific variable I want to get out of the function?
Functions should define clearly input and output variables. Organizing the code differently will be much more difficult to understand and to modify later on. In the end, it will most likely cost you more time to work with an unorthodox style than investing in some restructuring.
If you have a huge number of output variables, I would suggest organizing them in structure arrays, which might be easy to handle as output variables.
The only untidy workaround I can imagine would use whos, assignin and eval:
function your_function()
x = 'hello' ;
y = 'world' ;
variables = whos ;
for k=1:length(variables)
assignin('base',variables(k).name,eval(variables(k).name))
end
end
But I doubt that this will help with the aim to clean up your program. As mentioned above I suggest ordering things manually in structures:
function out = your_function()
x = 'hello' ;
y = 'world' ;
out.x = x ;
out.y = y ;
end
If the function you would like to define are simple and have a single output, one option is to use anonymous functions.
Another option is to store all the variable you would like to use afterwards in a struct and have your big function return this struct as an output.
function AllVariables = GlobalFunction(varargin);
% bunch of stuff
AllVariables= struct('Variable1', Variable1, 'Variable2', Variable2, …);
end
I just discovered (to my surprise) that calling the following function
function foo()
if false
fprintf = 1;
else
% do nothing
end
fprintf('test')
gives and error Undefined function or variable "fprintf". My conclusion is that the scope of variables is determined before runtime (in my limited understanding how interpretation of computer languages and specifically Matlab works). Can anyone give me some background information on this?
Edit
Another interesting thing I forgot to mention above is that
function foo()
if false
fprintf = 1;
else
% do nothing
end
clear('fprintf')
fprintf('test')
produces Reference to a cleared variable fprintf.
MATLAB parses the function before it's ever run. It looks for variable names, for instance, regardless of the branching that activates (or doesn't activate) those variables. That is, scope is not determined at runtime.
ADDENDUM: I wouldn't recommend doing this, but I've seen a lot of people doing things with MATLAB that I wouldn't recommend. But... consider what would happen if someone were to define their own function called "false". The pre-runtime parser couldn't know what would happen if that function were called.
It seems that the first time the MATLAB JIT compiler parses the m-file, it identifies all variables declared in the function. It doesn't seem to care whether said variable is being declared in unreachable code. So your local fprintf variable immediately hides the builtin function fprintf. This means that, as far as this function is concerned, there is no builtin function named fprintf.
Of course, once that happens, every reference within the function to fprintf refers to the local variable, and since the variable never actually gets created, attempting to access it results in errors.
Clearing the variable simply clears the local variable, if it exists, it does not bring the builtin function back into scope.
To call a builtin function explicitly, you can use the builtin function.
builtin( 'fprintf', 'test' );
The line above will always print the text at the MATLAB command line, irrespective of local variables that may shadow the fprintf function.
Interesting situation. I doubt if there is detailed information available about how the MATLAB interpreter works in regard to this strange case, but there are a couple of things to note in the documentation...
The function precedence order used by MATLAB places variables first:
Before assuming that a name matches a function, MATLAB checks for a variable with that name in the current workspace.
Of course, in your example the variable fprintf doesn't actually exist in the workspace, since that branch of the conditional statement is never entered. However, the documentation on variable naming says this:
Avoid creating variables with the same name as a function (such as i, j, mode, char, size, and path). In general, variable names take precedence over function names. If you create a variable that uses the name of a function, you sometimes get unexpected results.
This must be one of those "unexpected results", especially when the variable isn't actually created. The conclusion is that there must be some mechanism in MATLAB that parses a file at runtime to determine what possible variables could exist within a given scope, the net result of which is functions can still get shadowed by variables that appear in the m-file even if they don't ultimately appear in the workspace.
EDIT: Even more baffling is that functions like exist and which aren't even aware of the fact that the function appears to be shadowed. Adding these lines before the call to fprintf:
exist('fprintf')
which('fprintf')
Gives this output before the error occurs:
ans =
5
built-in (C:\Program Files\MATLAB\R2012a\toolbox\matlab\iofun\fprintf)
Indicating that they still see the built-in fprintf.
These may provide insight:
https://www.mathworks.com/help/matlab/matlab_prog/base-and-function-workspaces.html
https://www.mathworks.com/help/matlab/matlab_prog/share-data-between-workspaces.html
This can give you some info about what is shadowed:
which -all
(Below was confirmed as a bug)
One gotcha is that Workspace structs, and classes on the path, have particular scoping and type precedence that (if you are me) may catch you out.
E.g. in 2017b:
% In C.m, saved in the current directory
classdef C
properties (Constant)
x = 100;
end
end
% In Command window
C.x = 1;
C.x % 100
C.x % 1 (Note the space)
C.x*C.x % 1
disp(C.x) % 1
This is a thought example of what I am thinking of:
test = 'x > 0';
while str2func(test)
Do your thing
x=x-1;
end
Is it possible to store whole logical operations in a variable like this?
Of course the str2func will break here. If it is possible this function will likely be something else. And I have only added apostrophes to the test variable content, because I cannot think of what else would be the storing method.
I can see it usefull when sending arguments to functions and alike. But mostly I'm just wondering, because I have never seen it done in any programming language before.
You can store the textual representation of a function in a variable and evaluate it, for example
test = 'x > 0';
eval(test)
should result in 1 or 0 depending on x's value.
But you shouldn't use eval for reasons too-often covered here on SO for me to bother repeating. You should instead become familiar with functions and function handles. For example
test = #(x)x>0
makes test a handle to a function which tests whether its argument is greater than 0 or not.
Many languages which are interpreted at run-time, as opposed to compiled languages, have similar capabilities.