In short: is there an elegant way to restrict the scope of anonymous functions, or is Matlab broken in this example?
I have a function that creates a function handle to be used in a pipe network solver. It takes as input a Network state which includes information about the pipes and their connections (or edges and vertices if you must), constructs a large string which will return a large matrix when in function form and "evals" that string to create the handle.
function [Jv,...] = getPipeEquations(Network)
... %// some stuff happens here
Jv_str = ['[listConnected(~endNodes,:)',...
' .* areaPipes(~endNodes,:);\n',...
anotherLongString,']'];
Jv_str = sprintf(Jv_str); %// This makes debugging the string easier
eval(['Jv = #(v,f,rho)', Jv_str, ';']);
This function works as intended, but whenever I need to save later data structures that contain this function handle, it requires a ridiculous amount of memory (150MB) - coincidentally about as much as the entire Matlab workspace at the time of this function's creation (~150MB). The variables that this function handle requires from the getPipeEquations workspace are not particularly large, but what's even crazier is that when I examine the function handle:
>> f = functions(Network.jacobianFun)
f =
function: [1x8323 char]
type: 'anonymous'
file: '...\pkg\+adv\+pipe\getPipeEquations.m'
workspace: {2x1 cell}
...the workspace field contains everything that getPipeEquations had (which, incidentally is not the entire Matlab workspace).
If I instead move the eval statement to a sub-function in an attempt to force the scope, the handle will save much more compactly (~1MB):
function Jv = getJacobianHandle(Jv_str,listConnected,areaPipes,endNodes,D,L,g,dz)
eval(['Jv = #(v,f,rho)', Jv_str, ';']);
Is this expected behavior? Is there a more elegant way to restrict the scope of this anonymous function?
As an addendum, when I run the simulation that includes this function several times, clearing workspaces becomes painfully slow, which may or may not be related to Matlab's handling of the function and its workspace.
I can reproduce: anonymous functions for me are capturing copies of all variables in the enclosing workspace, not just those referenced in the expression of the anonymous function.
Here's a minimal repro.
function fcn = so_many_variables()
a = 1;
b = 2;
c = 3;
fcn = #(x) a+x;
a = 42;
And indeed, it captures a copy of the whole enclosing workspace.
>> f = so_many_variables;
>> f_info = functions(f);
>> f_info.workspace{1}
ans =
a: 1
>> f_info.workspace{2}
ans =
fcn: #(x)a+x
a: 1
b: 2
c: 3
This was a surprise to me at first. But it makes sense when you think about it: because of the presence of feval and eval, Matlab can't actually know at construction time what variables the anonymous function is actually going to end up referencing. So it has to capture everything in scope just in case they get referenced dynamically, like in this contrived example. This uses the value of foo but Matlab won't know that until you invoke the returned function handle.
function fcn = so_many_variables()
a = 1;
b = 2;
foo = 42;
fcn = #(x) x + eval(['f' 'oo']);
The workaround you're doing - isolating the function construction in a separate function with a minimal workspace - sounds like the right fix.
Here's a generalized way to get that restricted workspace to build your anonymous function in.
function eval_with_vars_out = eval_with_vars(eval_with_vars_expr, varargin)
% Assign variables to the local workspace so they can be captured
ewvo__reserved_names = {'varargin','eval_with_vars_out','eval_with_vars_expr','ewvo__reserved_names','ewvo_i'};
for ewvo_i = 2:nargin
if ismember(inputname(ewvo_i), ewvo__reserved_names)
error('variable name collision: %s', inputname(ewvo_i));
end
eval([ inputname(ewvo_i) ' = varargin{ewvo_i-1};']);
end
clear ewvo_i ewvo__reserved_names varargin;
% And eval the expression in that context
eval_with_vars_out = eval(eval_with_vars_expr);
The long variable names here hurt readability, but reduce the likelihood of collision with the caller's variables.
You just call eval_with_vars() instead of eval(), and pass in all the input variables as additional arguments. Then you don't have to type up a static function definition for each of your anonymous function builders. This'll work as long as you know up front what variables are actually going to be referenced, which is the same limitation as the approach with getJacobianHandle.
Jv = eval_with_vars_out(['#(v,f,rho) ' Jv_str],listConnected,areaPipes,endNodes,D,L,g,dz);
Anonymous functions capture everything within their scope and store them in the function workspace. See MATLAB documentation for anonymous functions
In particular:
"Variables specified in the body of the expression. MATLAB captures these variables and holds them constant throughout the lifetime of the function handle.
The latter variables must have a value assigned to them at the time you construct an anonymous function that uses them. Upon construction, MATLAB captures the current value for each variable specified in the body of that function. The function will continue to associate this value with the variable even if the value should change in the workspace or go out of scope."
An alternative workaround to your problem, is to use the fact that the matlab save function can be used to save only the specific variables you need. I have had issues with the save function saving way too much data (very different context from yours), but some judicial naming conventions, and use of wildcards in the variables list made all my problems go away.
Related
Considering we have a huge matrix called A and we pass it to the function func(A) wherein func I do a set of computations like:
func(A):
B=A;
%% a lot of processes will happen on B here
return B;
end
The fact is that as soon as I pass A to B I would not have anything to do with A anymore in my Matlab session so it takes an unnecessary space in memory. Is it possible to remove its instance in the scope of the script that called func?
Using evalin with the option caller you can evaluate expression clear A:
function A = func(A)
evalin('caller', 'clear A')
A(1) = 5;
end
However we usually don't know the name of the input variable so we can use inputname to get the actual name of the workspace variable:
function A = func(A)
name = inputname(1);
if ~isempty(name)
evalin('caller', ['clear ' name])
end
A(1)=4;
end
1.Here inputname(1) means the actual name of the first argument.
2.Work directly with A because if you copy A into B the function scope will have two copies of A.
If you write your function as
function A = func(A)
% Do lots of processing on A here
and call it as
A = func(A);
Then MATLAB will optimize it such that you work on A in-place, meaning that no copy is made. There is no need to delete A from the workspace.
This behavior is not expressly documented as far as I know, but it is well-known. See for example on Undocumented MATLAB, or on Loren's blog.
I have a statement in my MATLAB program:
f = #(A)DistanceGauss(A,x_axis,Y_target,Y_initial,numOf,modus);
I understood that f is defined as the function handle to the function distancegauss which contains the parameters/arg list present inside the parentheses.
What does the variable A in #(A) do? Does it have any importance? While browsing I found that the variables within parentheses after # would be the input arguments for an anonymous function..
Can anyone explain what does that A do? Will this handle work even without that A after the # symbol ? Because it is already present as an argument to be passed after the function name.
Your code will create an anonymous function f which accepts one input A. In particular f will call the function DistanceGauss(A,x_axis,Y_target,Y_initial,numOf,modus); where the value of A is whatever you input with f(A) and the other inputs must already exist in your workspace and will be passed to the function. Note: if the other variables don't exist you should get an error when calling f.
Now a reasonable question is why would you want to do this you could just call DistanceGauss(A,x_axis,Y_target,Y_initial,numOf,modus); directly with whatever values you want, without having to worry about whether some of them exist.
There are two main reasons I can think of why you would do this (I'm sure there are others). Firstly for simplicity where your other inputs don't change and you don't want to have to keep retyping them or have users accidentally change them.
The other reason where you would want this is when optimizing/minimizing a function, for example with fminsearch. The matlab optimization functions will vary all inputs. If you want only vary some of them you can use this sort of syntax to reduce the number of input variables.
As to what A actually is in your case this will depend on what it does in DistanceGauss, which is not a standard MATLAB function so I suggest you look at the code for that.
"f(A)" or "f of A" or "The function of A" here has the handle "f"
DistanceGauss() here is another function that was defined elsewhere in your code.
You would set x_axis, Y_target, Y_initial, numOf, & modus before creating the function f. These arguments would stay the same for Function f, even if you try and set them again later.
'A' though, is different. You don't set it before you make the function. You can later operate on all values of A, such as if you plot the function or get the integral of the function. In that case, it would be performing the DistanceGauss function on every value of 'A', (though we can't see here what DistanceGauss function does. )
Anonymous function should be defined such as:
sqr = #(x) x.^2;
in which x shows the variable of the the function and there is no name for it (it is called anonymous!).
Although you can do something like this:
c = 10;
mygrid = #(x,y) ndgrid((-x:x/c:x),(-y:y/c:y));
[x,y] = mygrid(pi,2*pi);
in which you are modifying an existing function ndgrid to make a new anonymous function.
In your case also:
f = #(A)DistanceGauss(A,x_axis,Y_target,Y_initial,numOf,modus);
This is a new anonymous function by modifying the function DistanceGauss that you may want to have a single variable A.
If you remove (A) from the code, then f would be a handle to the existing function DistanceGauss:
f = #DistanceGauss;
Now you can evaluate the function simply by using the handle:
f(A,x_axis,...)
This doesn't make sense in any other language I've seen:
for...
if (...)
if (...)
ids = [1,2,3;4,5,6]
end
end
end
K = ids(:,3)
I can't find any reference in the Matlab docs, but that to me in C, Ruby, Javascript, PHP, Java, Python, heck even Ada95, should not work. It's not in the input parameters of the function, it's not declared anywhere else.
This approach is used twice in this code attached to a paper though. Can anyone shed light? Is there just global scope in Matlab?
The variable first declared and defined inside a loop is not global, but you can declare a variable just about anywhere. I don't believe there is scope more local than a function. In general, scope is very broad in MATLAB. I'd agree that the scope of a non-global variable is limited to the function in which it's defined, but there are a couple of unusual ways in which variables can get passed around in MATLAB.
One oddity that seems to defy usual scope rules is function handles. Unlike many other languages where a function handle is little more than a pointer to the function in memory, MATLAB stores a workspace for a function handle. For example:
>> a = pi;
>> aFun = #(r) a*r.^2;
>> a = 1
>> aFun(1/sqrt(2))
ans =
1.5708
The handle swallows up the initial value of a:
>> finfo = functions(aFun)
finfo =
function: '#(r)a*r.^2'
type: 'anonymous'
file: ''
workspace: {[1x1 struct]}
>> finfo.workspace{1}
ans =
a: 3.1416
Function handles to nested functions are also another way to make a variable accessible outside of their original scope, including the nested function itself, which can even be made accessed outside of that file! Again it does it by storing the value at the time the handle is created. Consider the function:
function [y,hf] = nestTest(x)
a = 2; b = 1;
y = nestFun(x);
hf = #nestFun;
function y = nestFun(x)
y = a*x + b;
end
end
It calls the nested function, but also returns a handle to it. It would seem to not be defining a and b, but it works:
>> [y,hf] = nestTest(2)
y =
5
hf =
#nestTest/nestFun
And so does the handle:
>> hf(2)
ans =
5
Again because it stores an internal workspace with the values that it inherited when it was defined:
>> finfo = functions(hf)
finfo =
function: 'nestTest/nestFun'
type: 'nested'
file: 'C:\Users\Jon.bobs-tavern\Documents\MATLAB\nestTest.m'
workspace: {[1x1 struct]}
>> finfo.workspace{1}
ans =
y: 5
hf: #nestTest/nestFun
x: 2
a: 2
b: 1
See Preserving Data from the Workspace for more info. Also, the MATLAB editor has highlighting to help indicate scope.
Another thing to keep in mind, which should be familiar to users of other programming languages, is the stack (or the workspace in which the variable exists). You can use assignin to directly assign a variable to the caller or "base" workspace.
I believe (though if someone wishes to contradict me I'll be interested to learn) that MATLAB variable scope is limited to the function it's defined in, not the block. So a variable defined within an if-else block within a function is accessible outside that block, but only within the same function. Basically, each function has its own workspace, and variables defined within that function all go into that workspace. It gets a bit more complicated when we start using nested functions and such, and for that I refer you to the very helpful Art of MATLAB Blog.
For your second question, MATLAB does has global scope - functions defined as
global var
are defined within the global workspace, and can be accessed anywhere* in MATLAB. If you define a variable as global within one function, you can access the variable in another function by repeating the global var statement. Read here for more information.
*Note that global variables don't work well with parallelised code (for instance, within a par-for loop.
when I am doing a function in Matlab. Sometimes I have equations and every one of these have constants. Then, I have to declare these constants inside my function. I wonder if there is a way to call the values of that constants from outside of the function, if I have their values on the workspace.
I don't want to write this values as inputs of my function in the function declaration.
In addition to the solutions provided by Iterator, which are all great, I think you have some other options.
First of all, I would like to warn you about global variables (as Iterator also did): these introduce hidden dependencies and make it much more cumbersome to reuse and debug your code. If your only concern is ease of use when calling the functions, I would suggest you pass along a struct containing those constants. That has the advantage that you can easily save those constants together. Unless you know what you're doing, do yourself a favor and stay away from global variables (and functions such as eval, evalin and assignin).
Next to global, evalin and passing structs, there is another mechanism for global state: preferences. These are to be used when it concerns a nearly immutable setting of your code. These are unfit for passing around actual raw data.
If all you want is a more or less clean syntax for calling a certain function, this can be achieved in a few different ways:
You could use a variable number of parameters. This is the best option when your constants have a default value. I will explain by means of an example, e.g. a regular sine wave y = A*sin(2*pi*t/T) (A is the amplitude, T the period). In MATLAB one would implement this as:
function y = sinewave(t,A,T)
y = A*sin(2*pi*t/T);
When calling this function, we need to provide all parameters. If we extend this function to something like the following, we can omit the A and T parameters:
function y = sinewave(t,A,T)
if nargin < 3
T = 1; % default period is 1
if nargin < 2
A = 1; % default amplitude 1
end
end
y = A*sin(2*pi*t/T);
This uses the construct nargin, if you want to know more, it is worthwhile to consult the MATLAB help for nargin, varargin, varargout and nargout. However, do note that you have to provide a value for A when you want to provide the value of T. There is a more convenient way to get even better behavior:
function y = sinewave(t,A,T)
if ~exists('T','var') || isempty(T)
T = 1; % default period is 1
end
if ~exists('A','var') || isempty(A)
A = 1; % default amplitude 1
end
y = A*sin(2*pi*t/T);
This has the benefits that it is more clear what is happening and you could omit A but still specify T (the same can be done for the previous example, but that gets complicated quite easily when you have a lot of parameters). You can do such things by calling sinewave(1:10,[],4) where A will retain it's default value. If an empty input should be valid, you should use another invalid input (e.g. NaN, inf or a negative value for a parameter that is known to be positive, ...).
Using the function above, all the following calls are equivalent:
t = rand(1,10);
y1 = sinewave(t,1,1);
y2 = sinewave(t,1);
y3 = sinewave(t);
If the parameters don't have default values, you could wrap the function into a function handle which fills in those parameters. This is something you might need to do when you are using some toolboxes that impose constraints onto the functions that are to be used. This is the case in the Optimization Toolbox.
I will consider the sinewave function again, but this time I use the first definition (i.e. without a variable number of parameters). Then you could work with a function handle:
f = #(x)(sinewave(x,1,1));
You can work with f as you would with an other function:
e.g. f(10) will evaluate sinewave(10,1,1).
That way you can write a general function (i.e. sinewave that is as general and simple as possible) but you create a function (handle) on the fly with the constants substituted. This allows you to work with that function, but also prevents global storage of data.
You can of course combine different solutions: e.g. create function handle to a function with a variable number of parameters that sets a certain global variable.
The easiest way to address this is via global variable:
http://www.mathworks.com/help/techdoc/ref/global.html
You can also get the values in other workspaces, including the base or parent workspace, but this is ill-advised, as you do not necessarily know what wraps a given function.
If you want to go that route, take a look at the evalin function:
http://www.mathworks.com/help/techdoc/ref/evalin.html
Still, the standard method is to pass all of the variables you need. You can put these into a struct, if you wish, and only pass the one struct.
I'm writing a program in MATLAB to solve integrals, and I have my function in a .M-file. Now I wonder how I can write a program in the .MAT-file that lets the user set a value that exists in the both files. The .M-file looks like this:
function fh = f(y)
fh = 62.5.*(b-y).*(40-20.*exp(-(0.01.*y).*(0.01.*y)));
and as you can see, the function depends on two variables, y and b. I want the user to set b. I tried putting b = input('Type in the value of b: ') in the .M-file but for some reason the user would then have to put in the same value four times.
Can I ask for the value of b in the .MAT-file?
Firstly, m-files store code (i.e. functions), while MAT-files store data (i.e. variables). You can save workspace variables to a MAT-file using the function SAVE and load them into a workspace from a file using the function LOAD. If you have a user choose a value for b, then save it to a MAT-file ('b_value.mat', for example), you can simply load the value from the MAT-file inside your m-file function like so:
function fh = f(y)
load('b_value.mat','b');
fh = 62.5.*(b-y).*(40-20.*exp(-(0.01.*y).*(0.01.*y)));
However, this is not a very good way to handle the larger problem I think you are having. It requires that you hardcode the name of the MAT-file in your function f, plus it will give you an error if the file doesn't exist or if b isn't present in the file.
Let's address what I think the larger underlying problem is, and how to better approach a solution...
You mention that you are solving integrals, and that probably means you are performing numerical integration using one or more of the various built-in integration functions, such as QUAD. As you've noticed, using these functions requires you to supply a function for the integrand which accepts a single vector argument and returns a single vector argument.
In your case, you have other additional parameters you want to pass to the function, which is complicated by the fact that the integration functions only accept integrand functions with a single input argument. There is actually a link in the documentation for QUAD (and the other integration functions) that shows you a couple of ways you can parameterize the integrand function without adding extra input arguments by using either nested functions or anonymous functions.
As an example, I'll show you how you can do this by writing f as an anonymous function instead of an m-file function. First you would have the user choose the parameter b, then you would construct your anonymous function as follows:
b = input('Type in the value of b: ');
f = #(y) 62.5.*(b-y).*(40-20.*exp(-(0.01.*y).^2));
Note that the value of b used by the anonymous function will be fixed at what it was at the time that the function was created. If b is later changed, you would need to re-make your anonymous function so that it uses the new value.
And here's an example of how to use f in a call to QUAD:
q = quad(f,lowerLimit,upperLimit);
In your m file declare b as a global
function fh = f(y)
global b
fh = 62.5.(b-y).(40-20.*exp(-(0.01.y).(0.01.*y)));
This allows the variable to be accessed from another file without having to create another function to set the value of b. You could also add b to the arguments of your fh function.