MATLAB variable passing and lazy assignment - matlab

I know that in Matlab, there is a 'lazy' evaluation when a new variable is assigned to an existing one. Such as:
array1 = ones(1,1e8);
array2 = array1;
The value of array1 won't be copied to array2 unless the element of array2 is modified.
From this I supposed that all the variables in Matlab are actually value-type and are all passed by values (although lazy evaluation is used). This also implies that the variables are created on the call stack.
Well, I am not judging the way it treats the variables, although I have never seen a second programming language doing this way. I mean, for possibly large data structures such as arrays, treating it as value type and passing it by values does not seem to be a good idea. Though the lazy evaluation saves the space and time, it just seems strange to me. You may have an expression for mutating (instead of initialization or assignment) of a variable leading to an out-of-memory error. As far as I know, in C array names are actually pointers, and in Fortran, arrays are passed by reference. Most modern languages retreat arrays as reference type.
So, can anyone tell me why Matlab use such a not-so-common way to implement the arrays. Is it true that in Matlab, nothing is or can be created on the heap?
By the way, I have asked some experienced Matlab users about it. They simply say that they never change the variable once it is created, and use function call to create new variables. That means all the mutable data are treated immutable. Is there any gain or loss for programming in this way?

You're phrasing your question in a confusing way, using terms from programming languages such as C and FORTRAN that are misleading when applied to other languages.
There is a distinction between variables being passed by value or by reference, and variables having value semantics or reference semantics.
In C, variables can be passed by value, or they can be passed by reference using a pointer.
MATLAB does not have pointers. Whatever you've been told, MATLAB always passes variables by value. Since it does not have pointers, it doesn't make sense to ask whether it is passing variables by value or by reference - it must be by value.
Nevertheless, MATLAB variables can have either value semantics or reference semantics. In MATLAB, a variable with reference semantics is called a handle variable.
To emphasise - even if the variable is being passed by value, it can have either value or reference semantics.
When you create a regular variable:
>> a = 1;
The variable a has value semantics. What this means is that if you create another variable from it and then change the original, the new variable does not change.
>> b = a;
>> b
b =
1
>> a = 2;
>> b
b =
1
But if you create, for example, a figure:
>> f = figure;
The variable f has reference, or handle semantics. What this means is that if you create another variable from it and then change the original, the new variable also changes.
>> get(f, 'Name')
ans =
''
>> g = f;
>> set(f, 'Name', 'hello')
>> get(g, 'Name')
ans =
hello
When you define your own variable types using MATLAB OO classes, you can specify whether the objects of that class will have value or reference/handle semantics by inheriting the class from the built-in class handle.
Objects that are instances of value classes will behave similarly to a above; objects that are instances of handle classes will behave similarly to f above.
And they are both, always, passed by value.
I'm guessing at the underlying reason for your question: but I would recommend that you take a look into how to create handle classes. They will probably provide you with the variable behaviour that you're hoping to achieve (i.e. being able to pass it around, take a copy of it without increasing memory significantly, and it always refers to the same underlying thing).
If the "experienced MATLAB users" you have spoken to are using only value variables then they are losing a great deal - it is very often much more convenient to use handle variables. And I would actually bet that they are using them without realising it - pretty much all of MATLAB Handle Graphics relies on handle variables, like f above.
I believe the above is a complete explanation of the semantics of MATLAB variables. There are a couple of other wrinkles that confuse people, but they do not contradict the above:
Although MATLAB has pass-by-value behaviour (which, as explained above is different from whether variables have value or reference semantics), it also has lazy or copy-on-write behaviour. You describe this in your question, so you obviously get what it's doing, but it's simply an optimization that is a separate issue from the passing behaviour or variable semantics.
As mentioned in a comment by #Bernhard, if you implement functions using a syntax similar to x = myfun(x) rather than the more normal y = myfun(x), MATLAB can perform in-place optimizations on your code (i.e. overwriting the original variable rather than making a temporary copy) in some circumstances (in particular, the operations carried out on x within myfun have to be capable of being done in-place, such as arithmetic or trigonometric functions, not matrix operations like ' that would change the dimensions). But again, this is just an optimization, it doesn't change the semantics of the variables.
PS One more thing - stop thinking about the stack and the heap as well; there's not really an analogue in MATLAB, because you don't really have control over what area of memory your variables are stored in.

Related

Does MATLAB have any set-like datatype?

I am looking for a way to compare finite sequential data with non-deterministic ordering in MATLAB. Basically, what I want is an array, but without imposing an order on the contained elements. If I have the objects
a = [x y z];
and
b = [x z y];
I'd want isequal(a, b) to return true. With arrays, this is not the case. The easy fix would be to sort the entries before comparing them. Unfortunately, in my case the elements are complex objects which cannot easily be mapped to have an unambigious numerical relationship to each other. Another approach would be not to use isequal, but rather a custom comparison function which asserts matching lengths and then simply checks if each element from the first array is contained in the second one. However, in my case the arrays are non-trivially nested inside the structs I am trying to compare via isequal, and it would be quite complicated to write a custom comparison function for the encapsulating structs. Other than this ordering problem, the inbuilt isequal function covers all of my needs, as it correctly handles arbitrarily nested structs with arbitrary fields, so I would really like to avoid writing a complicated custom function for that.
Is there any datatype in MATLAB which allows for the described behavior? Or is there a way to easily build such a custom type? In Java, I could simply write a wrapper class with a custom implementation for the equals method, but there seems to be no such mechanism in MATLAB?
I've found a way to solve my problem elegantly. Contrary to my previously stated belief, MATLAB actually does allow for class-specific overriding of isequal.
classdef CustomType
properties
value
end
methods
function self = CustomType(value)
self.value = value;
end
function equal = isequal(self, other)
if not(isa(other, 'CustomType'))
equal = false;
return;
end
% implement custom comparison rules here
end
end
end
So, I can simply assign the fields in question like this and don't have to change anything else in my code:
a = Set([x y z]); % custom type
...
b = Set([x z y]);
...
isequal(a, b); % true
In my use case, I don't even need the uniqueness property of sets. So I only have to perform order-independent comparison and don't need to waste performance on ensuring unrequired properties. Furthermore, by using a dedicated type, I can differentiate explicitly between fields which have order (i.e. regular arrays) and those which don't, at the moment of assignment.
Another solution might be to overwrite the inbuilt isequal and make it apply custom comparison rules when its arguments are of specific type. However, this would slow down all comparisons in the whole program and make for bad encapsulation. I feel like using a custom type with an overriden isequal is the way to solve this kind of problem. But I still think that sets (and other types of commonly used containers) should be included in the basic repertoire of MATLAB.

Coupled variables in hyperparameter optimization in MATLAB

I would like to find optimal hyperparamters for a specific function, I am using bayesopt routine in MATLAB.
I can set the variables to optimize like the following:
a = optimizableVariable('a',[0,1],'Type','integer');
But I have coupled variables, i.e, variables whose value depend on the existence of other variables, e.g., a={0,1}, b={0,1} iff a=1.
Meaning that b has an influence on the function if a==1.
I thought about creating a unique variables that encompasses all the possibilities, i.e., c=1 if a=0, c=2 if a=1,b=0, c=3 if a=1,b=0. The problem is that I am interested in optimizing continuous variables and the above approach does not hold anymore.
I tried something alone the line of
b = a * optimizableVariable('b',[0,1],'Type','integer');
But MATLAB threw an error.
Undefined operator '*' for input arguments of type 'optimizableVariable'.
After three months almost to the day, buried deep down in MATLAB documentation, the answer was to use constrained variables.
https://www.mathworks.com/help/stats/constraints-in-bayesian-optimization.html#bvaw2ar

Built-in function for assignment in Matlab

I was having a doubt today :).
For
A=1;
is there any function f that does the same? like following
f(A,1);
It could help me in some cases like in cellfun or something like that.
You can do this easily if your variable A is a handle class object, thus giving it reference behavior. You could then create a method f for the class that accepts a class object A and a new value for it to store. See Object-Oriented Programming for more information.
For data types like double or cell there are no built-in functions that work this way. You could make your own function using assignin and inputname like so:
function f(var, value)
assignin('caller', inputname(1), value);
end
And call it as follows, with A already defined:
A = 0;
f(A, 1); % Changes the value of A to 1
However, this would generally be considered bad practice as it makes the code harder to follow, as call-by-value behavior is the expected norm.
In general no, MATLAB functions cannot change their input.
But, if you are brave, you can create a MEX-file that breaks that promise and does change the input. In a MEX-file you can write to the input array, but doing so carelessly causes havoc. For example,
B = A;
f(A,1); % <- modifies A
would cause B to also be modified, because MATLAB delays copying the data when you do B = A. That is, the two variables point to the same data until you modify one, at which point the data is copied. But in a MEX-file you can write to a matrix without doing this check, thereby modifying B also. The link I provided shows how to modify A carefully.

How can I get Matlab to use the variable value instead of name

In my code, I have a line that looks like this:
f=#(test) bf{i}(5);
where bf is a cell array with functions from str2func() stored in it, i is a variable storing an integer, and the 5 is the argument to pass to the function. How can I get matlab to evaluate the line using the current value of i? Right now when I display f it outputs:
#(test)bf{i}(5)
Lets say i=1, I want it to output:
#(test)bf{1}(5)
Although technically the bf{1} should also be replaced with whatever function is stored in bf{1}. How can I force matlab to evaluate the variables in this statement?
When you create a function handle, the workspace variables are copied and the expression is evaluated when you call the function handle (Typically not a problem in memory consumption, matlab stores only changes).
Now the problem is, to tell Matlab when to evaluate what part of the expression.
If you are aiming for a better performance, pre-evaluate all constant parts of the function. Let's say your function is #(x)(g(3).*f(x)), in this case matlab would evaluate g(3) on every call.
Instead use:
f=#(x)(x.^2)
g_3=g(3)
h=#(x)(g_3.*f(x))
Now having the constant parts evaluated, you want to see the constants instead of the variabe name. I know two ways to achieve this.
You can use the symbolic toolbox, basically converting the function handle to a symbolic function, then to a function handle again. This not only displays the constants, but also substitutes f. This is not possible for all functions.
>> matlabFunction(h(sym('x')))
ans =
#(x)x.^2.*4.2e1
Another possibility is to use eval:
h=eval(['#(x)',sprintf('%e',g_3),'.*f(x)'])
Pre-evaluating constant parts of the expressions as I did in the first step is typically recommendable, but both solutions to get the constant visible in your function handle aren't really recommendable. The first solution using matlabFunction only applies to some functions, while the second comes with all the disadvantages of eval.

A command to catch the variable values from the workspace, inside a function

when I am doing a function in Matlab. Sometimes I have equations and every one of these have constants. Then, I have to declare these constants inside my function. I wonder if there is a way to call the values of that constants from outside of the function, if I have their values on the workspace.
I don't want to write this values as inputs of my function in the function declaration.
In addition to the solutions provided by Iterator, which are all great, I think you have some other options.
First of all, I would like to warn you about global variables (as Iterator also did): these introduce hidden dependencies and make it much more cumbersome to reuse and debug your code. If your only concern is ease of use when calling the functions, I would suggest you pass along a struct containing those constants. That has the advantage that you can easily save those constants together. Unless you know what you're doing, do yourself a favor and stay away from global variables (and functions such as eval, evalin and assignin).
Next to global, evalin and passing structs, there is another mechanism for global state: preferences. These are to be used when it concerns a nearly immutable setting of your code. These are unfit for passing around actual raw data.
If all you want is a more or less clean syntax for calling a certain function, this can be achieved in a few different ways:
You could use a variable number of parameters. This is the best option when your constants have a default value. I will explain by means of an example, e.g. a regular sine wave y = A*sin(2*pi*t/T) (A is the amplitude, T the period). In MATLAB one would implement this as:
function y = sinewave(t,A,T)
y = A*sin(2*pi*t/T);
When calling this function, we need to provide all parameters. If we extend this function to something like the following, we can omit the A and T parameters:
function y = sinewave(t,A,T)
if nargin < 3
T = 1; % default period is 1
if nargin < 2
A = 1; % default amplitude 1
end
end
y = A*sin(2*pi*t/T);
This uses the construct nargin, if you want to know more, it is worthwhile to consult the MATLAB help for nargin, varargin, varargout and nargout. However, do note that you have to provide a value for A when you want to provide the value of T. There is a more convenient way to get even better behavior:
function y = sinewave(t,A,T)
if ~exists('T','var') || isempty(T)
T = 1; % default period is 1
end
if ~exists('A','var') || isempty(A)
A = 1; % default amplitude 1
end
y = A*sin(2*pi*t/T);
This has the benefits that it is more clear what is happening and you could omit A but still specify T (the same can be done for the previous example, but that gets complicated quite easily when you have a lot of parameters). You can do such things by calling sinewave(1:10,[],4) where A will retain it's default value. If an empty input should be valid, you should use another invalid input (e.g. NaN, inf or a negative value for a parameter that is known to be positive, ...).
Using the function above, all the following calls are equivalent:
t = rand(1,10);
y1 = sinewave(t,1,1);
y2 = sinewave(t,1);
y3 = sinewave(t);
If the parameters don't have default values, you could wrap the function into a function handle which fills in those parameters. This is something you might need to do when you are using some toolboxes that impose constraints onto the functions that are to be used. This is the case in the Optimization Toolbox.
I will consider the sinewave function again, but this time I use the first definition (i.e. without a variable number of parameters). Then you could work with a function handle:
f = #(x)(sinewave(x,1,1));
You can work with f as you would with an other function:
e.g. f(10) will evaluate sinewave(10,1,1).
That way you can write a general function (i.e. sinewave that is as general and simple as possible) but you create a function (handle) on the fly with the constants substituted. This allows you to work with that function, but also prevents global storage of data.
You can of course combine different solutions: e.g. create function handle to a function with a variable number of parameters that sets a certain global variable.
The easiest way to address this is via global variable:
http://www.mathworks.com/help/techdoc/ref/global.html
You can also get the values in other workspaces, including the base or parent workspace, but this is ill-advised, as you do not necessarily know what wraps a given function.
If you want to go that route, take a look at the evalin function:
http://www.mathworks.com/help/techdoc/ref/evalin.html
Still, the standard method is to pass all of the variables you need. You can put these into a struct, if you wish, and only pass the one struct.