Minimize inputs by loading values in MATLAB functions - matlab

In MATLAB is it be better (as far as optimization is concerned):
1)To have a function "foo" with a lot of inputs that come from the outputs from other functions
or
2)Save at the end of the functions the results to a results.mat file and load it in the "foo" function and minimize its inputs this way?

almost always: option 1.
As option 2 depends on file IO, and thus one write and one read to/from a hard disk, SSD or similar, it will likely lose out against keeping variables in RAM. Moreover, if you pass arguments to a function, and that function only reads them, no explicit copies of that variable are made. This is not true for the .mat file solution, as something that you already have in memory will be explicitly copied onto an extremely slow device (HDD, SSD), and then again read back into memory from the extremely slow device, just to save on a few input arguments.
So, unless you're working with big data sets and your variables give you out-of-memory errors, keep everything in RAM as much as possible.
You can minimize argument count by simply collecting data in a container data type. MATLAB has cell and struct for this purpose (or classdef, if you include value classes). You can transform this:
[outarg1, outarg2] = function(arg1, arg2, arg3,...)
into
[outarg1, outarg2] = function(S)
where
S = struct(...
'arg1', function1(X),...
'arg2', function2(X,Y,Z),...
'arg3', function3(X,Z),...
%// etc.
);
or
S = {
function1(X)
function2(X,Y,Z)
function3(X,Z)
%// etc.
}
or similar. Or you can make use of the special cell/functions called varargin/nargin and varargout/nargout:
varargout = function(varargin)
% rename input arguments
arg1 = varargin{1};
arg2 = varargin{2};
%// etc.
% assign all output arguments in one go
[varargout{1:nargout}] = deal(outargs{:}));
end % function
You can read more about all these things by typing help <thing> on the MATLAB command prompt. For example,
help cell
will give you a wealth of information on what a cell is and how to use it.

I think using global variables is a better way to optimize memory and processing requirements.
you can use keyword global.

Related

Clean methodology for running a function for a large set of input parameters (in Matlab)

I have a differential equation that's a function of around 30 constants. The differential equation is a system of (N^2+1) equations (where N is typically 4). Solving this system produces N^2+1 functions.
Often I want to see how the solution of the differential equation functionally depends on constants. For example, I might want to plot the maximum value of one of the output functions and see how that maximum changes for each solution of the differential equation as I linearly increase one of the input constants.
Is there a particularly clean method of doing this?
Right now I turn my differential-equation-solving script into a large function that returns an array of output functions. (Some of the inputs are vectors & matrices). For example:
for i = 1:N
[OutputArray1(i, :), OutputArray2(i, :), OutputArray3(i, :), OutputArray4(i, :), OutputArray5(i, :)] = DE_Simulation(Parameter1Array(i));
end
Here I loop through the function. The function solves a differential equation, and then returns the set of solution functions for that input parameter, and then each is appended as a row to a matrix.
There are a few issues I have with my method:
If I want to see the solution to the differential equation for a different parameter, I have to redefine the function so that it is an input of one of the thirty other parameters. For the sake of code readability, I cannot see myself explicitly writing all of the input parameters as individual inputs. (Although I've read that structures might be helpful here, but I'm not sure how that would be implemented.)
I typically get lost in parameter space and often have to update the same parameter across multiple scripts. I have a script that runs the differential-equation-solving function, and I have a second script that plots the set of simulated data. (And I will save the local variables to a file so that I can load them explicitly for plotting, but I often get lost figuring out which file is associated with what set of parameters). The remaining parameters that are not in the input of the function are inside the function itself. I've tried making the parameters global, but doing so drastically slows down the speed of my code. Additionally, some of the inputs are arrays I would like to plot and see before running the solver. (Some of the inputs are time-dependent boundary conditions, and I often want to see what they look like first.)
I'm trying to figure out a good method for me to keep track of everything. I'm trying to come up with a smart method of saving generated figures with a file tag that displays all the parameters associated with that figure. I can save such a file as a notepad file with a generic tagging-number that's listed in the title of the figure, but I feel like this is an awkward system. It's particularly awkward because it's not easy to see what's different about a long list of 30+ parameters.
Overall, I feel as though what I'm doing is fairly simple, yet I feel as though I don't have a good coding methodology and consequently end up wasting a lot of time saving almost-identical functions and scripts to solve fairly simple tasks.
It seems like what you really want here is something that deals with N-D arrays instead of splitting up the outputs.
If all of the OutputArray_ variables have the same number of rows, then the line
for i = 1:N
[OutputArray1(i, :), OutputArray2(i, :), OutputArray3(i, :), OutputArray4(i, :), OutputArray5(i, :)] = DE_Simulation(Parameter1Array(i));
end
seems to suggest that what you really want your function to return is an M x K array (where in this case, K = 5), and you want to pack that output into an M x K x N array. That is, it seems like you'd want to refactor your DE_Simulation to give you something like
for i = 1:N
OutputArray(:,:,i) = DE_Simulation(Parameter1Array(i));
end
If they aren't the same size, then a struct or a table is probably the best way to go, as you could assign to one element of the struct array per loop iteration or one row of the table per loop iteration (the table approach would assume that the size of the variables doesn't change from iteration to iteration).
If, for some reason, you really need to have these as separate outputs (and perhaps later as separate inputs), then what you probably want is a cell array. In that case you'd be able to deal with the variable number of inputs doing something like
for i = 1:N
[OutputArray{i, 1:K}] = DE_Simulation(Parameter1Array(i));
end
I hesitate to even write that, though, because this almost certainly seems like the wrong data structure for what you're trying to do.

display variable value each n iterations with condition outside the loop

When I have to display the variable value every n iterations of a for loop I always do something along these lines:
for ii=1:1000
if mod(ii,100)==0
display(num2str(ii))
end
end
I was wondering if there is a way to move the if condition outside the loop in order to speed up the code. Or also if there is something different I could do.
You can use nested loops:
N = 1000;
n = 100;
for ii = n:n:N
for k = ii-n+1:ii-1
thingsToDo(k);
end
disp(ii)
thingsToDo(ii);
end
where thingsToDo() get the relevant counter (if needed). This a little more messy, but can save a lot of if testing.
Unless the number of tested values is much larger than the number of printed values, I would not blame the if-statement. It may not seem this way at first, but printing is indeed a fairly complex task. A variable needs to be converted and sent to an output stream which is then printing in the terminal. In case you need to speed the code up, then reduce the amount of printed data.
Normally Matlab function takes vector inputs as well. This is the case for disp and display and does only take a single function call. Further, conversion to string is unnecessary before printing. Matlab should send the data to some kind of stream anyway (which may indeed take argument of type char but this is not the same char as Matlab uses), so this is probably just a waste of time. In addition to that num2str do a lot of things to ensure typesafe conversion. You already know that display is typesafe, so all these checks are redundant.
Try this instead,
q = (1:1000)'; % assuming q is some real data in your case
disp(q(mod(q,100)==0)) % this requires a single call to disp

advice with pointers in matlab

I am running a very large meta-simulation where I go through two hyperparameters (lets say x and y) and for each set of hyperparameters (x_i & y_j) I run a modest sized subsimulation. Thus:
for x=1:I
for y=1:j
subsimulation(x,y)
end
end
For each subsimulation however, about 50% of the data is common to every other subsimulation, or subsimulation(x_1,y_1).commondata=subsimulation(x_2,y_2).commondata.
This is very relevant since so far the total simulation results file size is ~10Gb! Obviously, I want to save the common subsimulation data 1 time to save space. However, the obvious solution, being to save it in one place would screw up my plotting function, since it directly calls subsimulation(x,y).commondata.
I was wondering whether I could do something like
subsimulation(x,y).commondata=% pointer to 1 location in memory %
If that cant work, what about this less elegant solution:
subsimulation(x,y).commondata='variable name' %string
and then adding
if(~isstruct(subsimulation(x,y).commondata)),
subsimulation(x,y).commondata=eval(subsimulation(x,y).commondata)
end
What solution do you guys think is best?
Thanks
DankMasterDan
You could do this fairly easily by defining a handle class. See also the documentation.
An example:
classdef SimulationCommonData < handle
properties
someData
end
methods
function this = SimulationCommonData(someData)
% Constructor
this.someData = someData;
end
end
end
Then use like this,
commonData = SimulationCommonData(something);
subsimulation(x, y).commondata = commonData;
subsimulation(x, y+1).commondata = commonData;
% These now point to the same reference (handle)
As per my comment, as long as you do not modify the common data, you can pass it as third input and still not copy the array in memory on each iteration (a very good read is Internal Matlab memory optimizations). This image will clarify:
As you can see, the first jump in memory is due to the creation of common and the second one to the allocation of the output c. If the data were copied on each iteration, you would have seen many more memory fluctuations. For instance, a third jump, then a decrease, then back up again and so on...
Follows the code (I added a pause in between each iteration to make it clearer that no big jumps occur during the loop):
function out = foo(a,b,common)
out = a+b+common;
end
for ii = 1:10; c = foo(ii,ii+1,common); pause(2); end

matlab local static variable

In order to test an algorithm in different scenarios, in need to iteratively call a matlab function alg.m.
The bottleneck in alg.m is something like:
load large5Dmatrix.mat
small2Dmatrix=large5Dmatrix(:,:,i,j,k) % i,j and k change at every call of alg.m
clear large5Dmatrix
In order to speed up my tests, i would like to have large5Dmatrix loaded only at the first call of alg.m, and valid for future calls, possibly only within the scope of alg.m
Is there a way to acheve this in matlab other then setting large5Dmatrix as global?
Can you think of a better way to work with this large matrix of constant values within alg.m?
You can use persistent for static local variables:
function myfun(myargs)
persistent large5Dmatrix
if isempty(large5Dmatrix)
load large5Dmatrix.mat;
end
small2Dmatrix=large5Dmatrix(:,:,i,j,k) % i,j and k change at every call of alg.m
% ...
end
but since you're not changing large5Dmatrix, #High Performance Mark answer is better suited and has no computational implications. Unless you really, really don't want large5Dmatrix in the scope of the caller.
When you pass an array as an argument to a Matlab function the array is only copied if the function updates it, if the function only reads the array then no copy is made. So any performance penalty the function pays, in time and space, should only kick in if the function updates the large array.
I've never tested this with a recursive function but I don't immediately see why it should start copying the large array if it is only read from.
So your strategy would be to load the array outside the function, then pass it into the function as an argument.
This note may clarify.

A command to catch the variable values from the workspace, inside a function

when I am doing a function in Matlab. Sometimes I have equations and every one of these have constants. Then, I have to declare these constants inside my function. I wonder if there is a way to call the values of that constants from outside of the function, if I have their values on the workspace.
I don't want to write this values as inputs of my function in the function declaration.
In addition to the solutions provided by Iterator, which are all great, I think you have some other options.
First of all, I would like to warn you about global variables (as Iterator also did): these introduce hidden dependencies and make it much more cumbersome to reuse and debug your code. If your only concern is ease of use when calling the functions, I would suggest you pass along a struct containing those constants. That has the advantage that you can easily save those constants together. Unless you know what you're doing, do yourself a favor and stay away from global variables (and functions such as eval, evalin and assignin).
Next to global, evalin and passing structs, there is another mechanism for global state: preferences. These are to be used when it concerns a nearly immutable setting of your code. These are unfit for passing around actual raw data.
If all you want is a more or less clean syntax for calling a certain function, this can be achieved in a few different ways:
You could use a variable number of parameters. This is the best option when your constants have a default value. I will explain by means of an example, e.g. a regular sine wave y = A*sin(2*pi*t/T) (A is the amplitude, T the period). In MATLAB one would implement this as:
function y = sinewave(t,A,T)
y = A*sin(2*pi*t/T);
When calling this function, we need to provide all parameters. If we extend this function to something like the following, we can omit the A and T parameters:
function y = sinewave(t,A,T)
if nargin < 3
T = 1; % default period is 1
if nargin < 2
A = 1; % default amplitude 1
end
end
y = A*sin(2*pi*t/T);
This uses the construct nargin, if you want to know more, it is worthwhile to consult the MATLAB help for nargin, varargin, varargout and nargout. However, do note that you have to provide a value for A when you want to provide the value of T. There is a more convenient way to get even better behavior:
function y = sinewave(t,A,T)
if ~exists('T','var') || isempty(T)
T = 1; % default period is 1
end
if ~exists('A','var') || isempty(A)
A = 1; % default amplitude 1
end
y = A*sin(2*pi*t/T);
This has the benefits that it is more clear what is happening and you could omit A but still specify T (the same can be done for the previous example, but that gets complicated quite easily when you have a lot of parameters). You can do such things by calling sinewave(1:10,[],4) where A will retain it's default value. If an empty input should be valid, you should use another invalid input (e.g. NaN, inf or a negative value for a parameter that is known to be positive, ...).
Using the function above, all the following calls are equivalent:
t = rand(1,10);
y1 = sinewave(t,1,1);
y2 = sinewave(t,1);
y3 = sinewave(t);
If the parameters don't have default values, you could wrap the function into a function handle which fills in those parameters. This is something you might need to do when you are using some toolboxes that impose constraints onto the functions that are to be used. This is the case in the Optimization Toolbox.
I will consider the sinewave function again, but this time I use the first definition (i.e. without a variable number of parameters). Then you could work with a function handle:
f = #(x)(sinewave(x,1,1));
You can work with f as you would with an other function:
e.g. f(10) will evaluate sinewave(10,1,1).
That way you can write a general function (i.e. sinewave that is as general and simple as possible) but you create a function (handle) on the fly with the constants substituted. This allows you to work with that function, but also prevents global storage of data.
You can of course combine different solutions: e.g. create function handle to a function with a variable number of parameters that sets a certain global variable.
The easiest way to address this is via global variable:
http://www.mathworks.com/help/techdoc/ref/global.html
You can also get the values in other workspaces, including the base or parent workspace, but this is ill-advised, as you do not necessarily know what wraps a given function.
If you want to go that route, take a look at the evalin function:
http://www.mathworks.com/help/techdoc/ref/evalin.html
Still, the standard method is to pass all of the variables you need. You can put these into a struct, if you wish, and only pass the one struct.