Parallel computing using Matlab - matlab

I am executing windows '.exe' file in 'cmd' prompt for various inputs through Matlab. The commands as follows.
for i = 1:n
filename = sprintf('input_%d.dat',i);
string = sprintf('!sfbox.exe %s', filename);
eval(string)
end
All input files are present and independent of each other. But if I attempt to parallelize the execution using 'parfor' as follows,
parfor i = 1:n
filename = sprintf('input_%d.dat',i);
string = sprintf('!sfbox.exe %s', filename);
eval(string)
end
I get an error, but the code runs serially without stopping
Explanation
MATLAB runs parfor loops on multiple MATLAB workers that have
multiple workspaces. The indicated function might not access the
correct workspace; therefore, its usage is invalid.
Is there a correct way to execute the eval using parfor?
(PS: I tried manually executing several .exe files in cmd prompt and it is feasible to run several .exe files at same time in command prompt. Problem is the way I attempt to do it in Matlab. Please suggest better methods.)

You are hitting issues with Matlab not knowing what eval is actually doing. While you know that it's doing the right thing, the eval command could be executing anything. There is a little documentation on transparency issues using eval statements in parfor and spmd statements.
Switching to use an feval statement should solve your problem, as Matlab will know that the only thing going into that statement is a string. More directly, you can use the system command to directly execute an arbitrary string in the cmd prompt from matlab.
parfor i = 1:n
filename = sprintf('input_%d.dat',i);
string = sprintf('sfbox.exe %s', filename);
system(string);
end

Related

How to write a single Octave script with function definitions compatible with Matlab scripts?

If I write this:
clc
clear
close all
format long
fprintf( 1, 'Starting...\n' )
function results = do_thing()
results = 1;
end
results = do_thing()
And run it with Octave, it works correctly:
Starting...
results = 1
But if I try to run it with Matlab 2017b, it throws this error:
Error: File: testfile.m Line: 13 Column: 1
Function definitions in a script must appear at the end of the file.
Move all statements after the "do_thing" function definition to before the first local function
definition.
Then, if I fix the error as follows:
clc
clear
close all
format long
fprintf( 1, 'Starting...\n' )
results = do_thing()
function results = do_thing()
results = 1;
end
It works correctly on Matlab:
Starting...
results =
1
But now, it stopped working with Octave:
Starting...
error: 'do_thing' undefined near line 8 column 11
error: called from
testfile at line 8 column 9
This problem was explained on this question: Run octave script file containing a function definition
How to fix it without having to create a separate and exclusive file for the function do_thing()?
Is this issue fixed on some newer version of Matlab as 2019a?
The answer is in the comments, but for the sake of clarity:
% in file `do_thing.m`
function results = do_thing()
results = 1;
end
% in your script file
clc; clear; close all; format long;
fprintf( 1, 'Starting...\n' );
results = do_thing();
Accompanying explanatory rant:
The canonical and safest way to define functions is to define them in their own file, and make this file accessible in octave / matlab's path.
Octave has supported 'dynamic' function definitions (i.e. in the context of a script or the command-line) since practically forever. However, for the purposes of compatibility, since matlab did not support this, most people did not use it, and quite sensibly relied on the canonical way instead.
Matlab has recently finally introduced dynamic function definitions too, but has opted to implement them explicitly in a way that breaks compatibility with octave, as you describe above. (rant: this may be a coincidence and an earnest design decision, but I do note that it also happens to go against prior matlab conventions regarding nested functions, which were allowed to be defined anywhere within their enclosing scope).
In a sense, nothing has changed. Matlab was incompatible with advanced octave functionality, and now that it has introduced its own implementation of this functionality, it is still incompatible. This is a blessing in disguise. Why? Because, if you want intercompatible code, you should rely on the canonical form and good programming practices instead of littering your scripts with dynamic functions, which is what you should be doing anyway.
Octave's implementation of local functions in scripts is different from Matlab's. Octave requires that local functions in scripts be defined before their use. But Matlab requires that local functions in scripts all be defined at the end of the file.
So you can use local functions in scripts on both applications, but you can't write a script that will work on both. So just use functions if you want code that will work on both Matlab and Octave.
Examples:
Functions at end
disp('Hello world')
foo(42);
function foo(x)
disp(x);
end
In Matlab R2019a:
>> myscript
Hello world
42
In Octave 5.1.0:
octave:1> myscript
Hello world
error: 'foo' undefined near line 2 column 1
error: called from
myscript at line 2 column 1
Functions before use
disp('Hello world')
function foo(x)
disp(x);
end
foo(42);
In Matlab R2019a:
>> myscript
Error: File: myscript.m Line: 7 Column: 1
Function definitions in a script must appear at the end of the file.
Move all statements after the "foo" function definition to before the first local function definition.
In Octave 5.1.0:
octave:2> myscript
Hello world
42
How it works
Note that technically the functions here in Octave are not "local functions", but "command-line functions". Instead of defining a function that is local to the script, they define global functions that come into existence when the function statement is evaluated.
The following code works on both Matlab and Octave:
if exist('do_nothing') == 0
disp('function not yet defined, run script again')
else
do_nothing
end
%====
function results = do_nothing()
results = 1;
end
When run on octave, the first attempt exits with the message, but subsequent attempts succeed. On Matlab, it works the first time. While this works on both platforms, it is less than ideal, since it requires that much of the script code be placed inside an "if" statement block.

Dealing with infinite loops in Matlab [duplicate]

I am iterating through a large test matrix in MATLAB and calling second-party proprietary software (running in MATLAB) each time. I cannot edit the software source code. Sometimes, the software hangs, so I want to exit it after a certain amount of time and move on to the next iteration.
In pseudocode, I'm doing this:
for i = 1:n
output(i) = proprietary_software(input(i));
end
How can I skip to the next iteration (and possibly save output(i)='too_long') if the proprietary software is taking too long?
You will need to call Matlab from another instance of Matlab. The other instance of Matlab will run the command and release control to the first instance of Matlab to wait while it either saves the results or reaches a certain time. In this case, it will wait 30 seconds.
You will need 1 additional function. Make sure this function is on the Matlab path.
function proprietary_software_caller(input)
hTic=tic;
output=proprietary_software(input);
hToc=toc(hTic);
if hToc<30
save('outfile.mat','output');
end
exit;
end
You will need to modify your original script this way
[status,firstPID] = str2double(system('for /f "tokens=2 delims=," %F in (''tasklist /nh /fi "imagename eq Matlab.exe" /fo csv) do #echo %~F'')'));
for i = 1:n
inputStr=num2str(input(i));
system(['matlab.exe -nodesktop -r proprietary_software_caller\(',inputStr,'\)&']);
hTic=tic;
hToc=toc(hTic);
while hToc<30 || ~(exist('outfile.mat','file')==2)
hToc=toc(hTic);
end
if hToc>=30
output(i)= 'too_long';
[status,allPIDs]=str2double(system('for /f "tokens=2 delims=," %F in (''tasklist /nh /fi "imagename eq Matlab.exe" /fo csv) do #echo %~F'')'));
allPIDs(allPIDs==firstPID)=[];
for a=1:numel(allPIDs)
[status,cmdout]=system(['taskkill /F /pid ' sprintf('%i',allPIDs(a))]);
end
elseif exist('outfile.mat','file')==2
loadedData=load('outfile.mat');
output(i)=loadedData.output;
delete('outfile.mat');
end
end
I hope this helps.
You are essentially asking for a way to implement a timeout on MATLAB code. This can be surprisingly tricky to implement. The first thing to state is that if the MATLAB code in question cannot terminate itself, either by exiting cleanly or throwing an error, then it is not possible to terminate the code without quitting or killing the MATLAB process in question. For example, throwing an error in an externally created timer does not work; the error is caught.
The first question to ask is therefore:
Can the over-running code be made to terminate itself?
This depends on the cause to the over-run, and also your access to the source code:
If the program gets stuck in an infinite (or very long-running) loop, either in MATLAB code or a mex file for which you have source code, or which calls a user-defined callback each iteration, then you can get this code to terminate itself.
If the program gets stuck inside a MATLAB builtin, or a p-code file or mex file for which you don't have the source code, and doesn't have support for calling a callback regularly, then it won't be possible for you to get the code to terminate itself.
Let's address the first case. The easiest way to get the code to terminate itself is to get it to throw an error, which is caught by the caller, if it exceeds the timeout time. E.g. in the OP's case:
for i = 1:n
tic();
try
output(i) = proprietary_software(input(i));
catch
end
end
with the following code somewhere in the over-running loop, or called in a loop callback or mex file:
assert(toc() < 10, 'Timed out');
Now for the second case. You need to kill this MATLAB process, so it makes sense for this to be a MATLAB process you have spawned from your current MATLAB session. You can do this using a system call similar to this:
system('matlab -nodisplay -r code_to_run()')
While it is possible for a MATLAB process to quit itself in some situations which could be of use here (e.g. a timer function calling quit('force')), the most reliable way of killing a MATLAB process is to do it with a system call, using taskkill (Windows) or kill (Linux/Mac).
A framework using the approach of spawning and killing timed-out MATLAB processes might work like this:
Using system calls, launch one or more new MATLAB processes from your MATLAB session, running the code you want.
Use the file system or a memory mapped file to communicate between the MATLAB processes the function inputs, loop progress, outputs, process ids and timeout times.
Use the original MATLAB process to check the timeout times haven't been reached, or if so to terminate the process in question and instantiate a new one.
Use the original MATLAB process to collect up the function outputs (either from the filesystem or memory mapped file) and exit. Workers should terminate when there is no more work left
I provide a sketch only because a full working implementation of this approach is fairly involved, and in fact it has already been implemented and is publicly available in the batch_job toolbox. In the OP's case, using this toolbox (with a 10 second timeout) you'd call:
output = batch_job(#proprietary_software, input(:)', '-timeout', 10);
Note that for the toolbox to work, its root directory needs to be on your MATLAB path at startup.

Using MATLAB on a cluster

I have a software which gets its inputs and outputs from files in text. I just need to run the software with a given input file SIM.DATA using the following command in matlab
[status, result]=system('eclrun -v 2014.2 SIM.DATA',-echo')
if I have multiple .DATA files like SIM1.DATA to SIM4.DATA, and I use a parfor lop to run the software with different input DATa files, it gives and error but when I run it with for loop it works just fine.
I have very basic knowledge of parralel computing and cluster computing, could you please let me know what is wrong with my approach? here is a piece of code i run.
parfor labindex=1:4
cmd{labindex} = ['eclrun -v 2014.2 e300 SIM',num2str(labindex),'.DATA'];
end
parfor labindex=1:4
[status, result] = system([cmd{labindex},'.DATA'],'-echo');
end

Running same script with different input variables using batch command

I would like to run a script called 'myscript' using the batch command:
j = batch('myscript')
My script has a function in the beginning so:
function myscript(input)
...
end
Is it somehow possible to run batch files with different input parameters for my function? I know that there are matlabpool, parfor etc commands, but it is unfortunately not working for me.
The syntax you have to use is indicated in the documentation of batch():
j = batch(fcn,N,{x1, ..., xn})
and in your case
j = batch(fcn, 1, {input})
Alternatively, you can check How do I call MATLAB from the DOS prompt?

Running a matlab program with arguments

I have a matlab file that takes in a file. I would like to run that program in the matlab shell, such as prog. I need to implement it so that it takes a number of arguments, such as "prog filename.txt 1 2 which would mean that i can use filename.txt and 1 2 as variables in my program.
Thank you!
In order to make a script accept arguments from the command line, you must first turn it into a function that will get the arguments you want, i.e if your script is named prog.m, put as the first line
function []=prog(arg1, arg2)
and add an end at the end (assuming that the file has only one function). It's very important that you call the function the same name as the file.
The next thing is that you need to make sure that the script file is located at the same place from where you call the script, or it's located at the Matlab working path, otherwise it'll not be able to recognize your script.
Finally, to execute the script you use
matlab -r "prog arg1 arg2"
which is equivalent to calling
prog(arg1,arg2)
from inside Matlab.
*- tested in Windows and Linux environments
Once your function is written in a separate file, as discussed by the other answer you can call it with a slightly more complicated setup to make it easier to catch errors etc.
There is useful advice in this thread about ensuring that Matlab doesn't launch the graphical interface and quits after finishing the script, and reports the error nicely if there is one.
For example:
matlab -nodisplay -nosplash -r "try, prog(1, 'file.txt'), catch me, fprintf('%s / %s\n',me.identifier,me.message), exit(1), end, exit(0)"
The script given to Matlab would read as follows if line spaces were added:
% Try running the script
try
prog(1, 'file.txt')
catch me
% On error, print error message and exit with failure
fprintf('%s / %s\n',me.identifier,me.message)
exit(1)
end
% Else, exit with success
exit(0)