I am running a parallelized code courtesy the MATLAB Parallel Computing Toolbox using the spmd command. Specifically, the code is like this:
spmd
out = function(data,labindex);
end
Now the function involves a library (libsvm) which gives me a trained classifier for each iteration. During the training process, there are several debug messages being printed out to the standard output by the library and somehow these are not appearing on my standard terminal - I think this is because the workers are actually on a cluster and hence the debug messages are not visible to me.
Is there anyway to reroute the debug messages ? (possibly other than writing to a file on a shared disk)
One option may be to try the Parallel Command Window. This opens a new special Command Window with one pane per lab. You'll need to run commands from the "P>>" pmode prompt in this window. More here.
Related
I want to have two MATLAB windows open on the same computer. The desired scenario is as follows: MATLAB window 1 is continuously running a script that has nothing to do with MATLAB window 2. At the same time, MATLAB window 2 is running a script that continuously checks for a certain condition, and if it is met, then it will terminate the script running on MATLAB window 1, and then terminate its own script as well. I want to have two MATLAB windows instead of one since I believe it will be more time efficient for what I am trying to do. I found an interesting "KeyInject" program at http://au.mathworks.com/matlabcentral/fileexchange/40001-keyinject , but I was wondering if there is a simpler way already built into MATLAB.
Do you want simple, or a flexible, infinitely expandable version 1.0? Simple would be to trigger System A via a file created by System B.
Simple would have System B create a file, then System A would check for the file with the command
if exist ( fileName, 'file' )
then do your shutdown commands. On startup, System A would delete the file with
delete ( fileName );
The second option is to use the udp command. UDP allows any data to be sent between processes, whether on the same computer or over a network. (See https://www.mathworks.com/help/instrument/udp.html for more info).
I see several ways:
Restructure to avoid this XY problem
Use (mat) files (as Hoki suggested), possibly using the parallel computing toolbox to keep everything in one MATLAB session.
Write some MEX functions that communicate with each other via a global pipe.
Write an Auto(Hot)key script.
Option 2 is probably easiest. Take a look at events and listeners if you write in OOP, otherwise, you'd have to poll inside a loop
Option 3 is harder and way more time consuming to implement, but allows for much faster detection of the condition, and much faster data transfer between the sessions. Use only if speed is essential...but I guess that doesn't apply :)
Option 4: the AutoHotkey solution is probably the most Horrible Thing® you could do on an already Horrible Construction®, but oh what fun!! In both MATLAB sessions, you create a (hidden) figure with the name Window1 or Window2, respectively. These window names are something that AutoHotkey can easily track. If the conditions are met, you update the corresponding window name, triggering the remainder of the AutoHotkey script: press a button in the other window! If you need to transfer data between the windows: you can create basic edit boxes in both GUIs, and copy-paste the data between them. If you're on Linux: you can use Autokey for the same purpose, but by then you're basically writing Python code doing the heavy lifting, so just use Python.
Or, you know, use KeyInject. Less fun.
I run on my computer very often Matlab programs compiled using mcc, in which I execute parfor. Each program has slow startup time, I think because the parallel worker pool is created (it takes about 20 seconds just to startup the parallel pool). It would be more efficient for me if the pool could remain open all the time in the background. For example when opening a parpool in the matlab interface, it says that the parpool will remain open for 30 minutes and so there is no need to open a parpool for each matlab script. Is something like that possible also when the code is compiled, or are there other solutions?
You could increase the time for which the pool is opened. During testing, you can type
>> preferences
and choosing Parallel Computing Toolbox settings on left menu .
You can achieve the same result adding to the code
p = parpool
p.IdleTimeout = 120 %minutes
If you have the pool opened for a longer time you should be able to run multiple scripts without the need for opening and closing it multiple times.
I would avoid leaving it opened permanently.
I often call computationally intensive command-line programs from within MATLAB using the system command:
[status, result] = system(cmd_line_for_my_low_level_exe, '-echo');
where the -echo option (supposedly) echoes console output (stdout) generated by low_level_exe in the MATLAB command window.
On Linux machines this works great, with MATLAB echoing the console output in (seemingly) real-time. Users get a nice continuous update on low_level_exe's progress.
On Windows machines this is not the case. It can often be many minutes in-between echoes, and users sometimes get impatient and assume the code has crashed...
Is there a way to increase/control the frequency of MATLAB's -echo, or possibly another, better option entirely? (I'd prefer to stay away from mex files to maintain compatibility with Octave).
Is this actually a MATLAB issue, or just a Linux/Windows incompatibility?
Is there a way to call Matlab functions from outside, in particular by the Windows cmd (but also the Linux terminal, LUA-scripts, etc...), WITHOUT opening a new instance of Matlab each time?
for example in cmd:
matlab -sd myCurrentDirectory -r "function(parameters)" -nodesktop -nosplash -nojvm
opens a new instance of Matlab relatively fast and executes my function. Opening and closing of this reduced matlab prompt takes about 2 seconds (without computations) - hence for 4000 executions more than 2 hours. I'd like to avoid this, as the called function is always located in the same workspace. Can it be done in the same instance always?
I already did some research and found the possibility of the MATLAB COM Automation Server, but it seems quite complicated to me and I don't see the essential steps to make it work for my case. Any advices for that?
I'm not familiar with c/c++/c# but I'm thinking about the use of python (but just in the worst case).
Based on the not-working, but well thought, idea of #Ilya Kobelevskiy here the final workaround:
function pipeConnection(numIterations,inputFile)
for i=1:numIterations
while(exist('inputfile','file'))
load inputfile;
% read inputfile -> inputdata
output = myFunction(inputdata);
delete('inputfile');
end
% Write output to file
% Call external application to process output data
% generate new inputfile
end;
Another convenient solution would be to compile an executable of the Matlab function:
mcc -m myfunction
run this .exe-file using cmd:
cd myCurrentDirectory && myfunction.exe parameter1 parameter2
Be aware that the parameters are now passed as strings and the original .m-file needs to be adjusted considering that.
further remarks:
I guess Matlab still needs to be installed on the system, though
it is not necessary to run it.
I don't know how far this method is limited respectively the complexity of the
underlying function.
The speed-up compared to the initial apporach given in the question is
relatively small
Amongst the several methods exposed here, there is one workaround that should reduce the execution time of your multiple matlab calls. The idea is to run a custom function multiple times within on matlab session.
For example, myRand.m function is defined as
function r = myRand(a,b)
r = a + (b-a).*rand;
Within the matlab command window, we generate the single line command like this
S = [1:5; 1:5; 101:105];
cmd_str = sprintf('B(%d) = myRand(%d,%d);', S)
It generates the following command string B(1) = myRand(1,101);B(2) = myRand(2,102);B(3) = myRand(3,103);B(4) = myRand(4,104);B(5) = myRand(5,105); that is executed within a single matlab session with
matlab -nojvm -nodesktop -nosplash -r "copy_the_command_string_here";
One of the limitation is that you need to run your 4000 function calls in a row.
I like approach proposed by Magla, but given the constrains stated in your comment to it, it can be improved to still run single function in one matlab session.
Idea is to pipe your inputs and outputs. For inputs, you can check if certain input file exists, if it does, read input for your function from it, do work, write output to another file to signal script/function processing results that it matlab function is done and is waiting for the next input.
It is very straightforwad to implement using disk files, with some effort it is probably possible to do through memory disk (i.e., open input/output fiels in RAM).
function pipeConnection(numIterations,inputFile,outputFile)
for i=1:numIterations
while(!isfile(inputFile))
sleep(50);
end;
% Read inputs
output = YourFunction(x,y,z);
% Write output to file, go to next iteration
end;
return;
If number of iterations is unknown when you start, you can also encode exit conditions in input file rather than specifying number of iterations right away.
If you're starting up MATLAB from the command line with the -r option in the way you describe, then it will always start a new instance as you describe. I don't believe there's a way around this.
If you are calling MATLAB from a C/C++ application, MATLAB provides the MATLAB engine interface, which would connect to any running instance of MATLAB.
Otherwise the MATLAB Automation Server interface that you mention is the right way to go. If you're finding it complicated, I would suggest posting a separate question detailing what you've tried and what difficulties you're having.
For completeness, I'll mention that MATLAB also has an undocumented interface that can be called directly from Java - however, as it's undocumented it's very difficult to get right, and is subject to change across versions so you shouldn't rely on it.
Edit: As of R2014b, MATLAB makes available the MATLAB Engine for Python, via which you can automate MATLAB from a Python script. And as of R2016b, there is also the MATLAB Engine for Java. If anyone was previously considering the undocumented Java techniques mentioned above, this would now be the way to go.
I have access to a cluster running Torque, but installing the MATLAB Distributed Computing Engine is not an option. I am wondering if it is possible to use the MPI commands in MATLAB without the extra features like distributed arrays. Is it possible to use the MATLAB lab* commands in conjunction with the mpirun commands if you don't have the Distributed Computing Engine?
If your MPI implmentation is Open MPI you can use the Poor Man's Parallel Toolbox (tm) which allows you to run many MATLAB instances in parallel on many nodes and have each of them do something different, e.g. run a different script. The key to success lies in the fact that Open MPI exports the rank of current process in the environment variable OMPI_COMM_WORLD_RANK and a simple shell script can be used to wrap around the execution. Here is a sample:
#!/bin/bash
file_num=script$(printf "%03d" $(($OMPI_COMM_WORLD_RANK + 1))).m
matlab < $file_num
One would launch this as:
mpiexec -np 24 ./script.sh
This will launch 24 copies of MATLAB, each receiving input from different scripts. The first one would get commands from script001.m, the second one from script002.m, and so on.
Of course, you can always write your parallel code in C or C++, or even Fortran, and use MPI there. Then compile the code into a shared library, loadable and callable from MATLAB.