lsqcurvefit fails depending on platform - matlab

I am trying to process an extremely large dataset which requires that I do several million non-linear curve fits. I have acquired a dedicated piece of code that is designed to be used for the data I have collected, which at its heart uses the MATLAB function lsqcurvefit. All works well when I run it on my laptop, except that the fitting is too slow to be useful to me right now, which is not too surprising considering that the model function is quite complicated. To put this in perspective, my laptop can only process about 8000 fits per hour, and I have on the order of tens of millions of fits to do.
Fortunately I have access to a computing cluster at my institution, which should enable me to process this data in a more reasonable time frame. The issue that has arisen is that - despite being cross-platform - there seems to be some significant difference between what the MATLAB code is doing on my Windows laptop and the cluster. Despite running the exact same code, on exactly the same data, with the same version of MATLAB, the code running on the Unix cluster fails with the following error message:
Error using eig
Input to EIG must not contain NaN or Inf.
Error in trust (line 29)
[V,D] = eig(H);
Error in trdog (line 109)
[st,qpval,po,fcnt,lambda] = trust(rhs,MM,delta);
Error in snls (line 311)
[sx,snod,qp,posdef,pcgit,Z] = trdog(x,g,A,D,delta,dv,...
Error in lsqncommon (line 156)
snls(funfcn,xC,lb,ub,flags.verbosity,options,defaultopt,initVals.F,initVals.J,caller,
...
Error in lsqcurvefit (line 254)
lsqncommon(funfcn,xCurrent,lb,ub,options,defaultopt,caller,...
I can confirm that there are no infinities or NaNs in my data, which this error message might initially seem to suggest. I can only conclude that using a different platform leads to some differing accuracy in execution, which probably leads to a divide by zero error somewhere along the way. My question is - how can I make this code run on the cluster?
For reference, my laptop is running Windows 7 Professional 64-bit, with an Intel i5 5200U 2.20GHz x4, and the cluster runs Scientific Linux 6.7 x86_64, with various Intel Xeon proccessors, with both running MATLAB R2015b.

Related

Matlab out of memory error while solving ODE

I have to integrate an ODE of 8 variable in matlab. My simulation time is 5e9 with a time step of 0.1. But it shows memory error. I am working with i7 core ,2.6Ghz CPU with 8GB RAM. How can I simulate ODEs for a large time samples?
Assuming you're working on 64 Bit version of MATLAB you might want to let MATLAB squeeze the memory to the edge using the Preferences -> MATLAB -> Workspace -> MATLAB Array Size Limit.
If you are getting this erro because you really mximized the memory in the system do the following:
Make sure you're using 64 Bit OS and 64 Bit version of MATLAB.
Before you call the ODE function, clear manually (using the clear() function) variables you don't need any more (Or can recreate once the function finishes).
Increase the swap file of your system. It will help with larger memory consumption but might make things much slower.
You can find more tips and tricks in Resolve "Out of Memory" Errors and memory().

How do I force MATLAB to run deep learning code on the CPU instead of the GPU?

I don't have CUDA-enabled Nvidia GPU, and I want to force MATLAB to run the code on CPU instead of GPU (yes, I know, it will be very very slow). How can I do it?
As an example, let’s try to run this code on my PC without CUDA. Here is the error given by MATLAB:
There is a problem with the CUDA driver or with this GPU device. Be sure that you have a supported GPU and that the latest driver is installed.
Error in nnet.internal.cnn.SeriesNetwork/activations (line 48)
output = gpuArray(data);
Error in SeriesNetwork/activations (line 269)
YChannelFormat = predictNetwork.activations(X, layerID);
Error in DeepLearningImageClassificationExample (line 262)
trainingFeatures = activations(convnet, trainingSet, featureLayer, ...
Caused by:
The CUDA driver could not be loaded. The library name used was 'nvcuda.dll'. The error was:
The specified module could not be found.
With R2016a, the ConvNet "functionality requires the Parallel Computing Toolbox™ and a CUDA®-enabled NVIDIA® GPU with compute capability 3.0 or higher."
See: http://uk.mathworks.com/help/nnet/convolutional-neural-networks.html
The code example that you link to requires a GPU. As such the solution is very simple:
You need to use different code.
In your question it is not mentioned specifically what you are trying to achieve, so it is hard to say whether you would need to create something your self or will be able to pick up an exisiting solution, but this CPU vs GPU deep learning benchmark may be an inspiration.

Multiple sequence alignment of 12 species

i need to perform MSA( multiple sequence alignment on nucleotide sequences of 12 wheat varieties. all these varieties have different length bps(base pairs).I followed this documentation of MATLAB http://www.mathworks.in/help/bioinfo/ref/multialign.html. But when i type this "
ma = multialign(p53,tree,'ScoringMatrix',...
{'pam150','pam200','pam250'})
showalignment(ma)"
i get an error :
??? Out of memory. Type HELP MEMORY for
your options.
Error in ==> profalign>affinegap at 648
F = zeros(n+1,m+1,numStates);
Error in ==> profalign at 426
[F, pointer] =
affinegap(prof1,len1,prof2,len2,SM,go1,go2,ge1,ge2,wg1,wg2);
Error in ==> multialign at 655
[profs{rootInd} h1 h2] =
profalign(profs{[i,rootInd]},...
Please help
This is a hard problem to debug, because it is highly dependent on your specific settings. As mentioned in the comments, Matlab is saying that it ran out of memory. This might be because of the way you have Matlab configured or because your computer doesn't have enough RAM (or maybe you were using too much RAM for other things at the time). It's also possible that you just gave it more data than it can handle. However, assuming that the sequences aren't unreasonably long, 12 sequences should be pretty manageable for a progressive alignment algorithm, which multalign seems to be.
Given all of those variables, the simplest solution is just to avoid trying to run it on your computer. There are websites where you can submit your data to be aligned on a server that will definitely have sufficient RAM. The most popular of such websites is ClustalOmega, the successor to ClustalW. These sites will generally return results fairly quickly.

Very slow execution of Matlab code under ubuntu

I was using MATLAB 2012a under windows 7 and I was executing some intense code, and I mean by intense in terms of memory usage and processing time, however, the code was working fine on Windows. Now, I changed my OS to ubuntu 12.04 and I installed Matlab 2013a. The amount of memory used is considerably less than the way it was in Windows, but the time taken by matlab to execute the same code is extremely high-really high.
I need to mention that my code contain nothing that may take such huge time except a statement of sparse with symbolic substitution as one of the arguments as follows
K=zeros(Np,Np);
for i=1:ord
K=K+sparse(t(1:ord,:),repmat(t(i,:),ord,1),double(subs(Kv(:,i),Arg(Kv,1,1,6),Arg(Kv,1,2,6))),Np,Np);
end
Note: that Kv is a symbolic matrix and Arg is a function to provide OLD and NEW and it depends on a number of global variables.
I have the feeling that I missed to add something to ubuntu that might help accelerate the execution of the Matlab codes.
Any ideas ?
I had a similar problem at windows, but I believe the solution is same on Ubuntu LTS.
So, if you increase the Java Heap Memory of Matlab, the Matlab will consume more memory from your system but it will be faster.
To do that go to:
File->preferences->General->Java Heap Memory and increase to the maximum.
The default value is 128, that is too little.
If heap memory limit doesn't fix the issue, then try increasing matlab process.
First start matlab, then do
ps aux|grep MATLAB
In my case the result is:
comtom 9769 28.2 19.8 4360632 761808 tty2 S<l+ 14:00 1:50 /usr/local/MATLAB/MATLAB_Production_Server/R2015a/bin/glnxa64/MATLAB -desktop
Look at first number (PID). Then use it with command renice to change process priority:
renice -3 -p 9769
That's it. The GUI is very slow because it's built against outdated Xorg libs. So changing priority helps, you may notice some gnome effect's tear, but matlab's interface will work a lot better.

Out of memory using svmtrain in Matlab

I have a set of data that I am trying to learn using SVM. For context, the data has a dimensionality of 35 and contains approximately 30'000 data-points.
I have previously trained decision trees in Matlab with this dataset and it took approximately 20 seconds. Not being totally satisfied with the error rate, I decided to try SVM.
I first tried svmtrain(X,Y). After about 5 seconds, I get the following message:
??? Error using ==> svmtrain at 453
Error calculating the kernel function:
Out of memory. Type HELP MEMORY for your options.
When I looked up this error, it was suggested to me that I use the SMO method: svmtrain(X, Y, 'method', 'SMO');. After about a minute, I get this:
??? Error using ==> seqminopt>seqminoptImpl at 236
No convergence achieved within maximum number (15000) of main loop passes
Error in ==> seqminopt at 100
[alphas offset] = seqminoptImpl(data, targetLabels, ...
Error in ==> svmtrain at 437
[alpha bias] = seqminopt(training, groupIndex, ...
I tried using the other methods (LS and QP), but I get the first behaviour again: 5 second delay then
??? Error using ==> svmtrain at 453
Error calculating the kernel function:
Out of memory. Type HELP MEMORY for your options.
I'm starting to think that I'm doing something wrong because decision trees were so effortless to use and here I'm getting stuck on what seems like a very simple operation.
Your help is greatly appreciated.
Did you read the remarks near the end about the algorithm memory usage?
Try setting the method to SMO and use a kernelcachelimit value that is appropriate to the memory you have available on your machine.
During learning, the algorithm will build a double matrix of size kernelcachelimit-by-kernelcachelimit. default value is 5000
Otherwise subsample your instances and use techniques like cross-validation to measure the performance of the classifier.
Here is the relevant section:
Memory Usage and Out of Memory Error
When you set 'Method' to 'QP', the svmtrain function operates on a
data set containing N elements, and it creates an (N+1)-by-(N+1)
matrix to find the separating hyperplane. This matrix needs at least
8*(n+1)^2 bytes of contiguous memory. If this size of contiguous
memory is not available, the software displays an "out of memory"
error message.
When you set 'Method' to 'SMO' (default), memory consumption is
controlled by the kernelcachelimit option. The SMO algorithm stores
only a submatrix of the kernel matrix, limited by the size specified
by the kernelcachelimit option. However, if the number of data points
exceeds the size specified by the kernelcachelimit option, the SMO
algorithm slows down because it has to recalculate the kernel matrix
elements.
When using svmtrain on large data sets, and you run out of memory or
the optimization step is very time consuming, try either of the
following:
Use a smaller number of samples and use cross-validation to test the performance of the classifier.
Set 'Method' to 'SMO', and set the kernelcachelimit option as large as your system permits.