Related
I am writing a report for a class and am having some issues with the lines of an unstable plot going beyond the boundary of the graph and overlapping the title and xlabel. This is despite specifying a ylim from -2 to 2. Is there a good way to solve this issue?
Thanks!
plot(X,u(:,v0),X,u(:,v1),X,u(:,v2),X,u(:,v3),X,u(:,v4))
titlestr = sprintf('Velocity vs. Distance of %s function using %s: C=%g, imax=%g, dx=%gm, dt=%gsec',ICFType,SDType,C,imax,dx,dt);
ttl=title(titlestr);
ylabl=ylabel("u (m/s)");
xlabl=xlabel("x (m)");
ylim([-2 2])
lgnd=legend('t=0','t=1','t=2','t=3','t=4');
ttl.FontSize=18;
ylabl.FontSize=18;
xlabl.FontSize=18;
lgnd.FontSize=18;
EDIT: Minimum reproducible example
mgc=randi([-900*10^10,900*10^10], [1000,2]);
mgc=mgc*1000000;
plot(mgc(:,1),mgc(:,2))
ylim([-1,1])
This is odd. It really looks like a Bug... partly
The reason is probably that the angle of the lines are so narrow that MATLAB runs into rounding errors when calculating the points to draw for your limits very small limits given very large numbers. (You see that you don't run into this problem when you don't scale the matrix mgc.
mgc = randi([-900*10^10,900*10^10], [1000,2]);
plot(mgc(:,1),mgc(:,2))
ylim([-1,1])
but if you scale it further, you run into this problem...
mgc = randi([-900*10^10,900*10^10], [1000,2]);
plot(mgc(:,1)*1e6,mgc(:,2)*1e6)
ylim([-1,1])
While those numbers are nowhere near the maximum number a double can represent (type realmax in the command window to see that this is a number with 308 zeros!); limiting the plot to [-1,1] on one of the axes -- note that you obtain the same phenom on the x-axis -- let MATLAB run into precision problems.
First of all, you see that it plots much less lines than before (in my case)... although, I just said to zoom on the y-axis. The thing is, that MATLAB does not recalculate the lines for the section but it really zooms into it (I guess that this may cause resolution errors with regard to pixels?)
Well, lets have a look at the data (Pro-tip, you can get the data of a line from a MATLAB figure by calling this snippet
datObj = findobj(gcf,'-property','YData','-property','XData');
X = datObj.XData;
Y = datObj.YData;
xlm = get(gca,'XLim'); % get the current x-limits
) We see that it represents the original data set, which is not surprising as you can also zoom out again.
Note that his only occurs if you have such a chaotic, jagged line. If you sort it, it does not happen.
quick fix:
Now, what happens, if we calculate the exact points for this section?
m = diff(Y)./diff(X); % slope
n = Y(1:end-1)-m.*X(1:end-1); % offset
x = [(-1-n); (1-n)]./m;
y = ones(size(x))./[-1 1].';
% plot
plot([xMinus1;xPlus1],(ones(length(xMinus1),2).*[-1 1]).')
xlim(xlm); % limit to exact same scale as before
The different colors indicate that they are now individual lines and not a single wild chaos;)
It seems Max pretty much hit the nail on the head as it pertains to the reason for this error is occurring. Per Enrico's advice I went ahead and submitted a bug report. MathWorks responded saying they were unsure it was "unexpected behavior" and would look into it more shortly. They also did suggest a temporary workaround (which, in my case, may be permanent).
This workaround is to put
set(gca,'ClippingStyle','rectangle');
directly after the plotting line.
Below is a modified version of the minimum reproducible example with this modification.
mgc=randi([-900*10^10,900*10^10], [1000,2]);
mgc=mgc*1000000;
plot(mgc(:,1),mgc(:,2))
set(gca,'ClippingStyle','rectangle');
ylim([-1,1])
I have a signal that looks like this :
I would like to find a way to locate the start and end of the portion of the middle.
What I did, is for the values above and below 0.5 to be a constant ==> 1, and if I find many times 1 in the row it means that it's my signal... but it's not a good way I guess! First my "threshold" would not be every time 0.5, and I am sure it exists some better way to do so.
If you guys have some documentations or ideas about that..
Thank you very much.
As mentioned by the others it is more of a DSP question and dsp.stackexchange.com will propably give you a better answer, but until then this might help:
data=csvread('acceleration.txt',1)
threshold_y=max(data)*0.5; %Thanks to GameOfThrows
thershold_x=101; %how many zeros can be between to ones to still count as continuous
addframe=50; %if you want a little bit of data before and after the active area
logic_index=data>threshold_y;
num_index=find(logic_index);
distance=diff(num_index);
gaps=[1 ; find(distance>thershold_x)]; %find the gaps bigger than your threshold
final_index=false(length(data),1);
for i=1:length(gaps)-1 %add ones between
final_index(num_index(gaps(i)+1)-addframe:num_index(gaps(i+1))+addframe)=true;
end
plot(x,data,x,final_index);
it is basically what you decribed in your question but with the addition of dealing with the zeros inbetween an area. thanks to #GameofThrows for the threshold idea.
I am using MATLAB to execute a triple integral using integral3 and it is running very slow. I was wondering if there ways to speed it. I am guessing its due to the fact that I set the abstol wrong. Not sure how to handle it. PS the code below works with no syntax error. There are a couple of things I dont know how to pick, abstol, method etc..
clear all
syms gamma1
syms gamma2
syms z
syms v
Nt=16; sigmanoise=10^(-7.9); c3=0.129; c1=(1-c3)/2;a2=0;b2=0;
a1=0.0030; b1= 0.0030; A1= 1.5625e-04,A2=0; B1= 7.8125e-05;B2=0;
theta= 3.1623;lambda1= 4.9736e-05;lambda2=0;p1=1;p2=0; alpha1=2; alpha2=4;delta1=2/alpha1; delta2=2/alpha2;beta1=0.025; beta2=0.025;
a= gamma1^-1+gamma2^-1+2*gamma1^(-0.5)*gamma2^(-0.5);
laplacesgi=(exp(+2*pi*j.*z*a)-1)./(2*pi*j*z);
laplacesgi=matlabFunction(laplacesgi);
laplacenoi=exp(-2*pi*j.*z*theta*sigmanoise/Nt);
laplacenoi=matlabFunction(laplacenoi);
interfere= #(gamma1,gamma2,v,z)( (1 -2*c1-c3./(1+2*pi*j*z*theta*v.^(-1))).*(A1.*(v).^(delta1-1).*exp(-a1.*(v).^ (delta1./2))+B1.*(v).^(delta2-1) .*(1-exp(-b1.*(v).^ (delta2./2)))));
gscalar =#(gamma1,gamma2,z)integral(#(v)(interfere(gamma1,gamma2,v,z)),gamma2,inf);
g = #(gamma1,gamma2,z)arrayfun(gscalar,gamma1,gamma2,z);
lp= A1*(gamma1)^(delta1-1)*exp(-a1*(gamma1)^ (delta1/2))+B1*(gamma1)^(delta2-1)*(1-exp(-b1*(gamma1)^ (delta2/2)))+A2*gamma1^(delta1-1)*exp(-a2*gamma1^(delta1/2))+ B2*gamma1^(delta2-1)*(1-exp(-b2*gamma1^ (delta2/2)));%;
dk1=((2*pi*lambda1))/(beta1^2)*(1-exp(-a1*(gamma2)^(delta1/2))*(1+(gamma2)^(delta1/2)*a1))+ pi*lambda1*gamma2^(delta2)*p1^delta2-((2*pi*lambda1)/(beta1^2))*(1-exp(-b1*(gamma2)^(delta2/2))*(1+(gamma2)^(delta2/2)*b1));
dk2=((2*pi*lambda2))/(beta2^2)*(1-exp(-a2*(gamma2)^(delta1/2))*(1+(gamma2)^(delta1/2)*a2))+ pi*lambda2*gamma2^(delta2)*p2^delta2-((2*pi*lambda2)/(beta2^2))*(1-exp(-b2*(gamma2)^(delta2/2))*(1+(gamma2)^(delta2/2)*b2));
dk=dk1+dk2;
lcp= A1*(gamma2)^(delta1-1)*exp(-a1*(gamma2)^ (delta1/2))+B1*(gamma2)^(delta2-1)*(1-exp(-b1*(gamma2)^ (delta2/2)))+A2*gamma2^(delta1-1)*exp(-a2*gamma2^ (delta1/2))+ B2*gamma2^(delta2-1)*(1-exp(-b2*gamma2^(delta2/2)));%;
pdflast=lp*lcp*exp(-dk);
pdflast=matlabFunction(pdflast);
pdflast= #(gamma1,gamma2)arrayfun(pdflast,gamma1,gamma2);
gamma2min=#(gamma1)gamma1;
warning('off','MATLAB:integral:MinStepSize');
T = integral3(#(gamma1,gamma2,z)(laplacenoi(z).*laplacesgi(gamma1,gamma2,z).*pdflast(gamma1,gamma2).*exp(-g(gamma1,gamma2,z))),0,inf,#(gamma2)gamma2,inf,0.05,1000,'abstol',1e-3)
I appreciate any ideas or suggestions.
This is getting way too long for a comment, and while it doesn't really give an answer either, I think it may be helpful anyway, so I will slightly abuse the answer form for it.
Code Readability
I don't think your code as it stands fulfills the basic fundamental purpose of code: Communicating with a human being, probably yourself down the road.
I don't know if the variable names are unambiguous enough that in six months, they will still tell you exactly what is what. If they are, great. If not, you may want to improve upon them. (And yes, naming stuff is one of the hardest parts of programming, but that doesn't make it less important.)
The same holds true for comments: If you don't need comments on your formulas, more power to you. I have no idea what you are computing, so the fact that I don't understand your formulas doesn't mean much. But again, think of yourself in a few months, looking for a problem: Would you have wished for a comment such that you know if that factor is really correct or off by one?
Here's something I do know: Your formulas are simply too wide to be comprehended at once. Simple reformatting helps to see the structure better. Here's how I reformatted your code to start making heads or tails from it:
clear all
syms gamma1
syms gamma2
syms z
syms v
Nt=16;
sigmanoise=10^(-7.9);
c3=0.129;
c1=(1-c3)/2;
a2=0;
b2=0;
a1=0.0030;
b1=0.0030;
A1=1.5625e-04;
A2=0;
B1=7.8125e-05;
B2=0;
theta=3.1623;
lambda1=4.9736e-05;
lambda2=0;
p1=1;
p2=0;
alpha1=2;
alpha2=4;
delta1=2/alpha1;
delta2=2/alpha2;
beta1=0.025;
beta2=0.025;
a=gamma1^(-1)+gamma2^(-1)+2*gamma1^(-0.5)*gamma2^(-0.5);
laplacesgi=matlabFunction((exp(2*pi*1j*z*a)-1)./(2*pi*1j*z));
laplacenoi=matlabFunction(exp(-2*pi*1j*z*theta*sigmanoise/Nt));
interfere= #(gamma1,gamma2,v,z)( ...
(1 -2*c1-c3./(1+2*pi*j*z*theta*v.^(-1))).*(A1.*v.^(delta1-1).* ...
exp(-a1.*v.^(delta1./2))+B1.*v.^(delta2-1).*(1-exp(-b1.*v.^(delta2./2)))));
gscalar=#(gamma1,gamma2,z)integral(#(v)(interfere(gamma1,gamma2,v,z)),gamma2,inf);
g=#(gamma1,gamma2,z)arrayfun(gscalar,gamma1,gamma2,z);
lp=A1*gamma1^(delta1-1)*exp(-a1*gamma1^(delta1/2))+ ...
B1*gamma1^(delta2-1)*(1-exp(-b1*gamma1^(delta2/2)))+ ...
A2*gamma1^(delta1-1)*exp(-a2*gamma1^(delta1/2))+ ...
B2*gamma1^(delta2-1)*(1-exp(-b2*gamma1^(delta2/2)));
dk1=((2*pi*lambda1))/(beta1^2)*(1-exp(-a1*gamma2^(delta1/2))*(1+gamma2^(delta1/2)*a1))+ ...
pi*lambda1*gamma2^(delta2)*p1^delta2- ...
((2*pi*lambda1)/(beta1^2))*(1-exp(-b1*gamma2^(delta2/2))*(1+gamma2^(delta2/2)*b1));
dk2=((2*pi*lambda2))/(beta2^2)*(1-exp(-a2*gamma2^(delta1/2))*(1+gamma2^(delta1/2)*a2))+ ...
pi*lambda2*gamma2^(delta2)*p2^delta2- ...
((2*pi*lambda2)/(beta2^2))*(1-exp(-b2*gamma2^(delta2/2))*(1+gamma2^(delta2/2)*b2));
dk=dk1+dk2;
lcp=A1*gamma2^(delta1-1)*exp(-a1*gamma2^(delta1/2))+ ...
B1*gamma2^(delta2-1)*(1-exp(-b1*gamma2^(delta2/2)))+ ...
A2*gamma2^(delta1-1)*exp(-a2*gamma2^(delta1/2))+ ...
B2*gamma2^(delta2-1)*(1-exp(-b2*gamma2^(delta2/2)));
pdflast=matlabFunction(lp*lcp*exp(-dk));
pdflast=#(gamma1,gamma2)arrayfun(pdflast,gamma1,gamma2);
gamma2min=#(gamma1)gamma1;
T = integral3(#(gamma1,gamma2,z)( ...
laplacenoi(z).*laplacesgi(gamma1,gamma2,z).*pdflast(gamma1,gamma2).*exp(-g(gamma1,gamma2,z))), ...
0,inf,...
#(gamma2)gamma2,inf,...
0.05,1000,...
'abstol',1e-3)
A few notes one this:
MATLAB is one of the languages that require an indication that the logical line should continue after the physical line break. The indication in MATLAB is three dots.
Get rid of any and all warnings the MATLAB editor shows you. In very rare cases, by disabling the warning for this line; usually, by correcting your code. Some of these warnings may seem over-protective, but coe quickly reaches the point where none of us can have enough of it in our minds to see the more subtle problems, and linting helps avoid a fair number of them, in my experience.
Consistent spacing helps, in the same way “proper” (i.e., standardized) spelling makes reading English easier: The patterns are just much more obvious.
Line breaks should in general not be done haphazardly, but emphasizing the structure of commands and formulas. In several of your formulas, I have seen symmetries between input parameters and tried to make them obvious by placing the line breaks accordingly. That helps a lot when looking for typos.
Your code has lines such as these:
pdflast=lp*lcp*exp(-dk);
pdflast=matlabFunction(pdflast);
I used to recycle variables like that, too. Over time, I learned the hard way that it helps for debugging and readability not to, especially if your values have different types, as they do here.
There are a few points I would still clean up at this point. For example, pdflast works just fine on arrays and the line pdflast= #(gamma1,gamma2)arrayfun(pdflast,gamma1,gamma2); should be deleted, and the lower bound for gamma2 in the integral3 call is a function of gamma1 and should be changed to #(gamma1)gamma1.
Does the computer/MATLAB care about any of this? Maybe something slipped in where it does, but basically: No. All of these changes are for you, and if you send your code in an SO post, for us, the readers.
(Likely) Bug: Vectorization
I think your definition of g is wrong:
g=#(gamma1,gamma2,z)arrayfun(gscalar,gamma1,gamma2,z);
The cubature (i.e., integral3) will try to call this function with non-scalar values for one or more of the parameters. Most likely, these will not all be of the same size, and even if they were, it would expect to get a 3D result, not a vector. Try calling your g that way:
>> g(1:2,1,1)
Error using arrayfun
All of the input arguments must be of the same size and shape.
Previous inputs had size 2 in dimension 2. Input #3 has size 1
Error in #(gamma1,gamma2,z)arrayfun(gscalar,gamma1,gamma2,z)
It's really a good idea to check intermediate building blocks like that. What your really need to have is an arrayfun over gamma2, something like this:
gscalar=#(gamma1,gamma2,z) ...
integral(#(v)(interfere(gamma1,gamma2,v,z)),gamma2,inf, ...
'ArrayValued',true);
g = #(gamma1,gamma2,z)arrayfun(#(gamma2)gscalar(gamma1,gamma2,z),gamma2);
(Possible) Bug: Definition of interfere
I don't know if you tried checking interfere against any known or suspected values. (Sanity checks for formulas I just typed seem a really good idea to me.) I somehow doubt that the formula is correctly capturing your intent:
interfere=#(gamma1,gamma2,v,z)( ...
(1-2*c1-c3./(1+2*pi*1j*z*theta*v.^(-1))).*(A1.*v.^(delta1-1).* ...
exp(-a1.*v.^(delta1./2))+B1.*v.^(delta2-1).*(1-exp(-b1.*v.^(delta2./2)))));
The potential problem with this formula (apart from a somewhat inconsistent use of * vs. .* etc.) is that the values do not depend on gamma1 and gamma2 at all.
Of course, that can happen, but if you actually mean it to be the case, what is the rationale for including gamma1 in the formula in the first place?
If this is as it should be, you may need to still make the result the proper size: Right now, interfere simply ignores its first two inputs, which may trip up the integrator: interfere(1:3,1,1,1) should return a 3-element vector.
Concluding Thoughts
As you may have noticed, your question did not get a satisfying answer yet. Nor do I think in its current form it will. To get volunteers to look at your problem, you need to make it easy to understand what you are doing:
Start by simplifying your formulas. They may not be of interest to you anymore, but right now, they're just clutter.
Trim down your parameters. That is somehow part of the above.
Throw out things that are probably irrelevant. Apart from the point that you don't need (and probably don't want) an additional arrayfun around the matlabFunction results, symbolic math is likely to be irrelevant to your actua question on integral3. If you can ask your question without it, it may attract more attention.
For anything you cannot trim down, consider explaining what is happening.
Of course, in this process, for each iteration, test your code (after saying clear all or in a fresh MATLAB session!) to check if the problem is still there. If it is not, you may have found a hint where your basic problem is hiding.
For a longer discussion on the topic, see https://meta.stackexchange.com/questions/18584/how-to-ask-a-smart-question and the guides linked to within that discussion.
first a little background. I'm a psychology student so my background in coding isn't on par with you guys :-)
My problem is as follow and the most important observation is that curve fitting with 2 different programs gives completly different results for my parameters, altough my graphs stay the same. The main program we have used to fit my longitudinal data is kaleidagraph and this should be seen as kinda the 'golden standard', the program I'm trying to modify is matlab.
I was trying to be smart and wrote some code (a lot at least for me) and the goal of that code was the following:
1. Taking an individual longitudinal datafile
2. curve fitting this data on a non-parametric model using lsqcurvefit
3. obtaining figures and the points where f' and f'' are zero
This all worked well (woohoo :-)) but when I started comparing the function parameters both programs generate there is a huge difference. The kaleidagraph program stays close to it's original starting values. Matlab wanders off and sometimes gets larger by a factor 1000. The graphs stay however more or less the same in both situations and both fit the data well. However it would be lovely if I would know how to make the matlab curve fitting more 'conservative' and more located near it's original starting values.
validFitPersons = true(nbValidPersons,1);
for i=1:nbValidPersons
personalData = data{validPersons(i),3};
personalData = personalData(personalData(:,1)>=minAge,:);
% Fit a specific model for all valid persons
try
opts = optimoptions(#lsqcurvefit, 'Algorithm', 'levenberg-marquardt');
[personalParams,personalRes,personalResidual] = lsqcurvefit(heightModel,initialValues,personalData(:,1),personalData(:,2),[],[],opts);
catch
x=1;
end
Above is a the part of the code i've written to fit the datafiles into a specific model.
Below is an example of a non-parametric model i use with its function parameters.
elseif strcmpi(model,'jpa2')
% y = a.*(1-1/(1+(b_1(t+e))^c_1+(b_2(t+e))^c_2+(b_3(t+e))^c_3))
heightModel = #(params,ages) abs(params(1).*(1-1./(1+(params(2).* (ages+params(8) )).^params(5) +(params(3).* (ages+params(8) )).^params(6) +(params(4) .*(ages+params(8) )).^params(7) )));
modelStrings = {'a','b1','b2','b3','c1','c2','c3','e'};
% Define initial values
if strcmpi('male',gender)
initialValues = [176.76 0.339 0.1199 0.0764 0.42287 2.818 18.52 0.4363];
else
initialValues = [161.92 0.4173 0.1354 0.090 0.540 2.87 14.281 0.3701];
end
I've tried to mimick the curve fitting process in kaleidagraph as good as possible. There I've found they use the levenberg-marquardt algorithm which I've selected. However results still vary and I don't have any more clues about how I can change this.
Some extra adjustments:
The idea for this code was the following:
I'm trying to compare different fitting models (they are designed for this purpose). So what I do is I have 5 models with different parameters and different starting values ( the second part of my code) and next I have the general curve fitting file. Since there are different models it would be interesting if I could put restrictions into how far my starting values could wander off.
Anyone any idea how this could be done?
Anybody willing to help a psychology student?
Cheers
This is a common issue when dealing with non-linear models.
If I were, you, I would try to check if you can remove some parameters from the model in order to simplify it.
If you really want to keep your solution not too far from the initial point, you can use upper bounds and lower bounds for each variable:
x = lsqcurvefit(fun,x0,xdata,ydata,lb,ub)
defines a set of lower and upper bounds on the design variables in x so that the solution is always in the range lb ≤ x ≤ ub.
Cheers
You state:
I'm trying to compare different fitting models (they are designed for
this purpose). So what I do is I have 5 models with different
parameters and different starting values ( the second part of my code)
and next I have the general curve fitting file.
You will presumably compare the statistics from fits with different models, to see whether reductions in the fitting error are unlikely to be due to chance. You may want to rely on that comparison to pick the model that not only fits your data suitably but is also simplest (which is often referred to as the principle of parsimony).
The problem is really with the model you have shown resulting in correlated parameters and therefore overfitting, as mentioned by #David. Again, this should be resolved when you compare different models and find that some do just as well (statistically speaking) even though they involve fewer parameters.
edit
To drive the point home regarding the problem with the choice of model, here are (1) results of a trial fit using simulated data (2) the correlation matrix of the parameters in graphical form:
Note that absolute values of the correlation close to 1 indicate strongly correlated parameters, which is highly undesirable. Note also that the trend in the data is practically linear over a long portion of the dataset, which implies that 2 parameters might suffice over that stretch, so using 8 parameters to describe it seems like overkill.
I programmed in MATLAB for many years, but switched to using R exclusively in the past few years so I'm a little out of practice. I'm interviewing a candidate today who describes himself as a MATLAB expert.
What MATLAB interview questions should I ask?
Some other sites with resources for this:
"Matlab interview questions" on Wilmott
"MATLAB Questions and Answers" on GlobaleGuildLine
"Matlab Interview Questions" on CoolInterview
This is a bit subjective, but I'll bite... ;)
For someone who is a self-professed MATLAB expert, here are some of the things that I would personally expect them to be able to illustrate in an interview:
How to use the arithmetic operators for matrix or element-wise operations.
A familiarity with all the basic data types and how to convert effortlessly between them.
A complete understanding of matrix indexing and assignment, be it logical, linear, or subscripted indexing (basically, everything on this page of the documentation).
An ability to manipulate multi-dimensional arrays.
The understanding and regular usage of optimizations like preallocation and vectorization.
An understanding of how to handle file I/O for a number of different situations.
A familiarity with handle graphics and all of the basic plotting capabilities.
An intimate knowledge of the types of functions in MATLAB, in particular nested functions. Specifically, given the following function:
function fcnHandle = counter
value = 0;
function currentValue = increment
value = value+1;
currentValue = value;
end
fcnHandle = #increment;
end
They should be able to tell you what the contents of the variable output will be in the following code, without running it in MATLAB:
>> f1 = counter();
>> f2 = counter();
>> output = [f1() f1() f2() f1() f2()]; %# WHAT IS IT?!
We get several new people in the technical support department here at MathWorks. This is all post-hiring (I am not involved in the hiring), but I like to get to know people, so I give them the "Impossible and adaptive MATLAB programming challenge"
I start out with them at MATLAB and give them some .MAT file with data in it. I ask them to analyze it, without further instruction. I can very quickly get a feel for their actual experience.
http://blogs.mathworks.com/videos/2008/07/02/puzzler-data-exploration/
The actual challenge does not mean much of anything, I learn more from watching them attempt it.
Are they making scripts, functions, command line or GUI based? Do they seem to have a clear idea where they are going with it? What level of confidence do they have with what they are doing?
Are they computer scientists or an engineer that learned to program. CS majors tend to do things like close their parenthesis immediately, and other small optimizations like that. People that have been using MATLAB a while tend to capture the handles from plotting commands for later use.
How quickly do they navigate the documentation? Once I see they are going down the 'right' path then I will just change the challenge to see how quickly they can do plots, pull out submatrices etc...
I will throw out some old stuff from Project Euler. Mostly just ramp up the questions until one of us is stumped.
Floating Point Questions
Given that Matlab's main (only?) data type is the double precision floating point matrix, and that most people use floating point arithmetic -- whether they know it or not -- I'm astonished that nobody has suggested asking basic floating point questions. Here are some floating point questions of variable difficulty:
What is the range of |x|, an IEEE dp fpn?
Approximately how many IEEE dp fpns are there?
What is machine epsilon?
x = 10^22 is exactly representable as a dp fpn. What are the fpns xp
and xs just below and just above x ?
How many dp fpns are in [1,2)? How many atoms are on an edge of a
1-inch sugar cube?
Explain why sin(pi) ~= 0, but cos(pi) = -1.
Why is if abs(x1-x2) < 1e-10 then a bad convergence test?
Why is if f(a)*f(b) < 0 then a bad sign check test?
The midpoint c of the interval [a,b] may be calculated as:
c1 = (a+b)/2, or
c2 = a + (b-a)/2, or
c3 = a/2 + b/2.
Which do you prefer? Explain.
Calculate in Matlab: a = 4/3; b = a-1; c = b+b+b; e = 1-c;
Mathematically, e should be zero but Matlab gives e = 2.220446049250313e-016 = 2^(-52), machine epsilon (eps). Explain.
Given that realmin = 2.225073858507201e-308, and Matlab's u = rand gives a dp fpn uniformly distributed over the open interval (0,1):
Are the floating point numbers [2^(-400), 2^(-100), 2^(-1)]
= 3.872591914849318e-121, 7.888609052210118e-031, 5.000000000000000e-001
equally likely to be output by rand ?
Matlab's rand uses the Mersenne Twister rng which has a period of
(2^19937-1)/2, yet there are only about 2^64 dp fpns. Explain.
Find the smallest IEEE double precision fpn x, 1 < x < 2, such that x*(1/x) ~= 1.
Write a short Matlab function to search for such a number.
Answer: Alan Edelman, MIT
Would you fly in a plane whose software was written by you?
Colin K would not hire me (and probably fire me) for saying "that
Matlab's main (only?) data type is the double precision floating
point matrix".
When Matlab started that was all the user saw, but over the years
they have added what they coyly call 'storage classes': single,
(u)int8,16,32,64, and others. But these are not really types
because you cannot do USEFUL arithmetic on them. Arithmetic on
these storage classes is so slow that they are useless as types.
Yes, they do save storage but what is the point if you can't do
anything worthwhile with them?
See my post (No. 13) here, where I show that arithmetic on int32s is 12 times slower than
double arithmetic and where MathWorkser Loren Shure says "By
default, MATLAB variables are double precision arrays. In the olden
days, these were the ONLY kind of arrays in MATLAB. Back then even
character arrays were stored as double values."
For me the biggest flaw in Matlab is its lack of proper types,
such as those available in C and Fortran.
By the way Colin, what was your answer to Question 14?
Ask questions about his expertise and experience in applying MATLAB in your domain.
Ask questions about how he would approach designing an application for implementation in MATLAB. If he refers to recent features of MATLAB, ask him to explain them, and how they are different from the older features they replace or supplement, and why they are preferable (or not).
Ask questions about his expertise with MATLAB data structures. Many of the MATLAB 'experts' I've come across are very good at writing code, but very poor at determining what are the best data structures for the job in hand. This is often a direct consequence of their being domain experts who've picked up MATLAB rather than having been trained in computerism. The result is often good code which has to compensate for the wrong data structures.
Ask questions about his experience, if any, with other languages/systems and invite him to expand upon his observations about the relative strengths and weaknesses of MATLAB.
Ask for top tips on optimising MATLAB programs. Expect the answers: vectorisation, pre-allocation, clearing unused variables, etc.
Ask about his familiarity with the MATLAB profiler, debugger and lint tools. I've recently discovered that the MATLAB 'expert' over in the corner here had never, in 10 years using the tool, found the profiler.
That should get you started.
I. I think this recent SO question
on indexing is a very good question
for an "expert".
I have a 2D array, call it 'A'. I have
two other 2D arrays, call them 'ix'
and 'iy'. I would like to create an
output array whose elements are the
elements of A at the index pairs
provided by x_idx and y_idx. I can do
this with a loop as follows:
for i=1:nx
for j=1:ny
output(i,j) = A(ix(i,j),iy(i,j));
end
end
How can I do this without the loop? If
I do output = A(ix,iy), I get the
value of A over the whole range of
(ix)X(iy).
II. Basic knowledge of operators like element-wise multiplication between two matrices (.*).
III. Logical indexing - generate a random symmetric matrix with values from 0-1 and set all values above T to 0.
IV. Read a file with some properly formatted data into a matrix (importdata)
V. Here's another sweet SO question
I have three 1-d arrays where elements
are some values and I want to compare
every element in one array to all
elements in other two.
For example:
a=[2,4,6,8,12]
b=[1,3,5,9,10]
c=[3,5,8,11,15]
I want to know if there are same
values in different arrays (in this
case there are 3,5,8)
Btw, there's an excellent chance your interviewee will Google "MATLAB interview questions" and see this post :)
Possible question:
I have an array A of n R,G,B triplets. It is a 3xn matrix. I have another array B in the form 1xn which stores an index value (association to a cluster) for each triplet.
How do I plot the triplets of A in 3D space (using plot3 function), coloring each triplet according to its index in B? (The goal is to qualitatively evaluate my clustering)
Really, really good programmers who are MATLAB novices won't be able to give you an efficient (== MATLAB style) solution. However, it is a very simple problem if you do know your MATLAB.
Depends a bit what you want to test.
To test MATLAB fluency, there are several nice Stack Overflow questions that you could use to test e.g. array manipulations (example 1, example 2), or you could use fix-this problems like this question (I admit, I'm rather fond of that one), or look into this list for some highly MATLAB-specific stuff. If you want to be a bit mean, throw in a question like this one, where the best solution is a loop, and the typical MATLAB-way-of-thinking solution would just fill up the memory.
However, it may be more useful to ask more general programming questions that are related to your area of work and see whether they get the problem solved with MATLAB.
For example, since I do image analysis, I may ask them to design a class for loading images of different formats (a MATLAB expert should know how to do OOP, after all, it has been out for two years now), and then ask follow-ups as to how to deal with large images (I want to see a check on how much memory would be used - or maybe they know memory.m - and to hear about how MATLAB usually works with doubles), etc.