What does nlfilter do? - matlab

I am smoothing an image and some forums gave me this.
fstr = #(a) median(a(:));
smooth_img = nlfilter(A,[50 50],fstr);
Is it going to find the median of 50x50 block and move to next 50x 50 block?
I mean the block is from pixel 1 to 50, in the next iteration it goes to 51 to 100 or 1 to 50 then to 2 to 51 and so on?
Thank you.

nlfilter() is a sliding filter, so the latter is correct, i.e. 1:50, 2:51, 3:52, etc..
The function blockproc() works in a blockwise manner, i.e. 1:50, 51:100, etc.. if that is what you need

Related

How to use Matlab (for/while/if)

I drew lines at 100,100,20 intervals as shown in the picture.
How do I create code using (for) or (while)? Please let me know.
xline(100,'b-');
xline(200,'b-');
xline(220,'b-');
xline(320,'b-');
xline(420,'b-');
xline(440,'b-');
xline(540,'b-');
It it's not a constant spacing between x coordinates of your line you'll have to define a matrix then call for it's element :
x_matrix = [100 200 220 320 420 440 540];
for i = 1:length(x_matrix)
xline(x_matrix(i),'b-');
end
PS: I wouldn't recommend to use plot function in a while loop. If the while loop is executed too much times or even if matlab is stuck in it, your computer won't like...

MATLAB is too slow calculation of Spearman's rank correlation for 9-element vectors

I need to calculate the Spearman's rank correlation (using corr function) for pairs of vectors with different lengths (for example 5-element vectors to 20-element vectors). The number of pairs is usually above 300 pairs for each length. I track the progress with waitbar. I have noticed that it takes unusually very long time for 9-element pair of vectors, where for other lengths (greater and smaller) it takes very short times. Since the formula is exactly the same, the problem must have originated in MATLAB function corr.
I wrote the following code to verify that the problem is with corr function and not other calculations that I have besides 'corr', where all of that calculations (including 'corr') take place inside some 2 or 3 'for' loops. The code repeats the timing 50 times to avoid accidental results.
The result is a bar graph, confirming that it takes a long time for MATLAB to calculate Spearman's rank correlation for 9-element vectors. Since my calculations are not that heavy, this problem does not cause endless wait, it just increases the total time consumed for the whole process. Can someone tell me that what causes the problem and how to avoid it?
Times1 = zeros(20,50);
for i = 5:20
for j = 1:50
tic
A = rand(i,2);
[r,p] = corr(A(:,1),A(:,2),'type','Spearman');
Times1(i,j) = toc;
end
end
Times2 = mean(Times1,2);
bar(Times2);
xticks(1:25);
xlabel('number of elements in vectors');
ylabel('average time');
After some investigation, I think I found the root of this very interesting problem. My tests have been conducted profiling every outer iteration using the built-in Matlab profiler, as follows:
res = cell(20,1);
for i = 5:20
profile clear;
profile on -history;
for j = 1:50
uni = rand(i,2);
corr(uni(:,1),uni(:,2),'type','Spearman');
end
profile off;
p = profile('info');
res{i} = p.FunctionTable;
end
The produced output looks like this:
The first thing I noticed is that the Spearman correlation for matrices with a number of rows less than or equal to 9 is computed in a different way than for matrices with 10 or more rows. For the former, the functions being internally called by the corr function are:
Function Number of Calls
----------------------- -----------------
'factorial' 100
'tiedrank>tr' 100
'tiedrank' 100
'corr>pvalSpearman' 50
'corr>rcumsum' 50
'perms>permsr' 50
'perms' 50
'corr>spearmanExactSub' 50
'corr>corrPearson' 50
'corr>corrSpearman' 50
'corr' 50
'parseArgs' 50
'parseArgs' 50
For the latter, the functions being internally called by the corr function are:
Function Number of Calls
----------------------- -----------------
'tiedrank>tr' 100
'tiedrank' 100
'corr>AS89' 50
'corr>pvalSpearman' 50
'corr>corrPearson' 50
'corr>corrSpearman' 50
'corr' 50
'parseArgs' 50
'parseArgs' 50
Since the computation of the Spearman correlation for matrices with 10 or more rows seems to run smoothly and quickly and doesn't show any evidence of performance bottlenecks, I decided to avoid losing time investigating on this fact and I focused on the main concern: the small matrices.
I tried to understand the difference between the execution time of the whole process for a matrix with 5 rows and for one with 9 rows (the one notably showing the worst performance). This is the code I used:
res5 = res{5,1};
res5_tt = [res5.TotalTime];
res5_tt_perc = ((res5_tt ./ sum(res5_tt)) .* 100).';
res9_tt = [res{9,1}.TotalTime];
res9_tt_perc = ((res9_tt ./ sum(res9_tt)) .* 100).';
res_diff = res9_tt_perc - res5_tt_perc;
[~,res_diff_sort] = sort(res_diff,'desc');
tab = [cellstr(char(res5.FunctionName)) num2cell([res5_tt_perc res9_tt_perc res_diff])];
tab = tab(res_diff_sort,:);
tab = cell2table(tab,'VariableNames',{'Function' 'TT_M5' 'TT_M9' 'DIFF'});
And here is the result:
Function TT_M5 TT_M9 DIFF
_______________________ _________________ __________________ __________________
'corr>spearmanExactSub' 7.14799963478685 16.2879721171023 9.1399724823154
'corr>pvalSpearman' 7.98185309750143 16.3043118970503 8.32245879954885
'perms>permsr' 3.47311716905926 8.73599255035966 5.26287538130039
'perms' 4.58132952553723 8.77488502392486 4.19355549838763
'corr>corrSpearman' 15.629476293326 16.440893059217 0.811416765890929
'corr>rcumsum' 0.510550019981949 0.0152486312660671 -0.495301388715882
'factorial' 0.669357868472376 0.0163923929871943 -0.652965475485182
'parseArgs' 1.54242684137027 0.0309456171268161 -1.51148122424345
'tiedrank>tr' 2.37642998160463 0.041010720272735 -2.3354192613319
'parseArgs' 2.4288171135289 0.0486075856244615 -2.38020952790444
'corr>corrPearson' 2.49766877262937 0.0484657591710417 -2.44920301345833
'tiedrank' 3.16762535118088 0.0543584195582888 -3.11326693162259
'corr' 21.8214856092549 16.5664346332513 -5.25505097600355
Once the bottleneck was detected, I started analyzing the internal code (open corr) and I finally found the cause of the problem. Within the spearmanExactSub, this part of code is being executed (where n is the number of rows of the matrix):
n = arg1;
nfact = factorial(n);
Dperm = sum((repmat(1:n,nfact,1) - perms(1:n)).^2, 2);
A permutation is being computed on a vector whose values range from 1 to n. This is what comes into play increasing the computational complexity (and, obviously, the computational time) of the function. Other operations, like the subsequent repmat on factorial(n) of 1:n and the ones below that point, contribute to worsen the situation. Now, long story short...
factorial(5) = 120
factorial(6) = 720
factorial(7) = 5040
factorial(8) = 40320
factorial(9) = 362880
can you see the reason why, between 5 and 9, your bar graph shows an "exponentially" increasing computational time?
On a side note, there is nothing you can do to solve this problem, unless you find another implementation of the Spearman correlation that doesn't present the same bottleneck or you implement your own.

What does a function n-mod(a,m) such as 8-mod(330,8) do on MATLAB?

I really can't figure out what it does.
More specifically, I am working on image compression on Matlab and I am provided a code that looks like this:
X=imread('image1.jpg');
s=size(X); % image1.jpg has size 330 x 220
offset1 = mod(8-mod(s(1),8),8);
offset2 = mod(8-mod(s(2),8),8);
if offset1 ~= 0 || offset2 ~= 0
X(s(1)+offset1, s(2)+offset2, 3) = 0;
end
figure(1)
image(X);
axis image
axis off
Trying to figure out what that if statement does, but I have no clue what that offset1 and offset2 is referring to.
They're trying to determine whether the image size is a multiple of 8. JPEG images are always multiples of 8 in size internally because they are made from 8x8 DCT blocks. The header can specify a smaller size, in which case only the specified upper-leftmost portion is visible, and the right and bottom are trimmed.
The part 8 - mod(s(1), 8) is computing how many more bytes it would take to get to the next multiple of 8 in the X size. The outer mod(..., 8) just folds the case of "8 more bytes" back into "0 more bytes".

Detecting if values are within range of each other and taking a midpoint - MATLAB

Following on from: Detecting if any values are within a certain value of each other - MATLAB
I am currently using randi to generate a random number from which I then subtract and add a second number - generated using poissrnd:
for k=1:10
a = poissrnd(200,1);
b(k,1) = randi([1,20000]);
c(k,1:2) = [b(k,1)-a,b(k,1)+a];
end
c = sort(c);
c provides an output in this format:
823 1281
5260 5676
5372 5760
5379 5779
6808 7244
6869 7293
9203 9653
12197 12563
14411 14765
15302 15670
Which are essentially the boundaries +/- a around the point chosen in b.
I then want to set an additional variable (i.e. d = 2000) which is used as the threshold by which values are matched and then merged. The boundaries are taken into consideration for this - the output of the above value when d = 2000 would be:
1052
7456
13933
The boundaries 823-1281 are not within 2000 of any other value so the midpoint is taken - reflecting the original value. The next midpoint taken is between 5260 and 9653 because as you go along, each successive values is within 2000 of the one before it until 9653. The same logic is then applied to take the midpoint between 12197 and 15670.
Is there a quick and easy way to adapt the answer give in the linked question to deal with a 2 column format?
EDIT (in order to make it clearer):
The values held in c can be thought of as demarcating the boundaries of 'blocks' that sit on a line. Every single boundary is checked to see if anything lies within 2000 of it (the black lines).
As soon as any black line touches a red block, that entire red block is incorporated into the same merge block - in full. This is why the first midpoint value calculated is 1052 - nothing is touched by the two black lines emanating from the first two boundaries. However the next set of blocks all touch one another. This incorporates them all into the merge such that the midpoint is taken between 9653 and 5260 = 7456.
The block starting at 12197 is out of reach of it's preceding one so it remains separate. I've not shown all the blocks.
EDIT 2 #Esteban:
b =
849
1975
8336
9599
12057
12983
13193
13736
16887
18578
c =
662 1036
1764 2186
8148 8524
9386 9812
11843 12271
12809 13157
12995 13391
13543 13929
16687 17087
18361 18795
Your script then produces the result:
8980
12886
17741
When in fact it should be:
1424
8980
12886
17741
So it is just missing the first value - if no merge is occurring, the midpoint is just taken between the two values. Sometimes this seems to work - other times it doesn't.
For example here it works (when value is set to 1000 instead of 2000 as a test):
c =
2333 2789
5595 6023
6236 6664
10332 10754
11425 11865
12506 12926
12678 13114
15105 15517
15425 15797
19490 19874
result =
2561
6129
11723
15451
19682
See if this works for you -
th = 2000 %// threshold
%// Column arrays
col1 = c(:,1)
col2 = c(:,2)
%// Position of "group" shifts
grp_changes = diff([col2(1:end-1,:) col1(2:end,:)],[],2)>th
%// Start and stop positions of shifts
stops = [grp_changes ; 1]
starts = [1 ; stops(1:end-1)]
%// Finally the mean of shift positions, which is the desired output
out = floor(mean([col1(starts~=0) col2(stops~=0)],2))
Not 100% sure if it will work for all your samples... but this is the code I came up with which works with at least the data in your example:
value=2000;
indices = find(abs(c(2:end,1)-c(1:end-1,2))>value);
indices = vertcat(indices, length(c));
li = indices(1:end-1)+1;
ri = indices(2:end);
if li(1)==2
li=vertcat(1,li);
ri=vertcat(1,ri);
end
result = floor((c(ri,2)+c(li,1))/2)
it's not very clean and could surely be done in less lines, but it's easy to understand and it works, and since your c will be small, I dont see the need to further optimize this unless you will run it millions of time.

flip coin 100 times get exactly 50 Matlab

If I flip a coin 100 times, what is the probability that exactly 50 will be heads? My thoughts were to get the number of times exactly 50 appeared in the 100 coin flips out of 1000 times and divide that by 1000, the number of events.
I have to model this experiment in Matlab.
I understand that flipping a coin 100 times and retrieving the number of heads and adding a count to the number of exactly 50 heads is one event. But I do not know how to repeat that event 1000, or 10000 times.
Here is the code I have written so far:
total_flips=100;
heads=0;
tails=0;
n=0;
for z=1:1000
%tosses 100 coins
for r=1:100
%randomizes to choose 1 or 0, 0 being heads
coin=floor(2*rand(1));
if (coin==0)
heads=heads+1;
else
tails=tails+1;
end
end
if heads==50
n=n+1;
end
end
I have tried to encompass the for loop and the if statement within a for loop, but had no luck. How do I repeat it?
although your problem is solved, here comes comments on your code:
1) You set the variable total_flips=100, but you do not use it in your for-loop, where it goes from 1 to 100. It could go from 1 to total_flips
2) Omitting for-loops: although this was not your question, but your code can be optimized. You do not need a single for-loop for your problem:
repititions = 1000;
total_flips = 100;
coin_flip_matrix = floor(2*rand(total_flips, repititions)); % all coin flips: one column per repitition
num_of_heads = sum(coin_flip_matrix); % number of heads for each repitition (shaped: 1 x repitions)
n = sum(num_of_heads == 50) % how often did we hit 50?
You don't need tails at all, and you need to set heads back to zero inside the outer for z=1:1000 loop.