Can operations on submatrices (and subvectors) be vectorized? - matlab

I'm currently working on an edge detector in octave. Coming from other programming languages like Java and Python, I'm used to iterating in for loops, rather than performing operations on entire matrices. Now in octave, this causes a serious performance hit, and I'm having a bit of difficulty figuring out how to vectorize my code. I have the following two pieces of code:
1)
function zc = ZeroCrossings(img, T=0.9257)
zc = zeros(size(img));
# Iterate over central positions of all 3x3 submatrices
for y = 2:rows(img) - 1
for x = 2:columns(img) - 1
ndiff = 0;
# Check all necessary pairs of elements of the submatrix (W/E, N/S, NW/SE, NE/SW)
for d = [1, 0; 0, 1; 1, 1; 1, -1]'
p1 = img(y-d(2), x-d(1));
p2 = img(y+d(2), x+d(1));
if sign(p1) != sign(p2) && abs(p1 - p2) >= T
ndiff++;
end
end
# If at least two pairs fit the requirements, these coordinates are a zero crossing
if ndiff >= 2
zc(y, x) = 1;
end
end
end
end
2)
function g = LinkGaps(img, k=5)
g = zeros(size(img));
for i = 1:rows(img)
g(i, :) = link(img(i, :), k);
end
end
function row = link(row, k)
# Find first 1
i = 1;
while i <= length(row) && row(i) == 0
i++;
end
# Iterate over gaps
while true
# Determine gap start
while i <= length(row) && row(i) == 1
i++;
end
start = i;
# Determine gap stop
while i <= length(row) && row(i) == 0
i++;
end
# If stop wasn't reached, exit loop
if i > length(row)
break
end
# If gap is short enough, fill it with 1s
if i - start <= k
row(start:i-1) = 1;
end
end
end
Both of these functions iterate over submatrices (or rows and subrows in the second case), and particularly the first one seems to be slowing down my program quite a bit.
This function takes a matrix of pixels (img) and returns a binary (0/1) matrix, with 1s where zero crossings (pixels whose corresponding 3x3 neighbourhoods fit certain requirements) were found.
The outer 2 for loops seem like they should be possible to vectorize somehow. I can put the body into its own function (taking as an argument the necessary submatrix) but I can't figure out how to then call this function on all submatrices, setting their corresponding (central) positions to the returned value.
Bonus points if the inner for loop can also be vectorized.
This function takes in the binary matrix from the previous one's output, and fills in gaps in its rows (i.e. sets them to 1). A gap is defined as a series of 0s of length <= k, bounded on both sides by 1s.
Now I'm sure at least the outer loop (the one in LinkGaps) is vectorizable. However, the while loop in link again operates on subvectors, rather than single elements so I'm not sure how I'd go about vectorizing it.

Not a full solution, but here is an idea how you could do the first without any loops:
% W/E
I1 = I(2:end-1,1:end-2);
I2 = I(2:end-1,3:end );
C = (I1 .* I2 < 0) .* (abs(I1 - I2)>=T);
% N/S
I1 = I(1:end-2,2:end-1);
I2 = I(3:end, 2:end-1);
C = C + (I1 .* I2 < 0) .* (abs(I1 - I2)>=T);
% proceed similarly with NW/SE and NE/SW
% ...
% zero-crossings where count is at least 2
ZC = C>=2;
Idea: form two subimages that are appropriately shifted, check for the difference in sign (product negative) and threshold the difference. Both tests return a logical (0/1) matrix, the element-wise product does the logical and, result is a 0/1 matrix with 1 where both tests have succeeded. These matrices can be added to keep track of the counts (ndiff).

Related

Is there a way to modify the isdag matlab function in order for it to ignore cycles of length zero?

recently I've been tasked with programming an algorithm that optimizes job-shop scheduling problems and I'm following an approach which uses directed graphs for this. In these directed graphs nodes represent events and edges represent time precedence constraints between events, i.e. time inequalities. So, for example, 2 consecutive nodes A & B separated by a directed edge of length 2 which goes from A to B would represent the inequality tB-tA>=2. It follows that equality would be represented by 2 directed edges of opposite directions, one with positive length and the other one with negative length. Thus, we end up with a graph which has some cycles of length zero.
Matlab has a function called isdag which returns true if a directed graph has no cycles and false otherwise; is there a way to modify this function in order for it to ignore the cycles of length zero? If not, has anyone got any idea on how to program this? Thanks in advance!!
I also tried this but it doesn't work. I've tried it with the adjacency matrix adjMatrix = [0, 10, 0; -9, 0, 5; 0, 0, 0] it should return true as it has a cycle between nodes 1 and 2 of length 10+(-9)=1, but it returns false,
function result = hasCycleWithPositiveWeight(adjMatrix)
n = size(adjMatrix,1);
visited = false(1,n);
path = zeros(1,n);
result = false;
for i = 1:n
pathStart = 1;
pathEnd = 1;
path(pathEnd) = i;
totalWeight = 0;
while pathStart <= pathEnd
node = path(pathStart);
visited(node) = true;
for j = 1:n
if adjMatrix(node,j) > 0
totalWeight = totalWeight + adjMatrix(node,j);
if visited(j)
if j == i && totalWeight > 0
result = true;
return;
end
else
pathEnd = pathEnd + 1;
path(pathEnd) = j;
end
end
end
pathStart = pathStart + 1;
totalWeight = totalWeight - adjMatrix(node, path(max(pathStart-1,1)));
visited(node) = false;
end
end
end
If you want to find the cycle between just two consecutive nodes use this:
a = [0, 10, 0; -9, 0, 5; 0, 0, 0];
b = a.';
And = (a & b);
Add = (a + b);
result = any( And .* Add, 'all');
It returns true if there is a cycle that its length isn't 0.
Explanation:
In the graph if the length between node 2 and 5 is 12 we set the element A(2 , 5) of the adjacency matrix to 12 and if the length between node 5 and 2 is -8 we set the element A(5, 2) of the adjacency matrix to 8. So there is a symmetry between nodes relationship. The matrix transpose changes the position of (2, 5) to (5, 2) and (5, 2) to (2, 5).
If we And a matrix with its transpose the result matrix shows that there is a cycle between two nodes if it is 1.
If we Add a matrix to its transpose the result matrix shows the sum of the pairwise lengths between nodes.
If we multiplyelement-wise the two matrices Add and And the result matrix shows the sum of the pairwise lengths between nodes but the sum of lengths of two nodes that don't form a cycle is set to 0.
Now the function any can be used to test if the matrix has a cycle that its length isn't 0.

Optimize algorithm that generates the number of units in each binary state

TL;DR: I need to find all possible combinations of N row vectors (of size 1xB), whose row-wise sum produces the desired result vector (also of size 1xB).
I have a binary matrix (1 or 0 entries only) of size N x B where N denotes the number of units and B denotes the number of bins. Each unit, i.e., each row, of the matrix can be in one of 2^B states. That is, if B=2, the states possible are {0,0}, {0,1}, {1,0} or {1,1}. If B=3, then the possible states are {0,0,0}, {0,0,1}, {0,1,0}, {0,1,1}, {1,0,0}, {1,0,1}, {1,1,0} or {1,1,1}. Basically the binary representation of the numbers from 0 to 2^B-1.
For the matrix, I know the sum over the rows of the matrix, for example, {1,2}. This sum can be achieved through different binary matrices like [0,0;0,1;1,1] or [0,1;0,1;1,0]. The number of units in each state are {1,1,0,1} and {0,2,1,0}, respectively for each of the matrices, where the first number corresponds to the first state {0,0}, second to the second state {0,1} and so on in increasing order. My problem is to find all possible vectors of these numbers of states that satisfy a particular matrix sum.
Now to implement this in MATLAB, I used recursion and a global variable. This to me was the easiest approach, however, it takes a lot of time. The code I used is given below:
function output = getallstate()
global nState % stores all the possible vectors
global nStateRow % stores the current row of the vector
global statebin %stores the binary representation of all the possible states
nState = [];
nStateRow = 1;
nBin = 2; % number of columns or B
v = [1 2]; % should always be of the size 1 x nBin
N = 3; % number of units
statebin = de2bi(0:(2 ^ nBin - 1), nBin) == 1; % stored as logical because I use it to index later
getnstate(v, 2 ^ nBin - 1, nBin) % the main function
checkresult(v, nState, nBin) % will result in false if even one of the results is incorrect
% adjust for max number of units, because the total of each row cannot exceed this number.
output = nState(1:end-1, :); % last row is always repeated (needs to be fixed somehow)
output(:, 1) = N - sum(output(:, 2:end), 2); % the first column, that is the number of units in the all 0 state is always determined by the number of units in the other states
if any(output(:, 1) < 0)
output(output(:, 1) < 0, :) = [];
end
end
function getnstate(r, state, nBin)
global nState
global nStateRow
global statebin
if state == 0
if all(r == 0)
nStateRow = nStateRow + 1;
nState(nStateRow, :) = nState(nStateRow - 1, :);
end
else
for a = 0:min(r(statebin(state + 1, :)))
nState(nStateRow, state + 1) = a;
getnstate(r - a * statebin(state + 1, :), state - 1, nBin);
end
end
end
function allOk = checkresult(r, nState, nBin)
% just a function that checks whether the obtained vectors all result in the correct sum
allstate = de2bi(0:(2 ^ nBin - 1), nBin);
allOk = true;
for iRow = 1:size(nState, 1)
sumR = sum(bsxfun(#times, allstate, nState(iRow, :).'), 1);
allOk = allOk & isequal(sumR,r);
end
end
function b = de2bi(d, n)
d = d(:);
[~, e] = log2(max(d));
b = rem(floor(d * pow2(1-max(n, e):0)), 2);
end
The above code works fine and gives all possible states but, as is expected, it gets slower as you increase the number of columns (B) and the number of units (N). Also, it uses globals. The following are my questions:
Is there a way to generate these without using globals?
Is there a non-recursive way for this algorithm?
EDIT 1
In what way do the above and still have an optimised algorithm which is faster than the current version?
EDIT 2
Added the de2bi function to remove dependency on the Communications Toolbox.

How to add random values into an empty vector repeatedly without knowing when it would stop? How to count average number of steps?

Imagine the process of forming a vector v by starting with the empty vector and then repeatedly putting a randomly chosen number from 1 to 20 on the end of v. How could you use Matlab to investigate on average how many steps it takes before v contains all numbers from 1 to 20? You can define/use as many functions or scripts as you want in your answer.
v=[];
v=zeros(1,20);
for a = 1:length(v)
v(a)=randi(20);
end
since v is now only a 1x20 vector, if there are two numbers equal, it definitely
does not have all 20 numbers from 1 to 20
for i = 1:length(v)
for j = i+1:length(v)
if v(i)==v(j)
v=[v randi(20)];
i=i+1;
break;
end
end
end
for k = 1:length(v)
for n = 1:20
if v(k)==n
v=v;
elseif v(k)~=n
a=randi(20);
v=[v a];
end
if a~=n
v=[v randi(20)];
k=k+1;
break;
end
end
end
disp('number of steps: ')
i*k
First of all, the loop generating the vector must be infinite. You can break out of the loop if your condition is met. This is how you can count how many steps you need. You cannot use a loop over 20 steps if you know you'll need more than that. I like using while true and break.
Next, your method of determining if all elements are present is a method of O(n2). This can be done in O(n log n) sorting the elements. This is what unique does. It works by sorting, which, in the general case, is O(n log n) (think QuickSort). So, drawing n elements and after each checking to see if you've got them all is an operation O(n2 log n). This is expensive!
But we're talking about a finite set of integers here. Integers can be sorted in O(n) (look up histogram sort or radix sort). But we can do even better, because we don't even need to physically create the vector or sort its values. We can instead simply keep track of the elements we have seen in an array of length 20: In the loop, generate the next vector element, set the corresponding value in your 20-element array, and when all elements of this array are set, you have seen all values at least once. This is when you break.
My implementation of these two methods is below. The unique method takes 11s to do 10,000 repetitions of this process, and the other one only 0.37s. After 10,000 repetitions, I saw that you need about 72 steps on average to see all 20 integers.
function test
k = 10000;
tic;
n1 = 0;
for ii=1:k
n1 = n1 + method1;
end
n1 = n1 / k;
toc
disp(n1)
tic;
n2 = 0;
for ii=1:k
n2 = n2 + method2;
end
n2 = n2 / k;
toc
disp(n2)
end
function n = method1
k = 20;
v = [];
n = 1;
while true
v(end+1) = randi(k);
if numel(unique(v))==k
break;
end
n = n + 1;
end
end
function n = method2
k = 20;
h = zeros(20,1);
n = 1;
while true
h(randi(k)) = 1;
if all(h)
break;
end
n = n + 1;
end
end
Note on the timings: I use tic/toc here, but it is usually better to use timeit instead. The time difference is large enough for this to not matter all that much. But do make sure that the code that uses tic/toc is inside a function, and not copy-pasted to the command line. Timings are not representative when using tic/toc on the command line because the JIT compiler will not be used.
I'm not sure if I understand your question correctly, but maybe have a look at the unique() function.
if
length(unique(v)) == 20
then you have all values from 1:20 in your vector
v = []
counter = 0;
while length(unique(v)) ~= 20
a = randi(20);
v=[v a];
counter = counter +1
end
the value counter should give you the number of iterations needed until v contains all values.
If you want to get the average amount of iterations by trial and error just make a look around this code and test it 10000 times and average the results form counter.

Vectorize MATLAB code

Let's say we have three m-by-n matrices of equal size: A, B, C.
Every column in C represents a time series.
A is the running maximum (over a fixed window length) of each time series in C.
B is the running minimum (over a fixed window length) of each time series in C.
Is there a way to determine T in a vectorized way?
[nrows, ncols] = size(A);
T = zeros(nrows, ncols);
for row = 2:nrows %loop over the rows (except row #1).
for col = 1:ncols %loop over the columns.
if C(row, col) > A(row-1, col)
T(row, col) = 1;
elseif C(row, col) < B(row-1, col)
T(row, col) = -1;
else
T(row, col) = T(row-1, col);
end
end
end
This is what I've come up with so far:
T = zeros(m, n);
T(C > circshift(A,1)) = 1;
T(C < circshift(B,1)) = -1;
Well, the trouble was the dependency with the ELSE part of the conditional statement. So, after a long mental work-out, here's a way I summed up to vectorize the hell-outta everything.
Now, this approach is based on mapping. We get column-wise runs or islands of 1s corresponding to the 2D mask for the ELSE part and assign them the same tags. Then, we go to the start-1 along each column of each such run and store that value. Finally, indexing into each such start-1 with those tagged numbers, which would work as mapping indices would give us all the elements that are to be set in the new output.
Here's the implementation to fulfill all those aspirations -
%// Store sizes
[m1,n1] = size(A);
%// Masks corresponding to three conditions
mask1 = C(2:nrows,:) > A(1:nrows-1,:);
mask2 = C(2:nrows,:) < B(1:nrows-1,:);
mask3 = ~(mask1 | mask2);
%// All but mask3 set values as output
out = [zeros(1,n1) ; mask1 + (-1*(~mask1 & mask2))];
%// Proceed if any element in mask3 is set
if any(mask3(:))
%// Row vectors for appending onto matrices for matching up sizes
mask_appd = false(1,n1);
row_appd = zeros(1,n1);
%// Get 2D mapped indices
df = diff([mask_appd ; mask3],[],1)==1;
cdf = cumsum(df,1);
offset = cumsum([0 max(cdf(:,1:end-1),[],1)]);
map_idx = bsxfun(#plus,cdf,offset);
map_idx(map_idx==0) = 1;
%// Extract the values to be used for setting into new places
A1 = out([df ; false(1,n1)]);
%// Map with the indices obtained earlier and set at places from mask3
newval = [row_appd ; A1(map_idx)];
mask3_appd = [mask_appd ; mask3];
out(mask3_appd) = newval(mask3_appd);
end
Doing this vectorized is rather difficult because the current row's output depends on the previous row's output. Doing vectorized operations usually means that each element should stand out on its own using some relationship that is independent of the other elements that surround it.
I don't have any input on how you would achieve this without a for loop but I can help you reduce your operations down to one instead of two. You can do the assignment vectorized per row, but I can't see how you'd do it all in one shot.
As such, try something like this instead:
[nrows, ncols] = size(A);
T = zeros(nrows, ncols);
for row = 2:nrows
out = T(row-1,:); %// Change - Make a copy of the previous row
out(C(row,:) > A(row-1,:)) = 1; %// Set those elements of C
%// in the current row that are larger
%// than the previous row of A to 1
out(C(row,:) < B(row-1,:)) = -1; %// Same logic but for B now and it's
%// less than and the value is -1 instead
T(row,:) = out; %// Assign to the output
end
I'm currently figuring out how to do this with any loops whatsoever. I'll keep you posted.

Matlab recursion. Sum off all odd numbers in a vector

I am trying to implement a recursive function to add the odd numbers in a vector v.
So far this is my attempt
function result = sumOdd(v)
%sum of odd numbers in a vector v
%sumOdd(v)
n = 1;
odds = [];
if length(v) > 0
if mod(v(n),2) == 1
odds(n) = v(n);
v(n) = [];
n = n + 1;
sumOdd(v)
elseif mod(v(n),2) == 0
v(n) = [];
n = n + 1;
sumOdd(v)
end
else
disp(sum(odds))
end
end
This does not work and returns a value of zero. I am new to programming and recursion and would like to know what I'm doing wrong.
Thank you.
There is a better way to solve this in MATLAB:
function result=summOdd(v)
odd_numbers=v(mod(v,2)); % Use logical indexing to get all odd numbers
result=sum(odd_numbers); % Summ all numbers.
end
To give a recursive solution:
When implementing a recursive function, there is a pattern you should always follow. First start with the trivial case, where the recursion stops. In this case, the sum of an empty list is 0:
function result = sumOdd(v)
%sum of odd numbers in a vector v
%sumOdd(v)
if length(v) == 0
result=0;
else
%TBD
end
end
I always start this way to avoid infinite recursions when trying my code. Where the %TBD is placed you have to put your actual recursion. In this case your idea was to process the first element and put all remaining into the recursion. First write a variable s which contains 0 if the first element is even and the first element itself when it is odd. This way you can calculate the result using result=s+sumOdd(v)
function result = sumOdd(v)
%sum of odd numbers in a vector v
%sumOdd(v)
if length(v) == 0
result=0;
else
if mod(v(1),2) == 1
s=v(1);
else
s=0;
end
v(1) = [];
result=s+sumOdd(v);
end
end
Now having your code finished, read the yellow warning the editor gives to you, it tells you to replace length(v) == 0 with isempty(v).
Instead of keeping 'n', you can always delete the first element of v. Also, you need to pass 'odds' as an argument, as you're initializing it as an empty array at each time you call the function (hence the zero output).
The following example seems to do the job:
function result = sumOdd(v,odds)
%sum of odd numbers in a vector v
%sumOdd(v)
if ~isempty(v)
if mod(v(1),2) == 1
odds = [odds;v(1)];
v(1) = [];
sumOdd(v,odds)
elseif mod(v(1),2) == 0
v(1) = [];
sumOdd(v,odds)
end
else
disp(sum(odds))
end
end