Replace values in an array in matlab without changing the original array - matlab

My question is that given an array A, how can you give another array identical to A except changing all negatives to 0 (without changing values in A)?
My way to do this is:
B = A;
B(B<0)=0
Is there any one-line command to do this and also not requiring to create another copy of A?

While this particular problem does happen to have a one-liner solution, e.g. as pointed out by Luis and Ian's suggestions, in general if you want a copy of a matrix with some operation performed on it, then the way to do it is exactly how you did it. Matlab doesn't allow chained operations or compound expressions, so you generally have no choice but to assign to a temporary variable in this manner.
However, if it makes you feel better, B=A is efficient as it will not result in any new allocated memory, unless / until B or A change later on. In other words, before the B(B<0)=0 step, B is simply a reference to A and takes no extra memory. This is just how matlab works under the hood to ensure no memory is wasted on simple aliases.
PS. There is nothing efficient about one-liners per se; in fact, you should avoid them if they lead to obscure code. It's better to have things defined over multiple lines if it makes the logic and intent of the algorithm clearer.
e.g, this is also a valid one-liner that solves your problem:
B = subsasgn(A, substruct('()',{A<0}), 0)
This is in fact the literal answer to your question (i.e. this is pretty much code that matlab will call under the hood for your commands). But is this clearer, more elegant code just because it's a one-liner? No, right?

Try
B = A.*(A>=0)
Explanation:
A>=0 - create matrix where each element is 1 if >= 0, 0 otherwise
A.*(A>=0) - multiply element-wise
B = A.*(A>=0) - Assign the above to B.

Related

Test for Duplicate Quickly in Matlab Array

I have two matrices S and T which have n columns and a row vector v of length n. By my construction, I know that S does not have any duplicates. What I'm looking for is a fast way to find out whether or not the row vector v appears as one of the rows of S. Currently I'm using the test
if min([sum(abs(S - repmat(f,size(S,1),1)),2);sum(abs(T - repmat(v,size(dS_new,1),1)),2)]) ~= 0 ....
When I first wrote it, I had a for loop testing each (I knew this would be slow, I was just making sure the whole thing worked first). I then changed this to defining a matrix diff by the two components above and then summing, but this was slightly slower than the above.
All the stuff I've found online says to use the function unique. However, this is very slow as it orders my matrix after. I don't need this, and it's a massively waste of time (it makes the process really slow). This is a bottleneck in my code -- taking nearly 90% of the run time. If anyone has any advice as how to speed this up, I'd be most appreciative!
I imagine there's a fairly straightforward way, but I'm not that experienced with Matlab (fairly, just not lots). I know how to use basic stuff, but not some of the more specialist functions.
Thanks!
To clarify following Sardar_Usama's comment, I want this to work for a matrix with any number of rows and a single vector. I'd forgotten to mention that the elements are all in the set {0,1,...,q-1}. I don't know whether that helps or not to make it faster!
You may want this:
ismember(v,S,'rows')
and replace arguments S and v to get indices of duplicates
ismember(S,v,'rows')
Or
for test if v is member of S:
any(all(bsxfun(#eq,S,v,2))
this returns logical indices of all duplicates
all(bsxfun(#eq,S,v),2)

How to create serially numbered variables in workspace? [duplicate]

This question already has answers here:
Dynamic variables matlab
(2 answers)
Closed 8 years ago.
I was wondering how I can use a for loop to create multiple matrix when given a certain number.
Such as If 3 was given I would need three matricies called: C1, C2 and C3.
k = 3
for i = 1:K
C... = [ ]
end
Not sure how to implement this.
The first thing coming in mind is the eval function mentioned by Dennis Jaheruddin, and yes, it is bad practice. So does the documentation say:
Why Avoid the eval Function?
Although the eval function is very powerful and flexible, it not
always the best solution to a programming problem. Code that calls
eval is often less efficient and more difficult to read and debug than
code that uses other functions or language constructs. For example:
MATLABĀ® compiles code the first time you run it to enhance performance for future runs. However, because code in an eval
statement can change at run time, it is not compiled.
Code within an eval statement can unexpectedly create or assign to a variable already in the current workspace, overwriting existing
data.
Concatenating strings within an eval statement is often difficult to read. Other language constructs can simplify the syntax in your
code.
The "safer" alternative is the function assignin:
The following will do exactly what you want:
letter = 'C';
numbers = 1:3;
arrayfun(#(x) assignin('base',[letter num2str(x)],[]),numbers)
I know cases where you need to create variables likes this, but in most cases it is better and more convenient to use cell arrays or structs.
The trick is to use a cell array:
k=3
C=cell(k,1)
for t=1:k
C{t}= rand(t)
end
If each matrix is of equal size, you would probably just want a three dimensional matrix rather than a cell array.
You now have one cell array,but can access the matrices like this:
C{2} %Access the second matrix.
The use of eval is almost inevitable in this case:
k = 3;
for i = 1:k
eval(sprintf('C%d = [];', i));
end;
Mind you that generating variable names for data storage rather than indexing them (numerically -- see the solution of Dennis Jaheruddin based on cell arrays -- or creating a dynamic struct with named fields that store the data) is almost always a bad idea. Unless you do so to cope up with some weird scripts that you don't want to / cannot modify that need some variables already created in the global workspace.

Matlab function argument passing

Does specifying the input parameters in a function call in terms of output of other functions affect performance?
Would the peak memory usage be affected?
Would it be better if I use temporary variables and clear them after each intermediate step has been calculated?
For ex:
g=imfill(imclearborder(imdilate(Inp_img,strel('square',5))),'holes');
or
temp1=imdilate(Inp_img,strel('square',5));
temp1=imclearborder(temp1);
g=imfill(temp1,'holes');
clear temp1
Which would be better in terms of peak memory usage and speed?
It really depends.
From the top of my head (meaning, I could be wrong):
MATLAB uses a lazy copy-on-write scheme for variable assignment. That means,
a = rand(5);
b = a;
will not create an explicit copy of a. In essence, b is just a reference. However, when you issue
b(2) = 4;
the full contents of a will be copied into a new variable, the location where b points to is changed to that new copy, and the new contents (4) is written.
The same goes for passing arguments. If you issue
c = myFcn(a, b);
and myFcn only reads data from a and b, these variables are never copied explicitly to the function's workspace. But, if it writes (or otherwise makes changes) to a or b, their contents will be copied.
So, in your particular case, I think the peak memory for
r = myFcn( [some computation] )
will be equal to or less than
T = [some computation];
r = myFcn( T );
clear T;
If myFcn makes no changes to T, there will be no difference at all (save for more hassle on your part, and the risk of forgetting the clear).
However, if myFcn changes T, a deep copy will be made, so for a moment T will be in memory twice.
The best way to find out is to profile with memory in mind:
profile -memory
This isn't an answer to the question you asked as far as the 'letter of the law' is concerned 'per se', (and apologies if I'm making assumptions) but as far as the 'spirit of the law' is concerned, I understand the implied question to be "does writing things as 'ugly' one-liners ever confer any significant optimisation benefits", to which the answer is most definitely no. Partly due to matlab's lazy evaluation, as rody pointed out above.
So I would prefer the second version, simply because it is more readable. It will not have any penalty on performance as far as I'm aware.

Vectorizing scalars/vector division

If for example I have:
Q1=4;
Q2=5;
PG=2:60
A1=Q1./sqrt(PG);
A2=Q2./sqrt(PG);
plot(PG,A1)
plot(PG,A2)
can I do sth like : ?
Q=[Q1,Q2];
A=Q./sqrt(PG);
plot(PG,A(1))
plot(PG,A(2))
or sth to avoid the A1 and A2?
A=bsxfun(#rdivide,[Q1;Q2],sqrt(PG)) will do (note the semicolon, not comma, between Q1 and Q2), but if the code in the question is your use case and you ever want anyone else to read and understand the code, I'd advise against using it.
You have to address the rows of A using A(1,:) and A(2,:) (no matter how you get to A), but you probably want to plot(PG,A) anyway.
[edit after first comment:]
rdivide is simply the name of the function usually denoted ./ in MATLAB code, applicable to arrays of the same size or a scalar and an array. bsxfun will simply apply a two-argument function to the other two arguments supplied to it in a way it considers best-fitting (to simplify a bit). arrayfun does something similar: applying a function to all elements of one array. To apply here, one would need a function having PG hard-coded inside.

Slow anonymous function

Suppose you have a loop with 50000 iterations and want to calculate mean values (scalars) from alot of matrices. This is not complete, but roughly like this:
for k=1:50000
...
mean=sum(sum(matrix))/numel(matrix); %Arithmetic mean
...
end
And now want to include different mean equations to choose from. First I tried this:
average='arithmetic'
for k=1:50000
...
switch average
case 'arithmetic'
mean=sum(sum(matrix))/numel(matrix); %Arithmetic mean
case 'geometric'
mean=prod(prod(matrix)).^(1/numel(matrix)); %Geometric mean
case 'harmonic'
mean=numel(matrix)/sum(sum(1./matrix)); %Harmonic mean
end
...
end
This is obviously alot slower than the first loop because it needs to find the matching string for every iteration which feels really unnecessary. Then I tried this:
average='arithmetic'
switch average
case 'arithmetic'
eq=#(arg)sum(sum(arg))/numel(arg); %Arithmetic mean
case 'geometric'
eq=#(arg)prod(prod(arg)).^(1/numel(arg)); %Geometric mean
case 'harmonic'
eq=#(arg)numel(arg)/sum(sum(1./arg)); %Harmonic mean
end
for k=1:50000
...
mean=eq(matrix); %Call mean equation
...
end
This is still about twice as slow as the first loop and I don't get why. The two last loops are almost similar in speed.
Am I doing something wrong here? How can I achieve the same performance as the first loop with this extra feature?
Help is very much appreciated!
Having the switch inside the loop is performing a comparison 50000 times which only needs to be performed once, something I'd advise against.
The second is a little more subtle, but it's quite probable the eq function is being dynamically looked up every iteration and possibly interpreted each time as well (not sure how MATLAB does optimisation). Your best bet for performance is probably to put the for loop inside of the switch
switch average
case 'arithmetic'
for ... end
case 'geometric'
for ... end
case 'harmonic'
for ... end
end
Well, every function, even anonymous functions, can be expected to have some amount of extra overhead involved in calling it, making them slightly slower than their single-line expression counterparts in your example. However, in this case there may be extra overhead due to the fact that functions by the name eq already exist in abundance in MATLAB, since eq is the method name of the overloaded == operator. Using the WHICH command like so:
>> which eq -all
Will show you that eq is heavily overloaded, with one existing for each of the fundamental data types and most objects.
I would try using a different name for your anonymous function handle just to see if dispatching may be a factor, although I kinda doubt it based on the function precedence order (i.e. variables always appear to take precedence). Your best solution performance-wise may be to avoid the extra function call overhead by doing something like what DavW suggests.
I would like to make one other suggestion. Many of the mathematical operations you are doing can be greatly improved to make them more efficient, specifically by making use of the function MEAN and the colon operator to reshape an entire matrix into a column vector:
result = mean(matrix(:)); %# For the arithmetic mean
result = prod(matrix(:))^(1/numel(matrix)); %# For the geometric mean
result = 1/mean(1./matrix(:)); %# For the harmonic mean
Note that I didn't use the name mean for my variable since that is already used for the built-in function, and you definitely don't want to shadow it.