Complementary array Matlab - matlab

We've got an array of values, and we would like to create another array whose values are not in the first one.
Example:
load('internet.mat')
The first column contains the values in MBs, we have thought in something like:
MB_no = setdiff(v, internet(:,1))
where v is a 0 vector whose length equals to the number of rows in internet.mat. But it just doesn't work.
So, how do we do this?

You need to specify the range of possible values to define what values are not in internet . Say the range is v = 1:10 then setdiff(v,internet(:,1)) will give you the values in 1:10 that are not in the first column of internet.

It seems as if you don't want the first column.
You can simply do:
MB_no=internet(:,2:end);

assuming internet(:,1) has only positive integers and you wish to find which are the integers in [1,...,max( internet(:,1) )] that do not appear in that range you can simply do
app = [];
app( internet(:,1) ) = 1;
MB_no = find( app == 0 );
This is somewhat like bucket sort.

Related

Most efficient way to store dictionaries in Matlab

I have a set of IDs associated with costs which is just a double value. IDs are integers and unique. Two IDs may have same costs. I stored them as:-
a=containers.Map('KeyType','uint32','ValueType','double');
a(1)=7.3
a(2)=8.4
a(3)=7.3
Now i want to find the minimum cost.
b=[];
c=values(a);
b=[b,c{:}];
cost_min=min(b);
Now i want to find all IDs associated i.e. 1 and 3 with the minimum cost i.e. 7.3. I can collect all the keys into an array and then do a for loop over this array. Is there a better way to do this entire thing in Matlab so that for loops are not required?
sparse matrix can work as a hashmap, just do this:
a= sparse(1:3,1,[7.3 8.4 7.3])
find(a == min(nonzeros(a))
There are methods which can be used on maps for this kind of operations
http://se.mathworks.com/help/matlab/ref/containers.map-class.html
The approach finding minimum values and minimum keys can be done something like this,
a=containers.Map('KeyType','uint32','ValueType','double');
a(1)=7.3;
a(3)=8.4;
a(4)=7.3;
minval = inf;
minkeys = -1;
for k = keys(a)
val = a.values(k);
val = val{1};
if (val < minval(1))
minkeys = k;
minval = val;
elseif (val == minval(1))
minkeys = [minkeys,k];
end
end
disp(minval);
disp(minkeys);
This is not efficient though and value search is clumsy for maps. This is not what they are intended for. Maps is supposed to do efficient key lookup. In case you are going to do a lot of lookups and this is what takes time, then use a map. If you need to do a lot of value searches, I would recommend that you use a matrix (or two arrays) for this instead.
idx = [1;3;4];
val = [7.3,8.3,7.3];
minval = min(val);
minidx = idx(val==minval);
disp(minval);
disp(minidx);
There is also another post with an example where it is shown how a sparse matrix can be used as a hashmap. Let the index become the key. This will take about 3 times the memory as all non-zero elements an ordinary array, but a map uses more memory than an array as well.

Find value in vector "p" that corresponds to maximum value in vector "r = f(p)"

As simple as in title. I have nx1 sized vector p. I'm interested in the maximum value of r = p/foo - floor(p/foo), with foo being a scalar, so I just call:
max_value = max(p/foo-floor(p/foo))
How can I get which value of p gave out max_value?
I thought about calling:
[max_value, max_index] = max(p/foo-floor(p/foo))
but soon I realised that max_index is pretty useless. I'm sorry asking this, real beginner here.
Having dropped the issue to pieces, I realized there's no unique corrispondence between values p and values in my related vector p/foo-floor(p/foo), so there's a logical issue rather than a language one.
However, given my input data, I know that the solution is unique. How can I fix this?
I ended up doing:
result = p(p/foo-floor(p/foo) == max(p/foo-floor(p/foo)))
Looks terrible, so if you know any other way...
Once you have the index, use it:
result = p(max_index)
You can create a new vector with your lets say "transformed" values:
p2 = (p/foo-floor(p/foo))
and then just use find to find the max values on p2:
max_index = find(p2 == max(p2))
that will return the index or indices of p2 with the max value of that operation, and finally just lookup the original value in p
p(max_index)
in 1 line, this is:
p(find((p/foo-floor(p/foo) == max((p/foo-floor(p/foo))))))
which is basically the same thing you did in the end :)

find value in a string of cell considering some margin

Suppose that I have a string of values corresponding to the height of a group of people
height_str ={'1.76000000000000';
'1.55000000000000';
'1.61000000000000';
'1.71000000000000';
'1.74000000000000';
'1.79000000000000';
'1.74000000000000';
'1.86000000000000';
'1.72000000000000';
'1.82000000000000';
'1.72000000000000';
'1.63000000000000'}
and a single height value.
height_val = 177;
I would like to find the indices of the people that are in the range height_val +- 3cm.
To find the exact match I would do like this
[idx_height,~]=find(ismember(cell2mat(height_str),height_val/100));
How can I include the matches in the previous range (174-180)?
idx_height should be = [1 5 6 7]
You can convert you strings into an numeric array (as #Divakar mentioned) by
height = str2num(char(height_str))*100; % in cm
Then just
idx_height = find(height>=height_val-3 & height<=height_val+3);
Assuming that the precision of heights stays at 0.01cm, you can use a combination of str2double and ismember for a one-liner -
idx_height = find(ismember(str2double(height_str)*100,[height_val-3:height_val+3]))
The magic with str2double is that it works directly with cell arrays to get us a numeric array without resorting to a combined effort of converting that cell array to a char array and then to a numeric array.
After the use of str2double, we can use ismember as you tried in your problem to get us the matches as a logical array, whose indices are picked up with find. That's the whole story really.
Late addition, but for binning my first choice would be to go with bsxfun and logical operations:
idx_height = find(bsxfun(#le,str2double(height_str)*100,height_val+3) & ...
bsxfun(#ge,str2double(height_str)*100,height_val-3))

Is there a quick way to assign unique text entries in an array a number?

In MatLab, I have several data vectors that are in text. For example:
speciesname = [species1 species2 species3];
genomelength = [8 10 5];
gonometype = [RNA DNA RNA];
I realise that to make a plot, arrays must be numerical. Is there a quick and easy way to assign unique entries in an array a number, for example so that RNA = 1 and DNA = 2? Note that some arrays might not be binary (i.e. have more than two options).
Thanks!
So there is a quick way to do it, but im not sure that your plots will be very intelligible if you use numbers instead of words.
You can make a unique array like this:
u = unique(gonometype);
and make a corresponding number array is just 1:length(u)
then when you go through your data the number of the current word will be:
find(u == current_name);
For your particular case you will need to utilize cells:
gonometype = {'RNA', 'DNA', 'RNA'};
u = unique(gonometype)
u =
'DNA' 'RNA'
current = 'RNA';
find(strcmp(u, current))
ans =
2

Skip multiple points of 'x' and store and variable 'newdata in matlab

I have data stored in variable data.
data =
[43.98272955 39.55809471;
-49.51656799 28.57164726;
-9.475861028 -44.31264255;
27.14884251 2.603921223;
-2.914496888 7.864022006;
4.093025860 4.816211687;
-12.11007479 5.797539648;
-1.653535904 -12.49864642;
5.978990391 1.229984916;
0.9837133282 -2.001124423;
5.674977844 6.323209942;
-9.574459589 3.696791663;
0.3410452503 -7.338955191]
but need use only data corresponding to multiple numbers of x.
Example:
if x = 3,
want store only multiple rows of 3, so
newdata = [-9.475861028 -44.31264255;
4.093025860 4.816211687;
5.978990391 1.229984916;
-9.574459589 3.696791663]
how do I do that?
P.S I would use the command textscan.
this is straightforward with indexing:
newData = data(3:3:end,:)
If I understood the question correctly:
data(x:x:length(data),:)
You could just do just scan it row by row using the mod (modulo) function to extract rows corresponding to the desired multiples. For example:
x=3;
newdata=[];
for k=1:size(data,1)
if mod(k,x)==0
newdata=[newdata; data(k,:)];
end
end