Selecting a random value from vector but excluding particular values - matlab

I thought this was an easy one but I cannot find any solution for it.
I have this vector called cues= ["R" "B" "C" "P" "Y" "G"];
from which I want to randomly select one value, but excluding one (or two) of the values each time.
For example, I would like to get a random value from the vector excluding the "R" value, or in a second condition I would like "R" and "Y" not to be selected from the sample.
I have tried using randsample and randperm but neither of them seems to include this option.

One intuitive way to achieve this would be taking your lists of all values and exclusions, and making a list of inclusions instead, then you can select from that list.
See the comments for each step:
cues= ["R" "B" "C" "P" "Y" "G"]; % All options
exclude = ["R" "Y"]; % Exclusions (could change in a loop or whatever)
include = setdiff( cues, exclude ); % Actual options without exclusions
selection = include( randi(numel(include)) ); % Random selection from options

It seems I found a way to do it.
For a string vector, it can be done using erase function. For example:
cues_minusR = erase (cues,'R');
For a number vector (e.g., cues = [1 2 3 4 5 6]), it works this way:
c1Col=1;
c2Col=2;
cues(c1Col, : ) = [];
cues([c1Col, c2Col], : ) = [];

Related

Scala Apriori Algorithm : if first index value is same, then Generate Three item set from two items set list

i want this answer
if Input list is >> List( List("A","B"),List("A","C"),List("B","C"),List("B","D"))
Output should be >> List(List("A", "B","C"),List("B","C","D"))
i think it should be done as following>> all indices having first element same are grouped e.g if first element is "A" then group will be ("A","B","B","C").distinct = ("A","B","C")

Organising large datasets in Matlab

I have a problem I hope you can help me with.
I have imported a large dataset (200000 x 5 cell) in Matlab that has the following structure:
'Year' 'Country' 'X' 'Y' 'Value'
Columns 1 and 5 contain numeric values, while columns 2 to 4 contain strings.
I would like to arrange all this information into a variable that would have the following structure:
NewVariable{Country_1 : Country_n , Year_1 : Year_n}(Y_1 : Y_n , X_1 : X_n)
All I can think of is to loop through the whole dataset to find matches between the names of the Country, Year, X and Y variables combining the if and strcmp functions, but this seems to be the most ineffective way of achieving what I am trying to do.
Can anyone help me out?
Thanks in advance.
As mentioned in the comments you can use categorical array:
% some arbitrary data:
country = repmat('ca',10,1);
country = [country; repmat('cb',10,1)];
country = [country; repmat('cc',10,1)];
T = table(repmat((2001:2005)',6,1),cellstr(country),...
cellstr(repmat(['x1'; 'x2'; 'x3'],10,1)),...
cellstr(repmat(['y1'; 'y2'; 'y3'],10,1)),...
randperm(30)','VariableNames',{'Year','Country','X','Y','Value'});
% convert all non-number data to categorical arrays:
T.Country = categorical(T.Country);
T.X = categorical(T.X);
T.Y = categorical(T.Y);
% here is an example for using categorical array:
newVar = T(T.Country=='cb' & T.Year==2004,:);
The table class is made for such things, and very convenient. Just expand the logic statement in the last line T.Country=='cb' & T.Year==2004 to match your needs.
Tell me if this helps ;)

Is there a quick way to assign unique text entries in an array a number?

In MatLab, I have several data vectors that are in text. For example:
speciesname = [species1 species2 species3];
genomelength = [8 10 5];
gonometype = [RNA DNA RNA];
I realise that to make a plot, arrays must be numerical. Is there a quick and easy way to assign unique entries in an array a number, for example so that RNA = 1 and DNA = 2? Note that some arrays might not be binary (i.e. have more than two options).
Thanks!
So there is a quick way to do it, but im not sure that your plots will be very intelligible if you use numbers instead of words.
You can make a unique array like this:
u = unique(gonometype);
and make a corresponding number array is just 1:length(u)
then when you go through your data the number of the current word will be:
find(u == current_name);
For your particular case you will need to utilize cells:
gonometype = {'RNA', 'DNA', 'RNA'};
u = unique(gonometype)
u =
'DNA' 'RNA'
current = 'RNA';
find(strcmp(u, current))
ans =
2

Complementary array Matlab

We've got an array of values, and we would like to create another array whose values are not in the first one.
Example:
load('internet.mat')
The first column contains the values in MBs, we have thought in something like:
MB_no = setdiff(v, internet(:,1))
where v is a 0 vector whose length equals to the number of rows in internet.mat. But it just doesn't work.
So, how do we do this?
You need to specify the range of possible values to define what values are not in internet . Say the range is v = 1:10 then setdiff(v,internet(:,1)) will give you the values in 1:10 that are not in the first column of internet.
It seems as if you don't want the first column.
You can simply do:
MB_no=internet(:,2:end);
assuming internet(:,1) has only positive integers and you wish to find which are the integers in [1,...,max( internet(:,1) )] that do not appear in that range you can simply do
app = [];
app( internet(:,1) ) = 1;
MB_no = find( app == 0 );
This is somewhat like bucket sort.

MATLAB: how do I return entries in a vector?

Let's say a=[5;4;3;2;1] and I want all entries > 3, so I want it to spit out v=[5,4].
I know "find" only finds the indices, so it doesn't exactly work.
any suggestions?
Include the inequality test in the index:
v = a(a>3)