How I can alter the probability of an event in Matlab? - matlab

I have a network with N = 5 nodes. The probability that a new connection exit node "Ni" is:
P(N1) = P(N2) = P(N3) = P(N4) = P(N5) = 1/5
And the sum of all P(Ni) = 1.
which is a uniform distribution. I would like, nodes N3 and N5 had more chance to leave the rest. For example:
P(N1) = P(N2) = P(N4) = 2/15
P(N3) = P(N5) = 3/10
And the sum of all P(Ni) = 1.
The code I am using now is this:
nodes = 21;
NODES=(1:nodes);
R=randperm(nodes);
nodeSource=NODES(R(1));
nodeDestin=NODES(R(2));
Thanks.

You might want to look at randsample
nodeSource = randsample(1:numel(P), numel(P), true, P)

Related

Counting class frequencies to balance classes in tensorflow

I am working on my own implementation of wavenet in tensorflow. I keep having this issue where the audio goes to zero when generating. I think that it might help if my classes were weighted correctly. To do this I decided to divide my cost function by the frequency in which its value occurs. Right now I am keeping a running total of the frequencies as I train. To compute them I unroll the list of 128 different distinct values and I compute the count for each one. I feel like there should be a way to do this with vector operations but I am unsure how. Do any of you know how I can do away with the for loop?
with tf.variable_scope('training'):
self.global_step = tf.get_variable('global_step', [], tf.int64, initializer = tf.constant_initializer(), trainable = False)
class_count = tf.get_variable('class_count', (quantization_channels,), tf.int64, initializer = tf.constant_initializer(), trainable = False)
total_count = tf.get_variable('total_count', [], tf.int64, initializer = tf.constant_initializer(), trainable = False)
y_ = tf.reshape(y_, (-1,))
y = tf.reshape(y, (-1, quantization_channels))
counts = [0] * quantization_channels
for i in range(quantization_channels):
counts[i] = tf.reduce_sum(tf.cast(tf.equal(y_, i), tf.int64))
counts = class_count + tf.pack(counts)
total = total_count + tf.reduce_prod(tf.shape(y_, out_type = tf.int64))
with tf.control_dependencies([tf.assign(class_count, counts), tf.assign(total_count, total)]):
class_freq = tf.cast(counts, tf.float32) / tf.cast(total, tf.float32)
weights = tf.gather(class_freq, y_)
self.cost = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(y, y_) / (quantization_channels * weights + 1e-2))
self.accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(y, 1), y_), tf.float32))
opt = tf.train.AdamOptimizer(self.learning_rate)
grads = opt.compute_gradients(self.cost)
grads = [(tf.clip_by_value(g, -1.0, 1.0), v) for g, v in grads]
self.train_step = opt.apply_gradients(grads, global_step = self.global_step)
Try using tf.histogram_fixed_width function to get a distribution of class labels per batch.
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api_docs/python/functions_and_classes/shard4/tf.histogram_fixed_width.md

How to implement parallel-for in a 4 level nested for loop block

I have to calculate the std and mean of a large data set with respect to quite a few models. The final loop block is nested to four levels.
This is what it looks like:
count = 1;
alpha = 0.5;
%%%Below if each individual block is to be posterior'd and then average taken
c = 1;
for i = 1:numel(writers) %no. of writers
for j = 1: numel(test_feats{i}) %no. of images
for k = 1: numel(gmm) %no. of models
for n = 1: size(test_feats{i}{j},1)
[~, scores(c)] = posterior(gmm{k}, test_feats{i}{j}(n,:));
c = c + 1;
end
c = 1;
index_kek=find(abs(scores-mean(scores))>alpha*std(scores));
avg = mean(scores(index_kek)); %using std instead of mean... beacause of ..reasons
NLL(count) = avg;
count = count + 1;
end
count = 1; %reset count
NLL_scores{i}(j,:) = NLL;
end
fprintf('***score for model_%d done***\n', i)
end
It works and gives the desired result but it takes 3 days to give me the final calculation, even on my i7 processor. During processing the task manager tells me that only 20% of the cpu is being used, so I would rather put more load on the cpu to get the result faster.
Going by the official help here if I suppose want to make the outer most loop a parfor while keeping the rest normal for all I have to do is to insert integer limits rather than function calls such as size or numel.
So making these changes the above code will become:
count = 1;
alpha = 0.5;
%%%Below if each individual block is to be posterior'd and then average taken
c = 1;
num_writers = numel(writers);
num_images = numel(test_feats{1});
num_models = numel(gmm);
num_feats = size(test_feats{1}{1},1);
parfor i = 1:num_writers %no. of writers
for j = 1: num_images %no. of images
for k = 1: num_models %no. of models
for n = 1: num_feats
[~, scores(c)] = posterior(gmm{k}, test_feats{i}{j}(n,:));
c = c + 1;
end
c = 1;
index_kek=find(abs(scores-mean(scores))>alpha*std(scores));
avg = mean(scores(index_kek)); %using std instead of mean... beacause of ..reasons
NLL(count) = avg;
count = count + 1;
end
count = 1; %reset count
NLL_scores{i}(j,:) = NLL;
end
fprintf('***score for model_%d done***\n', i)
end
Is this the most optimum way to implement parfor in my case? Can it be improved or optimized further?
I couldn't test in Matlab for now but it should be close to a working solution. It has a reduced number of loops and changes a few implementation details but overall it might perform just as fast (or even slower) as your earlier code.
If gmm and test_feats take lots of memory then it is important that parfor is able to determine which peaces of data need to be delivered to which workers. The IDE should warn you if inefficient memory access is detected. This modification is especially useful if num_writers is much less than the number of cores in your CPU, or if it is only slightly larger (like 5 writers for 4 cores would take about as long as 8 writers).
[i_writer i_image i_model] = ndgrid(1:num_writers, 1:num_images, 1:num_models);
idx_combined = [i_writer(:) i_image(:) i_model(:)];
n_combined = size(idx_combined, 1);
NLL_scores = zeros(n_combined, 1);
parfor i_for = 1:n_combined
i = idx_combined(i_for, 1)
j = idx_combined(i_for, 2)
k = idx_combined(i_for, 3)
% pre-allocate
scores = zeros(num_feats, 1)
for i_feat = 1:num_feats
[~, scores(i_feat)] = posterior(gmm{k}, test_feats{i}{j}(i_feat,:));
end
% "find" is redundant here and performs a bit slower, might be insignificant though
index_kek = abs(scores - mean(scores)) > alpha * std(scores);
NLL_scores(i_for) = mean(scores(index_kek));
end

Matlab: binomial simulation

How do i simulate a binomial distribution with values for investment with two stocks acme and widget?
Number of trials is 1000
invest in each stock for 5 years
This is my code. What am I doing wrong?
nyears = 5;
ntrials = 1000;
startamount = 100;
yrdeposit = 50;
acme = zeros(nyears, 1);
widget = zeros(nyears,1);
v5 = zeros(ntrials*5, 1);
v5 = zeros(ntrials*5, 1);
%market change between -5 to 1%
marketchangeacme = (-5+(1+5)*rand(nyears,1));
marketchangewidget = (-3+(3+3)*rand(nyears,1));
acme(1) = startamount;
widget(1) = startamount;
for m=1:numTrials
for n=1:nyears
acme(n) = acme(n-1) + (yrdeposit * (marketchangeacme(n)));
widget(n) = acme(n-1) + (yrdeposit * (marketchangewidget(n)));
vacme5(i) = acme(j);
vwidget5(i) = widget(j);
end
theMean(m) = mean(1:n*nyears);
p = 0.5 % prob neg return
acmedrop = (marketchangeacme < p)
widgetdrop = (marketchangewidget <p)
end
plot(mean)
Exactly what you are trying to calculate is not clear. However some things that are obviously wrong with the code are:
widget(n) presumable isn't a function of acme(n-1) but rather 'widget(n-1)`
Every entry of theMean will be mean(1:nyears*nyears), which for nyears=5 will be 13. (This is because n=nyears always at that point in code.)
The probability of a negative return for acme is 5/6, not 0.5.
To find the locations of the negative returns you want acmedrop = (marketchangeacme < 0); not < 0.5 (nor any other probability). Similarly for widgetdrop.
You are not preallocating vacme5 nor vwidget5 (but you do preallocate v5 twice, and then never use it.
You don't create a variable called mean (and you never should) so plot(mean) will not work.

Matlab: Why is my plot not converging?

Would you be able to help me to understand why the plot does not converge? What am I missing from the code? I am plotting mean and variance against number of trials. Thx
samplesize = 10;
trialsize = 1000;
firstvector = [1:trialsize];
vectorB = zeroes(trialsize,1);
vectorC = zeroes(trialsize,1)
for i=1:trialsize
v1 = rand(samplesize, 1)
vectorB(i) = mean(v1);
vectorC(i) = var(v1);
end
plot(firstvector,vectorB)
plot(firstvector,vectorC)
Is this what you wanted to do? Basically take the mean and var of 10 more samples at a time? So 1st loop take 10, next 20, next 30 .... 1000 times?
Before you where just taking the mean and var of 10 random samples at a time so ... you were never going to converge.
samplesize = 10;
trialsize = 1000;
firstvector = [1:trialsize];
vectorB = zeros(trialsize,1);
vectorC = zeros(trialsize,1);
v1 = rand(trialsize*samplesize, 1);
for i=1:trialsize
vectorB(i) = mean(v1(1:i*samplesize));
vectorC(i) = var(v1(1:i*samplesize));
end
subplot(2,1,1);plot(firstvector,vectorB)
subplot(2,1,2);plot(firstvector,vectorC)

Generating random numbers...Faster way?

Using Run & Time on my algorithm I found that is a bit slow on adding standard deviation to integers. First of all I created the large integer matrix:
NumeroCestelli = 5;
lover_bound = 0;
upper_bound = 250;
steps = 10 ;
Alpha = 0.123
livello = [lover_bound:steps:upper_bound];
L = length(livello);
[PianoSperimentale] = combinator(L,NumeroCestelli,'c','r');
for i=1:L
PianoSperimentale(PianoSperimentale==i)=livello(i);
end
then I add standard deviation (sigma = alpha * mu) and error (of a weigher) like this:
%Standard Deviation
NumeroEsperimenti = size(PianoSperimentale,1);
PesoCestelli = randn(NumeroEsperimenti,NumeroCestelli)*Alfa;
PesoCestelli = PesoCestelli.*PianoSperimentale + PianoSperimentale;
random = randn(NumeroEsperimenti,NumeroCestelli);
PesoCestelli(PesoCestelli<0) = random(PesoCestelli<0).*(Alfa.*PianoSperimentale(PesoCestelli<0) + PianoSperimentale(PesoCestelli<0));
%Error
IncertezzaCella = 0.5*10^(-6);
Incertezza = randn(NumeroEsperimenti,NumeroCestelli)*IncertezzaCella;
PesoIncertezza = PesoCestelli.*Incertezza+PesoCestelli;
PesoIncertezza = (PesoIncertezza<0).*(-PesoIncertezza)+PesoIncertezza;
Is there a faster way?
There is not enough information for me to test it, but I bet that eliminating all the duplicate calculations that you do will lead to a speedup. I have tried to remove some of them:
PesoCestelli = randn(NumeroEsperimenti,NumeroCestelli)*Alfa;
PesoCestelli = (1+PesoCestelli).*PianoSperimentale;
random = randn(NumeroEsperimenti,NumeroCestelli);
idx = PesoCestelli<0;
PesoCestelli(idx) = random(idx).*(1+Alfa).*PianoSperimentale(idx);
%Error
IncertezzaCella = 0.5*10^(-6);
Incertezza = randn(NumeroEsperimenti,NumeroCestelli)*IncertezzaCella;
PesoIncertezza = abs((1+PesoCestelli).*Incertezza);
Note that I reduced the last two lines to a single line.
You calculate PesoCestelli<0 a number of times. You could just calculate it once and save teh value. You also create a full set of random numbers, but only use a subset of them where PesoCestelli<0. You might be able to speed things up by only creating the number of random numbers you need.
It is not clear what Alfa is, but if it is a scalar, instead of
Alfa.*PianoSperimentale(PesoCestelli<0) + PianoSperimentale(PesoCestelli<0)
it might be faster to do
(1+Alfa).*PianoSperimentale(PesoCestelli<0)