Finding 5 consecutive successes using Matlab? - matlab

I have a function which for 10 cycles finds the difference between individual sensor values and the average sensor value. The test will be done 100 times using this function. So every time cycle>10 I am forcing it to be zero so that in the 11th repetition it will restart counting from zero. Here is the code:
cycle=cycle +1;
if cycle>10
cycle=0;
end
for i=1: TotalnoOfGrids
for j=1: noOfNodes
if abs(char(Allquants{i}(j))-char(mostCommonLetters {i}))>0
if cycle>0
wrong{i}(j)=wrong{i}(j)+1;
else
wrong{i}(j)=0;
end
end
end
end
Now I need to know if the sensor performed 5 consecutive successes in the period of 10 cycles. How can I do that?
I thought of a loop but I read that it takes too much time.
Doing a search on the net I have found this SO question.
The problem is that the above function will be repeated for 100 cycles.I want for every 10 cycles see if there is consecutive successes so it is beeing done dynamically and I am not saving the success or failure status of the sensor for the cycles. So i do not have a vector containing 1 or 0 to use the function used in the above reference or as Jonas suggested

If a loop is the easiest thing, give it a try! Just because you've read it "takes too much time" doesn't mean it really makes a difference for your case! It is true that in Matlab it often makes sense to avoid loops; but in your case, 100*20*9 (if I understand you correctly) loop iterations doesn't seem so bad yet (depending on your speed requirement).
Edit (corrected answer)
I now understand from your comments that the code you show us is surrounded by a while or for loop which is being run ~100 times, and that Allquants and mostCommonLetters probably change inside that loop. In this case my previous answer didn't work for you, since it counted successes on different sensors; this should be better now.
If I read your code correctly, the condition abs(char(Allquants{i}(j))-char(mostCommonLetters {i}))>0 tells you that a result was "wrong"; consequently,
for i=1:TotalnoOfGrids
this_cycle_successes(i,:)=char(Allquants{i})==char(mostCommonLetters{i});
end
consecutive_successes=(consecutive_successes+1).*this_cycle_successes;
would calculate how many successes you had in a row. Note you need to initialize consecutive_successes before starting your cycle loop, e.g.
consecutive_successes = zeros(9,20);
After the 10 cycles, you can check which sensors had 5 successes like this:
has5successes = consecutive_successes>=5;
Note that this is a matrix operation, so now you will get 9*20 values, as you requested in your comment. This solution wouldn't require a loop over j.

Related

Periodic to incremental (data reduction)

It's some time I was thinking about solving this problem. I have a registration of angular data (Angle(~20000,1)) variating between 0 and 355 (a potentiometer attached to a rotary testing machine), and I wanted to convert it in an incremental form, since I want the final total angular displacement. The main issue is that between 355 and the next 0 there are no jumps but a fast decrement (with strongly negative slope in time vs angle space). I've tried 2 ways up to now:
Calculate the Angslope=diff(Angle), extract with find the indexes j1=find(Angslope>0.2 & Angslope<0.2) to avoid the negative slopes due to the inversion of angular signal, then try to apply those indexes to the original Angle(n,1), as Angle2=Angle(j1). The trouble is the n-1 length of Angslope and the fact that somehow there is not a simple shift of my indexes of one position.
For cycles and logical, wanting to exclude data if the previous one is < the current value,etc
Angle2=zeros(size(Angle,1),1);
for i=2:size(Angle,1)
if Angle(i,1)<Angle(i-1,1)
Angle2(i,1)=NaN;
else Angle2(i,1)=Angle(i,1);
end
end
Which works good, but I don't know how to "match up" the single increment steps I obtain!
Any help or simple comment would be of great help!!
You are probably looking for the unwrap function. For this you have to convert your angles into radians, but that's not a big deal.
You can get the increments in one line:
Inc = diff(unwrap(Angle*pi/180))*180/pi;
and your total angular displacement:
Tot = sum(Inc);
Best,

recording 'bursts' of samples at 300 samples per sec

I am recording voltage changes over a small circuit- this records mouse feeding. When the mouse is eating, the circuit voltage changes, I convert that into ones and zeroes, all is well.
BUT- I want to calculate the number and duration of 'bursts' of feeding- that is, instances of circuit closing that occur within 250 ms (75 samples) of one another. If the gap between closings is larger than 250ms I want to count it as a new 'burst'
I guess I am looking for help in asking matlab to compare the sample number of each 1 in the digital file with the sample number of the next 1 down- if the difference is more than 75, call the first 1 the end of one bout and the second one the start of another bout, classifying the difference as a gap, but if it is NOT, keep the sample number of the first 1 and compare it against the next and next and next until there is a 75-sample difference
I can compare each 1 to the next 1 down:
n=1; m=2;
for i = 1:length(bouts4)-1
if bouts4(i+1) - bouts4(i) >= 75 %250 msec gap at a sample rate of 300
boutend4(n) = bouts4(i);
boutstart4(m)= bouts4(i+1);
m = m+1;
n = n+1;
end
I don't really want to iterate through i for both variables though...
any ideas??
-DB
You can try the following code
time_diff = diff(bouts4);
new_feeding = time_diff > 75;
boutend4 = bouts4(new_feeding);
boutstart4 = [0; bouts4(find(new_feeding) + 1)];
That's actually not too bad. We can actually make this completely vectorized. First, let's start with two signals:
A version of your voltages untouched
A version of your voltages that is shifted in time by 1 step (i.e. it starts at time index = 2).
Now the basic algorithm is really:
Go through each element and see if the difference is above a threshold (in your case 75).
Enumerate the locations of each one in separate arrays
Now onto the code!
%// Make those signals
bout4a = bouts4(1:end-1);
bout4b = bouts4(2:end);
%// Ensure column vectors - you'll see why soon
bout4a = bout4a(:);
bout4b = bout4b(:);
% // Step #1
loc = find(bouts4b - bouts4a >= 75);
% // Step #2
boutend4 = [bouts4(loc); 0];
boutstart4 = [0; bouts4(loc + 1)];
Aside:
Thanks to tail.b.lo, you can also use diff. It basically performs that difference operation with the copying of those vectors like I did before. diff basically works the same way. However, I decided not to use it so you can see how exactly your code that you wrote translates over in a vectorized way. Only way to learn, right?
Back to it!
Let's step through this slowly. The first two lines of code make those signals I was talking about. An original one (up to length(bouts) - 1) and another one that is the same length but shifted over by one time index. Next, we use find to find those time slots where the time index was >= 75. After, we use these locations to access the bouts array. The ending array accesses the original array while the starting array accesses the same locations but moved over by one time index.
The reason why we need to make these two signals column vector is the way I am appending information to the starting vector. I am not sure whether your data comes in rows or columns, so to make this completely independent of orientation, I'm going to make sure that your data is in columns. This is because if I try to append a 0, if I do it to a row vector I have to use a space to denote that I'm going to the next column. If I do it for a column vector, I have to use a semi-colon to go to the next row. To completely avoid checking to see whether it's a row or column vector, I'm going to make sure that it's a column vector no matter what.
By looking at your code m=2. This means that when you start writing into this array, the first location is 0. As such, I've artificially placed a 0 at the beginning of this array and followed that up with the rest of the values.
Hope this helps!

Random numbers that add to 1 with a minimum increment: Matlab

Having read carefully the previous question
Random numbers that add to 100: Matlab
I am struggling to solve a similar but slightly more complex problem.
I would like to create an array of n elements that sums to 1, however I want an added constraint that the minimum increment (or if you like number of significant figures) for each element is fixed.
For example if I want 10 numbers that sum to 1 without any constraint the following works perfectly:
num_stocks=10;
num_simulations=100000;
temp = [zeros(num_simulations,1),sort(rand(num_simulations,num_stocks-1),2),ones(num_simulations,1)];
weights = diff(temp,[],2);
I foolishly thought that by scaling this I could add the constraint as follows
num_stocks=10;
min_increment=0.001;
num_simulations=100000;
scaling=1/min_increment;
temp2 = [zeros(num_simulations,1),sort(round(rand(num_simulations,num_stocks-1)*scaling)/scaling,2),ones(num_simulations,1)];
weights2 = diff(temp2,[],2);
However though this works for small values of n & small values of increment, if for example n=1,000 & the increment is 0.1% then over a large number of trials the first and last numbers have a mean which is consistently below 0.1%.
I am sure there is a logical explanation/solution to this but I have been tearing my hair out to try & find it & wondered anybody would be so kind as to point me in the right direction. To put the problem into context create random stock portfolios (hence the sum to 1).
Thanks in advance
Thank you for the responses so far, just to clarify (as I think my initial question was perhaps badly phrased), it is the weights that have a fixed increment of 0.1% so 0%, 0.1%, 0.2% etc.
I did try using integers initially
num_stocks=1000;
min_increment=0.001;
num_simulations=100000;
scaling=1/min_increment;
temp = [zeros(num_simulations,1),sort(randi([0 scaling],num_simulations,num_stocks-1),2),ones(num_simulations,1)*scaling];
weights = (diff(temp,[],2)/scaling);
test=mean(weights);
but this was worse, the mean for the 1st & last weights is well below 0.1%.....
Edit to reflect excellent answer by Floris & clarify
The original code I was using to solve this problem (before finding this forum) was
function x = monkey_weights_original(simulations,stocks)
stockmatrix=1:stocks;
base_weight=1/stocks;
r=randi(stocks,stocks,simulations);
x=histc(r,stockmatrix)*base_weight;
end
This runs very fast, which was important considering I want to run a total of 10,000,000 simulations, 10,000 simulations on 1,000 stocks takes just over 2 seconds with a single core & I am running the whole code on an 8 core machine using the parallel toolbox.
It also gives exactly the distribution I was looking for in terms of means, and I think that it is just as likely to get a portfolio that is 100% in 1 stock as it is to geta portfolio that is 0.1% in every stock (though I'm happy to be corrected).
My issue issue is that although it works for 1,000 stocks & an increment of 0.1% and I guess it works for 100 stocks & an increment of 1%, as the number of stocks decreases then each pick becomes a very large percentage (in the extreme with 2 stocks you will always get a 50/50 portfolio).
In effect I think this solution is like the binomial solution Floris suggests (but more limited)
However my question has arrisen because I would like to make my approach more flexible & have the possibility of say 3 stocks & an increment of 1% which my current code will not handle correctly, hence how I stumbled accross the original question on stackoverflow
Floris's recursive approach will get to the right answer, but the speed will be a major issue considering the scale of the problem.
An example of the original research is here
http://www.huffingtonpost.com/2013/04/05/monkeys-stocks-study_n_3021285.html
I am currently working on extending it with more flexibility on portfolio weights & numbers of stock in the index, but it appears my programming & probability theory ability are a limiting factor.......
One problem I can see is that your formula allows for numbers to be zero - when the rounding operation results in two consecutive numbers to be the same after sorting. Not sure if you consider that a problem - but I suggest you think about it (it would mean your model portfolio has fewer than N stocks in it since the contribution of one of the stocks would be zero).
The other thing to note is that the probability of getting the extreme values in your distribution is half of what you want them to be: If you have uniformly distributed numbers from 0 to 1000, and you round them, the numbers that round to 0 were in the interval [0 0.5>; the ones that round to 1 came from [0.5 1.5> - twice as big. The last number (rounding to 1000) is again from a smaller interval: [999.5 1000]. Thus you will not get the first and last number as often as you think. If instead of round you use floor I think you will get the answer you expect.
EDIT
I thought about this some more, and came up with a slow but (I think) accurate method for doing this. The basic idea is this:
Think in terms of integers; rather than dividing the interval 0 - 1 in steps of 0.001, divide the interval 0 - 1000 in integer steps
If we try to divide N into m intervals, the mean size of a step should be N / m; but being integer, we would expect the intervals to be binomially distributed
This suggests an algorithm in which we choose the first interval as a binomially distributed variate with mean (N/m) - call the first value v1; then divide the remaining interval N - v1 into m-1 steps; we can do so recursively.
The following code implements this:
% random integers adding up to a definite sum
function r = randomInt(n, limit)
% returns an array of n random integers
% whose sum is limit
% calls itself recursively; slow but accurate
if n>1
v = binomialRandom(limit, 1 / n);
r = [v randomInt(n-1, limit - v)];
else
r = limit;
end
function b = binomialRandom(N, p)
b = sum(rand(1,N)<p); % slow but direct
To get 10000 instances, you run this as follows:
tic
portfolio = zeros(10000, 10);
for ii = 1:10000
portfolio(ii,:) = randomInt(10, 1000);
end
toc
This ran in 3.8 seconds on a modest machine (single thread) - of course the method for obtaining a binomially distributed random variate is the thing slowing it down; there are statistical toolboxes with more efficient functions but I don't have one. If you increase the granularity (for example, by setting limit=10000) it will slow down more since you increase the number of random number samples that are generated; with limit = 10000 the above loop took 13.3 seconds to complete.
As a test, I found mean(portfolio)' and std(portfolio)' as follows (with limit=1000):
100.20 9.446
99.90 9.547
100.09 9.456
100.00 9.548
100.01 9.356
100.00 9.484
99.69 9.639
100.06 9.493
99.94 9.599
100.11 9.453
This looks like a pretty convincing "flat" distribution to me. We would expect the numbers to be binomially distributed with a mean of 100, and standard deviation of sqrt(p*(1-p)*n). In this case, p=0.1 so we expect s = 9.4868. The values I actually got were again quite close.
I realize that this is inefficient for large values of limit, and I made no attempt at efficiency. I find that clarity trumps speed when you develop something new. But for instance you could pre-compute the cumulative binomial distributions for p=1./(1:10), then do a random lookup; but if you are just going to do this once, for 100,000 instances, it will run in under a minute; unless you intend to do it many times, I wouldn't bother. But if anyone wants to improve this code I'd be happy to hear from them.
Eventually I have solved this problem!
I found a paper by 2 academics at John Hopkins University "Sampling Uniformly From The Unit Simplex"
http://www.cs.cmu.edu/~nasmith/papers/smith+tromble.tr04.pdf
In the paper they outline how naive algorthms don't work, in a way very similar to woodchips answer to the Random numbers that add to 100 question. They then go on to show that the method suggested by David Schwartz can also be slightly biased and propose a modified algorithm which appear to work.
If you want x numbers that sum to y
Sample uniformly x-1 random numbers from the range 1 to x+y-1 without replacement
Sort them
Add a zero at the beginning & x+y at the end
difference them & subtract 1 from each value
If you want to scale them as I do, then divide by y
It took me a while to realise why this works when the original approach didn't and it come down to the probability of getting a zero weight (as highlighted by Floris in his answer). To get a zero weight in the original version for all but the 1st or last weights your random numbers had to have 2 values the same but for the 1st & last ones then a random number of zero or the maximum number would result in a zero weight which is more likely.
In the revised algorithm, zero & the maximum number are not in the set of random choices & a zero weight occurs only if you select two consecutive numbers which is equally likely for every position.
I coded it up in Matlab as follows
function weights = unbiased_monkey_weights(num_simulations,num_stocks,min_increment)
scaling=1/min_increment;
sample=NaN(num_simulations,num_stocks-1);
for i=1:num_simulations
allcomb=randperm(scaling+num_stocks-1);
sample(i,:)=allcomb(1:num_stocks-1);
end
temp = [zeros(num_simulations,1),sort(sample,2),ones(num_simulations,1)*(scaling+num_stocks)];
weights = (diff(temp,[],2)-1)/scaling;
end
Obviously the loop is a bit clunky and as I'm using the 2009 version the randperm function only allows you to generate permutations of the whole set, however despite this I can run 10,000 simulations for 1,000 numbers in 5 seconds on my clunky laptop which is fast enough.
The mean weights are now correct & as a quick test I replicated woodchips generating 3 numbers that sum to 1 with the minimum increment being 0.01% & it also look right
Thank you all for your help and I hope this solution is useful to somebody else in the future
The simple answer is to use the schemes that work well with NO minimum increment, then transform the problem. As always, be careful. Some methods do NOT yield uniform sets of numbers.
Thus, suppose I want 11 numbers that sum to 100, with a constraint of a minimum increment of 5. I would first find 11 numbers that sum to 45, with no lower bound on the samples (other than zero.) I could use a tool from the file exchange for this. Simplest is to simply sample 10 numbers in the interval [0,45]. Sort them, then find the differences.
X = diff([0,sort(rand(1,10)),1]*45);
The vector X is a sample of numbers that sums to 45. But the vector Y sums to 100, with a minimum value of 5.
Y = X + 5;
Of course, this is trivially vectorized if you wish to find multiple sets of numbers with the given constraint.

How can i increase speed of for loop in matlab? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I want to read the value of pixcels in image result to compare this value with some
I use to for loop
function GrdImg= GrdLbp(VarImg,mapping,LbpImg)
tic
p=mapping.samples;
[Ysize,Xsize]=size(result);
GImg=zeros(Ysize,Xsize);
temp=[];
cnt=1;
for n=0:p-1
temp(cnt)=2^n;
temp(cnt+1)=(2^p)-1-(2^n);
cnt=cnt+2;
end
for i=1:Ysize
i
for j=1:Xsize
if isempty(find(result(i,j)==temp(:,:)))==1
GImg(i,j)=sqrtm(Vresult(i,j));
end
end
end
but it works too slow, Could you help me what can I use instead of for loop?
Thanks a lot
You didn't really give enough information to answer your question - since, as was stated in the comments, you aren't doing anything with the values in the loop right now. So let me give you a few ideas:
1) To compare all the pixels with a fixed value, and return the index of all pixels greater than 90% of the maximum:
threshold = 0.9 * max(myImage(:));
prettyBigPixels = find(myImage > threshold);
2) To set all pixels < 5% of max to zero:
threshold = 0.05 * max(myImage(:));
myImage(myImage < threshold) = 0;
In the first case, the find command returns all the indices (note - you can access a 2D matrix of MxN with a single index that goes from 1 to M*N). You can use ind2sub to convert to the individual i, j coefficients if you want to.
In the second case, putting (myImage < threshold) as the index of the matrix is called logical indexing - it is very fast, and will access only those elements that meet the criterion.
If you let us know what you're actually doing with the values found we can speed things up more; because right now, the net result of your code is that when the loop is finished, your value Temp is equal to the last element - and since you did nothing in the loop we can rewrite the whole thing as
Temp = pixel(end);
EDIT Now that you show what you are doing in your inner loop, we can optimize more. Behzad already showed how to speed up the computation of the vector temp - nothing to add there, it's the right way to do it. As for the two nested loops, which are likely the place where most time is spent, you can find all the pixels you are interested in with a single line:
pixelsOfInterest = find(~ismember(result(:), temp(:)));
This will find the index of pixels in result that do not occur in temp. You can then do
GImg(pixelsOfInterest) = sqrt(result(pixelsOfInterest));
These two lines together should replace the functionality of everything in your code from for i=1:Ysize to the last end. Note - your variables seem to be uninitialized, and change names - sometimes it's result, sometimes it's Vresult. I am not trying to debug that; just giving you a fast implementation of your inner loop.
As of question completely edited I answer new rather than edit my former answer, by the way.
You can improve your code in some ways:
1. instead of :
for n=0:p-1
temp(cnt)=2^n;
temp(cnt+1)=(2^p)-1-(2^n);
cnt=cnt+2;
end
use this one:
temp=zeros(1,2*p);
n=0:p-1;
temp(1:2:2*p)=2.^n; %//for odd elements
temp(2:2:2*p)=2^p-1-2.^n; %//for even elements (i supposed p>1)
2.when code is ready for calculating and not for debugging or other times, NEVER make some variables to print on screen because it makes too long time (in cpu cycles) to run. In your code there are some variables like i that prints on screen. remove them or end up them by ;.
3.You can use temp(:) in last rows because temp is one-dimensional
4.Different functions are for different types of variables. in this code you can use sqrt() instead of sqrtm(). it may be slightly faster.
5. The big problem in this code is in your last comparison, if non elemnt of temp matrix is not equal with result specific element then do something! its hard to make this part improved unless knowing the real aim of the code! you may be solve the problem in other algorithm that has completely different code. But if there is no way, so use it in this way (nested loops) Good Luck!
It seems your image is grayscle or monocolor , because Temp=pixel(i,j) gives a number not 3-numbers by the way.
Your question has not more explanation so I think in three type of numbers that you are comparison with.
compare with a constant number
compare with a series of numbers
compare with a two dimensional matrix of numbers
If first or third one is your need, solution is very easy (absolutely in third one, size of matrix must be equal to pixel size)
Comparison with a number (c is number or two-dimensional array)
comp=pixel - c;
But if second one is your need, you can first reshape pixel to one-dimensional matrix then compare it with the series of number s (absolutely length of this serie must be equal to product of pixel rows number and columns number; you can re-reshape pixel matrix after comparison to primary two dimensional matrix.
Comparison with a number serie s
pixel_temp = reshape(pixel,1,[]);
comp = pixel_temp - s;
pixel_compared = reshape(pixel_temp,size(pixel,1),size(pixel,2)); % to re-reshape to primary size

Problem using the find function in MATLAB

I have two arrays of data that I'm trying to amalgamate. One contains actual latencies from an experiment in the first column (e.g. 0.345, 0.455... never more than 3 decimal places), along with other data from that experiment. The other contains what is effectively a 'look up' list of latencies ranging from 0.001 to 0.500 in 0.001 increments, along with other pieces of data. Both data sets are X-by-Y doubles.
What I'm trying to do is something like...
for i = 1:length(actual_latency)
row = find(predicted_data(:,1) == actual_latency(i))
full_set(i,1:4) = [actual_latency(i) other_info(i) predicted_info(row,2) ...
predicted_info(row,3)];
end
...in order to find the relevant row in predicted_data where the look up latency corresponds to the actual latency. I then use this to created an amalgamated data set, full_set.
I figured this would be really simple, but the find function keeps failing by throwing up an empty matrix when looking for an actual latency that I know is in predicted_data(:,1) (as I've double-checked during debugging).
Moreover, if I replace find with a for loop to do the same job, I get a similar error. It doesn't appear to be systematic - using different participant data sets throws it up in different places.
Furthermore, during debugging mode, if I use find to try and find a hard-coded value of actual_latency, it doesn't always work. Sometimes yes, sometimes no.
I'm really scratching my head over this, so if anyone has any ideas about what might be going on, I'd be really grateful.
You are likely running into a problem with floating point comparisons when you do the following:
predicted_data(:,1) == actual_latency(i)
Even though your numbers appear to only have three decimal places of precision, they may still differ by very small amounts that are not being displayed, thus giving you an empty matrix since FIND can't get an exact match.
One feature of floating point numbers is that certain numbers can't be exactly represented, since they aren't an integer power of 2. This occurs with the numbers 0.1 and 0.001. If you repeatedly add or multiply one of these numbers you can see some unexpected behavior. Amro pointed out one example in his comment: 0.3 is not exactly equal to 3*0.1. This can also be illustrated by creating your look-up list of latencies in two different ways. You can use the normal colon syntax:
vec1 = 0.001:0.001:0.5;
Or you can use LINSPACE:
vec2 = linspace(0.001,0.5,500);
You'd think these two vectors would be equal to one another, but think again!:
>> isequal(vec1,vec2)
ans =
0 %# FALSE!
This is because the two methods create the vectors by performing successive additions or multiplications of 0.001 in different ways, giving ever so slightly different values for some entries in the vector. You can take a look at this technical solution for more details.
When comparing floating point numbers, you should therefore do your comparisons using some tolerance. For example, this finds the indices of entries in the look-up list that are within 0.0001 of your actual latency:
tolerance = 0.0001;
for i = 1:length(actual_latency)
row = find(abs(predicted_data(:,1) - actual_latency(i)) < tolerance);
...
The topic of floating point comparison is also covered in this related question.
You may try to do the following:
row = find(abs(predicted_data(:,1) - actual_latency(i))) < eps)
EPS is accuracy of floating-point operation.
Have you tried using a tolerance rather than == ?