Matlab: Find the result with accuracy of certain decimal place with minimum iterations - matlab

I'm using a numerical integration method to approximate an integral. I need to use a minimum number of iterations that give an answer correct to 5 decimal places.
I cannot seem to find a generalised way of doing this. I've only been able to get it working for some cases.
Here is what I'e tried:
%num2str(x,7) to truncate the value of x
xStr = num2str(x(n),7);
x5dp(n) = str2double(xStr); %convert back to a truncated number
%find the difference between the values
if n>1 %cannot index n-1 = 0
check = x5dp(n)-x5dp(n-1);
end
This will find the first instance at which the first 5dp are the same, it doesn't take into account that changes might occur beyond that point, which has happened, the iteration I am looking for was about 450, but it stopped at 178 due to this.
Then I tried this
while err>errLim & n<1000
...
r = fix(x(j)*1e6)/1e1 %to move the 6th dp to the 1stplace
x6dp = rem(r,1)*10 %to 'isolate' the value of the 6th dp
err = abs(x(j)-x(j-1)) % calculate the difference between the two
%if the 6th decimal place is greater than 5 and err<1e-6
% then the 6th decimal place won't change the value of the 5th any more
if err<errLim && x6dp<5
err=1;
end
...
end
This works for one method and function I tested it one. However when I pasted it into another method for another function, I get the iteration ending before the final result is achieved. the final 4 results are:
4.39045203734423 4.39045305948901 4.39045406364900 4.39045505024365
However, the answer I need is actually 4.39053, this has stopped the iteration about 300 steps too early.

Related

Index when mean is constant

I am relatively new to matlab. I found the consecutive mean of a set of 1E6 random numbers that has mean and standard deviation. Initially the calculated mean fluctuate and then converges to a certain value.
I will like to know the index (i.e 100th position) at which the mean converges. I have no idea how to do that.
I tried using the logical operator but i have to go through 1e6 data points. Even with that i still can't find the index.
Y_c= sigma_c * randn(n_r, 1) + mu_c; %Random number creation
Y_f=sigma_f * randn(n_r, 1) + mu_f;%Random number creation
P_u=gamma*(B*B)/2.*N_gamma+q*B.*N_q + Y_c*B.*N_c; %Calculation of Ultimate load
prog_mu=cumsum(P_u)./cumsum(ones(size(P_u))); %Progressive Cumulative Mean of system response
logical(diff(prog_mu==0)); %Find index
I suspect the issue is that the mean will never truly be constant, but will rather fluctuate around the "true mean". As such, you'll most likely never encounter a situation where the two consecutive values of the cumulative mean are identical. What you should do is determine some threshold value, below which you consider fluctuations in the mean to be approximately equal to zero, and compare the difference of the cumulative mean to that value. For instance:
epsilon = 0.01;
const_ind = find(abs(diff(prog_mu))<epsilon,1,'first');
where epsilon will be the threshold value you choose. The find command will return the index at which the variation in the cumulative mean first drops below this threshold value.
EDIT: As was pointed out, this method may potentially fail if the first few random numbers are generated such that the difference between them is less than the epsilon value, but have not yet converged. I would like to suggest a different approach, then.
We calculate the cumulative means, as before, like so:
prog_mu=cumsum(P_u)./cumsum(ones(size(P_u)));
We also calculate the difference in these cumulative means, as before:
df_prog_mu = diff(prog_mu);
Now, to ensure that conversion has been achieved, we find the first index where the cumulative mean is below the threshold value epsilon and all subsequent means are also below the threshold value. To phrase this another way, we want to find the index after the last position in the array where the cumulative mean is above the threshold:
conv_index = find(~df_prog_mu,1,'last')+1;
In doing so, we guarantee that the value at the index, and all subsequent values, have converged below your predetermined threshold value.
I wouldn't imagine that the mean would suddenly become constant at a single index. Wouldn't it asymptotically approach a constant value? I would reccommend a for loop to calculate the mean (it sounds like maybe you've already done this part?) like this:
avg = [];
for k=1:length(x)
avg(k) = mean(x(1:k));
end
Then plot the consecutive mean:
plot(avg)
hold on % this will allow us to plot more data on the same figure later
If you're trying to find the point at which the consecutive mean comes within a certain range of the true mean, try this:
Tavg = 5; % or whatever your true mean is
err = 0.01; % the range you want the consecutive mean to reach before we say that it "became constant"
inRange = avg>(Tavg-err) & avg<(Tavg+err); % gives you a binary logical array telling you which values fell within the range
q = 1000; % set this as high as you can while still getting a value for consIndex
constIndex = [];
for k=1:length(inRange)
if(inRange(k) == sum(inRange(k:k+q))/(q-1);)
constIndex = k;
end
end
The below answer takes a similar approach but makes an unsafe assumption that the first value to fall within the range is the value where the function starts to converge. Any value could randomly fall within that range. We need to make sure that the following values also fall within that range. In the above code, you can edit "q" and "err" to optimize your result. I would recommend double checking it by plotting.
plot(avg(constIndex), '*')

'Find' function working incorrectly, have tried floating point accuracy resolution

I have vertically concatenated files from my directory into a matrix that is about 60000 x 15 in size (verified).
d=dir('*.log');
n=length(d);
data=[];
for k=1:n
data{k}=importdata(d(k).name);
end
total=[];
for k=1:n
total=[total;data{n}];
end
I am using a the following 32-iteration loop and the 'Find" function to locate row numbers where the final column is an integer corresponding to the integer iteration of the loop:
for i=1:32
v=[];
vn=[];
[v,vn]=find(abs(fix(i)-fix(total))<eps);
g=length(v)
end
I have tried to account for the floating point accuracy by using 'fix' on values of 'i' and values from matrix 'total', in addition to taking their absolute difference and checking it to be less than a tolerance of 'eps' (floating-point relative accuracy function), up to a tolerance of .99.
The 'Find' function is not working correctly. It is only working for certain integers (although it should be locating all of them (1-32)), and for the integers it does find the values are incomplete.
What is the problem here? If 'Find' is inadequate for this purpose, what is a suitable alternative?
You are getting a lot of zeros because you are looking not just at the 15th column of data but the entire data matrix so you are going to have a lot of non-integers.
Also, you're using fix on both numbers and since floating point errors can cause the number to be slightly above and below the desired integer, this will cause the ones that are below to round down an integer lower than what you'd expect. You should use round to round to the nearest integer instead.
Rather than using find to do this, I would use simple boolean logic to determine the value of the last column
for k = 1:32
% Compare column 15 to the current index
matches = abs(total(:,end) - k) < eps;
% Do stuff with these matches
g = sum(matches); % Count the matches
end
Depending on what you want to actually do with the data, you may be able to use the last column as an input to accumarray to perform an operation on each group.
As a side note, you can replace the first chunk of code with
d = dir('*.log');
data = cellfun(#importdata, {d.name}, 'UniformOutput', false);
total = cat(1, data{:});

(MATLAB) On drawing elements randomly from vectors and rows randomly from matrices without replacement in for loops

So, two related questions for code I'm writing for an experiment I'm building in MATLAB. Firstly, I have a 20 row column vector, called block1 (with a number in each of the 20 slots).
I want to draw a number randomly from the vector without replacement and repeat this until there are no numbers left in the vector (so 20 times total). I have the following line within a for loop (which is set to run for 20 iterations), which draws the random number from the vector and sets it equal to variable rand_num:
rand_num = randsample(block1,1,false)
It draws the random number just fine, but the problem is that the draw without replacement part doesn't work. Initially I thought this was because the vector was reset at the beginning of each iteration of the for loop. To make sure, I debugged and found that even when I stop the for loop before the first iteration finishes (and after the random number has been drawn), there are still 20 numbers in the vector (when there should be 19). Any ideas as to how I can fix this?
The second question is as follows. For the same experiment, I have a 20 by 6 matrix called pick_matrix. During each iteration of the same for loop as above, I want to pick one row at random from the matrix (including every element in that row; so 6 elements total for each row) without replacement and assign that row to variable random_row.
I figured out the code to pick a row at random and assign it to random_row, but I'm having the same problem as with the last question: the without replacement part isn't working.
randsample works fine, and exactly as expected. You just need to call it once before your loop rather than twenty times within the loop. Read the help carefully - randsample does not change anything about the input variable block1. Generally speaking (ignoring globals and such), input variables are not changed by Matlab functions, even if they are changed within a function, unless they are then explicitly output.
e.g. consider this:
function y = myfunc(x);
x = x*2;
y = x*3;
disp(x)
end
Within the function, x is doubled. Say I have a variable x in my workspace, value 10. If I call y = myfunc(2), I will see "4" displaced - and y will be output as 12, not 6. However, the x in my base workspace is never changed, which is exactly the expected behaviour.
If I want x to be changed, I have to set it explicitly as output to the function and deliberately call the function in such a way as to overwrite my variable x, e.g [x y] = myfunc(2);
As suggested in the comments, you could use randperm instead if you're taking every number, but a more generic example using randsample to take a random pick of 10 from 20:
% sampling without replacement is default
rand_nums = randsample(block1, 10);
rand_ind = randperm(20,10); % 10 numbers from 1:20
for n = 1:10
rand_num = rand_nums(n);
rand_row = pick_matrix(rand_ind(n),:);
% whatever you need to do with these things goes here
end

How to calculate the "rest value" of a plot?

Didn't know how to paraphrase the question well.
Function for example:
Data:https://www.dropbox.com/s/wr61qyhhf6ujvny/data.mat?dl=0
In this case how do I calculate that the rest point of this function is ~1? I have access to the vector that makes the plot.
I guess the mean is an approximation but in some cases it can be pretty bad.
Under the assumption that the "rest" point is the steady-state value in your data and the fact that the steady-state value happens the majority of the times in your data, you can simply bin all of the points and use each unique value as a separate bin. The bin with the highest count should correspond to the steady-state value.
You can do this by a combination of histc and unique. Assuming your data is stored in y, do this:
%// Find all unique values in your data
bins = unique(y);
%// Find the total number of occurrences per unique value
counts = histc(y, bins);
%// Figure out which bin has the largest count
[~,max_bin] = max(counts);
%// Figure out the corresponding y value
ss_value = bins(max_bin);
ss_value contains the steady-state value of your data, corresponding to the most occurring output point with the assumptions I laid out above.
A minor caveat with the above approach is that this is not friendly to floating point data whose unique values are generated by floating point values whose decimal values beyond the first few significant digits are different.
Here's an example of your data from point 2300 to 2320:
>> format long g;
>> y(2300:2320)
ans =
0.99995724232555
0.999957488454868
0.999957733165346
0.999957976465197
0.999958218362579
0.999958458865564
0.999958697982251
0.999958935720613
0.999959172088623
0.999959407094224
0.999959640745246
0.999959873049548
0.999960104014889
0.999960333649014
0.999960561959611
0.999960788954326
0.99996101464076
0.999961239026462
0.999961462118947
0.999961683925704
0.999961904454139
Therefore, what I'd recommend is to perhaps round so that the first 5 or so significant digits are maintained.
You can do this to your dataset before you continue:
num_digits = 5;
y_round = round(y*(10^num_digits))/(10^num_digits);
This will first multiply by 10^n where n is the number of digits you desire so that the decimal point is shifted over by n positions. We round this result, then divide by 10^n to bring it back to the scale that it was before. If you do this, for those points that were 0.9999... where there are n decimal places, these will get rounded to 1, and it may help in the above calculations.
However, more recent versions of MATLAB have this functionality already built-in to round, and you can just do this:
num_digits = 5;
y_round = round(y,num_digits);
Minor Note
More recent versions of MATLAB discourage the use of histc and recommend you use histcounts instead. Same function definition and expected inputs and outputs... so just replace histc with histcounts if your MATLAB version can handle it.
Using the above logic, you could also use the median too. If the majority of data is fluctuating around 1, then the median would have a high probability that the steady-state value is chosen... so try this too:
ss_value = median(y_round);

recording 'bursts' of samples at 300 samples per sec

I am recording voltage changes over a small circuit- this records mouse feeding. When the mouse is eating, the circuit voltage changes, I convert that into ones and zeroes, all is well.
BUT- I want to calculate the number and duration of 'bursts' of feeding- that is, instances of circuit closing that occur within 250 ms (75 samples) of one another. If the gap between closings is larger than 250ms I want to count it as a new 'burst'
I guess I am looking for help in asking matlab to compare the sample number of each 1 in the digital file with the sample number of the next 1 down- if the difference is more than 75, call the first 1 the end of one bout and the second one the start of another bout, classifying the difference as a gap, but if it is NOT, keep the sample number of the first 1 and compare it against the next and next and next until there is a 75-sample difference
I can compare each 1 to the next 1 down:
n=1; m=2;
for i = 1:length(bouts4)-1
if bouts4(i+1) - bouts4(i) >= 75 %250 msec gap at a sample rate of 300
boutend4(n) = bouts4(i);
boutstart4(m)= bouts4(i+1);
m = m+1;
n = n+1;
end
I don't really want to iterate through i for both variables though...
any ideas??
-DB
You can try the following code
time_diff = diff(bouts4);
new_feeding = time_diff > 75;
boutend4 = bouts4(new_feeding);
boutstart4 = [0; bouts4(find(new_feeding) + 1)];
That's actually not too bad. We can actually make this completely vectorized. First, let's start with two signals:
A version of your voltages untouched
A version of your voltages that is shifted in time by 1 step (i.e. it starts at time index = 2).
Now the basic algorithm is really:
Go through each element and see if the difference is above a threshold (in your case 75).
Enumerate the locations of each one in separate arrays
Now onto the code!
%// Make those signals
bout4a = bouts4(1:end-1);
bout4b = bouts4(2:end);
%// Ensure column vectors - you'll see why soon
bout4a = bout4a(:);
bout4b = bout4b(:);
% // Step #1
loc = find(bouts4b - bouts4a >= 75);
% // Step #2
boutend4 = [bouts4(loc); 0];
boutstart4 = [0; bouts4(loc + 1)];
Aside:
Thanks to tail.b.lo, you can also use diff. It basically performs that difference operation with the copying of those vectors like I did before. diff basically works the same way. However, I decided not to use it so you can see how exactly your code that you wrote translates over in a vectorized way. Only way to learn, right?
Back to it!
Let's step through this slowly. The first two lines of code make those signals I was talking about. An original one (up to length(bouts) - 1) and another one that is the same length but shifted over by one time index. Next, we use find to find those time slots where the time index was >= 75. After, we use these locations to access the bouts array. The ending array accesses the original array while the starting array accesses the same locations but moved over by one time index.
The reason why we need to make these two signals column vector is the way I am appending information to the starting vector. I am not sure whether your data comes in rows or columns, so to make this completely independent of orientation, I'm going to make sure that your data is in columns. This is because if I try to append a 0, if I do it to a row vector I have to use a space to denote that I'm going to the next column. If I do it for a column vector, I have to use a semi-colon to go to the next row. To completely avoid checking to see whether it's a row or column vector, I'm going to make sure that it's a column vector no matter what.
By looking at your code m=2. This means that when you start writing into this array, the first location is 0. As such, I've artificially placed a 0 at the beginning of this array and followed that up with the rest of the values.
Hope this helps!