I want to create a time vector, from 1e-7 to 1e-5 with a higher resolution (smaller spacing) at the end.
The standard v = logspace(-7,-5) creates a vector with logarithmically increasing spacing. If I switch the order of a and b (logspace(-5,-7)) and use flip(v) the spacing is still the same, just the order of the numbers change.
You would need to specify an additional parameter besides the limits and the number of values: the base of the logarithm. This is equivalent to choosing where you sample the values on the logarithmic curve.
This code generates a sequence of logarithmically decreasing values in between your two limits:
lims = [1e-7,1e-5];
N = 10;
e = 10; % we'll generate linear values from 1 to e
% Generate logarithmic sequence (we need to flip for decreasing intervals)
d = flip(exp(linspace(1, e, N)));
% Map the sequence to our limits
d = (d - d(1)) / (d(end) - d(1));
d = d * (lims(2) - lims(1)) + lims(1);
d is:
1.0e-05 *
0.0100 0.6359 0.8661 0.9508 0.9820 0.9935 0.9977 0.9992 0.9998 1.0000
You could mirror the vector onto the half plane x<0 by multiplication with -1. Then the spacing is largest for the smaller numbers and decreasing, but v is in the interval -10^-5 to -10^-7.
Move v to the desired interval by adding 10^-5+10^-7.
Use the flip function so that v is ordered with the smallest element first and increasing.
v = logspace(-7,-5);
v = 1E-7+1E-5-v;
v = flip(v);
Related
Using the Matlab Profiler I found that this line of code is creating a large bottleneck and slowing down my program. w,x,y,z are all 3D matrices containing the same dimensions (A x B x C) where A does not equal B and does not equal C. Is there any way to optimize this line of code to run faster?
dt = .5;
for t = 1: tstop
w(:,:,t+1)= sum( dt*(x(:,:,t:-1:1).*(y(:,:,1:t) - .002).*z(:,:,1:t)),3);
end
If you group some terms outside the for loop, you can get up to a 2x boost:
p = dt*(y - .002).*z;
for t = 1: tstop
w(:,:,t+1)= sum( x(:,:,t:-1:1).*p(:,:,1:t), 3);
end
It is now easier to notice that we are computing convolutions of x and p along the third dimension. If that dimension C (or tstop) is large, you can try to inline or optimize those convolutions.
I would reshape the 3D matrices into 2D ones, grouping the first 2 dimensions and keeping the time dimension as the second one. Then you can try to perform row-wise convolution with conv2 (if possible, as claimed in this answer), of fft. Find below a solution with fft (and zero-padding), assuming tstop = C:
X = reshape(x, [A*B, C]); % reshape to 2D
Y = reshape(y, [A*B, C]);
Z = reshape(z, [A*B, C]);
P = dt*(Y - .002).*Z; % grouped terms
z__ = zeros(A*B, C); % zero-padding
W = real(ifft(fft([z__, X]').*fft([z__, P]'))'); % column-wise fft
W = [zeros(A*B, 1), W(:, 1:C)]; % first half
w = reshape(W, [A, B, C+1]);
The results are the same, and depending of A,B,C, this can give you a big performance boost. Example with A=13, B=14, C=1155:
original: 1.026312 seconds
grouping terms: 0.509862 seconds
FFT: 0.033699 seconds
What I am currently doing is computing the euclidean distance between all elements in a vector (the elements are pixel locations in a 2D image) to see if the elements are close to each other. I create a reference vector that takes on the value of each index within the vector incrementally. The euclidean distance between the reference vector and all the elements in the pixel location vector is computed using the MATLAB function "pdist2" and the result is applied to some conditions; however, upon running the code, this function seems to be taking the longest to compute (i.e. for one run, the function was called upon 27,245 times and contributed to about 54% of the overall program's run time). Is there a more efficient method to do this and speed up the program?
[~, n] = size(xArray); %xArray and yArray are same size
%Pair the x and y coordinates of the interest pixels
pairLocations = [xArray; yArray].';
%Preallocate cells with the max amount (# of interest pixels)
p = cell(1,n);
for i = 1:n
ref = [xArray(i), yArray(i)];
d = pdist2(ref,pairLocations,'euclidean');
d = d < dTh;
d = find(d==1);
[~,k] = size(d);
if (k >= num)
p{1,i} = d;
end
end
For squared Euclidean distance, there is a trick using matrix dot product:
||a-b||² = <a-b, a-b> = ||a||² - 2<a,b> + ||b||²
Let C = [xArray; yArray]; a 2×n matrix of all locations, then
n2 = sum(C.^2); % sq norm of coordinates
D = bsxfun(#plus, n2, n2.') - 2 * C.' * C;
Now D(ii,jj) holds the square distance between point ii and point jj.
Should run quite quickly.
I am using the approach from this Yale page on fractals:
http://classes.yale.edu/fractals/MultiFractals/Moments/TSMoments/TSMoments.html
which is also expounded on this set of lecture slides (slide 32):
http://multiscale.emsl.pnl.gov/docs/multifractal.pdf
The idea is that you get a dataset, and examine it through many histograms with increasing numbers of bars i.e. resolution. Once resolution is high enough, some bars take the value zero. At each of these stages, we take the number of results that fall into each histogram bin (neglecting any zero-valued bins), divide it by the total size of the dataset, and raise this to a power, q. This operation gives the 'partition function' for a given moment and bin size. Quoting the above linked tutorial: "Provides a selective characterization of the nonhomogeneity of the
measure, positive q’s accentuating the densest regions and negative q’s the smoothest regions."
So I'm using the histogram function in Matlab, looping over bin sizes, summing over all the non-zero bin contents, and so forth. But my output array of partition functions is just a bunch of 1s. I can't see what's going wrong, can anybody else?
Data for intel, cisco, apple and others is available on the same Yale website: yale.edu/fractals/MultiFractals/Finf(a)/Finf(a).html
N.B. intel refers to the intel stock price I was originally using as the dataset.
lower = 1; %set lowest level of histogram resolution (bin size)
upper = 300; %set highest level of histogram resolution (bin size)
qlow = -20; %set lowest moment exponent
qhigh = 20; %set highet moment exponent
qstep = 0.25; %set step size between moment exponents
qn= ((qhigh-qlow)/qstep) + 1; %calculates number of steps given qlow, qhigh, qstep
qvalues= linspace(qlow, qhigh, qn); %creates a vector of q values given above parameters
m = min(intel); %find the maximum of the dataset
M = max(intel); %find the minimum of the dataset
for Q = 1:length(qvalues) %loop over moment exponents q
for k = lower:upper %loop over bin sizes
counts = hist(intel, k); %unpack all k histogram height values into 'counts'
counts(counts==0) = []; %delete all zero values in ''counts
Zq = counts ./ length(intel);
Zq = Zq .^ qvalues(Q);
Zq = sum(Zq);
partitions(k-(lower-1), Q) = Zq; %store Zq in the kth row and the Qth column of 'partitions'
end
end
Your code seems to be generally bug-free but I made some changes since you perform needless repetitions over loops (I moved the outer loop inside and "vectorized" it since all moment calculations can be performed simultaneously for a given histogram. Also, it is building the histogram that takes longest).
intel = cumsum(randn(64,1)); % <-- mock random walk
Ni =length(intel);
%figure, plot(intel)
lower = 1; %set lowest level of histogram resolution (bin size)
upper = 300; %set highest level of histogram resolution (bin size)
qlow = -20; %set lowest moment exponent
qhigh = 20; %set highet moment exponent
qstep = 0.25; %set step size between moment exponents
qn= ((qhigh-qlow)/qstep) + 1; %calculates number of steps given qlow, qhigh, qstep
qvalues= linspace(qlow, qhigh, qn); %creates a vector of q values given above parameters
m = min(intel); %find the maximum of the dataset
M = max(intel); %find the minimum of the dataset
partitions = zeros(upper-lower+1,length(qvalues));
for k = lower:upper %loop over bin sizes
% (1) Select a bin size r and partition [m,M] into intervals of size r:
% [m, m+r), [m+r, m+2r), ..., [m+kr, M], where m+kr < M <= m+(k+1)r.
% Call these bins B0, ..., Bk.
edges = linspace(m,M,k+1);
edges(end)=Inf;
% (2) For each j, 0 <= j <= k, count the number of xi that lie in bin Bj. Call this number nj. Ignore all nj that equal 0 after all the xi have been counted..
counts = histc(intel, edges); %unpack all k histogram height values into 'counts'
counts(counts==0) = []; %delete all zero values in ''counts
% (3) Now compute the qth moment, Mrq = (n0/N)q + ... + (nk/N)q, where the sum is over all nonzero ni.
% Zq = counts/Ni;
partitions(k, :) = sum( (counts/Ni) .^ qvalues); %store Zq in the kth row and the Qth column of 'partitions'
end
figure, hold on
loglog(1./[1:k]', partitions(:,1),'g.-')
loglog(1./[1:k]', partitions(:,80),'b.-')
loglog(1./[1:k]', partitions(:,160),'r.-')
% (4) Perform linear regressions here to get alpha(r) ....
Hi,
I would like to create a correlation matrix between the two data set presented above that will ignore any appearances of zeros (in the picture above, the green color), anyone knows what is the most efficient way that will produce a smooth result?
Is there any correlation method that can identify the similarity point by point and by thus the results will have the "shape" of the original matrix?
thank u
note: I do not have the matlab statistic toolbox
2. Is there any correlation method that can identify the similarity point by
point and by thus the results will have the "shape" of the original matrix?
Let's start with your second point because it is more clear, what you want there. You want to do a point-by-point comparison of two images, say, A and B. This boils down to measuring the similarity of two scalars a and b. Let's assume that these scalars are from the interval [0, Q], where Q depends on your image format (Q == 1 or Q == 255 are common in Matlab).
Now, the simplest measure of distance is the difference d = |a - b|. You might want to normalize this to [0, 1] and also invert the values to measure similarity instead of distance. In Matlab:
S = 1 - abs(A - B) / Q;
You mentioned about ignoring the zeros in the images. Well, you need to define, what similarity measure you expect for a zero. One possibility is to set the similarity to zero, whenever one pixel is zero:
S(A == 0 | B == 0) = 0;
You could also say that the similarity there is undefined and set the similarity to NaN:
S(A == 0 | B == 0) = nan;
Of course, you can also say that the mismatch between 10 and 11 is as bad as the mismatch between 100 and 110. In this case, you could take the distance with respect to the sum a + b (known as Bray Curtis normalization or normalized Euclidean metric)
D = abs(A - B) ./ (A + B)
S = 1 - D / max(D(:));
You run into problems, if both matrices have a zero-value pixel at the same location. Again, there are several possibilities: You can augment the sum with a small positive value alpha (e.g. alpha = 1e-6) which prevents a division by zero: D = abs(A - B) ./ (alpha + A + B).
Another option is to ignore the infinite values in D and add your 'zero-processing' here, i.e.
D = abs(A - B) ./ (A + B)
D(A == 0 | B == 0) = nan;
S = 1 - D / max(D(:));
You see, there are plenty of possibilities.
1. I would like to create a correlation matrix [...]
You should definitly think more about this point and come up with a better description of what to compute. If your matrices are of size m x m, you have m^2 variables. From this you could compute a correlation matrix m^2 x m^2, which measures the correlation of every pixel to every other pixel. This matrix will also have the largest values in the diagonal (these are the variances). However, I would not suggest to compute the correlation matrix if you only have two realisations.
Another option is to measure the similarity of rows or columns in the two images. Then you end up with a vector 1 x m of correlation coefficients.
However, I do not know how to compute a correlation matrix of size m x m from two inputs of size m x m, which has the largest values in the diagonal.
To just get a general correlation coefficient I'd use corr2. From the docs:
r = corr2(A,B)
Returns the correlation coefficient r
between A and B, where A and B are matrices or vectors of the same
size. r is a scalar double.
Roughly, I believe it's just calculating corr(A(:), B(:)).
I have a function that generates normal random number matrix having normal distribution using normrnd.
values(vvvv)= normrnd(0,0.2);
The output is from round1 is:
ans =
0.0210 0.1445 0.5171 -0.1334 0.0375 -0.0165 Inf -0.3866 -0.0878 -0.3589
The output from round 2 is:
ans =
0.0667 0.0783 0.0903 -0.0261 0.0367 -0.0952 0.1724 -0.2723 Inf Inf
The output from round 3 is:
ans =
0.4047 -0.4517 0.4459 0.0675 0.2000 -0.3328 -0.1180 -0.0556 0.0845 Inf
the function will be repeated 20 times.
It is obvious that the function is completely random. What I seek is to add a condition.
What I need is: if any entry has a value between 0.2 and 0.3. that value will be fixed in the next rounds. Only the remaining entries will be subjected to change using the function rand.
I have found the rng(sd) which seeds the random number generator using the nonnegative integer sd so that rand, randi, and randn produce a predictable sequence of numbers.
How to set custom seed for pseudo-random number generator
but how to make several entries of the matrix only effected!!
Another problem: seems that rng is not available for matlab r2009
How to get something similar without entering in the complication of probability & statistics
You can do this more directly than actually generating all these matrices, and it's pretty easy to do so, by thinking about the distribution of the final output.
The probability of a random variable distributed by N(0, .2) lying between .2 and .3 is p ~= .092.
Call the random variable of the final output of your matrix X, where you do this n (20) times. Then either (a) X lies between .2 and .3 and you stopped early, or (b) you didn't draw a number between .2 and .3 in the first n-1 draws and so you went with whatever you got on the nth draw.
The probability of (b) happening is just b=(1-p)^(n-1): the independent events of drawing outside [.2, .3], which have probability 1-p, happend n-1 times. Therefore the probability of (a) is 1-b.
If (b) happened, you just draw a number from normrnd. If (a) happened, you need the value of a normal variable, conditional on its being between .2 and .3. One way to do this is to find the cdf values for .2 and .3, draw uniformly from the range between there, and then use the inverse cdf to get back the original number.
Code that does this:
mu = 0;
sigma = .2;
upper = .3;
lower = .2;
n = 20;
sz = 15;
cdf_upper = normcdf(upper, mu, sigma);
cdf_lower = normcdf(lower, mu, sigma);
p = cdf_upper - cdf_lower;
b = (1-p) ^ (n - 1);
results = zeros(sz, sz);
mask = rand(sz, sz) > b; % mask value 1 means case (a), 0 means case (b)
num_a = sum(mask(:));
cdf_vals = rand(num_a, 1) * p + cdf_lower;
results(mask) = norminv(cdf_vals, mu, sigma);
results(~mask) = normrnd(mu, sigma, sz^2 - num_a, 1);
If you want to simulate this directly for some reason (which is going to involve a lot of wasted effort, but apparently you don't like "the complications of statistics" -- by the way, this is probability, not statistics), you can generate the first matrix and then replace only the elements that don't fall in your desired range. For example:
mu = 0;
sigma = .2;
n = 10;
m = 10;
num_runs = 20;
lower = .2;
upper = .3;
result = normrnd(mu, sigma, n, m);
for i = 1 : (num_runs - 1)
to_replace = (result < lower) | (result > upper);
result(to_replace) = normrnd(mu, sigma, sum(to_replace(:)), 1);
end
To demonstrate that these are the same, here's a plots of the empirical CDFs of doing this for 1x1 matrices 100,000 times. (That is, I ran both functions 100k times and saved the results, then used cdfplot to plot values on the x axis vs portion of the obtained values that are less than that on the y axis.)
They're identical. (Indeed, a K-S test for identity of distribution gives a p-value of .71.) But the direct way was a bunch faster to run.