I am trying to use the best practice techniques in the Computational fluid dynamics area to analyze and visualize a velocity field.
Given 6 arrays of moving particles' positions and velocities: x,y,z and vx,vy,vz respectively.
I want to visualize and calculate the induced velocity field and its properties such as: curl, divergence, isosurfaces etc.
Here is a modest script of the volume visualization functions I was able to use without calling meshgrid (to avoid interpolation and more noise).
Ultimately, one of the things that I am not sure about is how to wisely create a mesh grid from my 50 points in space, the second is how to use CFD approaches to visualize the velocity field regardless the small amount of data points.
close all
rng default
t=0.1:0.1:10;
x = sin(t)';
y = cos(t)';
z = t.^0.2';
vx=y;vy=x;vz=z;
figure
subplot(2,3,1);
quiver3(x,y,z,vx,vy,vz);
hold on
streamribbon({ [x y z] }, {vx},{vy},{vz});
subplot(2,3,2);
[curl_val, cav] = curl([x,y,z],[vx,vy,vz]);
surfc([x,y,z],cav);
subplot(2,3,3);
surfc([x,y,z],curl_val);
w = sqrt( vx.^2 + vy.^2 + vz.^2 );
subplot(2,3,4);
quiver3(x,y,z,vx,vy,vz);
streamtube({ [x y z] }, {w});
subplot(2,3,5);
quiver3(x,y,z,vx,vy,vz);
subplot(2,3,6);
surfc([x,y,z],[vx,vy,vz]);
When I run the above script (excluding the data generation) on a real data, I get the following plots which aren't very informative:
I strongly suspect that the problem here is with the data, not the visualization technique. But in general, the problem is one or more of the following:
1) You do not have enough data to capture the underlying dynamics (the dynamics in space operate at a higher spatial frequency than you sampled)
2) The data is too noisy for the number of datapoints your collected.
3) The flow is fundamentally turbulent, and hence hoping for a nice laminar-like plot is not going to happen.
When you have problems visualizing data, the first rule of thumb is always to throw away any visualization that attempts to approximate a derivative (or gradient) in any way. The reason is that when you try to approximate a derivative with real data, noise almost always makes that estimate nonsense. For example, let's suppose we have a cosine that gets corrupted by some noise, and we try to numerically estimate the derivative from the data
figure
% Create a signal
dt = .1;
t = 0:.1:10;
x = cos(t);
% Add some noise
y = x + .5 * randn(size(x));
% Compute the first order approximation of the derivatives of the signals
dx = diff(x)/dt;
dy = diff(y)/dt;
% Plot everything
subplot(2,1,1)
plot(t,x,t,y)
axis tight
subplot(2,1,2)
plot(t(2:end),dx,t(2:end),dy)
axis tight
In the first plot, which shows the raw data, the noise doesn't look to bad, but when we look at the derivative estimate! Ouch... The noise is really amplified. So forget about higher order properties of a flow, such as the curl and vorticity, which require gradients of the data.
So what can we do in cases like these? Well essentially, just look at the raw data. If there is a pattern, it will reveal itself. For instance, let's look at your raw velocity vectors from 3 different perspectives:
data = dlmread('data.csv','\s')
x = data(:,1);
y = data(:,2);
z = data(:,3);
vx = data(:,4);
vy = data(:,5);
vz = data(:,6);
close all
figure
subplot(1,3,1);
quiver3(x,y,z,vx,vy,vz);
view([1,0,0])
subplot(1,3,2);
quiver3(x,y,z,vx,vy,vz);
view([0,1,0])
subplot(1,3,3);
quiver3(x,y,z,vx,vy,vz);
view([0,0,1])
The only thing that looks even slightly structured is that last plot. However, that plot tells us that we probably also have turbulence (in addition to noise) to contend with.
Specifically, from view 3, it definitely seems like you are taking velocity measurements in a flow that is tightly hugging an object. In this case, your measurements are probably too tight though... and probably in the boundary layer. If that is the case (that the measurements are in the boundary layer), then you can get time-varying effects in the flow, meaning that it doesn't make sense to look at anything without also having a time component. The "nice" plots that you have in your answer are only really helpful when the flow is laminar, where we get to see these nice, consistent stream lines. If it is turbulent, then there is no discernable pattern in the flow, no matter how hard you look.
So in conclusion, I don't think you will be able to find a nice visualization for your data because either the sensors you used were too noisy, or the flow was too turbulent.
As an aside... consider what happens when we look at the raw velocity vectors from your "nice" dataset:
That, my friend, is a well-trained house pet. You have a wild mountain lion on your hands.
Related
I have calculated power spectrum of signal. the steps are:
FFT of time signal
square of absolute value of FFT/length of signal i.e power spectrum
Now I want to convert it into time domain. What steps should I follow.
The reconstruction of the original signal from the frequency-domain requires both the magnitude and the phase information. So, as you compute the power spectrum and keep only the magnitude, you no longer have all the required information to uniquely reconstruct the original signal.
In other words, we can find examples where different signals have the exact same power spectrum. In that case retrieving which one of those different signals was the original one would thus not be possible.
As a simple illustration, let's say the original signal x is:
x = [0.862209 0.43418 0.216947544 0.14497645];
For sake of argument, let's consider some other signal y, which I've specially crafted for the purpose of this example as:
y = [-0.252234 -0.0835824 -0.826926341 -0.495571572];
As show in the following plots, those two signals might appear completely unrelated:
They do however share the same power spectrum:
f = [0:N-1]/N;
Xf = fft(x,N);
Yf = fft(y,N);
hold off; plot(f, Xf.*conj(Xf)/N, 'b');
hold on; plot(f, Yf.*conj(Yf)/N, 'r:');
xlabel('Normalized frequency');
legend('Px', 'Py')
title('Power spectrum');
As a result, someone who only sees the power spectrum and doesn't know that you started with x, could very well guess that you instead started with y.
That said, the fact that those signals have the same power spectrum could tell you that those signals aren't as unrelated as you might think. In fact those signals also share the same autocorrelation function in the time-domain:
Rx = xcorr(x);
Ry = xcorr(y);
t = [0:length(Rx)-1] - length(x) + 1;
hold off; stem(t, Rx, 'bo');
hold on; stem(t, Ry, 'rx');
legend('Rxx', 'Ryy');
xlabel('lag');
title('Autocorrelation');
This is to be expected since the autocorrelation can be obtained by computing the inverse transform (with ifft) of the power spectrum. This, however, is about as much as you can recover in the time domain. Any signal with this autocorrelation function would be as good a guess as any for the original signal. If you are very motivated you could attempt to solve the set of non-linear equations that are obtained from the definition of the autocorrelation and obtain a list of possibles signals. That would still not be sufficient to tell which one was the original, and as you noticed when comparing my example x and y, there wouldn't be a whole lot to make of it.
The easiest way to see the non-uniqueness of the power (or amplitude) spectrum for describing the time domain signal is that both white noise and the delta function in the time domain have the same power (or amplitude) spectrum - a constant - in the frequency domain.
right now I have a 3d scatter plot with peaks that I need to find the volumes for. My data is from an image, so the x- and y- values indicate the pixel positions on the xy-plane, and the z value is the pixel value for each pixel.
Here's my scatter plot:
scatter3(x,y,z,20,z,'filled')
I am trying to find the "volume" of the peaks of the data, like drawn below:
I've tried findpeaks() but it gives me many local maxima without the the two prominent peaks that I'm looking for. In addition, I'm really stuck on how to establish the "base" of my peaks, because my data is from a scatter plot. I've also tried the convex hull and a linear surface fit, and get this:
But I'm still stuck on how to use any of these commands to establish an automated peak "base" and volume. Please let me know if you have any ideas or code segments to help me out, because I am stumped and I can't find anything on Stack Overflow. Sorry in advance if this is really unclear! Thank you so much!
Here is a suggestion for dealing with this problem:
Define a threshold for z height, or define in any other way which points from the scatter are relevant (the black plane in the leftmost figure below).
Within the resulted points, find clusters on the X-Y plane, to define the different regions to calculate. You will have to define manually how many clusters you want.
for each cluster, perform a Delaunay triangulation to estimate its volume.
Here is an example code for all that:
[x,y,z] = peaks(30); % some data
subplot 131
scatter3(x(:),y(:),z(:),[],z(:),'filled')
title('The original data')
th = 2.5; % set a threshold for z values
hold on
surf([-3 -3 3 3],[-4 4 -4 4],ones(4)*th,'FaceColor','k',...
'FaceAlpha',0.5)
hold off
ind = z>th; % get an index of all values of interest
X = x(ind);
Y = y(ind);
Z = z(ind);
clustNum = 3; % the number of clusters should be define manually
T = clusterdata([X Y],clustNum);
subplot 132
gscatter(X,Y,T)
title('A look from above')
subplot 133
hold on
c = ['rgb'];
for k = 1:max(T)
valid = T==k;
% claculate a triangulation of the data:
DT = delaunayTriangulation([X(valid) Y(valid) Z(valid)]);
[K,v] = convexHull(DT); % get the convex hull indices
% plot the volume:
ts = trisurf(K,DT.Points(:,1),DT.Points(:,2),DT.Points(:,3),...
'FaceColor',c(k));
text(mean(X(valid)),mean(Y(valid)),max(Z(valid))*1.3,...
num2str(v),'FontSize',12)
end
hold off
view([-45 40])
title('The volumes')
Note: this code uses different functions from several toolboxes. In any case that something does not work, first make sure that you have the relevant toolbox, there are alternatives to most of them.
Having already a mesh, maybe you could use the process described in https://se.mathworks.com/matlabcentral/answers/277512-how-to-find-peaks-in-3d-mesh .
If not, making a linear regression on (x,z) or (y,z) plane could make a base for you to find the peaks.
Out of experience in data with plenty of noise, selecting the peaks manually is often faster if you have small set of data to make the analysis. Just plot every peak with its number from findpeaks() and select the ones that are relevant to you. An interpolation to a smoother data can help to solve the problem in the long term (but creates a problem by itself).
Other option will be searching for peaks in the (x,z) and (y,z) planes, then having the amplitude of each peak in an (x) [or (y)] interval and from there make a integration for every area.
I have a list of data that I am trying to fit to a polynomial and I am trying to plot the 95% confidence bands for the parameters as well (in Matlab).
If my data are x and y
f=fit(x,y,'poly2')
plot(f,x,y)
ci=confint(f,0.95);
a_ci=ci(1,:);
b_ci=ci(2,:);
I do not know how to proceed after that to get the minimum and maximum band around my data. Does anyone know how to do that?
I can see that you have the curve fitting toolbox installed, which is good, because you need it for the following code to work.
Basic fit of example data
Let's define some example data and a possible fit function. (I could also have used poly2 here, but I wanted to keep it a bit more general.)
xdata = (0:0.1:1)'; % column vector!
noise = 0.1*randn(size(xdata));
ydata = xdata.^2 + noise;
f = fittype('a*x.^2 + b');
fit1 = fit(xdata, ydata, f, 'StartPoint', [1,1])
plot(fit1, xdata, ydata)
Side note: plot() is not our usual plot function, but a method of the cfit-object fit1.
Confidence intervals of the fitted parameters
Our fit uses the data to determine the coefficients a,b of the underlying model f(x)=ax2+b. You already did this, but for completeness here is how you can read out the uncertainty of the coefficients for any confidence interval. The coefficients are alphabetically ordered, which is why I can use ci(1,:) for a, and so on.
names = coeffnames(fit1) % check the coefficient order!
ci = confint(fit1, 0.95); % 2 sigma interval
a_ci = ci(1,:)
b_ci = ci(2,:)
By default, Matlab uses 2σ (0.95) confidence intervals. Some people (physicists) prefer to quote the 1σ (0.68) intervals.
Confidence and Prediction Bands
It's a good habit to plot confidence bands or prediction bands around the data – especially when the coefficients are correlated! But you should take a moment to think about which one of the two you want to plot:
Prediction band: If I take a new measurement value, where would I expect it to lie? In Matlab terms, this is called the “observation band”.
Confidence band: Where do I expect the true value to lie? In Matlab terms, this is called the “functional band”.
As with the coefficient’s confidence intervals, Matlab uses 2σ bands by default, and the physicists among us switch this to 1σ intervals. By its nature, the prediction band is bigger, because it is the combination of the error of the model (the confidence band!) and the error of the measurement.
There is a another destinction to make, one that I don’t fully understand. Both Matlab and Wikipedia make that distinction.
Pointwise: How big is the prediction/confidence band for a single measurement/true value? In virtually all cases I can think of, this is what you would want to ask as a physicist.
Simultaneous: How big do you have to make the prediction/confidence band if you want a set of all new measurements/all prediction points to lie within the band with a given confidence?
In my personal opinion, the “simultaneous band” is not a band! For a measurement with n points, it should be n individual error bars!
The prediction/confidence distinction and the pointwise/simultaneous distinction give you a total of four options for “the” band around the plot. Matlab makes the 2σ pointwise prediction band easily accessible, but what you seem to be interested in is the 2σ pointwise confidence band. It is a bit more cumbersome to plot, because you have to specify dummy data over which to evaluate the prediction band:
x_dummy = linspace(min(xdata), max(xdata), 100);
figure(1); clf(1);
hold all
plot(xdata,ydata,'.')
plot(fit1) % by default, evaluates the fit over the currnet XLim
% use "functional" (confidence!) band; use "simultaneous"=off
conf1 = predint(fit1,x_dummy,0.95,'functional','off');
plot(x_dummy, conf1, 'r--')
hold off
Note that the confidence band at x=0 equals the confidence interval of the fit-coefficient b!
Extrapolation
If you want to extrapolate to x-values that are not covered by the range of your data, you can evaluate the fit and the prediction/confidence band for a bigger range:
x_range = [0, 2];
x_dummy = linspace(x_range(1), x_range(2), 100);
figure(1); clf(1);
hold all
plot(xdata,ydata,'.')
xlim(x_range)
plot(fit1)
conf1 = predint(fit1,x_dummy,0.68,'functional','off');
plot(x_dummy, conf1, 'r--')
hold off
I am trying to fit a line to some data without using polyfit and polyval. I got some good help already on how to implement this and I have gotten it to work with a simple sin function. However, when applied to the function I am trying to fit, it does not work. Here is my code:
clear all
clc
lb=0.001; %lowerbound of data
ub=10; %upperbound of data
step=.1; %step-size through data
a=.03;
la=1482/120000; %1482 is speed of sound in water and 120kHz
ep1=.02;
ep2=.1;
x=lb:step:ub;
r_sq_des=0.90; %desired value of r^2 for the fit of data without noise present
i=1;
for x=lb:step:ub
G(i,1)= abs(sin((a/la)*pi*x*(sqrt(1+(1/x)^2)-1)));
N(i,1)=2*rand()-1;
Ghat(i,1)=(1+ep1*N(i,1))*G(i,1)+ep2*N(i,1);
r(i,1)=x;
i=i+1;
end
x=r;
y=G;
V=[x.^0];
Vfit=[x.^0];
for i=1:1:1000
V = [x.^i V];
c = V \ y;
Vfit = [x.^i Vfit];
yFit=Vfit*c;
plot(x,y,'o',x,yFit,'--')
drawnow
pause
end
The first two sections are just defining variables and the function. The second for loop is where I am making the fit. As you can see, I have it pause after every nth order in order to see the fit.
I changed your fit formula a bit, I got the same answers but quickly got
a warning that the matrix was singular. No sense in continuing past
the point that the inversion is singular.
Depending on what you are doing you can usually change out variables or change domains.
This doesn't do a lot better, but it seemed to help a little bit.
I increased the number of samples by a factor of 10 since the initial part of the curve
didn't look sampled highly enough.
I added a weighting variable but it is set to equal weight in the code below. Attempts
to deweight the tail didn't help as much as I hoped.
Probably not really a solution, but perhaps will help with a few more knobs/variables.
...
step=.01; %step-size through data
...
x=r;
y=G;
t=x.*sqrt(1+x.^(-2));
t=log(t);
V=[ t.^0];
w=ones(size(t));
for i=1:1:1000
% Trying to solve for value of c
% c that
% yhat = V*c approximates y
% or y = V*c
% V'*y = V'*V * c
% c = (V'*V) \ V'*y
V = [t.^i V];
c = (V'*diag(w.^2)*V ) \ (V'*diag(w.^2)*y) ;
yFit=V*c;
subplot(211)
plot(t,y,'o',t,yFit,'--')
subplot(212)
plot(x,y,'o',x,yFit,'--')
drawnow
pause
end
It looks like more of a frequency estimation problem, and trying to fit a unknown frequency
with polynomial tends to be touch and go. Replacing the polynomial basis with a quick
sin/cos basis didn't seem to do to bad.
V = [sin(t*i) cos(t*i) V];
Unless you specifically need a polynomial basis, you can apply your knowledge of the problem domain to find other potential basis functions for your fit, or to attempt to make the domain in which you are performing the fit more linear.
As dennis mentioned, a different set of basis functions might do better. However you can improve the polynomial fit with QR factorisation, rather than just \ to solve the matrix equation. It is a badly conditioned problem no matter what you do however, and using smooth basis functions wont allow you to accurately reproduce the sharp corners in the actual function.
clear all
close all
clc
lb=0.001; %lowerbound of data
ub=10; %upperbound of data
step=.1; %step-size through data
a=.03;
la=1482/120000; %1482 is speed of sound in water and 120kHz
ep1=.02;
ep2=.1;
x=logspace(log10(lb),log10(ub),100)';
r_sq_des=0.90; %desired value of r^2 for the fit of data without noise present
y=abs(sin(a/la*pi*x.*(sqrt(1+(1./x).^2)-1)));
N=2*rand(size(x))-1;
Ghat=(1+ep1*N).*y+ep2*N;
V=[x.^0];
xs=(lb:.01:ub)';
Vfit=[xs.^0];
for i=1:1:20%length(x)-1
V = [x.^i V];
Vfit = [xs.^i Vfit];
[Q,R]=qr(V,0);
c = R\(Q'*y);
yFit=Vfit*c;
plot(x,y,'o',xs,yFit)
axis([0 10 0 1])
drawnow
pause
end
I'm trying to visualize a large amount of data (currently around 1.2 million), distributed among 6 different types, where a type is a different data series. I'm hoping that visualizing the data will yield some insight into the space that a paticular type occupies. For example, if I wanted to determine the type of pitch based on the velocity and spin of a baseball, a curveball would be one type, a knuckleball would be another type, etc.
Due to the nature of the data, there are some areas where I have a lot of data, and some areas where I have virtually no data. The result is that when I plot a specific type, its easy to see the general space that the values should span, but don't because of the missing data points. In addition, there are areas where there is significant overplotting to the point where it becomes difficult to view individual points.
What I am looking for is a way to populate the areas where I am missing data and a way to reduce the clutter for the areas where I have too much data. Any ideas? Thanks so much.
Below is some code so that you can get a better sense of what I'm talking about, ie rotate the image to play around with it.
x = [0:.1:.3 .5:.01:.7 1.1:.3:1.7 2:.1:2.3 3:.01:3.3 4:.3:4.5];
y = (0:.1:10) .* randn(1,length(0:.1:10));
[x, y] = meshgrid(x,y);
x = reshape(x,[],1);
y = reshape(y,[],1);
z = (x .^ 2) .* y + randn(length(x),1) .* 5;
plot3(x,y,z,'.');
xlabel('X');
ylabel('Y');
zlabel('Z');
grid on