Calculating the mean across raster layers - average

I am trying to average my precipitation data, which is an object in the form of a raster brick (the object is called "Prec"). "Prec" has 95 layers. However, the idea would be to average just the first 20 layers. For example, the first grid cell of the first layer would be averaged with the first grid cell of the second layer, and then the third layer, the fourth layer.....all the way to the 20th layer. This would be done for all grid cells (8192 cells per layer) for each layer. This is what object "Prec" looks like:
Prec
class : RasterBrick
dimensions : 64, 128, 8192, 95 (nrow, ncol, ncell, nlayers)
resolution : 2.8125, 2.789327 (x, y)
extent : -181.4062, 178.5938, -89.25846, 89.25846 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : C:/Users/Rain/Documents/My documents/All netCDF files/netcdffiles/MaxPrecIPSLIPSL-
CM5B-LRrcp85.nc
names : X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, X12, X13, X14, X15, ...
z-value : 1, 95 (min, max)
varname : onedaymax
I tried the following:
ncfname <- "MaxPrecIPSLIPSL-CM5B-LRrcp85.nc"
Prec <- brick(ncfname,var="onedaymax")
sm <- mean(Prec[1:20], na.rm=TRUE) #Attempting to calculate the mean by isolating first 20 layers
(example: Grid cell #1 of layer #1 is averaged with Grid cell #1 of layer #2....all the way to layer
#20
sm
[1] 20.8997
While this goes through, it only returns one value, as opposed to a single layer of 8192 average values from the 20 layers. Why would this be happening?
Any assistance would be greatly appreciated!

Here is a minimal, self-contained, reproducible example:
library(raster)
b <- brick(system.file("external/rlogo.grd", package="raster"))
To get the mean for all (in this case 3) layers
m <- mean(b)
To get the mean for the first two layers
m <- mean(b[[1:2]])
Note that the double brackets [[ are used to subset layers. The single brackets [ are to extract values. Thus b[1:2] returns the values for the first two cells. See ?raster::subset

Related

N-dimensional GP Regression

I'm trying to use GPflow for a multidimensional regression. But I'm confused by the shapes of the mean and variance.
For example: A 2-dimensional input space X of shape (20,20) is supposed to be predicted. My training samples are of shape (8,2) which means 8 training samples overall for the two dimensions. The y-values are of shape (8,1) which of course means one value of the ground truth per combination of the 2 input dimensions.
If I now use model.predict_y(X) I would expect to receive a mean of shape (20,20) but obtain a shape of (20,1). Same goes for the variance. I think that this problem comes from the shape of the y-values but I have have no idea how to fix it.
bound = 3
num = 20
X = np.random.uniform(-bound, bound, (num,num))
print(X_sample.shape) # (8,2)
print(Y_sample.shape) # (8,1)
k = gpflow.kernels.RBF(input_dim=2)
m = gpflow.models.GPR(X_sample, Y_sample, kern=k)
m.likelihood.variance = sigma_n
m.compile()
gpflow.train.ScipyOptimizer().minimize(m)
mean, var = m.predict_y(X)
print(mean.shape) # (20, 1)
print(var.shape) # (20, 1)
It sounds like you may be confused between the shape of a grid of input positions and the shape of the numpy arrays: if you want to predict on a 20 x 20 grid in two dimensions, you have 400 points in total, each with 2 values. So X (the one that you pass to m.predict_y()) should have shape (400, 2). (Note that the second dimension needs to have the same shape as X_sample!)
To construct this array of shape (400,2) you can use np.meshgrid (e.g., see What is the purpose of meshgrid in Python / NumPy?).
m.predict_y(X) only predicts the marginal variance at each test point, so the returned mean and var both have shape (400,1) (same length as X). You can of course reshape them to the 20 x 20 values on your grid.
(It is also possible to compute the full covariance, for the latent f this is implemented as m.predict_f_full_cov, which for X of shape (400,2) would return a 400x400 matrix. This is relevant if you want consistent samples from the GP, but I suspect that goes well beyond this question.)
I was indeed making the mistake to not flatten the arrays which in return produced the mistake. Thank you for the fast response STJ!
Here is an example of the working code:
# Generate data
bound = 3.
x1 = np.linspace(-bound, bound, num)
x2 = np.linspace(-bound, bound, num)
x1_mesh,x2_mesh = np.meshgrid(x1, x2)
X = np.dstack([x1_mesh, x2_mesh]).reshape(-1, 2)
z = f(x1_mesh, x2_mesh) # evaluation of the function on the grid
# Draw samples from feature vectors and function by a given index
size = 2
np.random.seed(1991)
index = np.random.choice(range(len(x1)), size=(size,X.ndim), replace=False)
samples = utils.sampleFeature([x1,x2], index)
X1_sample = samples[0]
X2_sample = samples[1]
X_sample = np.column_stack((X1_sample, X2_sample))
Y_sample = utils.samplefromFunc(f=z, ind=index)
# Change noise parameter
sigma_n = 0.0
# Construct models with initial guess
k = gpflow.kernels.RBF(2,active_dims=[0,1], lengthscales=1.0,ARD=True)
m = gpflow.models.GPR(X_sample, Y_sample, kern=k)
m.likelihood.variance = sigma_n
m.compile()
#print(X.shape)
mean, var = m.predict_y(X)
mean_square = mean.reshape(x1_mesh.shape) # Shape: (num,num)
var_square = var.reshape(x1_mesh.shape) # Shape: (num,num)
# Plot mean
fig = plt.figure(figsize=(16, 12))
ax = plt.axes(projection='3d')
ax.plot_surface(x1_mesh, x2_mesh, mean_square, cmap=cm.viridis, linewidth=0.5, antialiased=True, alpha=0.8)
cbar = ax.contourf(x1_mesh, x2_mesh, mean_square, zdir='z', offset=offset, cmap=cm.viridis, antialiased=True)
ax.scatter3D(X1_sample, X2_sample, offset, marker='o',edgecolors='k', color='r', s=150)
fig.colorbar(cbar)
for t in ax.zaxis.get_major_ticks(): t.label.set_fontsize(fontsize_ticks)
ax.set_title("$\mu(x_1,x_2)$", fontsize=fontsize_title)
ax.set_xlabel("\n$x_1$", fontsize=fontsize_label)
ax.set_ylabel("\n$x_2$", fontsize=fontsize_label)
ax.set_zlabel('\n\n$\mu(x_1,x_2)$', fontsize=fontsize_label)
plt.xticks(fontsize=fontsize_ticks)
plt.yticks(fontsize=fontsize_ticks)
plt.xlim(left=-bound, right=bound)
plt.ylim(bottom=-bound, top=bound)
ax.set_zlim3d(offset,np.max(z))
which leads to (red dots are the sample points drawn from the function). Note: Code not refactored what so ever :)

Optimization with Unknown Number of Variables

Since the original problem is more complicated, the idea is described using a simple example below.
For example, suppose we want to put several router antennas somewhere in a room so that the cellphone get most signal strength on the table (received power > Pmax) while weakest signal strength on bed (received power < Pmin). What is the best (minimum) number of antennas that should be used, and where should they be placed, in order to achieve the goal.
Mathematically,
SIGNAL_STRENGTH is dependent on variable (x, y, z) and the number
of variables
. i.e. location and number of antennas.
Besides, assume
PREDICTION = f((x1, y1, z1), (x2, y2, z2), ... (xi, yi, zi), ... (xn,
yn, zn))
where n and (xi, yi, zi) are to be optimized. The goal is to minimize
cost function = ||SIGNAL_STRENGTH - PREDICTION||
I tried to use GA with mixed integer programming in Matlab to implement that. Two optimization functions are used, outer function is to optimize n, and inner optimization function optimizes (x, y, z) with given n. This method works slow and I haven't seen one result given by this method so far.
Does anyone have a more efficient way to solve this problem? Any suggestion is appreciated. Thanks in advance.
Terminology | Problem Definition
An antenna is sending at position a in R^3 with constant power. Its signal strength can be measured by some S: R^3 -> R where S has a single maximum S_0 at a and the set, constructed by S(x) > const, is simply connected, i.e. S(x) = S_0 * exp(-const * (x-a)^2).
Given a set of antennas A the resulting signal strength is the maximum of a single antenna
S_A(x) = max{S_a(x) : for all a in A} ,
which means we 'lock' on the strongest antenna, which is what cell phones do.
Let K = R^3 x R denote a space of points (position, intensity). Now concider two finite subsets POI_min and POI_max of K. We want to find the set A with the minimal amount of antennas (|A| -> min.), that satisfies
for all (x,w) in POI_min : S_A(x) < w and for all (x,w) in POI_max : S_A(x) > w .
Implication
As S(x) > const is simply connected there has to be an antenna in a sphere around the position of each element (x,w) in POI_max with radius r = max{||xi - x|| : for all xi in S(xi) = w}. Which means that if we would put an antenna at the position of (x,w), then the furthest we can go away from x and still have signal strength w is the radius r within which an actual antenna has to be positioned.
With a similar argumentation for POI_min it follows that there is no antenna within r = min{||xi - x|| : for all xi in S(xi) = w}.
Solution
Instead of solving a nonlinear optimization task we can intersect spheres to obtain the optimal solution. If k spheres around the POI_max positions intersect, we can place a single antenna in the intersection, reducing the amount of antennas needed by k-1.
However each antenna that is placed must satisfy all constraints given by the elements of POI_min. Assuming that antennas are omnidirectional and thus orientation of an antenna doesn't matter we can do (pseudocode):
min_sphere = {(x_i,r_i) : from POI_min},
spheres_to_cover = {(x_i,r_i) : from POI_max}
A = {}
while not is_empty(spheres_to_cover)
power_set_score = struct // holds score, k
PS <- costruct power set of sphere_to_cover
for i = 1:number_of_elements(PS)
k = PS[i]
if intersection(k) \ min_sphere is not empty
power_set_score[i].score = |k|
else
power_set_score[i].score = 0
end if
power_set_score[i].k = k
end for
sort(power_set_score) // sort by score, biggest first
A <- add arbitrary point in (intersection(power_set_score[1].k) \ min_sphere)
spheres_to_cover = spheres_to_cover \ power_set_score[1].k
end while
On the other hand you have just given an example problem and thus this solution may not be applicable or broad enough for your case. I did make a few assumptions. So being more specific in the question might give you an even better answer.

How to interpolate random non monotonic increasing data

So I am working on my Thesis and I need to calculate geometric characteristics of an airfoil.
To do this, I need to interpolate the horizontal and vertical coordinates of an airfoil. This is used for a tool which will calculate the geometric characteristics automatically which come from random airfoil geometry files.
Sometime the Y values of the airfoil are non monotonic. Hence, the interp1 command gives an error since some values in the Y vector are repeated.
Therefore, my question is: How do I recognize and subsequently interpolate non monotonic increasing data automatically in Matlab.
Here is a sample data set:
0.999974 0.002176
0.994846 0.002555
0.984945 0.003283
0.973279 0.004131
0.960914 0.005022
0.948350 0.005919
0.935739 0.006810
0.923111 0.007691
0.910478 0.008564
0.897850 0.009428
0.885229 0.010282
0.872617 0.011125
0.860009 0.011960
0.847406 0.012783
0.834807 0.013598
0.822210 0.014402
0.809614 0.015199
0.797021 0.015985
0.784426 0.016764
0.771830 0.017536
0.759236 0.018297
0.746639 0.019053
0.734038 0.019797
0.721440 0.020531
0.708839 0.021256
0.696240 0.021971
0.683641 0.022674
0.671048 0.023367
0.658455 0.024048
0.645865 0.024721
0.633280 0.025378
0.620699 0.026029
0.608123 0.026670
0.595552 0.027299
0.582988 0.027919
0.570436 0.028523
0.557889 0.029115
0.545349 0.029697
0.532818 0.030265
0.520296 0.030820
0.507781 0.031365
0.495276 0.031894
0.482780 0.032414
0.470292 0.032920
0.457812 0.033415
0.445340 0.033898
0.432874 0.034369
0.420416 0.034829
0.407964 0.035275
0.395519 0.035708
0.383083 0.036126
0.370651 0.036530
0.358228 0.036916
0.345814 0.037284
0.333403 0.037629
0.320995 0.037950
0.308592 0.038244
0.296191 0.038506
0.283793 0.038733
0.271398 0.038920
0.259004 0.039061
0.246612 0.039153
0.234221 0.039188
0.221833 0.039162
0.209446 0.039064
0.197067 0.038889
0.184693 0.038628
0.172330 0.038271
0.159986 0.037809
0.147685 0.037231
0.135454 0.036526
0.123360 0.035684
0.111394 0.034690
0.099596 0.033528
0.088011 0.032181
0.076685 0.030635
0.065663 0.028864
0.055015 0.026849
0.044865 0.024579
0.035426 0.022076
0.027030 0.019427
0.019970 0.016771
0.014377 0.014268
0.010159 0.012029
0.007009 0.010051
0.004650 0.008292
0.002879 0.006696
0.001578 0.005207
0.000698 0.003785
0.000198 0.002434
0.000000 0.001190
0.000000 0.000000
0.000258 -0.001992
0.000832 -0.003348
0.001858 -0.004711
0.003426 -0.005982
0.005568 -0.007173
0.008409 -0.008303
0.012185 -0.009379
0.017243 -0.010404
0.023929 -0.011326
0.032338 -0.012056
0.042155 -0.012532
0.052898 -0.012742
0.064198 -0.012720
0.075846 -0.012533
0.087736 -0.012223
0.099803 -0.011837
0.111997 -0.011398
0.124285 -0.010925
0.136634 -0.010429
0.149040 -0.009918
0.161493 -0.009400
0.173985 -0.008878
0.186517 -0.008359
0.199087 -0.007845
0.211686 -0.007340
0.224315 -0.006846
0.236968 -0.006364
0.249641 -0.005898
0.262329 -0.005451
0.275030 -0.005022
0.287738 -0.004615
0.300450 -0.004231
0.313158 -0.003870
0.325864 -0.003534
0.338565 -0.003224
0.351261 -0.002939
0.363955 -0.002680
0.376646 -0.002447
0.389333 -0.002239
0.402018 -0.002057
0.414702 -0.001899
0.427381 -0.001766
0.440057 -0.001656
0.452730 -0.001566
0.465409 -0.001496
0.478092 -0.001443
0.490780 -0.001407
0.503470 -0.001381
0.516157 -0.001369
0.528844 -0.001364
0.541527 -0.001368
0.554213 -0.001376
0.566894 -0.001386
0.579575 -0.001398
0.592254 -0.001410
0.604934 -0.001424
0.617614 -0.001434
0.630291 -0.001437
0.642967 -0.001443
0.655644 -0.001442
0.668323 -0.001439
0.681003 -0.001437
0.693683 -0.001440
0.706365 -0.001442
0.719048 -0.001444
0.731731 -0.001446
0.744416 -0.001443
0.757102 -0.001445
0.769790 -0.001444
0.782480 -0.001445
0.795173 -0.001446
0.807870 -0.001446
0.820569 -0.001446
0.833273 -0.001446
0.845984 -0.001448
0.858698 -0.001448
0.871422 -0.001451
0.884148 -0.001448
0.896868 -0.001446
0.909585 -0.001443
0.922302 -0.001445
0.935019 -0.001446
0.947730 -0.001446
0.960405 -0.001439
0.972917 -0.001437
0.984788 -0.001441
0.994843 -0.001441
1.000019 -0.001441
First column is X and the second column is Y. Notice how the last values of Y are repeated.
Maybe someone can provide me with a piece of code to do this? Or any suggestions are welcome as well.
Remember I need to automate this process.
Thanks for your time and effort I really appreciate it!
There is quick and dirty method if you do not know the exact function defining the foil profile. Split your data into 2 sets, top and bottom planes, so the 'x' data are monotonic increasing.
First I imported your data table in the variable A, then:
%// just reorganise your input in individual vectors. (this is optional but
%// if you do not do it you'll have to adjust the code below)
x = A(:,1) ;
y = A(:,2) ;
ipos = y > 0 ; %// indices of the top plane
ineg = y <= 0 ; %// indices of the bottom plane
xi = linspace(0,1,500) ; %// new Xi for interpolation
ypos = interp1( x(ipos) , y(ipos) , xi ) ; %// re-interp the top plane
yneg = interp1( x(ineg) , y(ineg) , xi ) ; %// re-interp the bottom plane
y_new = [fliplr(yneg) ypos] ; %// stiches the two half data set together
x_new = [fliplr(xi) xi] ;
%% // display
figure
plot(x,y,'o')
hold on
plot(x_new,y_new,'.r')
axis equal
As said on top, it is quick and dirty. As you can see from the detail figure, you can greatly improve the x resolution this way in the area where the profile is close to the horizontal direction, but you loose a bit of resolution at the noose of the foil where the profile is close to the vertical direction.
If it's acceptable then you're all set. If you really need the resolution at the nose, you could look at interpolating on x as above but do a very fine x grid near the noose (instead of the regular x grid I provided as example).
if your replace the xi definition above by:
xi = [linspace(0,0.01,50) linspace(0.01,1,500)] ;
You get the following near the nose:
adjust that to your needs.
To interpolate any function, there must be a function defined. When you define y=f(x), you cannot have the same x for two different values of y because then we are not talking about a function. In your example data, neither x nor y are monotonic, so anyway you slice it, you'll have two (or more) "y"s for the same "x". If you wish to interpolate, you need to divide this into two separate problems, top/bottom and define proper functions for interp1/2/n to work with, for example, slice it horizontally where x==0. In any case, you would have to provide additional info than just x or y alone, e.g.: x=0.5 and y is on top.
On the other hand, if all you want to do is to insert a few values between each x and y in your array, you can do this using finite differences:
%// transform your original xy into 3d array where x is in first slice and y in second
xy = permute(xy(85:95,:), [3,1,2]); %// 85:95 is near x=0 in your data
%// lets say you want to insert three additional points along each line between every two points on given airfoil
h = [0, 0.25, 0.5, 0.75].'; %// steps along each line - column vector
%// every interpolated h along the way between f(x(n)) and f(x(n+1)) can
%// be defined as: f(x(n) + h) = f(x(n)) + h*( f(x(n+1)) - f(x(n)) )
%// this is first order finite differences approximation in 1D. 2D is very
%// similar only with gradient (this should be common knowledge, look it up)
%// from here it's just fancy matrix play
%// 2D gradient of xy curve
gradxy = diff(xy, 1, 2); %// diff xy, first order, along the 2nd dimension, where x and y now run
h_times_gradxy = bsxfun(#times, h, gradxy); %// gradient times step size
xy_in_3d_array = bsxfun(#plus, xy(:,1:end-1,:), h_times_gradxy); %// addition of "f(x)" and there we have it, the new x and y for every step h
[x,y] = deal(xy_in_3d_array(:,:,1), xy_in_3d_array(:,:,2)); %// extract x and y from 3d matrix
xy_interp = [x(:), y(:)]; %// use Matlab's linear indexing to flatten x and y into columns
%// plot to check results
figure; ax = newplot; hold on;
plot(ax, xy(:,:,1), xy(:,:,2),'o-');
plot(ax, xy_interp(:,1), xy_interp(:,2),'+')
legend('Original','Interpolated',0);
axis tight;
grid;
%// The End
And these are the results, near x=0 for clarity of presentation:
Hope that helps.
Cheers.

Visio NURBS formula

I have trouble deciphering the individual parameters of the NURBS formula in the shapes NURBSTo entry (used for splines - curved edges). MS Visio documentation didn't help too much.
The number of parameters is variable depending on the complexity of the curve. A more simple example is:
NURBS(0.4492,3,0,1,0,-0.1875,0,1,1,-0.1875,0,1)
where I found out the start and end coordinate parameters start is 5th for X and 6th for Y. End is 9th for X and 10th for Y. But the Y coordinates are still wrong, so I suppose they should be combined with another parameter. This Java code provided the best results so far in getting the control points of the spline:
int j = 0;
for (int i = 2; i + 4 < pointsS.length; i = i + 4)
{
mxPoint currPoint = new mxPoint();
currPoint.setX(startXY.getX() + (endXY.getX() - startXY.getX()) * pointsRaw[i + 2]);
currPoint.setY(startXY.getY() - (endXY.getY() - startXY.getY()) * pointsRaw[i + 3]);
pointList.add(currPoint);
j++;
}
Just an example for a more complex spline:
NURBS(2.9857,3,1,1,0.1875,0,0,1,0.1875,-0.8954,0,1,0.1875,-1.3431,0,1,0.1875,-1.7908,0.4521,1,-0.4936,-1.7908,1.049,1,-1.1747,-1.7908,1.424,1,-1.1747,-2.1799,1.902,1,-1.1747,-2.5689,2.3742,1)
The documentation say for param 2 only "degree". I suppose it's the degree of the polynomial that is used for approximation.
The wiki page about NURBS:
http://en.wikipedia.org/wiki/Non-uniform_rational_B-spline
Of course, it doesn't speak about Visio parameters :)
Are you accounting for the effect third and fourth parameters have on how you should interpret the x and y parameters?
From MSDN (http://msdn.microsoft.com/en-us/library/office/aa224197(v=office.11).aspx):
NURBS(knotLast, degree, xType, yType, x1, y1, knot1, weight1, ...)
knotLast The last knot.
degree The spline's degree.
xType Specifies how to interpret the x input data. If xType is 0, all x input data are interpreted as a percentage of Width. If xType is 1, all x input data are interpreted as local coordinates.
yType Specifies how to interpret the y input data. If yType is 0, all y input data are interpreted as a percentage of Height. If yType is 1, all y input data are interpreted as local coordinates.
x1 An x-coordinate.
y1 A y-coordinate.
knot1 A knot on the B-spline.
weight1 A weight on the B-spline.
This may help: Visio 2003 Developer's Survival Pack by Graham Wideman
http://www.amazon.com/Visio-2003-Developers-Survival-Pack/dp/1412011124
There's an extensive section on Visio NURBS. It's only $7 for the Kindle edition.

Replot the same data point multiple times before continuing in Matlab

If I have the following code :
for t=1:length(s) % s is a struct with over 1000 entries
if s(t).BOX==0
a(t,:)=0;
elseif s(t).BOX==1
a(t,:)=100;
end
if s(t).BOX==2
b(t,:)=150;
elseif s(t).BOX==3
b(t,:)=170;
end
.
.
.
end
plot(a)
plot(b)
plot(c)
What I want to accomplish :
for n=1:length(s)
Plot the data point of a(n) at t=0, t=1, t=2
then
Plot the data point of b(n) at t=3, t=4, t=5
.
.
.
etc
So basically, each data point will be plotted at 3 values of t before moving to the next point.
How can I achieve that ?
EDIT
Something like this :
If I'm understanding you correctly, and assuming a is a vector, you could do something like
% Your for loop comes before this
nVarsToPlot = 4;
nRepeatsPerPoint = 3;
t = repmat(linspace(1, nRepeatsPerPoint * length(s), nRepeatsPerPoint * length(s))', 1, nVarsToPlot);
genMat = #(x)repmat(x(:)', nRepeatsPerPoint, 1);
aMat = genMat(a); bMat = genMat(b); cMat = genMat(c); dMat = genMat(d);
abcPlot = [aMat(:) bMat(:) cMat(:) dMat(:)];
plot(t, abcPlot);
I'm a bit unclear on exactly what values you want your t to contain, but you essentially need a vector 3 times the length of s. You can then generate the correct data matrix by copying your [Nx1] vectors (a, b, c, etc.) three times (after transposing them to row vectors) each and stacking the whole lot into matrix, then transforming it to a vector with (:) which should come out in the right order as long as the matrix is constructed correctly.