3-D interpolation using griddata - matlab

I have a problem for the interpolation of a 3-D array using griddata command in MATLAB. I have tried different ways but unfortunately I couldn't have handled it. The array which should be interpolated is a 182x125x35 array (longitude, latitude and depth) which is model output and I want to interpolate it on the observation data points. Thanks for the time you spend for helping me.
clear;
clc;
% Observation Data
t_obs = ncread('/disk1/Observation/data_from_data.nc','date_time');
lat_obs = ncread('/disk1/Observation/data_from_data.nc','latitude');
long_obs = ncread('/disk1/Observation/data_from_data.nc','longitude');
depth_obs = ncread('/disk1/Observation/data_from_data.nc','var1');
temp_obs = ncread('/disk1/Observation/data_from_data.nc','var2');
% Model Data
lat_m = ncread('/disk1/.../a.nc','GLat');
lon_m = ncread('/disk1/.../a.nc','GLong');
H = ncread('/disk1/.../a.nc', 'H');
sigma = ncread('/disk1/.../a.nc','SIGMAZZ');
dom = ncread('/disk1/.../a.nc','DOM');
% Rearrangement of date string to number for observation time
toffset1 = datenum([1990 01 01 00 00 00]) + datenum([0000 00 00 00 00 00]);
t = t_obs - 1;
time_obs = toffset1 + t;
% Selection of model file and date
t_mod1 = input('[yyyy mm dd] = ');
t_mod2 = input('[yyyy mm dd] = ');
toffset = datenum([1990 01 01 00 00 00]) + datenum([0000 00 00 00 00 00]);
t_mod3 = t_mod1+(0.00000001);
t_mod4 = t_mod2+(0.00000001);
date1 = datenum(t_mod3);
date2 = datenum(t_mod4);
tind_m1 = date1;
tind_m2 = date2;
time_mod = [tind_m1:tind_m2]';
ind1 = zeros(16887,1);
ind2 = zeros(16887,1);
for i = 1:16887
d = sort(abs(time_mod - time_obs(i)));
ind1(i) = find(abs(time_mod - time_obs(i)) == d(1));
ind2(i) = find(abs(time_mod - time_obs(i)) == d(2));
end
% Find temporal point of parameter for temporal interpolation
for i = 1:16887
d = sort(abs(time_mod - time_obs(i)));
ind1(i) = find(abs(time_mod - time_obs(i)) == d(1));
ind2(i) = find(abs(time_mod - time_obs(i)) == d(2));
end
% Temporal and spatial interpolation and writing the output in a .nc file
c = 1;
for n = ind1(1):ind1(16887)
for m = ind2(1):ind2(16887)
for l = 1:16887
str1 = sprintf('%4.4d',n);
str2 = sprintf('%4.4d',m);
sfile1 = (['/media/.../restart.',str1,'.nc']);
sfile2 = (['/media/.../restart.',str2,'.nc']);
temp1 = ncread(sfile1,'t',[1 1 1], [182, 125, 35]);
temp2 = ncread(sfile2,'t',[1 1 1], [182, 125, 35]);
temp_new = temp1 + ((temp2 - temp1)/(m-n))*(t(l) - n);
temp_new = griddata(lon_m,lat_m,depth_m,temp_new,long_o,lat_o,depth_obs);
c = n;
str = sprintf('%4.4d', c);
ncid = netcdf.create(['/disk1/.../tempinter/restartint.',str,'.nc'],'NC_WRITE');
x = 1:182;
y = 1:125;
z = 1:35;
dimid1 = netcdf.defDim(ncid,'xdim',length(x));
dimid2 = netcdf.defDim(ncid,'ydim',length(y));
dimid3 = netcdf.defDim(ncid,'zdim',length(z));
varidt = netcdf.defVar(ncid,'temp_new','double',[dimid1 dimid2 dimid3]);
netcdf.endDef(ncid)
netcdf.putVar(ncid,varidt,temp_new);
netcdf.close(ncid);
c = c+1;
end
end
end

Related

Creating a table from a variable inside a for loop

I am writing a for loop to calculate the value of four different variables. The first variable is M. M increases from 10^2 to 10^5,
M = [10^2,10^3,10^4,10^5];
The other three variables needed for the table are shown in the code below.
confmc
confcv
confmcSize/confcvSize
I first create a for loop to iterate through the four different values of M. I then create the table outside of the for loop.
How could I adjust the implementation so that the table displays all four values of M?
randn('state',100)
%%%%%% Problem and method parameters %%%%%%%%%
S = 5; E = 6; sigma = 0.3; r = 0.05; T = 1;
Dt = 1e-2; N = T/Dt; M = [10^2,10^3,10^4,10^5];
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
for k=1:numel(M)
%%%%%%%%% Geom Asian exact mean %%%%%%%%%%%%
sigsqT= sigma^2*T*(N+1)*(2*N+1)/(6*N*N);
muT = 0.5*sigsqT + (r - 0.5*sigma^2)*T*(N+1)/(2*N);
d1 = (log(S/E) + (muT + 0.5*sigsqT))/(sqrt(sigsqT));
d2 = d1 - sqrt(sigsqT);
N1 = 0.5*(1+erf(d1/sqrt(2)));
N2 = 0.5*(1+erf(d2/sqrt(2)));
geo = exp(-r*T)*( S*exp(muT)*N1 - E*N2 );
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Spath = S*cumprod(exp((r-0.5*sigma^2)*Dt+sigma*sqrt(Dt)*randn(M(k),N)),2);
% Standard Monte Carlo
arithave = mean(Spath,2);
Parith = exp(-r*T)*max(arithave-E,0); % payoffs
Pmean = mean(Parith);
Pstd = std(Parith);
confmc = [Pmean-1.96*Pstd/sqrt(M(k)), Pmean+1.96*Pstd/sqrt(M(k))];
confmcSize = [(Pmean+1.96*Pstd/sqrt(M(k)))-(Pmean-1.96*Pstd/sqrt(M(k)))];
% Control Variate
geoave = exp((1/N)*sum(log(Spath),2));
Pgeo = exp(-r*T)*max(geoave-E,0); % geo payoffs
Z = Parith + geo - Pgeo; % control variate version
Zmean = mean(Z);
Zstd = std(Z);
confcv = [Zmean-1.96*Zstd/sqrt(M(k)), Zmean+1.96*Zstd/sqrt(M(k))];
confcvSize = [(Zmean+1.96*Zstd/sqrt(M(k)))-(Zmean-1.96*Zstd/sqrt(M(k)))];
end
T = table(M,confmc,confcv,confmcSize/confcvSize)
The current code returns
T =
1×4 table
M confmc confcv Var4
_____ ____________________ ____________________ ______
1e+05 0.096756 0.1007 0.097306 0.097789 8.1622
How could I change my implementation so that all four values of M are computed?
I just modified few things.Take a look at the following code.
randn('state',100)
%%%%%% Problem and method parameters %%%%%%%%%
S = 5; E = 6; sigma = 0.3; r = 0.05; T = 1;
Dt = 1e-2; N = T/Dt; M = [10^2,10^3,10^4,10^5];
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
confmc = zeros(numel(M), 2);
confcv = zeros(numel(M), 2);
confmcSize = zeros(numel(M), 1);
confcvSize = zeros(numel(M), 1);
for k=1:numel(M)
%%%%%%%%% Geom Asian exact mean %%%%%%%%%%%%
sigsqT= sigma^2*T*(N+1)*(2*N+1)/(6*N*N);
muT = 0.5*sigsqT + (r - 0.5*sigma^2)*T*(N+1)/(2*N);
d1 = (log(S/E) + (muT + 0.5*sigsqT))/(sqrt(sigsqT));
d2 = d1 - sqrt(sigsqT);
N1 = 0.5*(1+erf(d1/sqrt(2)));
N2 = 0.5*(1+erf(d2/sqrt(2)));
geo = exp(-r*T)*( S*exp(muT)*N1 - E*N2 );
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Spath = S*cumprod(exp((r-0.5*sigma^2)*Dt+sigma*sqrt(Dt)*randn(M(k),N)),2);
% Standard Monte Carlo
arithave = mean(Spath,2);
Parith = exp(-r*T)*max(arithave-E,0); % payoffs
Pmean = mean(Parith);
Pstd = std(Parith);
confmc(k,:) = [Pmean-1.96*Pstd/sqrt(M(k)), Pmean+1.96*Pstd/sqrt(M(k))];
confmcSize(k,1) = [(Pmean+1.96*Pstd/sqrt(M(k)))-(Pmean-1.96*Pstd/sqrt(M(k)))];
% Control Variate
geoave = exp((1/N)*sum(log(Spath),2));
Pgeo = exp(-r*T)*max(geoave-E,0); % geo payoffs
Z = Parith + geo - Pgeo; % control variate version
Zmean = mean(Z);
Zstd = std(Z);
confcv(k,:) = [Zmean-1.96*Zstd/sqrt(M(k)), Zmean+1.96*Zstd/sqrt(M(k))];
confcvSize(k,1) = [(Zmean+1.96*Zstd/sqrt(M(k)))-(Zmean-1.96*Zstd/sqrt(M(k)))];
end
T = table(M',confmc,confcv,confmcSize./confcvSize)
In short, I just used a matrix instead of a vector or scalar as the members of the table. In your code, the variables (confmc, confcv, confmcSize, confcvSize) were getting overwritten.

Dimensions of matrices being concatenated are not consistent using vertcat

I get this error messge everytime I run my code. I have searched through other questions that are the same and tried the solutions but they have not worked. The error message is below followed by the code.
Error using vertcat
Dimensions of matrices being concatenated are not consistent.
Error in project2 (line 147)
e = [a0;c0;d0];
N = length(z1);
if length(w)~=N, error('z and sw must be same length');
end
M = N-1;
a0 = zeros(2*M-2,(3*M));
b0 = zeros(2*M-2,1);
for i = 1:M-1
co = i;
ro = 2*(i-1)+1;
a0(ro,co) = w(i+1)^2;
a0(ro+1,co+1) = w(i+1)^2;
a0(ro,co+n) = w(i+1);
a0(ro+1,co+n+1) = w(i+1);
a0(ro,co+2*n) = 1;
a0(ro+1,co+2*n+1) = 1;
b0(ro) = z1(i+1);
b0(ro+1) = z1(i+1);
end
c0 = zeros(M-1,(3*M));
for i = 1:M-1
c0(i,i) = 2*w(i+1);
c0(i,i+1) = -2*w(i+1);
c0(i,i+n) = 1;
c0(i,i+n+1) = -1;
end
d0 = zeros(2,(3*M));
d0(1,1) = w(1)^2;
d0(1,M+1) = w(1);
d0(1,2*M+1) = 1;
d0(2,M) = w(end)^2;
d0(2,2*M) = w(end);
d0(2,end) = 1;
e = [a0;c0;d0];

Why do I get such a bad loss in my implementation of k-Nearest Neighbor?

I'm trying to implement k-NN in matlab. I have a matrix of 214 x's that have 9 columns of attributes with the 10th column being the label. I want to measure loss with a 0-1 function on 10 cross-validation tests. I have the following code:
function q3(file)
data = knnfile(file);
loss(data(:,1:9),'KFold',data(:,10))
losses = zeros(25,3);
new_data = data;
new_data(:,10) = [];
sdd = std(new_data);
meand = mean(new_data);
for s = 1:214
for q = 1:9
new_data(s,q) = (new_data(s,q) - meand(q)) / sdd(q);
end
end
new_data = [new_data data(:,10)];
for k = 1:25
loss1 = 0;
loss2 = 0;
for j = 0:9
index = floor(214/10)*j+1;
curd1 = data([1:index-1,index+21:end],:);
curd2 = new_data([1:index-1,index+21:end],:);
for l = 0:20
c1 = knn(curd1,k,data(index+l,:));
c2 = knn(curd2,k,new_data(index+l,:));
loss1 = loss1 + (c1 ~= data(index+l,10));
loss2 = loss2 + (c2 ~= new_data(index+l,10));
end
end
losses(k,1) = k;
losses(k,2) = 100*loss1/210;
losses(k,3) = 100*loss2/210;
end
function cluster = knn(Data,k,x)
distances = zeros(193,2);
for i = 1:size(Data,1)
row = Data(i,:);
d = norm(row(1:size(row,2)-1) - x(1:size(x,2)-1));
distances(i,:) = [d row(10)];
end
distances = sortrows(distances,1);
cluster = mode(distances(1:k,2));
I'm getting 40%+ loss with almost no correlation to k and I'm sure that something here is wrong but I'm not quite sure.
Any help would be appreciated!

compare more than 2 proportions matlab

Having 4 groups (A,B,C,D)
each of them containing a different number of male and female
male_A = 46
male_B = 241
male_C = 202
male_D = 113
female_A = 43
female_B = 134
female_C = 100
female_D = 53
How can I identify the groups that have a statistically different proportion of male and female? Suggestion using MATLAB would be appreciated...
POSSIBLE SOLUTION (PLEASE CHECK)
% 1st row: male
% 2nd row: female
cont = [46 241 202 113;
43 134 100 53]
mychi(cont)
%this function should calculate the Chi2
function mychi(cont)
cont = [cont, sum(cont,2)];
cont = [cont; sum(cont,1)];
counter = 1;
for i = 1 : size(cont,1)-1
for j = 1 : size(cont,2)-1
Observed(counter) = cont(i,j);
Expected(counter) = cont(i,end)*cont(end,j)/cont(end:end);
O_E_2(counter) = (abs(Observed(counter)-Expected(counter)).^2)/Expected(counter);
counter = counter + 1;
end
end
DOF = (size(cont,1)-2)*(size(cont,2)-2)
CHI = sum(O_E_2)
end
The CHI returned should be compared with the one for p<0.05 that can be found here
In my case
DOF =
3
CHI =
8.0746
CHI is > 0.352 so the groups have a biased number of male and female...
Not sure what comparison you are looking for, but the ratios can be obtained by
p = 0.05;
ratio_A = male_A ./ (male_A + female_A);
ratio_B = male_B ./ (male_B + female_B);
ratio_C = male_C ./ (male_C + female_C);
ratio_D = male_D ./ (male_D + female_D);
%Once you have ratios, you can perform analysis as mentioned on
%http://au.mathworks.com/help/stats/hypothesis-testing.html
Hope this helps
I suggest to arrange your data in a matrix and use the proper indexing according to your pourposes. Here you have an example:
male_A = 46;
male_B = 241;
male_C = 202;
male_D = 113;
female_A = 43;
female_B = 134;
female_C = 100;
female_D = 53;
matrix = [male_A female_A;
male_B female_B;
male_C female_C;
male_D female_D];
groups = ['A', 'B', 'C', 'D'];
total = (matrix(:,1)+matrix(:,2));
male_percentage = matrix(:,1)./total*100
female_percentage = matrix(:,2)./total*100
threshold = 65; %// Example threshold 65%
male_above_threshold = groups(male_percentage>threshold)
female_above_threshold = groups(female_percentage>threshold)
maximum_male_ratio = groups(male_percentage==max(male_percentage))
maximum_female_ratio = groups(female_percentage==max(female_percentage))
In your example you would get:
male_percentage =
51.6854
64.2667
66.8874
68.0723
female_percentage =
48.3146
35.7333
33.1126
31.9277
male_above_threshold =
CD
female_above_threshold =
Empty string: 1-by-0
maximum_male_ratio =
D
maximum_female_ratio =
A
Finding out the groups that are statistically different is another problem. You should provide more information in order to do that.

how can i reduce the following code by using any loop

num2 = xlsread('CANCER.xls','C2:C102')
[IDX,C1] = kmeans(num2,2)
num3 = xlsread('CANCER.xls','C103:C203')
[IDX,C2] = kmeans(num3,2)
num4 = xlsread('CANCER.xls','C304:C404')
[IDX,C3] = kmeans(num4,2)
num5 = xlsread('CANCER.xls','C405:C505')
[IDX,C4] = kmeans(num5,2)
num6 = xlsread('CANCER.xls','C506:C606')
[IDX,C5] = kmeans(num6,2)
num7 = xlsread('CANCER.xls','C607:C707')
[IDX,C6] = kmeans(num7,2)
num8 = xlsread('CANCER.xls','C708:C808')
[IDX,C7] = kmeans(num8,2)
num9 = xlsread('CANCER.xls','C809:C909')
[IDX,C8] = kmeans(num9,2)
num10 = xlsread('CANCER.xls','C1000:C1099')
[IDX,C9] = kmeans(num10,2)
num11= xlsread('CANCER.xls','C1100:C1199')
[IDX,C10] = kmeans(num11,2)
num12= xlsread('CANCER.xls','C1200:C1299')
[IDX,C11] = kmeans(num12,2)
num13= xlsread('CANCER.xls','C1300:C1399')
[IDX,C12] = kmeans(num13,2)
num14= xlsread('CANCER.xls','C1400:C1499')
[IDX,C13] = kmeans(num14,2)
kmns=[C1;C2;C3;C4;C5;C6;C7;C8;C9;C10;C11;C12;C13;C14]
Try this -
%%// Start and stop row numbers
start_ind = [2 103 304 405 506 607 708 809 1000 :100: 1400];
stop_ind = [start_ind(1:8)+100 start_ind(9:end) + 99];
data = xlsread('CANCER.xls'); %%// Read data in one-go
C = zeros(2,numel(start_ind)); %%// Place holder for C values
for k1 = 1:numel(start_ind)
num = data(start_ind(k1):stop_ind(k1),3); %%// Data for the specified range
[IDX,C(:,k1)] = kmeans(num,2); %%// Do the calculations
end
kmns = reshape(C,[],1); %%// Final result