I am running some regression in matlab.
I want to store the varibles nicely through structures.
Here is some code:
clc;
clear;
fruit_names={'Apple','Pear','Melon'};
Predictors.Apple = rand(500,11);
Predictors.Pear = rand(500,11);
Predictors.Melon = rand(500,11);
Returns.Apple = rand(500,1);
Returns.Pear = rand(500,1);
Returns.Melon = rand(500,1);
%%
kk=1;
for d = 1:length(fruit_names)
for i = [1,2,3,4,5,6,12,24,48,60]
for jj = 1:11
K = i;
xinit=[Predictors.(fruit_names{d})(:,jj)];
yinit=Returns.(fruit_names{d});
[b,bint,r,rint,stats] = regress(yinit,xinit);
Stats.(fruit_names{d})(kk+1)=stats(1);
kk=kk+1;%to help with reporting
end
end
end
It honestly my best attempt at a simple example. It does require the Econometrics toolbox if I remember correctly.
The problem I am having is the structure Stats stores the results I need but after the first variables include some useless zeros.
I have posted and deleted an earlier question which suggests removing the (kk,:) variables, but if I do this it only contains the final results not the evolution of results through the for loop.
Its the position of the kk variable that is the problem.
It seems like you need to collect the results from the second two loops.
clc;
clear;
fruit_names={'Apple','Pear','Melon'};
Predictors.Apple = rand(500,11);
Predictors.Pear = rand(500,11);
Predictors.Melon = rand(500,11);
Returns.Apple = rand(500,1);
Returns.Pear = rand(500,1);
Returns.Melon = rand(500,1);
%%
for d = 1:length(fruit_names)
kk=1; %move the kk = 1 variable here.
for i = [1,2,3,4,5,6,12,24,48,60]
for jj = 1:11
K = i;
xinit=[Predictors.(fruit_names{d})(:,jj)];
yinit=Returns.(fruit_names{d});
[b,bint,r,rint,stats] = regress(yinit,xinit);
Stats.(fruit_names{d})(kk+1)=stats(1);
kk=kk+1;%to help with reporting
end
end
end
Enjoy your fruit!
Related
I am very new to Scilab, but so far have not been able to find an answer (either here or via google) to my question. I'm sure it's a simple solution, but I'm at a loss. I have a lot of MATLAB scripts I wrote in grad school, but now that I'm out of school, I no longer have access to MATLAB (and can't justify the cost). Scilab looked like the best open alternative. I'm trying to convert my .m files to Scilab compatible versions using mfile2sci, but when running the mfile2sci GUI, I get the error/message shown below. Attached is the original code from the M-file, in case it's relevant.
I Searched Stack Overflow and companion sites, Google, Scilab documentation.
The M-file code follows (it's a super basic MATLAB script as part of an old homework question -- I chose it as it's the shortest, most straightforward M-file I had):
Mmax = 15;
N = 20;
T = 2000;
%define upper limit for sparsity of signal
smax = 15;
mNE = zeros(smax,Mmax);
mESR= zeros(smax,Mmax);
for M = 1:Mmax
aNormErr = zeros(smax,1);
aSz = zeros(smax,1);
ESR = zeros(smax,1);
for s=1:smax % for-loop to loop script smax times
normErr = zeros(1,T);
vESR = zeros(1,T);
sz = zeros(1,T);
for t=1:T %for-loop to carry out 2000 trials per s-value
esr = 0;
A = randn(M,N); % generate random MxN matrix
[M,N] = size(A);
An = zeros(M,N); % initialize normalized matrix
for h = 1:size(A,2) % normalize columns of matrix A
V = A(:,h)/norm(A(:,h));
An(:,h) = V;
end
A = An; % replace A with its column-normalized counterpart
c = randperm(N,s); % create random support vector with s entries
x = zeros(N,1); % initialize vector x
for i = 1:size(c,2)
val = (10-1)*rand + 1;% generate interval [1,10]
neg = mod(randi(10),2); % include [-10,-1]
if neg~=0
val = -1*val;
end
x(c(i)) = val; %replace c(i)th value of x with the nonzero value
end
y = A*x; % generate measurement vector (y)
R = y;
S = []; % initialize array to store selected columns of A
indx = []; % vector to store indices of selected columns
coeff = zeros(1,s); % vector to store coefficients of approx.
stop = 10; % init. stop condition
in = 0; % index variable
esr = 0;
xhat = zeros(N,1); % intialize estimated x signal
while (stop>0.5 && size(S,2)<smax)
%MAX = abs(A(:,1)'*R);
maxV = zeros(1,N);
for i = 1:size(A,2)
maxV(i) = abs(A(:,i)'*R);
end
in = find(maxV == max(maxV));
indx = [indx in];
S = [S A(:,in)];
coeff = [coeff R'*S(:,size(S,2))]; % update coefficient vector
for w=1:size(S,2)
r = y - ((R'*S(:,w))*S(:,w)); % update residuals
if norm(r)<norm(R)
index = w;
end
R = r;
stop = norm(R); % update stop condition
end
for j=1:size(S,2) % place coefficients into xhat at correct indices
xhat(indx(j))=coeff(j);
end
nE = norm(x-xhat)/norm(x); % calculate normalized error for this estimate
%esr = 0;
indx = sort(indx);
c = sort(c);
if isequal(indx,c)
esr = esr+1;
end
end
vESR(t) = esr;
sz(t) = size(S,2);
normErr(t) = nE;
end
%avsz = sum(sz)/T;
aSz(s) = sum(sz)/T;
%aESR = sum(vESR)/T;
ESR(s) = sum(vESR)/T;
%avnormErr = sum(normErr)/T; % produce average normalized error for these run
aNormErr(s) = sum(normErr)/T; % add new avnormErr to vector of all av norm errors
end
% just put this here to view the vector
mNE(:,M) = aNormErr;
mESR(:,M) = ESR;
% had an 'end' placed here, might've been unmatched
mNE%reshape(mNE,[],Mmax)
mESR%reshape(mESR,[],Mmax)]
figure
dimx = [1 Mmax];
dimy = [1 smax];
imagesc(dimx,dimy,mESR)
colormap gray
strESR = sprintf('Average ESR, N=%d',N);
title(strESR);
xlabel('M');
ylabel('s');
strNE = sprintf('Average Normed Error, N=%d',N);
figure
imagesc(dimx,dimy,mNE)
colormap gray
title(strNE)
xlabel('M');
ylabel('s');
The command used (and results) follow:
--> mfile2sci
ans =
[]
****** Beginning of mfile2sci() session ******
File to convert: C:/Users/User/Downloads/WTF_new.m
Result file path: C:/Users/User/DOWNLO~1/
Recursive mode: OFF
Only double values used in M-file: NO
Verbose mode: 3
Generate formatted code: NO
M-file reading...
M-file reading: Done
Syntax modification...
Syntax modification: Done
File contains no instruction, no translation made...
****** End of mfile2sci() session ******
To convert the foo.m file one has to enter
mfile2sci <path>/foo.m
where stands for the path of the directoty where foo.m is. The result is written in /foo.sci
Remove the ```` at the begining of each line, the conversion will proceed normally ?. However, don't expect to obtain a working .sci file as the m2sci converter is (to me) still an experimental tool !
I was wondering if someone could point out why I keep failing Test 1 and 2 despite my plot appearing as it should be.
My goal is to plot a bacteria population using three different K values on the same plot however I am failing due to the Index Exceeding Matrix Dimensions.
I am unsure what this means and how it relates to the plot.
I appreciate any help.
Thank you.
`function [bacteria_plot] = BacteriaPop(K,BacteriaPop)
C = 100;
r = 1;
t = 0:20;
K = 1000;
Kk = 3000;
kK = 5000;
BacteriaPop = (C.*exp(r.*t))./(1+(C./K).*exp(r.*t));
BacteriaPop = (C.*exp(r.*t))./(1+(C./Kk).*exp(r.*t));
BacteriaPop = (C.*exp(r.*t))./(1+(C./kK).*exp(r.*t));
hold on
plot(t,BacteriaPop)
end
`
[![Code, Plot and Error][2]][2]
Thanks to some help I have changed my code to
function [bacteria_plot] = bacteria_growth(K,Kk,kK,Y)
C = 100;
r = 1;
t = 0:20;
K = 1000;
Kk = 3000;
kK = 5000;
y(:,1) = (C.*exp(r.*t))./(1+(C./K).*exp(r.*t));
y(:,2) = (C.*exp(r.*t))./(1+(C./Kk).*exp(r.*t));
y(:,3) = (C.*exp(r.*t))./(1+(C./kK).*exp(r.*t));
hold on
plot(t,y)
hold off
end
It is still not perfect but it passed all the tests.
Any constructive criticism to streamline this code will be welcomed.
I am computing a list of error statistics of a variable u. Whenever I put the structures in the loop, Matlab becomes superslow, while it is pretty fast with standard variables. I have heard that with Matlab 2016b one can either use loops or use vectorized notation, the speed should be the same. It looks like it does not work for structures. Do you know why? Best suggestion here? This is a minimal example ( I just used trivial "if" clauses, mine are more complex and that's the main reason why I don't vectorize):
n = 1000,
m = 1000;
uMODEL = rand(n,m);
uOBS = rand(n,m);
%
%
%
tic
ERRORS = struct('Emedio',0,'MAE',0,'sigma',0,'Emax',0,'Emin',0);
Nerr(1:max(n,1),1) = 0; %Nerr at most it contains n
fields = fieldnames(ERRORS);
for i = 1:numel(fields)
ERRORS.(fields{i}) = zeros(max(n,1),1);
end
for i=1:m
for j=1:n
if (1000>2 && 2000<343532)
diff = uMODEL(j,i)-uOBS(j,i);
ERRORS.Emedio(j) = ERRORS.Emedio(j) + diff;
ERRORS.MAE(j) = ERRORS.MAE(j) + abs(diff);
ERRORS.sigma(j) = ERRORS.sigma(j) + diff^2;
ERRORS.Emax(j) = max(ERRORS.Emax(j),diff);
ERRORS.Emin(j) = min(ERRORS.Emin(j),diff);
Nerr(n) = Nerr(n) + 1;
end
end
end
ERRORS.Emedio(:) = ERRORS.Emedio(:)./Nerr(:);
ERRORS.MAE(:) = ERRORS.MAE(:)./Nerr(:);
ERRORS.sigma(:) = sqrt(ERRORS.sigma(:)./(Nerr(:)-1));
toc
clear ERRORS
tic
%
% here instead I define variables, I fill them up and then I throw them in structures, cause loops with structures are strangely slow...
Emedio = zeros(max(n,1),1);
MAE = zeros(max(n,1),1);
sigma = zeros(max(n,1),1);
Nerr = zeros(max(n,1),1);
Emax = zeros(max(n,1),1);
Emin = zeros(max(n,1),1);
for i=1:m
for j=1:n
if (1000>2 && 2000<343532)
diff = uMODEL(j,i)-uOBS(j,i);
Emedio(j) = Emedio(j) + diff;
MAE(j) = MAE(j) + abs(diff);
sigma(j) = sigma(j) + diff^2;
Emax(j) = max(Emax(j),diff);
Emin(j) = min(Emin(j),diff);
Nerr(n) = Nerr(n) + 1;
end
end
end
Emedio(:) = Emedio(:)./Nerr(:);
MAE(:) = MAE(:)./Nerr(:);
sigma(:) = sqrt(sigma(:)./(Nerr(:)-1));
ERRORS.Emedio(:) = Emedio(:);
ERRORS.MAE(:) = MAE(:);
ERRORS.sigma(:) = sigma(:);
ERRORS.Emax(:) = Emax(:);
ERRORS.Emin(:) = Emin(:);
toc
the output is:
Elapsed time is 2.372765 seconds.
Elapsed time is 0.057719 seconds.
I have originally written the following Matlab code to find intersection between a set of Axes Aligned Bounding Boxes (AABB) and space partitions (here 8 partitions). I believe it is readable by itself, moreover, I have added some comments for even more clarity.
function [A,B] = AABBPart(bbx,it) % bbx: aabb, it: iteration
global F
IT = it+1;
n = size(bbx,1);
F = cell(n,it);
A = Part([min(bbx(:,1:3)),max(bbx(:,4:6))],it,0); % recursive partitioning
B = F; % matlab does not allow
function s = Part(bx,it,J) % output to be global
s = {};
if it < 1; return; end
s = cell(8,1);
p = bx(1:3);
q = bx(4:6);
h = 0.5*(p+q);
prt = [p,h;... % 8 sub-parts (octa)
h(1),p(2:3),q(1),h(2:3);...
p(1),h(2),p(3),h(1),q(2),h(3);...
h(1:2),p(3),q(1:2),h(3);...
p(1:2),h(1),h(1:2),q(3);...
h(1),p(2),h(3),q(1),h(2),q(3);...
p(1),h(2:3),h(1),q(2:3);...
h,q];
for j=1:8 % check for each sub-part
k = 0;
t = zeros(0,1);
for i=1:n
if all(bbx(i,1:3) <= prt(j,4:6)) && ... % interscetion test for
all(prt(j,1:3) <= bbx(i,4:6)) % every aabb and sub-parts
k = k+1;
t(k) = i;
end
end
if ~isempty(t)
s{j,1} = [t; Part(prt(j,:),it-1,j)]; % recursive call
for i=1:numel(t) % collecting the results
if isempty(F{t(i),IT-it})
F{t(i),IT-it} = [-J,j];
else
F{t(i),IT-it} = [F{t(i),IT-it}; [-J,j]];
end
end
end
end
end
end
Concerns:
In my tests, it seems that probably few intersections are missing, say, 10 or so for 1000 or more setup. So I would be glad if you could help to find out any problematic parts in the code.
I am also concerned about using global F. I prefer to get rid of it.
Any other better solution in terms of speed, will be loved.
Note that the code is complete. And you can easily try it by some following setup.
n = 10000; % in the original application, n would be millions
bbx = rand(n,6);
it = 3;
[A,B] = AABBPart(bbx,it);
This is a follow-up question to How to append an element to an array in MATLAB? That question addressed how to append an element to an array. Two approaches are discussed there:
A = [A elem] % for a row array
A = [A; elem] % for a column array
and
A(end+1) = elem;
The second approach has the obvious advantage of being compatible with both row and column arrays.
However, this question is: which of the two approaches is fastest? My intuition tells me that the second one is, but I'd like some evidence for or against that. Any idea?
The second approach (A(end+1) = elem) is faster
According to the benchmarks below (run with the timeit benchmarking function from File Exchange), the second approach (A(end+1) = elem) is faster and should therefore be preferred.
Interestingly, though, the performance gap between the two approaches is much narrower in older versions of MATLAB than it is in more recent versions.
R2008a
R2013a
Benchmark code
function benchmark
n = logspace(2, 5, 40);
% n = logspace(2, 4, 40);
tf = zeros(size(n));
tg = tf;
for k = 1 : numel(n)
x = rand(round(n(k)), 1);
f = #() append(x);
tf(k) = timeit(f);
g = #() addtoend(x);
tg(k) = timeit(g);
end
figure
hold on
plot(n, tf, 'bo')
plot(n, tg, 'ro')
hold off
xlabel('input size')
ylabel('time (s)')
leg = legend('y = [y, x(k)]', 'y(end + 1) = x(k)');
set(leg, 'Location', 'NorthWest');
end
% Approach 1: y = [y, x(k)];
function y = append(x)
y = [];
for k = 1 : numel(x);
y = [y, x(k)];
end
end
% Approach 2: y(end + 1) = x(k);
function y = addtoend(x)
y = [];
for k = 1 : numel(x);
y(end + 1) = x(k);
end
end
How about this?
function somescript
RStime = timeit(#RowSlow)
CStime = timeit(#ColSlow)
RFtime = timeit(#RowFast)
CFtime = timeit(#ColFast)
function RowSlow
rng(1)
A = zeros(1,2);
for i = 1:1e5
A = [A rand(1,1)];
end
end
function ColSlow
rng(1)
A = zeros(2,1);
for i = 1:1e5
A = [A; rand(1,1)];
end
end
function RowFast
rng(1)
A = zeros(1,2);
for i = 1:1e5
A(end+1) = rand(1,1);
end
end
function ColFast
rng(1)
A = zeros(2,1);
for i = 1:1e5
A(end+1) = rand(1,1);
end
end
end
For my machine, this yields the following timings:
RStime =
30.4064
CStime =
29.1075
RFtime =
0.3318
CFtime =
0.3351
The orientation of the vector does not seem to matter that much, but the second approach is about a factor 100 faster on my machine.
In addition to the fast growing method pointing out above (i.e., A(k+1)), you can also get a speed increase from increasing the array size by some multiple, so that allocations become less as the size increases.
On my laptop using R2014b, a conditional doubling of size results in about a factor of 6 speed increase:
>> SO
GATime =
0.0288
DWNTime =
0.0048
In a real application, the size of A would needed to be limited to the needed size or the unfilled results filtered out in some way.
The Code for the SO function is below. I note that I switched to cos(k) since, for some unknown reason, there is a large difference in performance between rand() and rand(1,1) on my machine. But I don't think this affects the outcome too much.
function [] = SO()
GATime = timeit(#GrowAlways)
DWNTime = timeit(#DoubleWhenNeeded)
end
function [] = DoubleWhenNeeded()
A = 0;
sizeA = 1;
for k = 1:1E5
if ((k+1) > sizeA)
A(2*sizeA) = 0;
sizeA = 2*sizeA;
end
A(k+1) = cos(k);
end
end
function [] = GrowAlways()
A = 0;
for k = 1:1E5
A(k+1) = cos(k);
end
end