My goal is to implement a function which performs fourier synthesis in matlab, as part of learning the language. The function implements the following expression:
y = sum(ak*exp((i*2*pi*k*t)/T)
where k is the index, ak is a vector of fourier coefficients, t is a time vector of sampled times, and T is the period of the signal.
I have tried something like this:
for counter = -N:1:N
k = y+N+1;
y(k) = ak(k)*exp((i*2*pi*k*t)/T);
% y is a vector of length 2N+1
end
However, this gives me an error that the sides do not have equal numbers of items within them. This makes sense to me, since t is a vector of arbitrary length, and thus I am trying to make y(k) equal to numerous things rather than one thing. Instead, I suspect I need to try something like:
for counter = -N:1:N
k=y+N+1;
for t = 0:1/fs:1
%sum over t elements for exponential operation
end
%sum over k elements to generate y(k)
end
However, I'm supposedly able to implement this using purely matrix multiplication. How could I do this? I've tried to wrap my head around what Matlab is doing, but honestly, it's so far from the other languages I know that I don't really have any sense of what matlab's doing under the hood. Understanding how to change between operations on matrices and operations in for loops would be profoundly helpful.
You can use kron to reach your goal without for loops, i.e., matrix representation:
y = a*exp(1j*2*pi*kron(k.',t)/T);
where a,k and t are all assumed as row-vectors
Example
N = 3;
k = -N:N;
t = 1:0.5:5;
T = 15;
a = 1:2*N+1;
y = a*exp(1j*2*pi*kron(k.',t)/T);
such that
y =
Columns 1 through 6:
19.1335 + 9.4924i 10.4721 + 10.6861i 2.0447 + 8.9911i -4.0000 + 5.1962i -6.4721 + 0.7265i -5.4611 - 2.8856i
Columns 7 through 9:
-2.1893 - 4.5489i 1.5279 - 3.9757i 4.0000 - 1.7321i
Related
I have some confusion about the terminologies and simulation of an FIR system. I shall appreciate help in rectifying my mistakes and informing what is correct.
Assuming a FIR filter with coefficient array A=[1,c2,c3,c4]. The number of elements are L so the length of the filter L but the order is L-1.
Confusion1: Is the intercept 1 considered as a coefficient? Is it always 1?
Confusion2: Is my understanding correct that for the given example the length L= 4 and order=3?
Confusion3: Mathematically, I can write it as:
where u is the input data and l starts from zero. Then to simulate the above equation I have done the following convolution. Is it correct?:
N =100; %number of data
A = [1, 0.1, -0.5, 0.62];
u = rand(1,N);
x(1) = 0.0;
x(2) = 0.0;
x(3) = 0.0;
x(4) = 0.0;
for n = 5:N
x(n) = A(1)*u(n) + A(2)*u(n-1)+ A(3)*u(n-3)+ A(4)*u(n-4);
end
Confusion1: Is the intercept 1 considered as a coefficient? Is it always 1?
Yes it is considered a coefficient, and no it isn't always 1. It is very common to include a global scaling factor in the coefficient array by multiplying all the coefficients (i.e. scaling the input or output of a filter with coefficients [1,c1,c2,c2] by K is equivalent to using a filter with coefficients [K,K*c1,K*c2,K*c3]). Also note that many FIR filter design techniques generate coefficients whose amplitude peaks near the middle of the coefficient array and taper off at the start and end.
Confusion2: Is my understanding correct that for the given example the length L= 4 and order = 3?
Yes, that is correct
Confusion3: [...] Then to simulate the above equation I have done the following convolution. Is it correct? [...]
Almost, but not quite. Here are the few things that you need to fix.
In the main for loop, applying the formula you would increment the index of A and decrement the index of u by 1 for each term, so you would actually get x(n) = A(1)*u(n) + A(2)*u(n-1)+ A(3)*u(n-2)+ A(4)*u(n-3)
You can actually start this loop at n=4
The first few outputs should still be using the formula, but dropping the terms u(n-k) for which n-k would be less than 1. So, for x(3) you'd be dropping 1 term, for x(2) you'd be dropping 2 terms and for x(1) you'd be dropping 3 terms.
The modified code would look like the following:
x(1)=A(1)*u(1);
x(2)=A(1)*u(2) + A(2)*u(1);
x(3)=A(1)*u(3) + A(2)*u(2) + A(3)*u(1);
for n = 4:N
x(n) = A(1)*u(n) + A(2)*u(n-1)+ A(3)*u(n-2)+ A(4)*u(n-3);
end
I have a 151-by-151 matrix A. It's a correlation matrix, so there are 1s on the main diagonal and repeated values above and below the main diagonal. Each row/column represents a person.
For a given integer n I will seek to reduce the size of the matrix by kicking people out, such that I am left with a n-by-n correlation matrix that minimises the total sum of the elements. In addition to obtaining the abbreviated matrix, I also need to know the row number of the people who should be booted out of the original matrix (or their column number - they'll be the same number).
As a starting point I take A = tril(A), which will remove redundant off-diagonal elements from the correlation matrix.
So, if n = 4 and we have the hypothetical 5-by-5 matrix above, it's very clear that person 5 should be kicked out of the matrix, since that person is contributing a lot of very high correlations.
It's also clear that person 1 should not be kicked out, since that person contributes a lot of negative correlations, and thus brings down the sum of the matrix elements.
I understand that sum(A(:)) will sum everything in the matrix. However, I'm very unclear about how to search for the minimum possible answer.
I noticed a similar question Finding sub-matrix with minimum elementwise sum, which has a brute force solution as the accepted answer. While that answer works fine there it's impractical for a 151-by-151 matrix.
EDIT: I had thought of iterating, but I don't think that truly minimizes the sum of elements in the reduced matrix. Below I have a 4-by-4 correlation matrix in bold, with sums of rows and columns on the edges. It's apparent that with n = 2 the optimal matrix is the 2-by-2 identity matrix involving Persons 1 and 4, but according to the iterative scheme I would have kicked out Person 1 in the first phase of iteration, and so the algorithm makes a solution that is not optimal. I wrote a program that always generated optimal solutions, and it works well when n or k are small, but when trying to make an optimal 75-by-75 matrix from a 151-by-151 matrix I realised my program would take billions of years to terminate.
I vaguely recalled that sometimes these n choose k problems can be resolved with dynamic programming approaches that avoid recomputing things, but I can't work out how to solve this, and nor did googling enlighten me.
I'm willing to sacrifice precision for speed if there's no other option, or the best program will take more than a week to generate a precise solution. However, I'm happy to let a program run for up to a week if it will generate a precise solution.
If it's not possible for a program to optimise the matrix within an reasonable timeframe, then I would accept an answer that explains why n choose k tasks of this particular sort can't be resolved within reasonable timeframes.
This is an approximate solution using a genetic algorithm.
I started with your test case:
data_points = 10; % How many data points will be generated for each person, in order to create the correlation matrix.
num_people = 25; % Number of people initially.
to_keep = 13; % Number of people to be kept in the correlation matrix.
to_drop = num_people - to_keep; % Number of people to drop from the correlation matrix.
num_comparisons = 100; % Number of times to compare the iterative and optimization techniques.
for j = 1:data_points
rand_dat(j,:) = 1 + 2.*randn(num_people,1); % Generate random data.
end
A = corr(rand_dat);
then I defined the functions you need to evolve the genetic algorithm:
function individuals = user1205901individuals(nvars, FitnessFcn, gaoptions, num_people)
individuals = zeros(num_people,gaoptions.PopulationSize);
for cnt=1:gaoptions.PopulationSize
individuals(:,cnt)=randperm(num_people);
end
individuals = individuals(1:nvars,:)';
is the individual generation function.
function fitness = user1205901fitness(ind, A)
fitness = sum(sum(A(ind,ind)));
is the fitness evaluation function
function offspring = user1205901mutations(parents, options, nvars, FitnessFcn, state, thisScore, thisPopulation, num_people)
offspring=zeros(length(parents),nvars);
for cnt=1:length(parents)
original = thisPopulation(parents(cnt),:);
extraneus = setdiff(1:num_people, original);
original(fix(rand()*nvars)+1) = extraneus(fix(rand()*(num_people-nvars))+1);
offspring(cnt,:)=original;
end
is the function to mutate an individual
function children = user1205901crossover(parents, options, nvars, FitnessFcn, unused, thisPopulation)
children=zeros(length(parents)/2,nvars);
cnt = 1;
for cnt1=1:2:length(parents)
cnt2=cnt1+1;
male = thisPopulation(parents(cnt1),:);
female = thisPopulation(parents(cnt2),:);
child = union(male, female);
child = child(randperm(length(child)));
child = child(1:nvars);
children(cnt,:)=child;
cnt = cnt + 1;
end
is the function to generate a new individual coupling two parents.
At this point you can define your problem:
gaproblem2.fitnessfcn=#(idx)user1205901fitness(idx,A)
gaproblem2.nvars = to_keep
gaproblem2.options = gaoptions()
gaproblem2.options.PopulationSize=40
gaproblem2.options.EliteCount=10
gaproblem2.options.CrossoverFraction=0.1
gaproblem2.options.StallGenLimit=inf
gaproblem2.options.CreationFcn= #(nvars,FitnessFcn,gaoptions)user1205901individuals(nvars,FitnessFcn,gaoptions,num_people)
gaproblem2.options.CrossoverFcn= #(parents,options,nvars,FitnessFcn,unused,thisPopulation)user1205901crossover(parents,options,nvars,FitnessFcn,unused,thisPopulation)
gaproblem2.options.MutationFcn=#(parents, options, nvars, FitnessFcn, state, thisScore, thisPopulation) user1205901mutations(parents, options, nvars, FitnessFcn, state, thisScore, thisPopulation, num_people)
gaproblem2.options.Vectorized='off'
open the genetic algorithm tool
gatool
from the File menu select Import Problem... and choose gaproblem2 in the window that opens.
Now, run the tool and wait for the iterations to stop.
The gatool enables you to change hundreds of parameters, so you can trade speed for precision in the selected output.
The resulting vector is the list of indices that you have to keep in the original matrix so A(garesults.x,garesults.x) is the matrix with only the desired persons.
If I have understood you problem statement, you have a N x N matrix M (which happens to be a correlation matrix), and you wish to find for integer n where 2 <= n < N, a n x n matrix m which minimises the sum over all elements of m which I denote f(m)?
In Matlab it is fairly easy and fast to obtain a sub-matrix of a matrix (see for example Removing rows and columns from matrix in Matlab), and the function f is relatively inexpensive to evaluate for n = 151. So why can't you implement an algorithm that solves this backwards dynamically in a program as below where I have sketched out the pseudocode:
function reduceM(M, n){
m = M
for (ii = N to n+1) {
for (jj = 1 to ii) {
val(jj) = f(m) where mhas column and row jj removed, f(X) being summation over all elements of X
}
JJ(ii) = jj s.t. val(jj) is smallest
m = m updated by removing column and row JJ(ii)
}
}
In the end you end up with an m of dimension n which is the solution to your problem and a vector JJ which contains the indices removed at each iteration (you should easily be able to convert these back to indices applicable to the full matrix M)
There are several approaches to finding an approximate solution (eg. quadratic programming on relaxed problem or greedy search), but finding the exact solution is an NP-hard problem.
Disclaimer: I'm not an expert on binary quadratic programming, and you may want to consult the academic literature for more sophisticated algorithms.
Mathematically equivalent formulation:
Your problem is equivalent to:
For some symmetric, positive semi-definite matrix S
minimize (over vector x) x'*S*x
subject to 0 <= x(i) <= 1 for all i
sum(x)==n
x(i) is either 1 or 0 for all i
This is a quadratic programming problem where the vector x is restricted to taking only binary values. Quadratic programming where the domain is restricted to a set of discrete values is called mixed integer quadratic programming (MIQP). The binary version is sometimes called Binary Quadratic Programming (BQP). The last restriction, that x is binary, makes the problem substantially more difficult; it destroys the problem's convexity!
Quick and dirty approach to finding an approximate answer:
If you don't need a precise solution, something to play around with might be a relaxed version of the problem: drop the binary constraint. If you drop the constraint that x(i) is either 1 or 0 for all i, then the problem becomes a trivial convex optimization problem and can be solved nearly instantaneously (eg. by Matlab's quadprog). You could try removing entries that, on the relaxed problem, quadprog assigns the lowest values in the x vector, but this does not truly solve the original problem!
Note also that the relaxed problem gives you a lower bound on the optimal value of the original problem. If your discretized version of the solution to the relaxed problem leads to a value for the objective function close to the lower bound, there may be a sense in which this ad-hoc solution can't be that far off from the true solution.
To solve the relaxed problem, you might try something like:
% k is number of observations to drop
n = size(S, 1);
Aeq = ones(1,n)
beq = n-k;
[x_relax, f_relax] = quadprog(S, zeros(n, 1), [], [], Aeq, beq, zeros(n, 1), ones(n, 1));
f_relax = f_relax * 2; % Quadprog solves .5 * x' * S * x... so mult by 2
temp = sort(x_relax);
cutoff = temp(k);
x_approx = ones(n, 1);
x_approx(x_relax <= cutoff) = 0;
f_approx = x_approx' * S * x_approx;
I'm curious how good x_approx is? This doesn't solve your problem, but it might not be horrible! Note that f_relax is a lower bound on the solution to the original problem.
Software to solve your exact problem
You should check out this link and go down to the section on Mixed Integer Quadratic Programming (MIQP). It looks to me that Gurobi can solve problems of your type. Another list of solvers is here.
Working on a suggestion from Matthew Gunn and also some advice at the Gurobi forums, I came up with the following function. It seems to work pretty well.
I will award it the answer, but if someone can come up with code that works better I'll remove the tick from this answer and place it on their answer instead.
function [ values ] = the_optimal_method( CM , num_to_keep)
%the_iterative_method Takes correlation matrix CM and number to keep, returns list of people who should be kicked out
N = size(CM,1);
clear model;
names = strseq('x',[1:N]);
model.varnames = names;
model.Q = sparse(CM); % Gurobi needs a sparse matrix as input
model.A = sparse(ones(1,N));
model.obj = zeros(1,N);
model.rhs = num_to_keep;
model.sense = '=';
model.vtype = 'B';
gurobi_write(model, 'qp.mps');
results = gurobi(model);
values = results.x;
end
i am trying to learn how to vectorise matlab loops, so im just doing a few small examples.
here is the standard loop i am trying to vectorise:
function output = moving_avg(input, N)
output = [];
for n = N:length(input) % iterate over y vector
summation = 0;
for ii = n-(N-1):n % iterate over x vector N times
summation += input(ii);
endfor
output(n) = summation/N;
endfor
endfunction
i have been able to vectorise one loop, but cant work out what to do with the second loop. here is where i have got to so far:
function output = moving_avg(input, N)
output = [];
for n = N:length(input) % iterate over y vector
output(n) = mean(input(n-(N-1):n));
endfor
endfunction
can someone help me simplify it further?
EDIT:
the input is just a one dimensional vector and probably maximum 100 data points. N is a single integer, less than the size of the input (typically probably around 5)
i don't actually intend to use it for any particular application, it was just a simple nested loop that i thought would be good to use to learn about vectorisation..
Seems like you are performing convolution operation there. So, just use conv -
output = zeros(size(input1))
output(N:end) = conv(input1,ones(1,N),'valid')./N
Please note that I have replaced the variable name input with input1, as input is already used as the name of a built-in function in MATLAB, so it's a good practice to avoid such conflicts.
Generic case: For a general case scenario, you can look into bsxfun to create such groups and then choose your operation that you intend to perform at the final stage. Here's how such a code would look like for sliding/moving average operation -
%// Create groups of indices for each sliding interval of length N
idx = bsxfun(#plus,[1:N]',[0:numel(input1)-N]) %//'
%// Index into input1 with those indices to get grouped elements from it along columns
input1_indexed = input1(idx)
%// Finally, choose the operation you intend to perform and apply along the
%// columns. In this case, you are doing average, so use mean(...,1).
output = mean(input1_indexed,1)
%// Also pre-append with zeros if intended to match up with the expected output
Matlab as a language does this type of operation poorly - you will always require an outside O(N) loop/operation involving at minimum O(K) copies which will not be worth it in performance to vectorize further because matlab is a heavy weight language. Instead, consider using the
filter function where these things are typically implemented in C which makes that type of operation nearly free.
For a sliding average, you can use cumsum to minimize the number of operations:
x = randi(10,1,10); %// example input
N = 3; %// window length
y = cumsum(x); %// compute cumulative sum of x
z = zeros(size(x)); %// initiallize result to zeros
z(N:end) = (y(N:end)-[0 y(1:end-N)])/N; %// compute order N difference of cumulative sum
I want a code the below code more efficient timewise. preferably without a loop.
arguments:
t % time values vector
t_index = c % one of the possible indices ranging from 1:length(t).
A % a MXN array where M = length(t)
B % a 1XN array
code:
m = 1;
for k = t_index:length(t)
A(k,1:(end-m+1)) = A(k,1:(end-m+1)) + B(m:end);
m = m + 1;
end
Many thanks.
I'd built from B a matrix of size NxM (call it B2), with zeros in the right places and a triangular from according to the conditions and then all you need to do is A+B2.
something like this:
N=size(A,2);
B2=zeros(size(A));
k=c:length(t);
B2(k(1):k(N),:)=hankel(B)
ans=A+B2;
Note, the fact that it is "vectorized" doesn't mean it is faster these days. Matlab's JIT makes for loops comparable and sometimes faster than built-in vectorized options.
I'm trying to use MatLab code as a way to learn math as a programmer.
So reading I'm this post about subspaces and trying to build some simple matlab functions that do it for me.
Here is how far I got:
function performSubspaceTest(subset, numArgs)
% Just a quick and dirty function to perform subspace test on a vector(subset)
%
% INPUT
% subset is the anonymous function that defines the vector
% numArgs is the the number of argument that subset takes
% Author: Lasse Nørfeldt (Norfeldt)
% Date: 2012-05-30
% License: http://creativecommons.org/licenses/by-sa/3.0/
if numArgs == 1
subspaceTest = #(subset) single(rref(subset(rand)+subset(rand))) ...
== single(rref(rand*subset(rand)));
elseif numArgs == 2
subspaceTest = #(subset) single(rref(subset(rand,rand)+subset(rand,rand))) ...
== single(rref(rand*subset(rand,rand)));
end
% rand just gives a random number. Converting to single avoids round off
% errors.
% Know that the code can crash if numArgs isn't given or bigger than 2.
outcome = subspaceTest(subset);
if outcome == true
display(['subset IS a subspace of R^' num2str(size(outcome,2))])
else
display(['subset is NOT a subspace of R^' num2str(size(outcome,2))])
end
And these are the subset that I'm testing
%% Checking for subspaces
V = #(x) [x, 3*x]
performSubspaceTest(V, 1)
A = #(x) [x, 3*x+1]
performSubspaceTest(A, 1)
B = #(x) [x, x^2, x^3]
performSubspaceTest(B, 1)
C = #(x1, x3) [x1, 0, x3, -5*x1]
performSubspaceTest(C, 2)
running the code gives me this
V =
#(x)[x,3*x]
subset IS a subspace of R^2
A =
#(x)[x,3*x+1]
subset is NOT a subspace of R^2
B =
#(x)[x,x^2,x^3]
subset is NOT a subspace of R^3
C =
#(x1,x3)[x1,0,x3,-5*x1]
subset is NOT a subspace of R^4
The C is not working (only works if it only accepts one arg).
I know that my solution for numArgs is not optimal - but it was what I could come up with at the current moment..
Are there any way to optimize this code so C will work properly and perhaps avoid the elseif statments for more than 2 args..?
PS: I couldn't seem to find a build-in matlab function that does the hole thing for me..
Here's one approach. It tests if a given function represents a linear subspace or not. Technically it is only a probabilistic test, but the chance of it failing is vanishingly small.
First, we define a nice abstraction. This higher order function takes a function as its first argument, and applies the function to every row of the matrix x. This allows us to test many arguments to func at the same time.
function y = apply(func,x)
for k = 1:size(x,1)
y(k,:) = func(x(k,:));
end
Now we write the core function. Here func is a function of one argument (presumed to be a vector in R^m) which returns a vector in R^n. We apply func to 100 randomly selected vectors in R^m to get an output matrix. If func represents a linear subspace, then the rank of the output will be less than or equal to m.
function result = isSubspace(func,m)
inputs = rand(100,m);
outputs = apply(func,inputs);
result = rank(outputs) <= m;
Here it is in action. Note that the functions take only a single argument - where you wrote c(x1,x2)=[x1,0,x2] I write c(x) = [x(1),0,x(2)], which is slightly more verbose, but has the advantage that we don't have to mess around with if statements to decide how many arguments our function has - and this works for functions that take input in R^m for any m, not just 1 or 2.
>> v = #(x) [x,3*x]
>> isSubspace(v,1)
ans =
1
>> a = #(x) [x(1),3*x(1)+1]
>> isSubspace(a,1)
ans =
0
>> c = #(x) [x(1),0,x(2),-5*x(1)]
>> isSubspace(c,2)
ans =
1
The solution of not being optimal barely scratches the surface of the problem.
I think you're doing too much at once: rref should not be used and is complicating everything. especially for numArgs greater then 1.
Think it through: [1 0 3 -5] and [3 0 3 -5] are both members of C, but their sum [4 0 6 -10] (which belongs in C) is not linear product of the multiplication of one of the previous vectors (e.g. [2 0 6 -10] ). So all the rref in the world can't fix your problem.
So what can you do instead?
you need to check if
(randn*subset(randn,randn)+randn*subset(randn,randn)))
is a member of C, which, unless I'm mistaken is a difficult problem: Conceptually you need to iterate through every element of the vector and make sure it matches the predetermined condition. Alternatively, you can try to find a set such that C(x1,x2) gives you the right answer. In this case, you can use fminsearch to solve this problem numerically and verify the returned value is within a defined tolerance:
[s,error] = fminsearch(#(x) norm(C(x(1),x(2)) - [2 0 6 -10]),[1 1])
s =
1.999996976386119 6.000035034493023
error =
3.827680714104862e-05
Edit: you need to make sure you can use negative numbers in your multiplication, so don't use rand, but use something else. I changed it to randn.