Compare common value of a timeseries - matlab

I have timeseries that don't have the same start time and I would like to find the common part of them.
EX:
a=[ 0,1,2,3,4,5,6,7]
b=[ 2,3,4,5,6,7,8,9]
c=[-1,0,1,2,3,4,5,6]
result=[2,3,4,5,6]
Is there a matlab function to do that ?
EDIT:
I found an algorithm, but it is taking for ever and it is saturating my memory to analyse 6 timeseries of 100000 points. Is the algorithm not written properly or is it the way Longest common substring problem is?

The problem is called the Longest common substring problem.
http://en.wikipedia.org/wiki/Longest_common_substring_problem
It is not really hard to implement, and you you can probably also find Matlab code online. It is important to observe that if you know how to solve for 2 time series, you know how to solve for N, because: c(x,y,z) = c(x,c(y,z))

Related

Using Gurobi to run a MIQP: how can I improve time performance?

I am using Gurobi to run a MIQP (Mixed Integer Quadratic Programming) with linear constraints in Matlab. The solver is very slow and I would like your help to understand whether I can do something about it.
These are the lines which I use to launch the problem
clear model;
clear params;
model.A=[Aineq; Aeq];
model.rhs=[bineq; beq];
model.sense=[repmat('<', size(Aineq,1),1); repmat('=', size(Aeq,1),1)];
model.Q=Q;
model.obj=c;
model.vtype=type;
model.lb=total_lb;
model.ub=total_ub;
params.MIPGap=10^(-1);
result=gurobi(model,params);
This is a screenshot of the output in the Matlab window.
Question 1: It is the first time I am trying to run a MIQP and I would like to have your advice to understand what I can do to improve performance. Let me tell what I have tried so far:
I cheated by imposing params.MIPGap=10^(-1). In this way the phase of node exploration is made shorter. What are the cons of doing this?
I have big-M coefficients and I have tied them to the smallest possible values.
I have tried setting params.ScaleFlag=2; params.ObjScale=2 but it makes things slower
I have changed params.method but it does not seem to help (unless you have some specific recommendation)
I have increase params.Threads but it does not seem to help
Question 2 (minor): Why do I get a negative objective in the root simplex log? How can the objective function be negative?
Without having the full model here, there is not much on advise to give. Tight Big-M formulations are important, but you said, you checked them already. Sometimes splitting them up might help, but this is a complex field.
What might give great benefits for some problems is using the Gurobi parameter tuning tool. So try to export your model and feed the tuning tool with it. It automatically tries different of the hundreds of tuning parameters and might give some nice results.
Regarding the question about negative objectives in the simplex logs, I can think of a couple of possible explanations. First, note that the negative objective values occur in the presence of dual infeasibilities in the dual simplex run. In such a case, I'm not sure exactly what the primal objective values correspond to. Second, if you have a MIQP with products of binaries in the objective, Gurobi may convexify the objective in a way that makes it possible for a negative objective to appear in the reformulated model even when the original model must have a nonnegative objective in any feasible solution.

faster way to add many large matrix in matlab

Say I have many (around 1000) large matrices (about 1000 by 1000) and I want to add them together element-wise. The very naive way is using a temp variable and accumulates in a loop. For example,
summ=0;
for ii=1:20
for jj=1:20
summ=summ+ rand(400);
end
end
After searching on the Internet for some while, someone said it's better to do with the help of sum(). For example,
sump=zeros(400,400,400);
count=0;
for ii=1:20
for j=1:20
count=count+1;
sump(:,:,count)=rand(400);
end
end
sum(sump,3);
However, after I tested two ways, the result is
Elapsed time is 0.780819 seconds.
Elapsed time is 1.085279 seconds.
which means the second method is even worse.
So I am just wondering if there any effective way to do addition? Assume that I am working on a computer with very large memory and a GTX 1080 (CUDA might be helpful but I don't know whether it's worthy to do so since communication also takes time.)
Thanks for your time! Any reply will be highly appreciated!.
The fastes way is to not use any loops in matlab at all.
In many cases, the internal functions of matlab all well optimized to use SIMD or other acceleration techniques.
An example for using the build in functionalities to create matrices of the desired size is X = rand(sz1,...,szN).
In your explicit case sum(rand(400,400,400),3) should give you then the fastest result.

Solving approach for a series

I am having a great trouble on finding the solution of this series.
index 1 2 3 4 5
number 0 1 5 15 35
here say first index is an exception but what is the solution for that series to pick an index & get the number. Please add your Explanation of the solving approach.
I would also like to have some extra example for solving approach of other this kind of series.
The approaches to solve a general series matching problem vary a lot, depending on the information you have about the series. You can start with reading up on time series.
For this series you can easily google it and find out they're related to the binomial coefficients like n!/(n-4)!/4! . Taking into account i, it will be something like (i+3)!/4!/(i-1)!

How can I swap a section of a row with another within an array?

I am in the process of coding a simple Genetic Algorithm (GA). There are probably countless areas where I have unnecessarily used a for loop. I would like some tips on how to be more MATLAB efficient as well as an answer to my question. As far as I can tell I have succeeded but I am not sure. The area which this code defines is single-point crossover
Here is what I have tried...
crossPoints=randi([1 24],popSize/2,1);
for popNo=2:2:popSize
isolate=chromoParent(popNo-1:popNo,crossPoints(popNo/2,1)+1:end);
isolate([1 2],:)=isolate([2 1],:);
chromoParent(popNo-1:popNo,crossPoints(popNo/2,1)+1:end)=isolate;
end
chromoChild=chromoParent;
where, 'crossPoints' is the point at which single point crossover
between two binary encoded chromosomes is required.
'popSize' is the size of the population, required by my code to
be an even number
'isolate' defines the sections of 2 rows which are required to be swapped
with each other
'chromoParent' is the initial population which is required to be
changed by single-point crossover
'chromoChild' is the resulting population
Both 'chromoParent' and 'chromoChild' are represented by an array of
size, popSize x 25 binary characters
Can you spot an error in the way I am thinking about this problem? What's the most efficient way (in computational time) to achieve the same thing? It would help if you could be as broad as possible so that I could begin applying the principles I learn here to the rest of my code.
Thank you.
Your code looks fine. If you want, you can reduce the instructions in the loop to a single line by some very simple indexing:
chromoParent( popNo-1:popNo, crossPoints(popNo/2,1)+1:end) = ...
chromoParent(popNo:-1:popNo-1,crossPoints(popNo/2,1)+1:end);
This may be marginally faster, but as with any optimization, you should profile it first (My guess is that these line contribute very little to the overall CPU time).

Merge sensor data for clustering/neural net usage

I have several datasets i.e. matrices that have a 2 columns, one with a matlab date number and a second one with a double value. Here an example set of one of them
>> S20_EavesN0x2DEAir(1:20,:)
ans =
1.0e+05 *
7.345016409722222 0.000189375000000
7.345016618055555 0.000181875000000
7.345016833333333 0.000177500000000
7.345017041666667 0.000172500000000
7.345017256944445 0.000168750000000
7.345017465277778 0.000166875000000
7.345017680555555 0.000164375000000
7.345017888888889 0.000162500000000
7.345018104166667 0.000161250000000
7.345018312500001 0.000160625000000
7.345018527777778 0.000158750000000
7.345018736111110 0.000160000000000
7.345018951388888 0.000159375000000
7.345019159722222 0.000159375000000
7.345019375000000 0.000160625000000
7.345019583333333 0.000161875000000
7.345019798611111 0.000162500000000
7.345020006944444 0.000161875000000
7.345020222222222 0.000160625000000
7.345020430555556 0.000160000000000
Now that I have those different sensor values, I need to get them together into a matrix, so that I could perform clustering, neural net and so on, the only problem is, that the sensor data was taken with slightly different timings or timestamps and there is nothing I can do about that from a data collection point of view.
My first thought was interpolation to make one sensor data set fit another one, but that seems like a messy approach and I was thinking maybe I am missing something, a toolbox or function that would enable me to do this quicker without me fiddling around. To even complicate things more, the number of sensors grew over time, therefore I am looking at different start dates as well.
Someone a good idea on how to go about this? Thanks
I think your first thought about interpolation was the correct one, at least if you plan to use NNs. Another option would be to use approaches which are designed to deal with missing data, like http://en.wikipedia.org/wiki/Dempster%E2%80%93Shafer_theory for example.
It's hard to give an answer for the clustering part, because I have no idea what you're looking for in the data.
For the neural network, beside interpolating there are at least two other methods that come to mind:
training separate networks for each matrix
feeding them all together to the same network, with a flag specifying which matrix the data is coming from, i.e. something like: input (timestamp, flag_m1, flag_m2, ..., flag_mN) => target (value) where the flag_m* columns are mutually exclusive boolean values - i.e. flag_mK is 1 iff the line comes from matrix K, 0 otherwise.
These are the only things I can safely say with the amount of information you provided.