Trying to produce exponential traffic - matlab

I'm trying to simulate an optical network algorithm in MATLAB for a homework project. Most of it is already done, but I have an issue with the diagrams I'm getting.
In the simulation I'm generating exponential traffic, however, for low lambda values (0.1) I'm getting very high packet drop rates (99%). I wrote a sample here which is very close to the testbench I'm running on my simulator.
% Run the simulation 10 times, with different lambda values
l = [1 2 3 4 5 6 7 8 9 10];
for i=l(1):l(end)
X = rand();
% In the 'real' simulation the following line defines the time
% when the next packet generation event will occur. Suppose that
% i is the current time
t_poiss = i + ceil((-log(X)/(i/10)));
distr(i)=t_poiss;
end
figure, plot(distr)
axis square
grid on;
title('Exponential test:')
The resulting image is
The diagram I'm getting in this sample is IDENTICAL to the diagram I'm getting for the drop rate/λ. So I would like to ask if I'm doing something wrong or if I miss something? Is this the right thing to expect?

So the problem is coming from might be a numerical problem. Since you are generating a random number for X, the number might be incredibly small - say, close to zero. If you have a number close to zero numerically, log(X) is going to be HUGE. So your calculation of t_poiss will be huge. I would suggest doing something like X = rand() + 1 to make sure that X is never close to zero.

Related

Modeling an hrf time series in MATLAB

I'm attempting to model fMRI data so I can check the efficacy of an experimental design. I have been following a couple of tutorials and have a question.
I first need to model the BOLD response by convolving a stimulus input time series with a canonical haemodynamic response function (HRF). The first tutorial I checked said that one can make an HRF that is of any amplitude as long as the 'shape' of the HRF is correct so they created the following HRF in matlab:
hrf = [ 0 0 1 5 8 9.2 9 7 4 2 0 -1 -1 -0.8 -0.7 -0.5 -0.3 -0.1 0 ]
And then convolved the HRF with the stimulus by just using 'conv' so:
hrf_convolved_with_stim_time_series = conv(input,hrf);
This is very straight forward but I want my model to eventually be as accurate as possible so I checked a more advanced tutorial and they did the following. First they created a vector of 20 timepoints then used the 'gampdf' function to create the HRF.
t = 1:1:20; % MEASUREMENTS
h = gampdf(t,6) + -.5*gampdf(t,10); % HRF MODEL
h = h/max(h); % SCALE HRF TO HAVE MAX AMPLITUDE OF 1
Is there a benefit to doing it this way over the simpler one? I suppose I have 3 specific questions.
The 'gampdf' help page is super short and only says the '6' and '10' in each function call represents 'A' which is a 'shape' parameter. What does this mean? It gives no other information. Why is it 6 in the first call and 10 in the second?
This question is directly related to the above one. This code is written for a situation where there is a TR = 1 and the stimulus is very short (like 1s). In my situation my TR = 2 and my stimulus is quite long (12s). I tried to adapt the above code to make a working HRF for my situation by doing the following:
t = 1:2:40; % 2s timestep with the 40 to try to equate total time to above
h = gampdf(t,6) + -.5*gampdf(t,10); % HRF MODEL
h = h/max(h); % SCALE HRF TO HAVE MAX AMPLITUDE OF 1
Because I have no idea what the 'gampdf' parameters mean (or what that line does, in all actuality) I'm not sure this gives me what I'm looking for. I essentially get out 20 values where 1-14 have SOME numeric value in them but 15-20 are all 0. I'm assuming there will be a response during the entire 12s stimulus period (first 6 TRs so values 1-6) with the appropriate rectification which could be the rest of the values but I'm not sure.
Final question. The other code does not 'scale' the HRF to have an amplitude of 1. Will that matter, ultimately?
The canonical HRF you choose is dependent upon where in the brain the BOLD signal is coming from. It would be inappropriate to choose just any HRF. Your best source of a model is going to come from a lit review. I've linked a paper discussing the merits of multiple HRF models. The methods section brings up some salient points.

Finding strat point when signal become perodic

I am trying to find the Mean of three cycles after the signal become periodic and reach to steady state. I have a signal that is not periodic at the beginning but after some time it became periodic. I want to find the Mean of the next three cycles which each cycle has five points.
Now I did that by opening the plot and find the point where the signal become periodic then I enter that point to MATLAB, then I got the results. The program working fine but I have a big problem. I have 500,000 data records and its impossible to open each one and find the starting point where the signal become periodic. Is there any way that I can find starting point without opening the plot because each case has a different starting point where the signal become periodic?
I used below code now
close all,clear variables,clear all;
clc;
prompt = 'Enter Strating Point?';
N= input(prompt);
Result=mean(mean(1,N:N+4)+mean(1,N+5:N+9)+mean(1,N+10:N+14));
I attached sample of data, Column one is the signal and column two is the time.
https://www.dropbox.com/sh/27lebrp1lwnmm3l/AABIhN1tzUSJQjjED954Yvyka?dl=0
Thank you!
Full edit:
%inputs: time and y (the response), both same length vectors
ppc = 5; % points per cycle
A = zeros(ppc,1);
for i = 1:ppc
A(i) = mean(y(i:ppc:length(y)));
end
[~,b] = min(A);
possidx = (length(time)+b-ppc):-ppc:b; %idx of lowest points
lowlist = fliplr(y(possidx));% lowest points
for i = 2:length(lowlist) %start from behind
se = std(lowlist(1:i))/sqrt(i); %calculate SE for all current points
if se > 0.05 %depending on your filed you might wanna change it to a lower value
periodstart = time(possidx(i-1)); %lowest point of first period
break
end
end
What it does: the first loop finds which group of points is always at the bottom. So adjust ppc to 10 if you have 10 points per cycle. The points per cycle don't have to be exactly the same for each cycle if you have a lot of them, it should still be reasonably accurate.
Then we add from behind one by one these lowest points and calculate the standard error. Once it is greater than 0.05 we are outside of the periods.
I felt so free to use standard error because that is something i know and that makes sense in this situation. I set the threshold to 0.05 because it's standard in many fields, alter it if it is different in your field.

MATLAB Simple - Linear Predictive Coding and Energy Forecasting

I have a dataset with 274 samples (9 months) of the daily energy (Watts.hour) used on a residential household. I'm not sure if i'm applying the lpc function correctly.
My code is the following:
filename='9-months.csv';
energy = csvread(filename);
C=zeros(5,1);
counter=0;
N=3;
for n=274:-1:31
w2=energy(1:n-1,1);
a=lpc(w2,N);
energy_estimated=0;
for X = 1:N
energy_estimated = energy_estimated + (-a(X+1)*energy(n-X));
end
w_real=energy(n);
error2=abs(w_real-energy_estimated);
counter=counter+1;
C(counter,1)=error2;
end
mean_error=round(mean(C));
Being "n" the sample on analysis, I will use the energy array's values, from 1 to n-1, to calculate the lpc coefficientes (with N=3).
After that, it will apply the calculated coefficients on the "for" cycle presented, in order to calculate the estimated energy.
Finally, error2 outputs the error between the real energy and estimated value.
On the example presented ( http://www.mathworks.com/help/signal/ref/lpc.html ) some filters are used. Do I need to apply any filter to it? Is my methodology correct?
Thank you very much in advance!
The lpc seems to be used correctly, but there are a few other things about your code. I am adressign the part at he "for n" :
for n=31:274 %for me it would seem more logically to go forward in time
w2=energy(1:n-1,1);
a=lpc(w2,N);
energy_estimate=filter([0 -a(2:end)],1,w2);
energy_estimate=energy_estimate(end);
estimates(n)=energy_estimate;
end
error=energy(31:274)-estimates(31:274)';
meanerror=mean(error); %you dont really round mean errors
filter is exactly what you are trying to do with the X=1:N loop. but this will perform the calculation for the entire w2 vector. If you just want the last value take the (end) command as well.
Now there is no reason to calculate the error for every single value and then add them to a vector you can do that faster after the calculation.
Now if your trying to estimate future values with a lpc it could work like that, but you are implying that every value is only dependend on the last 3 values. Have you tried something like a polynominal approach? i would think that this would be closer to reality.

Unreasonable [positive] log-likelihood values from matlab "fitgmdist" function

I want to fit a data sets with Gaussian mixture model, the data sets contains about 120k samples and each sample has about 130 dimensions. When I use matlab to do it, so I run scripts (with cluster number 1000):
gm = fitgmdist(data, 1000, 'Options', statset('Display', 'iter'), 'RegularizationValue', 0.01);
I get the following outputs:
iter log-likelihood
1 -6.66298e+07
2 -1.87763e+07
3 -5.00384e+06
4 -1.11863e+06
5 299767
6 985834
7 1.39525e+06
8 1.70956e+06
9 1.94637e+06
The log likelihood is bigger than 0! I think it's unreasonable, and don't know why.
Could somebody help me?
First of all, it is not a problem of how large your dataset is.
Here is some code that produces similar results with a quite small dataset:
options = statset('Display', 'iter');
x = ones(5,2) + (rand(5,2)-0.5)/1000;
fitgmdist(x,1,'Options',options);
this produces
iter log-likelihood
1 64.4731
2 73.4987
3 73.4987
Of course you know that the log function (the natural logarithm) has a range from -inf to +inf. I guess your problem is that you think the input to the log (i.e. the aposteriori function) should be bounded by [0,1]. Well, the aposteriori function is a pdf function, which means that its value can be very large for very dense dataset.
PDFs must be positive (which is why we can use the log on them) and must integrate to 1. But they are not bounded by [0,1].
You can verify this by reducing the density in the above code
x = ones(5,2) + (rand(5,2)-0.5)/1;
fitgmdist(x,1,'Options',options);
this produces
iter log-likelihood
1 -8.99083
2 -3.06465
3 -3.06465
So, I would rather assume that your dataset contains several duplicate (or very close) values.

How to determine the dimensions of a subplot in Matlab?

So I am writing a function that plots matrix data from n different cells. If n is 10, it should display 10 equally spaced plots on a single figure. If n is 7, it should try to space them out as equally as possible (so 3x2 or 2x3 plots with a plot by itself).
I am able to get these graphs drawn using subplot() and plot() but I'm having a hard time finding out how to initialise the dimensions for the subplot.
The number of subplots will be changing after each run so I can't initialise it to specific dimensions.
Can anyone point me in the right direction?
I am afraid problems like this tend to be messy. This normally problems like this need to be solved for different cases.
if (mod(n,2) && n<8)
% Do something
elseif (!mod(n,2) && n < 11)
% Do something else
elseif ...
....
end
The conditions are choosen a bit arbitarily since the specifications in the OP seemed a bit arbitary too. You probably understand the point and can set your own conditions.
There are two reasons why I recommend this approach.
1) This makes the code simpler to write. You do not have to come up with some complicated solution which may break in after some time.
2) By adding cases you can protect yourself against a rampant number of plots. In case the number of plots gets too large you do typically not want to have all plots in the same figure. It is also possible to wrap this into a function and apply this to X plots at a time in a loop. Typically you would want each iteration to be a separate figure.
It is not very easy to elaborate more on this since you have not yet specified how many cases you expect or what will happen to the last plot in case of odd numbers. Still this may give a good hint.
Good luck!
Another simple solution would be using round and ceil on the square root:
for n=1:20
[n, round(sqrt(n))*ceil(sqrt(n)), round(sqrt(n)), ceil(sqrt(n))]
end
output:
%(n, total_plots, x, y)
1 1 1 1
2 2 1 2
3 4 2 2
4 4 2 2
5 6 2 3
6 6 2 3
7 9 3 3
8 9 3 3
9 9 3 3
10 12 3 4
Usage example:
n = 7
subplot(round(sqrt(n)), ceil(sqrt(n)), plot_nr_x) % switch first 2 params to have either a slightly longer or slightly wider subplot
I ran into a very similar problem today and I was having a lot of trouble to define the size of the subplot that would fit everything. My reasoning is mostly a hack but it can help. If you have to represent at most n figures, you can thing as a square grid of sqrt(n) * sqrt(n). To make things better we add a safety row, so the final matrix would be (sqrt(n) + 1) * sqrt(n). I hope this helps solving your problem.
In my code have 2 nested loops:
within a loop that opens a figure for each kk element and is meant to plot a particular graph from the x position within the array.
for kk=1:length(some_file_list)
% Load data
% do some math
% get data as a cell array with things we care about in data(3,)
array_size = size(data(3,:),2);
for x=1:size(data(3,:),2);
% do more math and get things ready to plot matrix_A scaled by range_A
figure(kk); % open figure
grid_rows = round((sqrt(array_size)+1));
grid_cols = round(sqrt(array_size));
% plot
subplot(grid_rows, grid_cols, x);
imagesc(matrix_A,range_A); %plot in position
colormap(gray);
end
end