MATLAB Averaging multidimensional arrays in third dimension - matlab

I have a structure called Task, which contains the output of preprocessing by the EEG analysing add-on Fieldtrip.
This includes an array called Task.trial as well as the header and other output data required by Fieldtrip. I need to keep this so I can run visualisation code afterwards.
Task.trial is a cell array of varying lengths. Each Participant completes 132 Trials but not all trials pass preprocessing. This has left a varying number of trials per participant around 120. For simplicity's sake, Task.trial is a 1x120 cell array.
Each cell of Task.trial contains a 66x500 Double, representing EEG channels x frame (at 500 Hz).
I wish to average Task.trial across successive trials. I do not want it to average across frames nor across channels, so I believe I'm looking for mean in the 3rd dimension. However, the following code:
TaskAverage = mean (Task.trial,3);
Results in the following error:
Undefined function 'sum' for input arguments of type 'cell'.
Error in mean (line 115)
y = sum(x, dim,
flag)/size(x,dim);
I have read numerous questions on here regarding multidimensional arrays and averaging, as well as the matlab help documentation. I have a very limited understanding of the background of matlab coding, so I cannot figure out how to fix this. Can anyone explain how to make this work?
My current alternative is to individually add each table of data (120 Trials x So many participants).

Related

Simulink: Vector summation and saving the output to workspace

I can not solve a very simple problem in the Simulink: summation of 2 equal size vectors and writing the result into the Matlab workspace.
The trivial operation that takes 1 line in Matlab seems a real problem in the simulink.
I have 2 vectors with the same size, for e.g. 10x1 and I want to get their summation result into the workspace with the same size (10x1).
I have already used 'sum' block for that and even my own function with element-wise summation, but I think the problem is that Simulink block 'to workspace' always concatenate outputs either along 1-st or 3-rd dimension. Hence the size of the output does not inherit the size of inputs.
I can not find any solution in the web, will be really appreciate for your help!
I didn't notice the vectors are saved in a column-based using "to workspace" block. Did you try to add "(:)" in your code to get it in a single column?
As I know, storing the data in columns (1x10) is faster than in rows (10x1). Maybe that is the reason for getting columns instead of rows.
https://www.mathworks.com/matlabcentral/answers/216512-which-is-faster-a-row-vector-or-a-column-vector-can-anyone-answer-me-please

Simulink 3D lookup table

I have a system of three nonlinear equations with eight unknowns. I'm currently setting each equation equal to a desired value and then using Matlab's fsolve (a numerical solver) to find a solution. Instead of running fsolve in real-time, I'd like to pre-compute solutions for a specific set of values to which I set the equations equal.
Pursuant that goal, I've run the solver over a set of values and created a 3D matrix (N x N x N) which I've attempted to load into eight Simulink 3-D lookup tables, Direct Lookup Table n-D block, so I can fetch each of the eight solved unknowns. It's my understanding the inputs to this block should work the same way I would reference an element in my 3-D array: table(x,y,z) but I'm constantly getting Simulink table input out-of-range errors. I've confirmed the inputs are within the table size, so I'm not sure what's wrong.
This isn't the most elegant implementation, so I'm open to better solutions. Ideally, I'd like to have a Simulink lookup that takes three inputs and returns a vector of the eight solved unknowns, or even better, can do some type of linear interpolation between the three lookup values to return an approximate solution.
Thanks!

How to quickly/easily merge and average data in matrix in MATLAB?

I have got a matrix of AirFuelRatio values at certain engine speeds and throttlepositions. (eg. the AFR is 14 at 2500rpm and 60% throttle)
The matrix is now 25x10, and the engine speed ranges from 1200-6000rpm with interval 200rpm, the throttle range from 0.1-1 with interval 0.1.
Say i have measured new values, eg. an AFR of 13.5 at 2138rpm and 74,3% throttle, how do i merge that in the matrix? The matrix closest values are 2000 or 2200rpm and 70 or 80% throttle. Also i don't want new data to replace the older data. How can i make the matrix take this value in and adjust its values to take the new value in account?
Simplified i have the following x-axis values(top row) and 1x4 matrix(below):
2 4 6 8
14 16 18 20
I just measured an AFR value of 15.5 at 3 rpm. If you interpolate the AFR matrix you would've gotten a 15, so this value is out of the ordinary.
I want the matrix to take this data and adjust the other variables to it, ie. average everything so that the more data i put in the more reliable and accurate the matrix becomes. So in the simplified case the matrix would become something like:
2 4 6 8
14.3 16.3 18.2 20.1
So it averages between old and new data. I've read the documentation about concatenation but i believe my problem can't be solved with that function.
EDIT: To clarify my question, the following visual clarification.
The 'matrix' keeps the same size of 5 points whil a new data point is added. It takes the new data in account and adjusts the matrix accordingly. This is what i'm trying to achieve. The more scatterd data i get, the more accurate the matrix becomes. (and yes the green dot in this case would be an outlier, but it explains my case)
Cheers
This is not a matter of simple merge/average. I don't think there's a quick method to do this unless you have simplifying assumptions. What you want is a statistical inference of the underlying trend. I suggest using Gaussian process regression to solve this problem. There's a great MATLAB toolbox by Rasmussen and Williams called GPML. http://www.gaussianprocess.org/gpml/
This sounds more like a data fitting task to me. What you are suggesting is that you have a set of measurements for which you wish to get the best linear fit. Instead of producing a table of data, what you need is a table of values, and then find the best fit to those values. So, for example, I could create a matrix, A, which has all of the recorded values. Let's start with:
A=[2,14;3,15.5;4,16;6,18;8,20];
I now need a matrix of points for the inputs to my fitting curve (which, in this instance, lets assume it is linear, so is the set of values 1 and x)
B=[ones(size(A,1),1), A(:,1)];
We can find the linear fit parameters (where it cuts the y-axis and the gradient) using:
B\A(:,2)
Or, if you want the points that the line goes through for the values of x:
B*(B\A(:,2))
This results in the points:
2,14.1897 3,15.1552 4,16.1207 6,18.0517 8,19.9828
which represents the best fit line through these points.
You can manually extend this to polynomial fitting if you want, or you can use the Matlab function polyfit. To manually extend the process you should use a revised B matrix. You can also produce only a specified set of points in the last line. The complete code would then be:
% Original measurements - could be read in from a file,
% but for this example we will set it to a matrix
% Note that not all tabulated values need to be present
A=[2,14; 3,15.5; 4,16; 5,17; 8,20];
% Now create the polynomial values of x corresponding to
% the data points. Choosing a second order polynomial...
B=[ones(size(A,1),1), A(:,1), A(:,1).^2];
% Find the polynomial coefficients for the best fit curve
coeffs=B\A(:,2);
% Now generate a table of values at specific points
% First define the x-values
tabinds = 2:2:8;
% Then generate the polynomial values of x
tabpolys=[ones(length(tabinds),1), tabinds', (tabinds').^2];
% Finally, multiply by the coefficients found
curve_table = [tabinds', tabpolys*coeffs];
% and display the results
disp(curve_table);

Using Linear Prediction Over Time Series to Determine Next K Points

I have a time series of N data points of sunspots and would like to predict based on a subset of these points the remaining points in the series and then compare the correctness.
I'm just getting introduced to linear prediction using Matlab and so have decided that I would go the route of using the following code segment within a loop so that every point outside of the training set until the end of the given data has a prediction:
%x is the data, training set is some subset of x starting from beginning
%'unknown' is the number of points to extend the prediction over starting from the
%end of the training set (i.e. difference in length of training set and data vectors)
%x_pred is set to x initially
p = length(training_set);
coeffs = lpc(training_set, p);
for i=1:unknown
nextValue = -coeffs(2:end) * x_pred(end-unknown-1+i:-1:end-unknown-1+i-p+1)';
x_pred(end-unknown+i) = nextValue;
end
error = norm(x - x_pred)
I have three questions regarding this:
1) Does this appropriately do what I have described? I ask because my error seems rather large (>100) when predicting over only the last 20 points of a dataset that has hundreds of points.
2) Am I interpreting the second argument of lpc correctly? Namely, that it means the 'order' or rather number of points that you want to use in predicting the next point?
3) If this is there a more efficient, single line function in Matlab that I can call to replace the looping and just compute all necessary predictions for me given some subset of my overall data as a training set?
I tried looking through the lpc Matlab tutorial but it didn't seem to do the prediction as I have described my needs require. I have also been using How to use aryule() in Matlab to extend a number series? as a reference.
So after much deliberation and experimentation I have found the above approach to be correct and there does not appear to be any single Matlab function to do the above work. The large errors experienced are reasonable since I am using a linear prediction algorithm for a problem (i.e. sunspot prediction) that has inherent nonlinear behavior.
Hope this helps anyone else out there working on something similar.

Using corrcoef and all p values returned as 1 or 0

I'm trying to test the correlation of 4 sets of data using the corrcoef function in matlab. Each one contains 50 values. Starting out, the data are in a .csv file wtih 4 columns of 50 values each. I've put all 4 sets into a column vector labeled y:
y=[[set1] [set2] [set3] [set4]]
So y is a matrix of 4 colums of 50 rows.
I've called the corrcoef function using
[r,p]=corrcoef(y)
When I run this code, I get a 4x4 r matrix with a diagonal of 1's and sets of identical values above or below it, which seems correct because the 1's must be the correlation of one set to itself and the identical values above and below the diagonal are just the correlations of the same two sets repeated (i.e. set2 to set1 vs. set1 to set2).
However, the matrix of p values I return seems all wrong, and I'm not sure why. I get a 4x4 matrix with a diagonal of 1's, and all the values above and below are 0. Clearly this is incorrect because it's saying that the probability of getting the perfect correlations by chance is 100% while getting the "imperfect" correlations by chance is almost impossible.
Can anyone help show me what I'm doing wrong here? I can supply more details if needed.
Edit: just wanted to mention I'm trying to follow the instructions from the mathworks help page:
http://www.mathworks.com/help/matlab/ref/corrcoef.html
In their example, they do show the p-values along the diagonal as all being equal to 1, can anyone tell me why that is?
Also, the p-values aren't exactly equal to 0, they're just absurdly small, like 5.9601e-10, which is what is making me feel like something is wrong.
Edit2: I've also tried computing the correlation coefficient between two of the sets using excel's corr function, and it gives me the same value for r as matlab does.