Matlab NN inputs & output maniuplation - matlab

Assume I have this matrix, A :
A=[ 25 11 2010 10 23 75
30 11 2010 11 24 45
31 12 2010 19 24 44
31 12 2010 22 27 32
1 1 2011 14 27 27
2 12 2011 15 28 30
3 12 2011 16 24 42 ];
The first 5 columns represent the inputs of some measured parameters and the last column is the corresponding output. The number of rows is the number of taking these measurements.
I want to use Matlab Neural network GRNN with the function newgrnn ( or any other NN function ) to train the data up to the 5th row and test the remaining 2 rows inputs to evaluate their corresponding outputs. I have tried many many times to do this but it always gives me error and the program did not run correctly. I have looked to newgrnn help example but it is only for one input while I have in this example 5 inputs.
My question is how do we put the inputs and the output in the newgrnn function structure. Actually, I have very large matrix with 22 inputs and one output and the size of my matrix is 26352 by 23 but the above is only sample example.

Since you haven't given any examples of what you've tried and what errors you get from your attempts, I'll have to give you a fairly generic answer.
Have a look at the newgrnn help file.
net = newgrnn(P,T,spread) takes three inputs,
P R-by-Q matrix of Q input vectors
T S-by-Q matrix of Q target class vectors
spread Spread of radial basis functions (default = 1.0)
So if your matrix A always has just the last column being the outputs (target class vectors) then the outputs (target class vectors) are A[1:5,end], and the inputs are A[1:5,1:(end-1)]. These say "first 5 rows of A, and the last column", and "first 5 rows of A, and all but the last column" respectively.
Then (simply following the example in the newgrnn help file, you will have to tweak to your own particular A):
net = newgrnn( A[1:5,1:(end-1)], A[1:5,end] )
% predict new values
Y = sim(net, A[6:7,1:(end-1)])
I think you should also read the Matlab help file for indexing arrays and matrices.

Related

Clustering matrix distance between 3 time series

I have a question about the application of clustering techniques more concretely the K-means.
I have a data frame with 3 sensors (A,B,C):
time A | B | C |
8:00:00 6 10 11
8:30:00 11 17 20
9:00:00 22 22 15
9:30:00 20 22 21
10:00:00 17 26 26
10:30:00 16 45 29
11:00:00 19 43 22
11:30:00 20 32 22
... ... ... ...
And I want to group sensors that have the same behavior.
My question is: Looking at the dataframe above, I must calculate the correlation of each object of the data frame and then apply the Euclidean distance on this correlation matrix, thus obtaining a 3 * 3 matrix with the value of distances?
Or do I transpose my data frame and then compute the dist () matrix with Euclidean metric only and then I will have a 3 * 3 matrix with the distances value.
You have just three sensors. That means, you'll need three values, d(A B), d(B,C) and d(A B). Any "clustering" here does not seem to make sense to me? Certainly not k-means. K-means is for points (!) In R^d for small d.
Choose any form of time series similarity that you like. Could be simply correlation, but also DTW and the like.
Q1: No. Why: The correlation is not needed here.
Q2: No. Why: I'd calculate the distances differently
For the first row, R' built-in s dist() function (which uses Euclidean distance by default)
dist(c(6, 10, 11))
gives you the intervals between each value
1 2
------
2| 4
3| 5 1
item 2 and 3 are closest to each other. That's simple.
But there is no single way to calculate the distance between a point and a group of points. There you need a linkage function (min/max/average/...)
What I would do using R's built-in kmeans() function:
Ignore the date column,
(assuming there are no NA values in any A,B,C columns)
scale the data if necessary (here they all seem to have same order of magnitude)
perform KMeans analysis on the A,B,C columns, with k = 1...n ; evaluate results
perform a final KMeans with your suitable choice of k
get the cluster assignments for each row
put them in a new column to the right of C

Indexing matrices with four indices in MATLAB

The usual way to index elements in a matrix (in MATLAB, at least) is to use two variables (i and j), so a general element matrix can be adressed by M_{i,j}. How can I do the same indexing in a matrix that has four indices, like M_{ij,kl}?
EDIT
The elements of a usual matrix A can be viewed as:
So a general element is extracted, in MATLAB, using A(n,m).
What I want to do is write a matrix that has elements that are indexed like this:
matrix2 http://bit.ly/1gHRZrR
Is there any way to do this without using cells/arrays, as pointed out in the comments of the question?
From your comment I assume you would like to extract elements with multiple (two) row and column indices. Given a matrix M = magic(5);, e.g.
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
you can indeed index M with multiple row and column indices as in M([3,4], [1,5]) which would yield a two by two matrix:
4 22
10 3

Matlab general matrix indexing for accesing several rows

Edit for clarity:
I have two matrices, p.valor 2x1000 and p.clase 1x1000. p.valor consists of random numbers spanning from -6 to 6. p.clase contains, in order, 200 1:s, 200 2:s and 600 3:s. What I wan´t to do is
Print p.valor using a diferent color/prompt for each clase determined in p.clase, as in following figure.
I first wrote this, in order to find out which locations in p.valor represented where the 1,2 respective 3 where in p.clase
%identify the locations of all 1,2 respective 3 in p.clase
f1=find(p.clase==1);
f2=find(p.clase==2);
f3=find(p.clase==3);
%define vectors in p.valor representing the locations of 1,2,3 in p.clase
x1=p.valor(f1);
x2=p.valor(f2);
x3=p.valor(f3);
There is 200 ones (1) in p.valor, thus, is x1=(1:200). The problem is that each number one(1) (and, respectively 2 and 3) represents TWO elements in p.valor, since p.valor has 2 rows. So even though p.clase and thus x1 now only have one row, I need to include the elements in the same colums as all locations in f1.
So the different alternatives I have tried have not yet been succesfull. Examples:
plot(x1(:,1), x1(:,2),'ro')
hold on
plot(x2(:,1),x2(:,2),'k.')
hold on
plot(x3(:,1),x3(:,2),'b+')
and
y1=p.valor(201:400);
y2=p.valor(601:800);
y3=p.valor(1401:2000);
scatter(x1,y1,'k+')
hold on
scatter(x2,y1,'b.')
hold on
scatter(x3,y1,'ro')
and
y1=p.valor(201:400);
y2=p.valor(601:800);
y3=p.valor(1401:2000);
plot(x1,y1,'k+')
hold on
plot(x2,y2,'b.')
hold on
plot(x3,y3,'ro')
My figures have the axisies right, but the plotted values does not match the correct figure provided (see top of the question).
Ergo, my question is: how do I include tha values on the second row in p.valor in my plotted figure?
I hope this is clearer!
Values from both rows simultaneously can be accessed using this syntax:
X=p.value(:,findX)
In this case, resulting X matrix will be a matrix having 2 rows and length(findX) columns.
M = magic(5)
M =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
M2 = M(1:2, :)
M2 =
17 24 1 8 15
23 5 7 14 16
Matlab uses column major indexing. So to get to the next row, you actually just have to add 1. Adding 2 to an index on M2 here gets you to the next column, or adding 5 to an index on M
e.g. M2(3) is 24. To get to the next row you just add one i.e. M2(4) returns 5.To get to the next column add the number of rows so M2(2 + 2) gets you 1. If you add the number of columns like you suggested you just get gibberish.
So your method is very wrong. Freude's method is 100% correct, it's much easier to use subscript indexing than linear indexing for this. But I just wanted to explain why what you were trying doesn't work in Matlab. (aside from the fact that X=p.value(findX findX+1000) gives you a syntax error, I assume you meant X=p.value([findX findX+1000]))

Finding matching rows from original dataset in an slightly reduced version of itself

I have two datasets, the original have all the labels and description of each variable, but the second is a reduced version of this dataset, used for specifics experiments, but don't have any of the information about the variables, contained in the original. So, I'm trying to match both datasets.
My question here is how can I find if a row from the original dataset is present in the new dataset, if a slight data reduction have been performed in both matrix dimensions?
Being more specific, the original dataset is a 24481 x 117 matrix and the new one is a 24188 x 97 matrix. However, the problem here is that I have no information of which rows or columns were or were not included in the new dataset
what you can do is zero pad the matrix with less number of elements so that it matches the size of the original data. then use
find(A==B)
A and B are the matrices
Using intersect function worked for me. Since a data reduction have been performed in both dimensions, first I look for the intersection of the first two columns vectors in the matrices (assuming that at least the columns order have been preserved in the reduction).
>> M = magic(5)
M =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
>> X = M([2,3,5], [1,2,4,5])
X =
23 5 14 16
4 6 20 22
11 18 2 9
>> [c,xi, mi]=intersect(X(:,1),M(:,1))
mi is the column index vector of all rows from the original matrix M present in the reduced matrix X.
Doing the same for the two first rows in the matrices gave me a row index vector for all columns selected from the original matrix M.
>> [c,xi, mi]=intersect(X(1,:),M(1,:))
This solution has a drawback is that when the first row or column of the original matrix was not selected in the new set, then there you go moving the index of the compared vector from the original matrix, luckily not too much ;).
>> [c,xi, mi]=intersect(X(1,:),M(2,:))

calculating x2 from poisson distributed data

So I have a table of values
v=0 1 2 3 4 5 6 7 8 9
#times obs.: 5 19 23 21 14 12 3 2 1 0
I am supposed to calculate chi squared assuming the data fits a poisson dist. with mean u=3.
I have to group values >=6 all in one bin.
I am unsure of how to plot the poisson dist., and most of all how to control what goes into what bin, if that makes sense.
I have plotted a histogram using histc before..but it was with random numbers that I normalized. The amount in each bin was set for me.
I am super new...sorry if this question sucks.
You use bar to plot a bar graph in matlab.
So this is what you do:
v=0:9;
f=[5 19 23 21 14 12 3 2 1 0];
fc=f(find(v<6)); % copy elements where v<=6 into a new array
fc(end+1)=sum(f(v=>6)); % append the sum of elements where v=>6 to that array
figure
bar(v(v<=6), fc);
That should do the trick...
Now you didn't actually ask about the chi squared calculation. I would urge you not to put values of v>6 all into one bin for that calculation, as it will give you a really bad result.
There is another technique: if you use the hist function, you can choose the bins - and Matlab will automatically put things that exceed the limits into the last bin. So if your observations were in the array Obs, you can do what was asked with:
h = hist(Obs, 0:6);
figure
bar(0:6, h)
The advantage is that you have the array h available (frequencies) for other calculations.
If you do instead
hist(Obs, 0:6)
Matlab will plot the graph for you in a single statement (but you don't have the values...)