I want to create a weighted adj matrix. Is there a good way that will work even with a huge data set?
I have this abc.txt file for example:
abc.txt
1 2 50
2 3 70
3 1 42
1 3 36
result should be
matrix=
0 50 36
0 0 70
42 0 0
How can I construct a weighted adjacency matrix from input dataset graph file as shown above that will contain the weights?
So basically input file has 3 columns and the third column is the weights of each edge.
You could also apply spconvert to the output of importdata:
matrix = full(spconvert(importdata('abc.txt')));
What you have is a sparse definition of a matrix, using sparse is the simplest way to create it. If your matrix is thin (many zeros) you may also stick with the sparse matrix because it requires less memory. Then delete the last line.
S=load('abc.txt')
M=sparse(S(:,1),S(:,2),S(:,3))
M=full(M)
Related
I have a 3D matrix that each page/slice is independent of other slices. Thus, I would like to use the find command to filter my data in each page. However, when applied, find will return the indices in a row vector that describe my data as a whole, where actually it is not. For example:
a=rand(1,10,5);
ind=find(a<0.3);
This would return ind something like:
ind=
1 2 5 9 10 11 20 24 25 ...
I expected something like:
ind(:,:,1)=
1 2 3
ind(:,:,2)=
1 5 6 10 %based on each slice, independent to other slices
I intended to do so (independently), so that I could apply the found indices to each slice of other matrix.
Can this be done without using loop? Thanks in advance!
Use ind2sub() to convert your indices to subscripts. Something like this should work for a 3d array:
[i,j,k] = ind2sub(size(a), ind)
That said, the outputs (i, j, k), will all be the same size, that is the same size as ind. In other words, it gives one set of subscripts (i,j,k) (coordinates) for each value of a<0.3.
It's not completely clear what output you want/expect from your question, but if you want separate subscripts for each page in a, you'll have to filter further (e.g. j(i==1),k(i==1) for the first page in i).
I am trying to achieve multiple matrices that will cover the full set of numbers. For example say I want to generate 5 matrices of length 10 that cover all the numbers from 1-20.
So matrix one will contain half the numbers say
m1 = [1 2 3 4 5 6 7 8 9 10];
while matrix two contains
m2 = [11 12 13 14 15 16 17 18 19 20];
Although this satisfies my condition with only two matrices not 5, I preferably need to generate all matrices randomly. Other than randomly generating the matrices and checking all values are generated is there a more efficient way to do this?
You can do it like that:
>> l=[1:20,randi(20,1,30)];
>> vec=l(randperm(length(l)));
>> v=reshape(vec,5,10);
The first line generates an array of 50 numbers from 1 to 20. It guarantees that each such number appears at least once. The second line randomizes the order of the numbers. The third line reshapes the vector into an array of arrays (that is, a 2D matrix, where each row is one of the arrays).
I just have a problem with graphing different plots on the same graph within a ‘for’ loop. I hope someone can be point me in the right direction.
I have a 2-D array, with discrete chunks of data in and amongst zeros. My data is the following:
A=
0 0
0 0
0 0
3 9
4 10
5 11
6 12
0 0
0 0
0 0
0 0
7 9.7
8 9.8
9 9.9
0 0
0 0
A chunk of data is defined as contiguous set of data, without interruptions of a [0 0] row. So in this example, the 1st chunk of data would be
3 9
4 10
5 11
6 12
And 2nd chunk is
7 9.7
8 9.8
9 9.9
The first column is x and second column is y. I would like to plot y as a function of x (x is horizontal axis, y is vertical axis) I want to plot these data sets on the same graph as a scatter graph, and put a line of best fit through the points, whenever I come across a chunk of data. In this case, I will have 2 sets of points and 2 lines of best fit (because I have 2 chunks of data). I would also like to calculate the R-squared value
The code that I have so far is shown below:
fh1 = figure;
hold all;
ah1 = gca;
% plot graphs:
for d = 1:max_number_zeros+num_rows
if sequence_holder(d,1)==0
continue;
end
c = d;
while sequence_holder(c,1)~=0
plot(ah1,sequence_holder(c,1),sequence_holder(c,num_cols),'*');
%lsline;
c =c+1;
continue;
end
end
Sequence holder is the array with the data in it. I can only plot the first set of data, with no line of best fit. I tried lsline, but that didn't work.
Can anyone tell me how to
-plot both sets of graphs
-how to draw a line of best fit a get the regression coefficient?
The first part could be done in a number of ways. I would test the second column for zeroness
zerodata = A(:,2) == 0;
which will give you a logical array of ones and zeros like [1 1 1 0 1 0 0 ...]. Then you can use this to split up your input. You could look at the diff of that array and test it for positive or negative sign. Your data starts on 0 so you won't get a transition for that one, so you'd need to think of some way to deal with that or the opposite case, unless you know for certain that it will always be one way or the other. You could just test the first element, or you could insert a known value at the start of your input array.
You will then have to store your chunks. As they may be of variable length and variable number you wouldn't put them into a big matrix, but you still want to be able to use a loop. I would use either a cell array, where each cell in a row contains the x or y data for a chunk, or a struct array where say structarray(1).x and structarray)1).y hold your data values.
Then you can iterate through your struct array and call plot on each chunk separately.
As for fitting you can use the fit command. It's complex and has lots of options so you should check out the help first (type doc fit inside the console to get the inline help, which is the same as the website help in content). The short version is that you can do a simple linear fit like this
[fitobject, gof] = fit(x, y, 'poly1');
where 'poly1' specifies you want a first order polynomial (i.e. straight line) and the output arguments give you a fit object, which you can do various things with like plot or interpolate, and the second gives you a struct containing among other things the r^2 and adjusted r^2. The fitobject also contains your fit coefficients.
I have a matrix say ABC which has 60 rows and 120 columns.
Depending on another matrix X which is a 120 entry long array I would like to populate another matrix as follows:
if X(i)=1 column i is added to the matrix ABC_Copy.
if X(i)=0 column i is skipped the loop continues.
As it is obvious i would iterate from 1 to 120 which is the size of S representing the 120 columns in ABC.
How can we implement this in matlab without iterating entirely and placing each value individually?
You can use logical arrays for indexing in Matlab:
ABC_Copy = ABC(:, X==1);
I have two datasets, the original have all the labels and description of each variable, but the second is a reduced version of this dataset, used for specifics experiments, but don't have any of the information about the variables, contained in the original. So, I'm trying to match both datasets.
My question here is how can I find if a row from the original dataset is present in the new dataset, if a slight data reduction have been performed in both matrix dimensions?
Being more specific, the original dataset is a 24481 x 117 matrix and the new one is a 24188 x 97 matrix. However, the problem here is that I have no information of which rows or columns were or were not included in the new dataset
what you can do is zero pad the matrix with less number of elements so that it matches the size of the original data. then use
find(A==B)
A and B are the matrices
Using intersect function worked for me. Since a data reduction have been performed in both dimensions, first I look for the intersection of the first two columns vectors in the matrices (assuming that at least the columns order have been preserved in the reduction).
>> M = magic(5)
M =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
>> X = M([2,3,5], [1,2,4,5])
X =
23 5 14 16
4 6 20 22
11 18 2 9
>> [c,xi, mi]=intersect(X(:,1),M(:,1))
mi is the column index vector of all rows from the original matrix M present in the reduced matrix X.
Doing the same for the two first rows in the matrices gave me a row index vector for all columns selected from the original matrix M.
>> [c,xi, mi]=intersect(X(1,:),M(1,:))
This solution has a drawback is that when the first row or column of the original matrix was not selected in the new set, then there you go moving the index of the compared vector from the original matrix, luckily not too much ;).
>> [c,xi, mi]=intersect(X(1,:),M(2,:))