Multiple Columns, way to select closest to a value - matlab

I'm trying to analyze data sets that are obtained from CSV files. After the data is read into matlab, I am left with a variable of my data only. The number of columns and rows changes between each file. Is there a way to average each column and then create a variable for the one with the closest average to a certain value? and then also select the columns directly before and after this middle column and create variables for them, as well as create a variable for the column with the lowest average? Currently, I am selecting the columns manually and creating a variables for them that way.
For example:
I have this table of numbers. (I used the same number in each column for sake of easy averaging in this example.
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
Let's say I want the column whose average is closest to 3.2
That column would be column 3 whose average is 3. Then I would want the code to select the column before (column 2) and the column after (column 4). As well as the column with the lowest average (column 1)

First get the averages (I assume the data matrix is in variable X):
Xmns = mean(X);
Then to find the minimum, use "min":
[val,ind] = min(Xmns);
"val" holds the minimum value, "ind" the corresponding index in Xmns, which is the corresponding column.
To find the column mean closest to a particular value, again you can use min:
[val,ind] = min(abs(Xmns-key_val));
Now "ind" holds the column index with mean closest to "key_val". The next column is just "ind+1" and the previous "ind-1" - just be sure to check you are not beyond the ends of the matrix (i.e. ind may already be 1 or size(X,2)).
Also, given the column index "ind", to create a new variable with that column, you just use:
sc= X(:,ind);
and if you want to remove that column from X:
X(:,ind) = [];
and that is all.

Related

How to create a Dataframe/ Table in netlogo?

I am wondering whether is it possible to make a table in netlogo where the left most/first column represents the agent-x who number (i.e agent-x 1, agent-x 2 ...) and the top/first row represents agent-y 's who number (i.e agent-y 23,agent-y 37).
The distances between the agents will be added to the corresponding cells. The result will be something similar to Excel, where if column B, contains agent-x 1 and row B contains agent-y 23 the corresponding cell for column B and row B will contain the distance between the two agents (i.e 8 units).

Comparing, matching and combining columns of data

I need some help matching data and combining it. I currently have four columns of data in an Excel sheet, similar to the following:
Column: 1 2 3 4
U 3 A 0
W 6 B 0
R 1 C 0
T 9 D 0
... ... ... ...
Column two is a data value that corresponds to the letter in column one. What I need to do is compare column 3 with column 1 and whenever it matches copy the corresponding value from column 2 to column 4.
You might ask why don't I do this manually ? I have a spreadsheet with around 100,000 rows so this really isn't an option!
I do have access to MATLAB and have the information imported, if this would be more easily completed within that environment, please let me know.
As mentioned by #bla:
a formula similar to =IF(A1=C1,B1,0)
should serve (Excel).

Taking average of one column with w.r.to other column

I have two columns in .std file. I want average of the second column values corresponding to all values ranging from some value (eg. 1.0- 1.9) in first column how can I program in Matlab?
Say, a is the name of your two column matrix. If you want to find all of the values in the first column in the range of 1.0 - 1.9 and then use those entries to find the mean in the second column you can do this:
f = find(a(:,1)>=1 & a(:,1)<=1.9)
m = mean(a(f,2))
find will find the values that lie within this range and return the index, and a(f,2) accesses those indices in the in the second column and takes the mean. You can also do it with one line like so:
m = mean(a((a(:,1)>=1 & a(:,1)<=1.9),2))

finding max value in each column and row

Say you have a 2D matrix,Is there a particular algorithm or way to do this in excel or matlab that would find the max of each row and column, such that each column and each row has only one maximal number N, where summing all N would result in the largest possible sum,i.e. is a row or column has a max number that is repeated., as such with the simple example below
1 2 4
3 1 4
1 2 4
the out put would be
1 2 4
3 2 4
1 2 4
You are looking for the maximum bipartite matching in a (complete) graph, where your matrix represents the edge weights matrix. You can compute this value using the Hungarian algorithm (MATLAB implementation available for download from File Exchange). Since you want the maximum match, negate all the numbers in your matrix and feed it to this function. You will get back two outputs - one is the (negative of) the maximum sum and the other a binary matrix with ones where the maximal elements occur in each row and column and zeros everywhere else.

Removing rows with identical first column value in matlab

I have a cell matrix of size 10000 X 3 in Matlab and I would like to remove rows with the same value in the first column.
That is, if row i and row j have the same value in the first column, I'd like to delete both rows.
I should also say that there can be more than two rows with the same value in the first column and in that case, I'd like to delete all these rows.
How do I do it?
Thanks!
You can use the functions histc, unique and logical indexing to achieve what you want. Here's a small example.
a=randi(10,5,3) %#generate a sample random matrix
a =
5 3 5
5 7 10
7 7 4
8 2 6
8 2 3
[uniqVals,uniqIndx]=unique(a(:,1)); %# get unique values and corresponding indices of the first column of a
count=histc(a(:,1),uniqVals); %# get the bin counts of the elements (i.e., find which are repeated)
b=a(uniqIndx(count==1),:)
b =
7 7 4
Only the row with the non-repeated element is selected. Since you said that you have a cell matrix, simply covert it to a matrix using cell2mat before doing this.