How to sort parts of table in Matlab? - matlab

I have a table in Matlab containing test data from different test persons. The test has seven video clips each with four different audio versions, which the test persons must rate on a scale from 1 to 100. Every video clip is presented twice for statistical accuracy. The test person pushes buttons 1-4 on an interface to hear the different audio versions.
My table contains the following columns (among some others that are not relevant for the question):
Test Person ID Audio Version Video Clip Rating
1 1 Forest 40
1 2 Forest 60
1 3 Forest 20
1 4 Forest 100
Now, to introduce minimal bias towards any favor of a particular button during the test, the audio versions are randomly permuted between every video clip. This means the real data will look somewhat more like this (Audio Version not sorted):
Test Person ID Audio Version Video Clip Rating
1 1 Forest 40
1 2 Forest 60
1 3 Forest 20
1 4 Forest 100
1 3 City 10
1 2 City 50
1 1 City 40
1 4 City 7
1 4 Inside 90
1 2 Inside 58
1 1 Inside 22
1 3 Inside 35
What I want to do is to: Maintain the correct order of the video clips, thus still have Forest -> City -> Inside, but have them all ordered so that is always audio version 1, 2, 3, 4:
Test Person ID Audio Version Video Clip Rating
1 1 Forest 40
1 2 Forest 60
1 3 Forest 20
1 4 Forest 100
1 1 City 40
1 2 City 50
1 3 City 10
1 4 City 7
1 1 Inside 22
1 2 Inside 58
1 3 Inside 35
1 4 Inside 90
My initial thought was to use the sortrow() function in Matlab and then sort them ascendingly along with the video clip, but since the video clips are introduced twice at different stages of the test and I want to maintain the same sequence of the clips in the table this doesn’t work. Also the same video clips are presented to numerous different test persons.
I am using a pre-made function that has to have the data sorted in this way to perform statistical calculations on it. This function takes the Matlab data from a struct and puts it into the table via a for-loop row for row.
Because the data is put into the table via a for-loop I thought there might be a way to only sort a fixed number of rows at a time i.e row 1-4, 5-8, 9-12 and so on. Do you know if there is a way to sort only part of a table in Matlab?

You can do this with a single call to sortrows by creating an additional numeric column representing your fixed pattern of video presentations. This additional column will simply label each set of 4 rows with a successive integer. It can be concatenated to the beginning of the table, then you can sort by the first and third columns to get the ordering you want:
[~, index] = sortrows([table(ceil((1:size(T, 1)).'./4)) T], [1 3]);
T = T(index, :);
And the output:
T =
ID Audio Video Rating
__ _____ ________ ______
1 1 'Forest' 40
1 2 'Forest' 60
1 3 'Forest' 20
1 4 'Forest' 100
1 1 'City' 40
1 2 'City' 50
1 3 'City' 10
1 4 'City' 7
1 1 'Inside' 22
1 2 'Inside' 58
1 3 'Inside' 35
1 4 'Inside' 90

Using reshape and sort you can do:
[~,idx]=sort(reshape(Audio_Version,4,[]));
idxtble = bsxfun(#plus,idx,0:4:(4*size(idx,2))-1);
table2 = table1(idxtble,:);
Explanation:
You can extract the audio column and reshape it to to [4 * n] matrix:
audio = reshape(Audio_Version,4,[]);
then sort columns of audio and get indexes of sorted elements:
[~,idx]=sort(audio);
Here idx represents row numbers of the sorted elements.
Convert idx to linear indexes of the whole column of the table:
idxtbl = bsxfun(#plus,idx,0:4:(4*size(idx,2))-1);
Reorder the table:
table2 = table1(idxtbl,:);

Just using sortrow for the specified part of the table like the following:
table(1:4,:) = sortrow(table(1:4,:), 2);
table(4:8,:) = sortrow(table(4:8,:), 2);
table(8:12,:) = sortrow(table(8:12,:), 2);
First sort the part of the table, then replace it with that part.

Related

How to filter out bad values in a data set regarding a matrix in matlab?

I wanted to ask any keen users here how to "filter out" bad values regarding a tremendous amount of a data matrix in matlab.
e.g: I have a MATLAB data file containing values 2*5000 (double) which represent x and y coordinates. How is it possible to delete all values above or under a certain limit?
or easier:
(matrix from data file)
1 2 4 134 2
3 5 5 4 2
or
1 2 4 9 2
3 5 5 234 2
setting a certain limit and delete column:
1 2 4 2
3 5 5 2
Find the "bad" elements, e.g. A < 0 | A > 20
Find the "good" columns, e.g. ~max(A < 0 | A > 20)
Keep the "good" columns / Remove the "bad" columns, e.g. A(:, ~max(A < 0 | A > 20))

How to replace values in specific lines and on determined conditions

I'm in trouble: I have a dataset with 128597 lines and 10 columns.
I need to change the values from column 10 between line 10276 until 128597.
And this change have to respect some conditions, like:
If the value is between 11 and 33, the value will become 1
If the value is between 34 and 56, the value will become 5
And go on...
I tried the code below, but didn't work:
m(10276:128597,10) > 11 & m(10276:128597,10)< 33=1;
Can anyone help me please!!! :)
i think this may be help
a=[1,2,3;2,3,4;3,2,1;2,4,1];
amask=false(size(a));
amask(2:3,3)= a(2:3,3)>3;
a(amask)=9999;
and the ans is
a =
1 2 3
2 3 4
3 2 1
2 4 1
a =
1 2 3
2 3 9999
3 2 1
2 4 1
If your bins are consecutive, i.e. there are no gaps between the categories, you can use discretize. Like this you can change all values at once and don't have to use an insane number of logical operators if you have a lot if bins.
% edges of the bins. As an example there are currently 3 bins ranging from
% 11 to 34, 34 to 57, 57 to 77. The upper values are included in the next bin, i.e. bin 2 goes from 34.0 to 56.999999
edges = [11 34 57 77];
% Example values you want to give to the bins. Length has to be one shorter than edges
replacementvalues = [1 5 9];
% subset of the data you want to change
subset = m(10276:128597,:);
% bin the subset into categories
binnumber = discretize(subset,edges);
% replace the values in subset that are inside those bins
subset(~isnan(binnumber)) = replacementvalues(binnumber(~isnan(binnumber)));
% replace the values in the original matrix with the updated subset
m(10276:128597,:) = subset;
If there are gaps between the bins, the code can be expanded by making a second set of "do not change" bins.

Lookup from one matrix into another as you loop through in Matlab

I have a matrix which I have created which is initially a column vector
matrix1 = [1:42]'
I have another matrix which is for arguments sake a 2000-by-2 matrix called matrix2.
Column 1 of matrix2 will always be a number between 1 and 42 and is in any order.
I want to loop through matrix2 column 1 and populate matrix1 column 2 with the result from matrix2 column2 against the corresponding number - at the end of each iteration of the loop I'm going to sum up column2 of matrix2.
so in pseudo code it would be something like this:
for i = 1:length(matrix2)
look at i,1 its a "4" for example - take matrix2 column2
and populate matrix1 next to the 4
(ie column 2 in matrix 1 next to the 4 with this number)
so matrix 1 initially
1
2
3
4
5
6
matrix 2
3 100
1 250
2 200
1 80
4 40
5 50
so after one iteration matrix1 looks like this
1
2
3 100
4
5
6
after iteration 2 matrix1 looks like this
1 250
2
3 100
4
5
6
after iteration 3 matrix1 looks like this
1 250
2 200
3 100
4
5
6
As mentioned, I'll perform a calculation after each iteration but the important thing is to populate matrix 1 column2. There will obviously be many over writes or replacements of the numbers in matrix1 column2 but thats fine since im going to perform a calculation after each iteration.
Try this:
for ii =1:length(matrix2)
matrix1(matrix2(ii,1),2) = matrix2(ii,2);
end
Might need to make matrix1 2D from the start (can just initialize 2nd column to be all 0s).

Pick x smallest elements in Matlab

I have a matrix of integer values, the x axis represent different days and y axis represents hour of the day. And in each cell is a number that indicates how many hours of the day of the day correspond some criteria of the day which is just going on. That's why I need to calculate it for every hour and not only at the end of the time.
The whole issue is I have then to pick 5 best days which have the lowest number (least corresponding). So basically in the matrix it means select 5 lowest numbers in the row and remember the indexes of the columns where the minimum is. (I need to know in which day it occured). Because at every time as the time goes on it can be 5 different days so sorting the whole table would do mess.
I can make it work really ugly by taking first 5 number and then when if I find smaller one on the way I will forget biggest one from the 5 and remember the index of the column for the new one. Yet this solution seems to be pretty sloppy. There has to be a better way in Matlab how to solve this.
Any ideas, functions that can make my life easier?
1 1 0 1 1 1 0 0 1 1
1 2 1 2 2 1 0 1 2 2
For example in these two rows indexed from 1-10, in the first row it should return columns
3,7,8 and two others not really caring which one.
In the second row it should return columns 7,8,6,1,3.
A = randi(60,100,2);
[min_val,index] = sort(A(:,2),'ascend');
output = [A(index(1:5),1) A(index(1:5),2)];
this should help you (I guess);
Probably one of the simplest (but not most efficient) way is to use the sort function (which also returns sorted indices):
>> [~,index] = sort([1 1 0 1 1 1 0 0 1 1]);
>> index(1:5)
ans =
3 7 8 1 2
>> [~,index] = sort([1 2 1 2 2 1 0 1 2 2]);
>> index(1:5)
ans =
7 1 3 6 8

sorting a timer in matlab

ok it seems like a simple problem, but i am having problem
I have a timer for each data set which resets improperly and as a result my timing gets mixed.
Any ideas to correct it? without losing any data.
Example
timer col ideally should be
timer , mine reads
1 3
2 4
3 5
4 6
5 1
6 2
how do i change the colum 2 or make a new colum which reads like colum 1 without changing the order of ther rows which have data
this is just a example as my file lengths are 86000 long , also i have missing timers which i do not want to miss , this imples no data for that period of time.
thanks
EDIT: I do not want to change the other columns. The coulm 1 is the gps counter and so it does not sync with the comp timer due to some other issues. I just want to change the row one such that it goes from high to low without effecting other rows. also take care of missing pts ( if i did not care for missing pts simple n=1: max would work.
missing data in this case is indicated by missing timer. for example i have 4,5,8,9 with missing 6,7
Ok let me try to edit agian
its a 8600x 80 matrix of data:
timer is one row which should go from 0 to 8600
but timer starts at odd times , so i have start of data from middle , lets say 3400, so in the middle of day my timer goes to 0 and then back to 1.
but my other rows are fine. I just need 2 plot other sets based on timer as time.
i cannot use T= 1:length(file) as then it ignores missed time stamps ( timers )
for example my data reads like
timer , mine reads
1 3
2 4
3 5
4 8
5 9
8 1
9 2
so u can see time stamps 6,7 are missing.
if i used n=1:length(file)
i would have got
1 2 3 4 5 6 7
which is wrong
i want
1 2 3 4 5 8 9
without changing the order of other rows , so i cannot use sort for the whole file.
I assume the following problem
data says
3 100
4 101
5 102
NaN 0
1 104
2 105
You want
1 100
2 101
3 102
NaN 0
4 104
5 105
I'd solve the problem like this:
%# create test data
data = [3 100
4 101
5 102
NaN 0
1 104
2 105];
%# find good rows (if missing data are indicated by zeros, use
%# goodRows = data(:,1) > 0;
goodRows = isfinite(data(:,1));
%# count good rows
nGoodRows = sum(goodRows);
%# replace the first column with sequential numbers, but only in good rows
data(goodRows,1) = 1:nGoodRows;
data =
1 100
2 101
3 102
NaN 0
4 104
5 105
EDIT 1
Maybe I understand your question this time
data says
4 101
5 102
1 104
2 105
You want
1 4 101
2 5 102
4 1 104
5 2 105
This can be achieved the following way
%# test data
data = [4 101
5 102
1 104
2 105];
%# use sort to get the correct order of the numbers and add it to the left of data
out = [sort(data(:,1)),data]
out =
1 4 101
2 5 102
4 1 104
5 2 105
EDIT 2
Note that out is the result from the solution in EDIT 1
It seems you want to plot the data so that there is no entry for missing values. One way to do this is to make a plot with dots - there won't be a dot for missing data.
plot(out(:,1),out(:,3),'.')
If you want to plot a line that is interrupted, you have to insert NaNs into out
%# create outNaN, that has NaN-rows for missing entries
outNaN = NaN(max(out(:,1)),size(out,2));
outNaN(out(:,1),:) = out;
%# plot
plot(out(:,1),out(:,3))