Matlab: Sum elements in array into another array - matlab

Suppose I have an array age=[16 17 25 18 32 89 43 55] which holds the ages of a certain list of people. I also have a second array called groups=[1 1 2 1 3 2 1 4] denotes to which group each person belongs, i.e the person whose age is 55 is the only person in group 4, there are three people in group 1 etc.
I want to calculate the combined sum of ages in each group. That is, the result I want to get in this case is an array of 4 elements, it's first entry containing the sum of ages of people belonging to group #1 (16+17+18+43), second entry containing the sum of ages of people belonging to group #2 (23+89) etc.
I know of course how to do this with a for loop, but is it possible to do this using some variation of sum or something similar, so as to tap into matlab's vector optimization?

The code in #Ismail's answer is fine, but you could also try this:
>> accumarray(groups', age')
ans =
94
114
32
55
I find it hard to get an appreciation from the documentation exactly what accumarray can do in its full generality, but this is a great example of a simple usage. It's worth learning how to use it effectively, as once you've worked it out it's very powerful - and it will be a lot faster (when used on a larger example) than arrayfun.

You can use arrayfun and unique as follows:
arrayfun(#(x) sum(age(groups==x)), unique(groups))

Related

How to average and equally compare categories with different number of data elements?

I am extracting a list of categories that contain a number of listed values. I am then averaging and doing a compare i. here is the general explanation.
Example:
Category 1 has 2 elements
Category 2 has 5 elements
Category 3 has 9 elements
Category 4 has 10 elements
Category 5 has 17 elements
Category 6 has 26 elements
Category 7 has 55 elements
Within each category, there are individual elements that contain a score. I am attempting to compare the average score for the overall category compared to another category equally.
The problem is that because each category contains a different amount of elements, the average comparison to evaluate is not the same. For example, comparing Category 1 with 2 elements to a Category 7 with 55 elements.
If Category 1 had 55 elements, then I could say that I am equally comparing the overall value to Category with 55 elements also.
My first thought was to say that each category must have 10 scores to equally compare.
For Category 1, I thought about just taking the 2 scores, and then add 8 zeros to show that the category is weaker due to not having the rest of the 8, while comparing against Category 7 with it's strongest top 10 scores out of the 52, but I don't believe that will provide any useful result.
The same would apply to Category 2 with 5 elements, that 5 zeros are factored in to make 10.
The same would apply to Category 3 with 9 elements, that 1 zero are factored in to make 10.
What I am trying to do is find a way to compare apples to apples by knowing that each category is compared against a set limit of 10 scores to gauge which is stronger in score relative to the others categories.
Is there a process or method in which I can address this? Is there a better way to approach this?
Thank you!
We can't decide for you which of the aggregate function is the most appropriate to your case. Usually, people use average or max like :
select category, count(1), avg(score), max(score) from scores group by category

select every 3rd participate from a list and make and average in matlab

I got 3 lists with grades ranging from 0-100 represting 3 different tests.
each list has an equal number of indxes (represting participates).
For example- the 1st indexes in the lists- list1,list2 and list3, are the grades of the first particiapte in the 3 different tests.
I need to make a new group (named group1) that select evey 3rd participate, starting from the first, and than calculate the avarage of this group scores.
i'll appriciate any help!!
Hopefully instead of three 'lists' you are actually using a 3 column matrix for this? e.g.
testScores = [20 48 13;
85 90 93;
54 50 56;
76 80 45
...]
From here it is trivial to select every third participant:
testScores(1:3:end, :)
and then to find the mean:
mean(testScores(1:3:end,:),2)

MATLAB/OCTAVE - Branching loops? or parallel looping?

Still new to the programing game but I need a little help! I'm not exactly sure how to describe what I want to do but I'll give it my best shot. I have a set of numbers produced by an algorithm I've put together. e.g. :
....
10 10 10
11 11 11
12 1 2
13 3 4
14 12 13
15 6 7
16 5 15
17 8 9
....
Essentially what I want to do is assign these index numbers to groups. Lets say I start with the number 14 in the first column. It is going to belong to group 1, so I label it in a new column in row 14 "1" for group one. The second and the third column show other index numbers that are grouped with the index 14. So I use a code like:
FindLHS = find(matrix(:,1)==matrix(14,2));
and
FindRHS = find(matrix(:,1)==matrix(14,3));
so clearly this will produce the results of
FindLHS = 12
FindRHS = 13
I will then proceed to label both 12 and 13 as belonging to group "1" as I did for 14
now my problem is I want to do this same procedure for both 12 and 13 of finding and labelling the indexs for 12 and 13 being (1,2) and (3,4). Is there a way to repeat that code for both idx of 1,2,3 and 4? because the real dataset has over 5000 data points in it...
Do you understand what I mean?
Thanks
James
All you really want to do is find wherever matrix(:,1) contains one of the numbers you've already found, include the numbers in the second and third columns into your group list (presuming they aren't already there), and stop when that list stops growing, right? This may not be the most efficient way of doing it but it gives you the basic idea:
while ~(numel(oldnum)==numel(num))
oldnum = num;
idx = ismember(matrix(:,1),oldnum)
num = unique(matrix(idx,:))
end
Output:
num =
1
2
3
4
12
13
14
Now if your first column is literally just your numbers 1 through 5000 in order, you don't need to even find the index, you can just use your number list directly.
To do this for multiple groups you would just need an outer loop that stores the information for each group, then picks out the next unused number. I'm presuming that your individual groups are consistent so that no matter which of those numbers you pick you end up with the same result - e.g. starting at 2 or 14 gives you the same result (if not, it becomes more complex).

Accumarray how to set "fun" to use only last obs of each bin

Is there a way to let accumarray drop every observation but the last of each group?
What I had in mind was something similar:
lastobs=accumarray(bin,x,[],#(x){pick the observation with the max index in each group};
To give you an example, suppose I have the following:
bin=[1 2 3 3 3 4 4]; %#The bin where the observations should be put
x= [21 3 12 5 6 8 31]; %#The vector of observations
%#The output I would like is as follow
lastobs=[21 3 6 31];
I am actually thinking of accumarray only because I just used it to compute the mean of the observations for each bin. So every function that could make the trick would be fine for me.
Of course you can do this with accumarray. x(end) is the last observation in an array. Note that bin needs to be sorted for this to work, so if it isn't, run
[bin,sortIdx]=sort(bin);x = x(sortIdx); first.
lastobs = accumarray(bin(:),x(:),[],#(x)x(end)); %# bin, x, should be n-by-1
You already got your accumarray answer, but since you are looking for any solution that will do the job, consider the following application of unique.
Using unique with the 'legacy' option gives the index of the last occurrence of each value, as you need:
>> [~,ia] = unique(bin,'legacy')
ia =
1 2 5 7
>> lastobs = x(ia)
lastobs =
21 3 6 31
Now, I love accumarray, as many here are aware, but I actually prefer this solution.

How to eliminate repeated data using MATLAB

I listed my data to something like this. I want to eliminate the repeated data in each row. How can I do that using MATLAB?
13 13 13 13 38 38 38
13 13 42 0 0 0 0
Expected result:
13 38
13 42
Have a look into the function unique. Check out the documentation here.
One way of operating on each of the rows of a matrix would be to call unique inside a loop for each row. Obviously, you could end up with different numbers of unique elements for each row, so you may have to store the result in a cell array.
Hope this helps.
To select unique elements from a vector, you can do:
a = unique(b, 'first');
You can find more about this function from Mathworks site docs.
Update
Building on what Amro said, you could do something like this if the top and bottom aren't guaranteed to be the same length (I'm guess they aren't, since that seems like an unlikely event):
result = {}
for i = 1:size(a, 1)
result{i} = unique(a(i, :), 'first');
end;