Still new to the programing game but I need a little help! I'm not exactly sure how to describe what I want to do but I'll give it my best shot. I have a set of numbers produced by an algorithm I've put together. e.g. :
....
10 10 10
11 11 11
12 1 2
13 3 4
14 12 13
15 6 7
16 5 15
17 8 9
....
Essentially what I want to do is assign these index numbers to groups. Lets say I start with the number 14 in the first column. It is going to belong to group 1, so I label it in a new column in row 14 "1" for group one. The second and the third column show other index numbers that are grouped with the index 14. So I use a code like:
FindLHS = find(matrix(:,1)==matrix(14,2));
and
FindRHS = find(matrix(:,1)==matrix(14,3));
so clearly this will produce the results of
FindLHS = 12
FindRHS = 13
I will then proceed to label both 12 and 13 as belonging to group "1" as I did for 14
now my problem is I want to do this same procedure for both 12 and 13 of finding and labelling the indexs for 12 and 13 being (1,2) and (3,4). Is there a way to repeat that code for both idx of 1,2,3 and 4? because the real dataset has over 5000 data points in it...
Do you understand what I mean?
Thanks
James
All you really want to do is find wherever matrix(:,1) contains one of the numbers you've already found, include the numbers in the second and third columns into your group list (presuming they aren't already there), and stop when that list stops growing, right? This may not be the most efficient way of doing it but it gives you the basic idea:
while ~(numel(oldnum)==numel(num))
oldnum = num;
idx = ismember(matrix(:,1),oldnum)
num = unique(matrix(idx,:))
end
Output:
num =
1
2
3
4
12
13
14
Now if your first column is literally just your numbers 1 through 5000 in order, you don't need to even find the index, you can just use your number list directly.
To do this for multiple groups you would just need an outer loop that stores the information for each group, then picks out the next unused number. I'm presuming that your individual groups are consistent so that no matter which of those numbers you pick you end up with the same result - e.g. starting at 2 or 14 gives you the same result (if not, it becomes more complex).
Related
To be generic the issue is: I need to create group means that exclude own group observations before calculating the mean.
As an example: let's say I have firms, products and product characteristics. Each firm (f=1,...,F) produces several products (i=1,...,I). I would like to create a group mean for a certain characteristic of the product i of firm f, using all products of all firms, excluding firm f product observations.
So I could have a dataset like this:
firm prod width
1 1 30
1 2 10
1 3 20
2 1 25
2 2 15
2 4 40
3 2 10
3 4 35
To reproduce the table:
firm=[1,1,1,2,2,2,3,3]
prod=[1,2,3,1,2,4,2,4]
hp=[30,10,20,25,15,40,10,35]
x=[firm' prod' hp']
Then I want to estimate a mean which will use values of all products of all other firms, that is excluding all firm 1 products. In this case, my grouping is at the firm level. (This mean is to be used as an instrumental variable for the width of all products in firm 1.)
So, the mean that I should find is: (25+15+40+10+35)/5=25
Then repeat the process for other firms.
firm prod width mean_desired
1 1 30 25
1 2 10 25
1 3 20 25
2 1 25
2 2 15
2 4 40
3 2 10
3 4 35
I guess my biggest difficulty is to exclude the own firm values.
This question is related to this page here: Calculating group mean/medians in MATLAB where group ID is in a separate column. But here, we do not exclude the own group.
p.s.: just out of curiosity if anyone works in economics, I am actually trying to construct Hausman or BLP instruments.
Here's a way that avoids loops, but may be memory-expensive. Let x denote your three-column data matrix.
m = bsxfun(#ne, x(:,1).', unique(x(:,1))); % or m = ~sparse(x(:,1), 1:size(x,1), true);
result = m*x(:,3);
result = result./sum(m,2);
This creates a zero-one matrix m such that each row of m multiplied by the width column of x (second line of code) gives the sum of other groups. m is built by comparing each entry in the firm column of x with the unique values of that column (first line). Then, dividing by the respective count of other groups (third line) gives the desired result.
If you need the results repeated as per the original firm column, use result(x(:,1))
all.
I have a 15 element array = [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15];.
I was wondering if there was a command such that it would step through iterations of the array without repeating itself. In other words, since there is a chance that randperm() will create the same matrix twice, I want to step through each permutation only once and perform a calculation.
I concede that there are factorial(15) permutations, but for my purposes, these two vectors (and similar) are identical and don't need to be counted twice:
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
[15 14 13 12 11 10 9 8 7 6 5 4 3 2 1]
Thus, is there any way to step through this?
Thanks.
I think what you are looking for is perms. randperm returns a single random permutation, you want all the permutations.
So use
my_permuations = perms([1:15]);
If forward-backward is the same as backward-foward then you can use the top half of the list only...
my_permutation_to_use = my_permutations(1:length(my_permutations)/2, :);
You may compare all permutations, but this would require to store all past permutations. Instead a local decision is better. I recommend this simple rule:
A permutation is valid, if the first element is smaller than the last element.
A permutation is redundant, if the first element is larger than the last element.
For small sizes, this could simply be done with this code:
%generate all permutations
x=perms(1:10)
%select only the valid lines, remove all redundant lines
x(x(:,1)<x(:,end),:)
Remains the problem, that generating x for 1:15 breaks all memory limits and would require about 100h.
Edit for clarity:
I have two matrices, p.valor 2x1000 and p.clase 1x1000. p.valor consists of random numbers spanning from -6 to 6. p.clase contains, in order, 200 1:s, 200 2:s and 600 3:s. What I wan´t to do is
Print p.valor using a diferent color/prompt for each clase determined in p.clase, as in following figure.
I first wrote this, in order to find out which locations in p.valor represented where the 1,2 respective 3 where in p.clase
%identify the locations of all 1,2 respective 3 in p.clase
f1=find(p.clase==1);
f2=find(p.clase==2);
f3=find(p.clase==3);
%define vectors in p.valor representing the locations of 1,2,3 in p.clase
x1=p.valor(f1);
x2=p.valor(f2);
x3=p.valor(f3);
There is 200 ones (1) in p.valor, thus, is x1=(1:200). The problem is that each number one(1) (and, respectively 2 and 3) represents TWO elements in p.valor, since p.valor has 2 rows. So even though p.clase and thus x1 now only have one row, I need to include the elements in the same colums as all locations in f1.
So the different alternatives I have tried have not yet been succesfull. Examples:
plot(x1(:,1), x1(:,2),'ro')
hold on
plot(x2(:,1),x2(:,2),'k.')
hold on
plot(x3(:,1),x3(:,2),'b+')
and
y1=p.valor(201:400);
y2=p.valor(601:800);
y3=p.valor(1401:2000);
scatter(x1,y1,'k+')
hold on
scatter(x2,y1,'b.')
hold on
scatter(x3,y1,'ro')
and
y1=p.valor(201:400);
y2=p.valor(601:800);
y3=p.valor(1401:2000);
plot(x1,y1,'k+')
hold on
plot(x2,y2,'b.')
hold on
plot(x3,y3,'ro')
My figures have the axisies right, but the plotted values does not match the correct figure provided (see top of the question).
Ergo, my question is: how do I include tha values on the second row in p.valor in my plotted figure?
I hope this is clearer!
Values from both rows simultaneously can be accessed using this syntax:
X=p.value(:,findX)
In this case, resulting X matrix will be a matrix having 2 rows and length(findX) columns.
M = magic(5)
M =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
M2 = M(1:2, :)
M2 =
17 24 1 8 15
23 5 7 14 16
Matlab uses column major indexing. So to get to the next row, you actually just have to add 1. Adding 2 to an index on M2 here gets you to the next column, or adding 5 to an index on M
e.g. M2(3) is 24. To get to the next row you just add one i.e. M2(4) returns 5.To get to the next column add the number of rows so M2(2 + 2) gets you 1. If you add the number of columns like you suggested you just get gibberish.
So your method is very wrong. Freude's method is 100% correct, it's much easier to use subscript indexing than linear indexing for this. But I just wanted to explain why what you were trying doesn't work in Matlab. (aside from the fact that X=p.value(findX findX+1000) gives you a syntax error, I assume you meant X=p.value([findX findX+1000]))
Suppose I have an array age=[16 17 25 18 32 89 43 55] which holds the ages of a certain list of people. I also have a second array called groups=[1 1 2 1 3 2 1 4] denotes to which group each person belongs, i.e the person whose age is 55 is the only person in group 4, there are three people in group 1 etc.
I want to calculate the combined sum of ages in each group. That is, the result I want to get in this case is an array of 4 elements, it's first entry containing the sum of ages of people belonging to group #1 (16+17+18+43), second entry containing the sum of ages of people belonging to group #2 (23+89) etc.
I know of course how to do this with a for loop, but is it possible to do this using some variation of sum or something similar, so as to tap into matlab's vector optimization?
The code in #Ismail's answer is fine, but you could also try this:
>> accumarray(groups', age')
ans =
94
114
32
55
I find it hard to get an appreciation from the documentation exactly what accumarray can do in its full generality, but this is a great example of a simple usage. It's worth learning how to use it effectively, as once you've worked it out it's very powerful - and it will be a lot faster (when used on a larger example) than arrayfun.
You can use arrayfun and unique as follows:
arrayfun(#(x) sum(age(groups==x)), unique(groups))
I have an array with size m = 11 and my hash function is Division method : h(k) = k mod m
I have an integer k = 10 and 10 mod 11 is -1 so where should I put this key in the array? I should put this key in the slot which its index is 10?
please help me thanks
EDITED : for getting my answer well for example I have integers like k = 10,22,31,4,15,28,17,88,59
the array would be like this?thanks
10 9 8 7 6 5 4 3 2 1 0 index
10 31 59 17 28 4 15 88 22 keys
As it's usually done, 10 mod 11 is 10, so yes, you'd normally use index 10.
Edit: To generalize: at least as it's normally defined, given two positive inputs, a modulo will always produce a positive result. As such, your questions about what to do with negative results don't really make sense with respect to the normal definition.
If you really do have the possibility of getting a negative result, my immediate reaction would be to switch to some language that will produce a reasonable result. If you can't do that, then you'd probably want to move the value into the correct range by adding m to the negative number until you get a number in the range [0..m) so it fits the normal definition of mod, then use that as your index.