How to make an indicator function or handle for glmfit? - matlab

I have a question about creating a handle or indicator function. I have a X matrix that contains 4 explanatory variables and one column (the last one, column 5) of ones and twos that indicates if the observation belongs to group 1 by 1 or group 2 by 2. I want to perform 2 glmfit. One for the observations belonging to group 1, and one for the observations belonging to group 2. I thus need some kind of indicator function so that the glmfit will only calculate the observations of the specific group. Can somebody help me how I can so this? I make use of the following glmfit:
[B1, dev, stats1] = glmfit(X(:,1:4), Y, 'binomial', 'link', 'logit');

Does the following do the job?
indicator = X(:, 5)
[B1, dev, stats1] = glmfit(X(indicator==1,1:4), Y, 'binomial', 'link', 'logit');
[B2, dev, stats2] = glmfit(X(indicator==2,1:4), Y, 'binomial', 'link', 'logit');
In the above, X(indicator==1, 1:4) employs what is called logical indexing, see also here. It provides a submatrix of X, consisting of only the rows where indicator has an entry of 1.

Related

How to neatly sum values of histogram bins, given a value matrix and an associated category association matrix (bin association)

Consider the example following below, where I have a 10x10 matrix, say A, of random values in some range, say [-5, 5]. I quantize the values of A into 8 categories, 1, ..., 8, such that an additional 10x10 matrix, say qA, describes the category association for each number in A. Finally, I produce the sums of all values assigned to each category. My question regards this final step.
myRange = 5; % values in open interval [-myRange, myRange]
A = myRange*(2*rand(10) - 1);
qA = uencode(A, 3, myRange)+1;
% (+) create "histogram" of sum of values assigned to each bin
myHistogram = zeros(8,1);
for i = 1:numel(A)
myHistogram(qA(i)) = myHistogram(qA(i)) + A(i);
end
bar(myHistogram)
Question: Is the some neater way of doing this, specifically the counting step (+) above? (Some better alternative than explicitly iterating over each element in the matrix A?).
Just as I was about to finish up and post my question I found a satisfying answer to it, however not here on SO. As self-answering is encouraged, I'll post the Q+A instead of aborting this Q posting.
Hence, based on the following Matlab Central thread, one neater solution is as follows:
myRange = 5; % values in open interval [-myRange, myRange]
A = myRange*(2*rand(10) - 1);
qA = uencode(A, 3, myRange)+1;
% or, if you dont have Signal Processing Toolbox required for 'uencode'
% [~, ~, qA] = histcounts(A, -myRange:myRange/4:myRange);
% (+) create "histogram" of sum of values assigned to each bin
myHistogram = accumarray(qA(:), A(:), [8 1])
Possibly there's alternative/even better ways to do this, performing quantizing and bin value summation in same step?

Recognizing poker hands from a 2D matrix of values

I have a 1000 x 5 sorted matrix with numbers from 1-13. Each number denotes the numerical value of a playing card. The Ace has the value 1, then the numbers 2 through 10 follow, then the Jack has value 11, Queen with value 12 and King with value 13. Therefore, each row of this matrix constitutes a poker hand. I am trying to create a program that recognizes poker hands using these cards that are enumerated in this way.
For example:
A = [1 1 2 4 5; 2 3 4 5 7; 3, 3, 5, 5, 6; 8, 8, 8, 9, 9]
Therefore, in this matrix A, the first row has a pair (1,1). The second row has high card (7), the third row has two pair ((3,3) and (5,5)) and the last one is a full house (Pair of 9s and 3 of a kind (8).
Is there a good way to do this in MATLAB?
bsxfun won't work for this situation. This is a counting problem. It's all a matter of counting what you have. Specifically, poker hands deal with counting up how much of each card you have, and figuring out the right combination of counts to get a valid hand. Here's a nice picture that shows us every possible poker hand known to man:
Source: http://www.bestonlinecasino.tips
Because we don't have the suits in your matrix, I'm going to ignore the Royal Flush, Straight Flush and the Flush scenario. Every hand you want to recognize can be chalked up to taking a histogram of each row with bins from 1 to 13, and determining if (in order of rank):
Situation #1: A high hand - all of the bins have a bin count of exactly 1
Situation #2: A pair - you have exactly 1 bin that has a count of 2
Situation #3: A two pair - you have exactly 2 bins that have a count of 2
Situation #4: A three of a kind - you have exactly 1 bin that has a count of 3
Situation #5: Straight - You don't need to compute the histogram here. Simply sort your hand, and take neighbouring differences and make sure that the difference between successive values is 1.
Situation #6: Full House - you have exactly 1 bin that has a count of 2 and you have exactly 1 bin that has a count of 3.
Situation #7: Four of a kind - you have exactly 1 bin that has a count of 4.
As such, find the histogram of your hand using histc or histcounts depending on your MATLAB version. I would also pre-sort your hand over each row to make things simpler when finding a straight. You mentioned in your post that the matrix is pre-sorted, but I'm going to assume the general case where it may not be sorted.
As such, here's some pre-processing code, given that your matrix is in A:
Asort = sort(A,2); %// Sort rowwise
diffSort = diff(Asort, 1, 2); %// Take row-wise differences
counts = histc(Asort, 1:13, 2); %// Count each row up
diffSort contains column-wise differences over each row and counts gives you a N x 13 matrix where N are the total number of hands you're considering... so in your case, that's 1000. For each row, it tells you how many of a particular card has been encountered. So all you have to do now is go through each situation and see what you have.
Let's make an ID array where it's a vector that is the same size as the number of hands you have, and the ID tells you which hand we have played. Specifically:
* ID = 1 --> High Hand
* ID = 2 --> One Pair
* ID = 3 --> Two Pairs
* ID = 4 --> Three of a Kind
* ID = 5 --> Straight
* ID = 6 --> Full House
* ID = 7 --> Four of a Kind
As such, here's what you'd do to check for each situation, and allocating out to contain our IDs:
%// To store IDs
out = zeros(size(A,1),1);
%// Variables for later
counts1 = sum(counts == 1, 2);
counts2 = sum(counts == 2, 2);
counts3 = sum(counts == 3, 2);
counts4 = sum(counts == 4, 2);
%// Situation 1 - High Hand
check = counts1 == 5;
out(check) = 1;
%// Situation 2 - One Pair
check = counts2 == 1;
out(check) = 2;
%// Situation 3 - Two Pair
check = counts2 == 2;
out(check) = 3;
%// Situation 4 - Three of a Kind
check = counts3 == 1;
out(check) = 4;
%// Situation 5 - Straight
check = all(diffSort == 1, 2);
out(check) = 5;
%// Situation 6 - Full House
check = counts2 == 1 & counts3 == 1;
out(check) = 6;
%// Situation 7 - Four of a Kind
check = counts4 == 1;
out(check) = 7;
Situation #1 basically checks to see if all of the bins that are encountered just contain 1 card. If we check for all bins that just have 1 count and we sum all of them together, we should get 5 cards.
Situation #2 checks to see if we have only 1 bin that has 2 cards and there's only one such bin.
Situation #3 checks if we have 2 bins that contain 2 cards.
Situation #4 checks if we have only 1 bin that contains 3 cards.
Situation #5 checks if the neighbouring differences for each row of the sorted result are all equal to 1. This means that the entire row consists of 1 when finding neighbouring distances. Should this be the case, then we have a straight. We use all and check every row independently to see if all values are equal to 1.
Situation #6 checks to see if we have one bin that contains 2 cards and one bin that contains 3 cards.
Finally, Situation #7 checks to see if we have 1 bin that contains 4 cards.
A couple of things to note:
A straight hand is also technically a high hand given our definition, but because the straight check happens later in the pipeline, any hands that were originally assigned a high hand get assigned to be a straight... so that's OK for us.
In addition, a full house can also be a three of a kind because we're only considering the three of a kind that the full house contains. However, the later check for the full house will also include checking for a pair of cards, and so any hands that were assigned a three of a kind will become full houses eventually.
One more thing I'd like to note is that if you have an invalid poker hand, it will automatically get assigned a value of 0.
Running through your example, this is what I get:
>> out
out =
2
1
3
6
This says that the first hand is a one pair, the next hand is a high card, the next pair is two pairs and the last hand is a full house. As a bonus, we can actually output what the strings are for each hand:
str = {'Invalid Hand', 'High Card', 'One Pair', 'Two Pair', 'Three of a Kind', 'Straight', 'Full House', 'Four of a Kind'};
hands = str(out+1);
I've made a placeholder for the invalid hand, and if we got a legitimate hand in our vector, you simply have to add 1 to each index to access the right hand. If we don't have a good hand, it'll show you an Invalid Hand string.
We get this for the strings:
hands =
'One Pair' 'High Card' 'Two Pair' 'Full House'

Matlab - submatrix for stiffness method

In order to use the stiffness method for trusses, I need to extract certain elements from a large global stiffness matrix.
Say I have a 9 x 9 matrix K representing a three-member truss. This means that the first 3 rows and columns correspond to the first node, the second set of three rows and columns with the second node, and the third with the third node. In the code is a vector zDisp that corresponds to each node that has zero displacement. On paper, a zero displacement of a node means you would cross out the rows and columns corresponding to that displacement, leaving you with a smaller and easier to work with K matrix. So if the first and third nodes have zero displacement, you would be left with a 3 x 3 matrix corresponding to the intersection of the middle three rows and the middle three columns.
I thought I could accomplish this one node at a time with a function like so:
function [ B ] = deleteNode( B, node )
%deleteNode removes the corresponding rows and vectors to a node that has
% zero deflection from the global stiffness matrix
% --- Problem line - this gets the first location in the matrix corresponding to the node
start = 3*node- 2;
for i = 0 : 2
B(start+i,:) = [];
B(:,start+i) = [];
end
end
So my main project would go something like
% Zero displacement nodes
zDisp = [1;
3;
];
% --- Create 9 x 9 global matrix Kg ---
% Make a copy of the global matrix
S = Kg;
for(i = 1 : length(zDisp))
S = deleteNode(S, zDisp(i));
end
This does not work because once the loop executes for node 1 and removes the first 3 rows and columns, the problem line in the function no longer works to find the correct location in the smaller matrix to find the node.
So I think this step needs to be executed all at once. I am thinking I may need to instead input which nodes are NOT zero displacement, and create a submatrix based off of that. Any tips on this? Been thinking on it awhile. Thanks all.
In your example, you want to remove rows/columns 1, 2, 3, 7, 8, and 9, so if zDisp=[1;3],
remCols=bsxfun(#plus,1:3,3*(zDisp-1))
If I understand correctly, you should just be able to first remove the columns given by zDisp:
S(remCols(:),:)=[]
then remove the rows:
S(:,remCols(:))=[]

MatLab: Create matrix row element if row elements of two previous matrices are identical

Sorry for the title. I could not think of something better.
I have the following problem.
I have two four-column matrices build up like this:
Property | X | Y | Z
The two matrices have different sizes, since matrix 1 has a large amount of additional rows compared to matrix 2.
What I want to do is the following:
I need to create a third matrix that only features those rows (of the large matrix) that are identical in columns X, Y and Z to rows in matrix2(the property column is always different).
I tried an if-statement but it did not really work out due to my programming syntax. Has somebody a tip?
Thank you!
I tried something like this: (in this case A is the larger matrix and I want its property column for X,Y,Z-positions that are identical to another matrix B.. I am terrible with the MatLab-syntax..
if (A(:,2) == B(:,2) and (A(:,3) == B(:,3) and (A(:,4) == B(:,4))
newArray(:,1) = A(:,1);
end
Use ismember with the 'rows' option to find the desired rows, and then use that as an index to build the result:
ind = ismember(A(:,2:4), B(:,2:4), 'rows');
C = A(ind,:);
I have assumed that a row of A is selected if its last three columns match those of any row of B.

How do I plot this? MATLAB

I have a matrix, X, in which I want to plot it using the kmeans function. What I would like: If row has a value of 1 in column 4 I would like it to be square shaped If the row has a value of 2 in column 4 I would like it + shaped BUT If the row has a value of 0 in column 5 it must be blue and if the row has a vale of 1 in column 5 it must be yellow
(You don't need to use these exact colors and shapes, I just want to distinguish these.) I tried this and it did not work:
plot(X(idx==2,1),X(idx==2,2),X(:,4)==1,'k.');
Thanks!!
Based on the example on the kmeans documentation page I propose this "nested" logic:
X = [randn(100,2)+ones(100,2);...
randn(100,2)-ones(100,2)];
opts = statset('Display','final');
% This gives a random distribution of 0s and 1s in column 5:
X(:,5) = round(rand(size(X,1),1));
[idx,ctrs] = kmeans(X,2,...
'Distance','city',...
'Replicates',5,...
'Options',opts);
hold on
plot(X(idx==1,1),X(idx==1,2),'rs','MarkerSize',12)
plot(X(idx==2,1),X(idx==2,2),'r+','MarkerSize',12)
% after plotting the results of kmeans,
% plot new symbols with a different logic on top:
plot(X(X(idx==1,5)==0,1),X(X(idx==1,5)==0,2),'bs','MarkerSize',12)
plot(X(X(idx==1,5)==1,1),X(X(idx==1,5)==1,2),'gs','MarkerSize',12)
plot(X(X(idx==2,5)==0,1),X(X(idx==2,5)==0,2),'b+','MarkerSize',12)
plot(X(X(idx==2,5)==1,1),X(X(idx==2,5)==1,2),'g+','MarkerSize',12)
The above code is a minimal working example, given that the statistics toolbox is available.
The key feature is the nested logic for the plotting. For example:
X(X(idx==1,5)==0,1)
The inner X(idx==1,5) selects those values of X(:,5) for which idx==1. From those, only values which are 0 are considered: X(X(...)==0,1). Based on the logic in the question, this should be a blue square: bs.
You have four cases, hence there are four additional plot lines.