I have following numpy array:
array([1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 2, 0, 1, 1, 2, 3, 3, 3, 3, 1, 1, 1, 1,
1, 3, 1, 1, 3, 0, 1, 3, 1, 2, 1, 1, 1, 1, 1, 2, 1, 2, 0, 1, 2, 0, 2,
2, 2, 1, 2, 2, 0, 2, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 3, 0, 2, 1, 1,
1, 1, 3, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 0, 2, 3,
2, 1, 1, 1, 1, 3, 1, 0])
Question: How can I create another array that encodes the data, given condition: If value = 3 or 2, then "1", else "0".
I tried:
from sklearn.preprocessing import label_binarize
label_binarize(doc_topics, classes=[3,2])[:15]
array([[0, 0],
[0, 0],
[0, 0],
[1, 0],
[0, 0],
[0, 0],
[0, 0],
[0, 0],
[0, 0],
[0, 0],
[0, 1],
[0, 0],
[0, 0],
[0, 0],
[0, 1]])
However, this seems to return a 2-D array.
Use np.where and pass your condition to mask the elements of interest to set where the condition is met to 1, 0 otherwise:
In[18]:
a = np.where((a==3) | (a == 2),1,0)
a
Out[18]:
array([0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0,
0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1,
0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0,
1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0,
0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0])
Here we compare the array with the desired values, and use the unary | to or the conditions, due to operator precedence we have to use parentheses () around the conditions.
To do this using sklearn:
In[68]:
binarizer = preprocessing.Binarizer(threshold=1)
binarizer.transform(a.reshape(1,-1))
Out[68]:
array([[0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0,
0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1,
0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0,
1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0,
0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0]])
This treats values above 1 as 1 and 0 otherwise, this only works for this specific data set as you want 2 and 3 to be 1, it won't work if you have other values, so the numpy method is more general
I'm trying to select the last element from a map and compare it against the other last elements in the map to say if the selected element is higher or lower.
So far I can get it to select the element from user input but cannot work out how to loop through the map and get it to compare against the other elements:
def menu(f: (String) => (String, Int)) = {
print("Number > ")
val data = f(readLine)
println(s"${data._1}: ${data._2}")
}
Here is the map which the elements are coming from:
val mapdata = Map(
"A1" -> List(9, 7, 2, 0, 0, 2, 7, 9, 1, 2, 4, 1, 9, 6, 5, 3, 2, 3, 7, 2, 8, 5, 4, 5, 1, 6, 5, 2, 4, 1),
"B2" -> List(0, 7, 6, 3, 3, 3, 1, 2, 9, 2, 9, 7, 4, 7, 3, 6, 3, 9, 5, 2, 9, 7, 3, 4, 6, 3, 4, 3, 4, 1),
"C3" -> List(8, 7, 1, 8, 0, 5, 8, 0, 5, 9, 7, 5, 3, 7, 9, 8, 1, 4, 6, 5, 6, 6, 3, 6, 8, 8, 7, 4, 0, 6),
"D4" -> List(2, 9, 5, 7, 3, 8, 6, 9, 7, 9, 0, 1, 3, 1, 3, 0, 0, 1, 3, 8, 5, 4, 0, 9, 7, 1, 4, 5, 2, 9),
"E5" -> List(2, 6, 8, 0, 3, 5, 2, 1, 5, 9, 4, 5, 3, 5, 5, 8, 8, 2, 5, 9, 3, 8, 6, 7, 8, 7, 4, 1, 2, 3),
"F6" -> List(2, 7, 5, 9, 1, 9, 2, 4, 1, 6, 3, 7, 4, 3, 4, 5, 9, 2, 2, 4, 8, 7, 9, 2, 2, 7, 9, 1, 6, 9),
"G7" -> List(6, 9, 5, 0, 8, 0, 0, 5, 8, 5, 8, 7, 1, 6, 6, 1, 5, 2, 2, 7, 9, 5, 5, 9, 1, 4, 4, 0, 2, 0),
"H8" -> List(2, 8, 8, 3, 2, 1, 1, 8, 5, 9, 0, 2, 1, 6, 9, 7, 9, 6, 7, 7, 0, 9, 5, 2, 5, 0, 2, 1, 8, 6),
"I9" -> List(2, 1, 8, 2, 4, 4, 2, 4, 9, 4, 0, 6, 9, 5, 9, 4, 9, 1, 8, 6, 3, 4, 4, 3, 7, 9, 1, 2, 6, 6)
)
for example I select H8 and its last element is 6
I then want to compare it to all last elements and say if it is higher or lower than each of the last elements.
Thanks in advance
I'm not sure if I understand you correctly. You can access last element of the list by calling last.
object A extends App {
val mapdata: Map[String, List[Int]] = Map(
"A1" -> List(9, 7, 2, 0, 0, 2, 7, 9, 1, 2, 4, 1, 9, 6, 5, 3, 2, 3, 7, 2, 8, 5, 4, 5, 1, 6, 5, 2, 4, 1),
"B2" -> List(0, 7, 6, 3, 3, 3, 1, 2, 9, 2, 9, 7, 4, 7, 3, 6, 3, 9, 5, 2, 9, 7, 3, 4, 6, 3, 4, 3, 4, 1),
"C3" -> List(8, 7, 1, 8, 0, 5, 8, 0, 5, 9, 7, 5, 3, 7, 9, 8, 1, 4, 6, 5, 6, 6, 3, 6, 8, 8, 7, 4, 0, 6),
"D4" -> List(2, 9, 5, 7, 3, 8, 6, 9, 7, 9, 0, 1, 3, 1, 3, 0, 0, 1, 3, 8, 5, 4, 0, 9, 7, 1, 4, 5, 2, 9),
"E5" -> List(2, 6, 8, 0, 3, 5, 2, 1, 5, 9, 4, 5, 3, 5, 5, 8, 8, 2, 5, 9, 3, 8, 6, 7, 8, 7, 4, 1, 2, 3),
"F6" -> List(2, 7, 5, 9, 1, 9, 2, 4, 1, 6, 3, 7, 4, 3, 4, 5, 9, 2, 2, 4, 8, 7, 9, 2, 2, 7, 9, 1, 6, 9),
"G7" -> List(6, 9, 5, 0, 8, 0, 0, 5, 8, 5, 8, 7, 1, 6, 6, 1, 5, 2, 2, 7, 9, 5, 5, 9, 1, 4, 4, 0, 2, 0),
"H8" -> List(2, 8, 8, 3, 2, 1, 1, 8, 5, 9, 0, 2, 1, 6, 9, 7, 9, 6, 7, 7, 0, 9, 5, 2, 5, 0, 2, 1, 8, 6),
"I9" -> List(2, 1, 8, 2, 4, 4, 2, 4, 9, 4, 0, 6, 9, 5, 9, 4, 9, 1, 8, 6, 3, 4, 4, 3, 7, 9, 1, 2, 6, 6)
)
val choice = "H8"
val numberToBeCompared = mapdata.getOrElse(choice, throw new RuntimeException(s"couldn't find $choice")).last
mapdata.filter(_._1 != choice)
.values
.foreach(list => {
if (list.last > numberToBeCompared)
println(s"${list.last} > $numberToBeCompared")
else
println(s"${list.last} <= $numberToBeCompared")
})
}
Result:
1 <= 6
1 <= 6
0 <= 6
3 <= 6
9 > 6
9 > 6
6 <= 6
6 <= 6
Is this what you need ?
Note that I still learn Scala so there's probably better way of doing this.
I also don't know if any list in mapdata can be empty so use Option if necessary.
// EDIT
If you want the letters you may try sth like this:
val choice = "H8"
val numberToBeCompared = mapdata.getOrElse(choice, throw new RuntimeException(s"couldn't find $choice")).last
mapdata.filter(_._1 != choice)
.foreach(entry => {
if (entry._2.last > numberToBeCompared)
println(s"${choice} > ${entry._1}")
else
println(s"${choice} <= ${entry._1}")
})
H8 <= A1
H8 <= B2
H8 <= G7
H8 <= E5
H8 > D4
H8 > F6
H8 <= C3
H8 <= I9
You just need to remember that when you iterate through mapdata you get entry. entry._1 is the key (your string), entry._2 is the list.
I have two big matrices. Matrix A which is [4144514 x 3] and matrix B which is [51962 x 17].
The first three columns of A and B have the identifiers. On matrix B the 3 columns make a unique identifier, but this might be repeated on A.
I want to merge the two matrices, in such a away that the resulting matrix A, which should be [4144514 x 20], i.e. I am merging the 17 columns of B with matrix A given the criteria on the first three columns of each matrix.
This is the loop I am doing:
for i=1:size(B,1)
aux = sum(A(:,1)==B(i,1) & A(:,2)==B(i,2) & A(:,3) == B(i,3));
A(A(:,1)==B(i,1) & A(:,2)==B(i,2) & A(:,3) == B(i,3),4:20) = repmat(B(i,:),aux,1);
end
The variable aux tells me how many lines in A match the criteria from B.
The second line of the loop, creates a matrix with the columns of B repeated for aux number of times, and puts them on the correct place in A. However this is extremely inefficient.
Let me give you a toy example below with smaller matrices. I will have A to be [108 x 3] and B to be [27 x 17].
What I am looking to do is the following:
A = [100, 1, 2000 ;100, 1, 2000 ;100, 1, 2000 ;100, 1, 2000 ;100, 2, 2000 ;100, 2, 2000 ;100, 2, 2000 ;100, 2, 2000 ;100, 3, 2000 ;100, 3, 2000 ;100, 3, 2000 ;100, 3, 2000 ;100, 1, 2001 ;100, 1, 2001 ;100, 1, 2001 ;100, 1, 2001 ;100, 2, 2001 ;100, 2, 2001 ;100, 2, 2001 ;100, 2, 2001 ;100, 3, 2001 ;100, 3, 2001 ;100, 3, 2001 ;100, 3, 2001 ;100, 1, 2002 ;100, 1, 2002 ;100, 1, 2002 ;100, 1, 2002 ;100, 2, 2002 ;100, 2, 2002 ;100, 2, 2002 ;100, 2, 2002 ;100, 3, 2002 ;100, 3, 2002 ;100, 3, 2002 ;100, 3, 2002 ;101, 1, 2000 ;101, 1, 2000 ;101, 1, 2000 ;101, 1, 2000 ;101, 2, 2000 ;101, 2, 2000 ;101, 2, 2000 ;101, 2, 2000 ;101, 3, 2000 ;101, 3, 2000 ;101, 3, 2000 ;101, 3, 2000 ;101, 1, 2001 ;101, 1, 2001 ;101, 1, 2001 ;101, 1, 2001 ;101, 2, 2001 ;101, 2, 2001 ;101, 2, 2001 ;101, 2, 2001 ;101, 3, 2001 ;101, 3, 2001 ;101, 3, 2001 ;101, 3, 2001 ;101, 1, 2002 ;101, 1, 2002 ;101, 1, 2002 ;101, 1, 2002 ;101, 2, 2002 ;101, 2, 2002 ;101, 2, 2002 ;101, 2, 2002 ;101, 3, 2002 ;101, 3, 2002 ;101, 3, 2002 ;101, 3, 2002 ;103, 1, 2000 ;103, 1, 2000 ;103, 1, 2000 ;103, 1, 2000 ;103, 2, 2000 ;103, 2, 2000 ;103, 2, 2000 ;103, 2, 2000 ;103, 3, 2000 ;103, 3, 2000 ;103, 3, 2000 ;103, 3, 2000 ;103, 1, 2001 ;103, 1, 2001 ;103, 1, 2001 ;103, 1, 2001 ;103, 2, 2001 ;103, 2, 2001 ;103, 2, 2001 ;103, 2, 2001 ;103, 3, 2001 ;103, 3, 2001 ;103, 3, 2001 ;103, 3, 2001 ;103, 1, 2002 ;103, 1, 2002 ;103, 1, 2002 ;103, 1, 2002 ;103, 2, 2002 ;103, 2, 2002 ;103, 2, 2002 ;103, 2, 2002 ;103, 3, 2002 ;103, 3, 2002 ;103, 3, 2002 ;103, 3, 2002];
B = [100, 1, 2000, 8, 7, 9, 10, 1, 2, 9, 2, 1, 3, 3, 3, 9, 7; 100, 2, 2000, 8, 2, 7, 2, 7, 5, 5, 9, 2, 7, 1, 2, 6, 4; 100, 3, 2000, 8, 8, 7, 3, 2, 8, 1, 10, 9, 8, 6, 1, 5, 7; 100, 1, 2001, 10, 10, 1, 7, 2, 5, 5, 8, 6, 5, 3, 6, 6, 4; 100, 2, 2001, 6, 7, 3, 1, 5, 3, 9, 9, 3, 8, 1, 6, 4, 4; 100, 3, 2001, 1, 5, 7, 1, 5, 4, 2, 10, 5, 4, 6, 5, 1, 10; 100, 1, 2002, 7, 4, 6, 4, 7, 8, 3, 7, 7, 8, 2, 1, 6, 2; 100, 2, 2002, 8, 7, 8, 10, 2, 10, 8, 4, 7, 5, 10, 4, 4, 2; 100, 3, 2002, 4, 9, 4, 10, 2, 4, 2, 1, 4, 10, 9, 2, 6, 9; 101, 1, 2000, 5, 9, 5, 3, 10, 1, 4, 2, 10, 2, 6, 8, 5, 4; 101, 2, 2000, 6, 1, 8, 10, 10, 7, 4, 6, 5, 2, 8, 3, 2, 1; 101, 3, 2000, 7, 7, 5, 6, 2, 8, 8, 8, 1, 6, 1, 1, 9, 7; 101, 1, 2001, 8, 4, 5, 5, 8, 7, 2, 2, 9, 8, 1, 4, 2, 1; 101, 2, 2001, 3, 7, 10, 4, 9, 9, 1, 1, 10, 7, 6, 5, 10, 9; 101, 3, 2001, 10, 8, 2, 4, 6, 1, 6, 4, 8, 10, 7, 9, 4, 7; 101, 1, 2002, 6, 9, 3, 2, 10, 8, 5, 2, 6, 9, 1, 3, 6, 6; 101, 2, 2002, 8, 3, 8, 4, 4, 3, 4, 2, 4, 7, 2, 10, 3, 3; 101, 3, 2002, 10, 2, 10, 10, 6, 5, 3, 5, 10, 1, 3, 4, 8, 5; 103, 1, 2000, 8, 3, 9, 9, 4, 3, 3, 9, 7, 7, 6, 5, 2, 6; 103, 2, 2000, 6, 7, 5, 5, 7, 10, 5, 3, 5, 4, 7, 8, 9, 7; 103, 3, 2000, 3, 4, 9, 10, 3, 10, 7, 2, 10, 3, 3, 3, 6, 6; 103, 1, 2001, 7, 7, 1, 7, 10, 7, 10, 7, 9, 8, 4, 7, 6, 2; 103, 2, 2001, 5, 5, 4, 3, 7, 7, 6, 5, 2, 5, 5, 6, 6, 5; 103, 3, 2001, 6, 3, 9, 9, 2, 10, 10, 10, 10, 7, 10, 9, 9, 8; 103, 1, 2002, 5, 10, 2, 8, 6, 5, 7, 6, 4, 3, 6, 8, 7, 4; 103, 2, 2002, 10, 7, 6, 3, 10, 4, 5, 5, 1, 3, 1, 9, 1, 5; 103, 3, 2002, 2, 1, 5, 5, 2, 8, 6, 2, 6, 6, 10, 1, 4, 9];
for i=1:size(B,1)
aux = sum(A(:,1)==B(i,1) & A(:,2)==B(i,2) & A(:,3) == B(i,3));
A(A(:,1)==B(i,1) & A(:,2)==B(i,2) & A(:,3) == B(i,3),4:20) = repmat(B(i,:),aux,1);
end
With this smaller example the code runs very fast. But as soon as the matrices get as big as the ones I have, it takes ages.
Is there any faster way to do this?
A very simple task to vectorize, ismember does all the work.
%your example
A = [100, 1, 2000 ;100, 1, 2000 ;100, 1, 2000 ;100, 1, 2000 ;100, 2, 2000 ;100, 2, 2000 ;100, 2, 2000 ;100, 2, 2000 ;100, 3, 2000 ;100, 3, 2000 ;100, 3, 2000 ;100, 3, 2000 ;100, 1, 2001 ;100, 1, 2001 ;100, 1, 2001 ;100, 1, 2001 ;100, 2, 2001 ;100, 2, 2001 ;100, 2, 2001 ;100, 2, 2001 ;100, 3, 2001 ;100, 3, 2001 ;100, 3, 2001 ;100, 3, 2001 ;100, 1, 2002 ;100, 1, 2002 ;100, 1, 2002 ;100, 1, 2002 ;100, 2, 2002 ;100, 2, 2002 ;100, 2, 2002 ;100, 2, 2002 ;100, 3, 2002 ;100, 3, 2002 ;100, 3, 2002 ;100, 3, 2002 ;101, 1, 2000 ;101, 1, 2000 ;101, 1, 2000 ;101, 1, 2000 ;101, 2, 2000 ;101, 2, 2000 ;101, 2, 2000 ;101, 2, 2000 ;101, 3, 2000 ;101, 3, 2000 ;101, 3, 2000 ;101, 3, 2000 ;101, 1, 2001 ;101, 1, 2001 ;101, 1, 2001 ;101, 1, 2001 ;101, 2, 2001 ;101, 2, 2001 ;101, 2, 2001 ;101, 2, 2001 ;101, 3, 2001 ;101, 3, 2001 ;101, 3, 2001 ;101, 3, 2001 ;101, 1, 2002 ;101, 1, 2002 ;101, 1, 2002 ;101, 1, 2002 ;101, 2, 2002 ;101, 2, 2002 ;101, 2, 2002 ;101, 2, 2002 ;101, 3, 2002 ;101, 3, 2002 ;101, 3, 2002 ;101, 3, 2002 ;103, 1, 2000 ;103, 1, 2000 ;103, 1, 2000 ;103, 1, 2000 ;103, 2, 2000 ;103, 2, 2000 ;103, 2, 2000 ;103, 2, 2000 ;103, 3, 2000 ;103, 3, 2000 ;103, 3, 2000 ;103, 3, 2000 ;103, 1, 2001 ;103, 1, 2001 ;103, 1, 2001 ;103, 1, 2001 ;103, 2, 2001 ;103, 2, 2001 ;103, 2, 2001 ;103, 2, 2001 ;103, 3, 2001 ;103, 3, 2001 ;103, 3, 2001 ;103, 3, 2001 ;103, 1, 2002 ;103, 1, 2002 ;103, 1, 2002 ;103, 1, 2002 ;103, 2, 2002 ;103, 2, 2002 ;103, 2, 2002 ;103, 2, 2002 ;103, 3, 2002 ;103, 3, 2002 ;103, 3, 2002 ;103, 3, 2002];
B = [100, 1, 2000, 8, 7, 9, 10, 1, 2, 9, 2, 1, 3, 3, 3, 9, 7; 100, 2, 2000, 8, 2, 7, 2, 7, 5, 5, 9, 2, 7, 1, 2, 6, 4; 100, 3, 2000, 8, 8, 7, 3, 2, 8, 1, 10, 9, 8, 6, 1, 5, 7; 100, 1, 2001, 10, 10, 1, 7, 2, 5, 5, 8, 6, 5, 3, 6, 6, 4; 100, 2, 2001, 6, 7, 3, 1, 5, 3, 9, 9, 3, 8, 1, 6, 4, 4; 100, 3, 2001, 1, 5, 7, 1, 5, 4, 2, 10, 5, 4, 6, 5, 1, 10; 100, 1, 2002, 7, 4, 6, 4, 7, 8, 3, 7, 7, 8, 2, 1, 6, 2; 100, 2, 2002, 8, 7, 8, 10, 2, 10, 8, 4, 7, 5, 10, 4, 4, 2; 100, 3, 2002, 4, 9, 4, 10, 2, 4, 2, 1, 4, 10, 9, 2, 6, 9; 101, 1, 2000, 5, 9, 5, 3, 10, 1, 4, 2, 10, 2, 6, 8, 5, 4; 101, 2, 2000, 6, 1, 8, 10, 10, 7, 4, 6, 5, 2, 8, 3, 2, 1; 101, 3, 2000, 7, 7, 5, 6, 2, 8, 8, 8, 1, 6, 1, 1, 9, 7; 101, 1, 2001, 8, 4, 5, 5, 8, 7, 2, 2, 9, 8, 1, 4, 2, 1; 101, 2, 2001, 3, 7, 10, 4, 9, 9, 1, 1, 10, 7, 6, 5, 10, 9; 101, 3, 2001, 10, 8, 2, 4, 6, 1, 6, 4, 8, 10, 7, 9, 4, 7; 101, 1, 2002, 6, 9, 3, 2, 10, 8, 5, 2, 6, 9, 1, 3, 6, 6; 101, 2, 2002, 8, 3, 8, 4, 4, 3, 4, 2, 4, 7, 2, 10, 3, 3; 101, 3, 2002, 10, 2, 10, 10, 6, 5, 3, 5, 10, 1, 3, 4, 8, 5; 103, 1, 2000, 8, 3, 9, 9, 4, 3, 3, 9, 7, 7, 6, 5, 2, 6; 103, 2, 2000, 6, 7, 5, 5, 7, 10, 5, 3, 5, 4, 7, 8, 9, 7; 103, 3, 2000, 3, 4, 9, 10, 3, 10, 7, 2, 10, 3, 3, 3, 6, 6; 103, 1, 2001, 7, 7, 1, 7, 10, 7, 10, 7, 9, 8, 4, 7, 6, 2; 103, 2, 2001, 5, 5, 4, 3, 7, 7, 6, 5, 2, 5, 5, 6, 6, 5; 103, 3, 2001, 6, 3, 9, 9, 2, 10, 10, 10, 10, 7, 10, 9, 9, 8; 103, 1, 2002, 5, 10, 2, 8, 6, 5, 7, 6, 4, 3, 6, 8, 7, 4; 103, 2, 2002, 10, 7, 6, 3, 10, 4, 5, 5, 1, 3, 1, 9, 1, 5; 103, 3, 2002, 2, 1, 5, 5, 2, 8, 6, 2, 6, 6, 10, 1, 4, 9];
%reference code, changed output to C to preserve input
C=A;
for i=1:size(B,1)
aux = sum(A(:,1)==B(i,1) & A(:,2)==B(i,2) & A(:,3) == B(i,3));
C(A(:,1)==B(i,1) & A(:,2)==B(i,2) & A(:,3) == B(i,3),4:20) = repmat(B(i,:),aux,1);
end
%vectorized version
[~,lia]=ismember(A(:,1:3),B(:,1:3),'rows');
C2=nan(numel(lia),size(B,2)+3);
C2(lia>0,:)=B(lia(lia>0),[1,2,3,1:end]);
toc;
tic;
%simplified vectorized version, assuming there is no need to duplicate the first three rows:
[~,lia]=ismember(A(:,1:3),B(:,1:3),'rows');
C3=nan(numel(lia),size(B,2));
C3(lia>0,:)=B(lia(lia>0),:);
toc;
Comparing the performance with some example data close to your real data (same data was to large):
n=100000 %4144514
m=10000 %51962
B=rand(10000,17);
B=unique(B,'rows');
A=B(randi([1 size(B,1)],100000,1),1:3);
Decreased the execution time from 14s to less 0.05s
I conducted an analysis using a logit model and now want to do the same using a probit model. Can anyone please turn this winbugs logit model into a winbugs probit model?
model
{
for (i in 1:n) {
# Linear regression on logit
logit(p[i]) <- alpha + b.sex*sex[i] + b.age*age[i]
# Likelihood function for each data point
frac[i] ~ dbern(p[i])
}
alpha ~ dnorm(0.0,1.0E-4) # Prior for intercept
b.sex ~ dnorm(0.0,1.0E-4) # Prior for slope of sex
b.age ~ dnorm(0.0,1.0E-4) # Prior for slope of age
}
Data
list(sex=c(1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1,
1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0,
0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1,
0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1),
age= c(69, 57, 61, 60, 69, 74, 63, 68, 64, 53, 60, 58, 79, 56, 53, 74, 56, 76, 72,
56, 66, 52, 77, 70, 69, 76, 72, 53, 69, 59, 73, 77, 55, 77, 68, 62, 56, 68, 70, 60,
65, 55, 64, 75, 60, 67, 61, 69, 75, 68, 72, 71, 54, 52, 54, 50, 75, 59, 65, 60, 60,
57, 51, 51, 63, 57, 80, 52, 65, 72, 80, 73, 76, 79, 66, 51, 76, 75, 66, 75, 78, 70,
67, 51, 70, 71, 71, 74, 74, 60, 58, 55, 61, 65, 52, 68, 75, 52, 53, 70),
frac=c(1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0,
1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1,
1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1,
1, 0, 1, 1, 0, 0, 1, 0, 0, 1),
n=100)
Initial Values
list(alpha=0, b.sex=1, b.age=1)
WinBUGS accepts multiple types of link functions (see page 15 in the WinBUGS manual). For a probit model, change your linear regression equation to:
probit(p[i]) <- alpha + b.sex*sex[i] + b.age*age[i]
I would recommend you center the age variable, otherwise you may well run into some convergence problems, so something like:
probit(p[i]) <- alpha + b.sex*sex[i] + b.age*(age[i] - mean(age[]))
Alternatively, for a probit model (if the probit functions gives you some trap errors) you could use the phi standard normal cdf function:
p[i] <- phi(alpha + b.sex*sex[i] + b.age*(age[i] - mean(age[])))