2 Models in JAGS - kind of 'non-trivial' case - winbugs

I am trying to build a GARCH(1,1) model in JAGS, and for simplicity let's assume that the mean equation follows the AR(1) process. I am trying to build 1 JAGS model that will allow joining the AR(1), and GARCH(1,1) processes.
For now I can only achieve the results by building 2 separate JAGS models (they are simplified for the clarity of the presentation).
The first JAGS model estimates the parameters of the AR(1) process:
modelstring="
model {
for (i in 2:n) {
y[i]~dnorm(alpha0+alpha1*y[i-1],1)
}
alpha0 ~ dnorm(alpha0.mean,alpha0.prec)
alpha1 ~ dunif(-1,1)
}
Having the parameter's estimates I generate the data of the AR(1) process, obtain residuals, and variances (assuming some window):
ALPHA0=summary(output1)$statistics[1]
ALPHA1=summary(output1)$statistics[2]
y_hat=ALPHA0+ALPHA1*y[1:(dim(DATA)[1])]
eps=y-y_hat
window=30
VAR=rep(NA, dim(DATA)[1]-window)
for (i in 1:length(VAR)){
VAR[i]=var(eps[i:(i+window)])
}
The next block is the GARCH(1,1) proses in JAGS:
modelstring="
model {
for (i in 2:n) {
Var[i]~dnorm(beta0+beta1*Var[i-1]*+beta2*eps[i-1]^2,1)
}
beta0 ~ dnorm(beta0.mean,beta0.prec)
beta1 ~ dunif(0,1)
beta2 ~ dnorm(0,1-beta1)
}
"
How do I join two processes that are dependent?

Related

P-adjustment (FDR) on Hierarchical Clustering On Principle Components (HCPC) in R

I'm working right now with "Hierarchical Clustering On Principle Components (HCPC)". In the end of the analysis, p-values are computed by the HCPC function.
I searched but I couldn't find any function that could adjust the p-value based on FDR together with HCPC. It's really important to avoid any junk data in my multivariate set. Therefore my question is how can I run together with HCPC the p-value adjustment?
This is what I'm doing right now:
#install.packages(c("FactoMineR", "factoextra", "missMDA"))
library(ggplot2)
library(factoextra)
library(FactoMineR)
library(missMDA)
library(data.table)
MyData <- fread('https://drive.google.com/open?
id=1y1YbIXtUssEBqmMSEbiQGcoV5j2Bz31k')
row.names(MyData) <- MyData$ID
MyData [1] <- NULL
Mydata_frame <- data.frame(MyData)
# Compute PCA with ncp = 3 (Variate based on the cluster number)
Mydata_frame.pca <- PCA(Mydata_frame, ncp = 2, graph = FALSE)
# Compute hierarchical clustering on principal components
Mydata.hcpc <- HCPC(Mydata_frame.pca, graph = FALSE)
Mydata.hcpc$desc.var$quanti
v.test Mean in category
Overall mean sd in category Overall sd p.value
CD8RAnegDRpos 12.965378 -0.059993483
-0.3760962775 0.46726224 0.53192037 1.922798e-38
TregRAnegDRpos 12.892725 0.489753272
0.1381306362 0.46877083 0.59502553 4.946490e-38
mTregCCR6pos197neg195neg 12.829277 1.107851623
0.6495813704 0.48972987 0.77933283 1.124088e-37
CD8posCCR6neg183neg194neg 12.667318 1.741757598
1.1735140264 0.45260338 0.97870842 8.972977e-37
mTregCCR6neg197neg195neg 12.109074 1.044905184
0.6408258230 0.51417779 0.72804665 9.455537e-34
CD8CD8posCD4neg 11.306215 0.724115486
0.4320918842 0.49823677 0.56351333 1.222504e-29
CD8posCCR6pos183pos194neg 11.226390 -0.239967805
-0.4982954123 0.49454619 0.50203520 3.025904e-29
TconvRAnegDRpos 11.011114 -0.296585038
-0.5279707475 0.44863446 0.45846770 3.378002e-28

How to check via callbacks if alpha is decreasing? + How to load all cores during training?

I'm training doc2vec, and using callbacks trying to see if alpha is decreasing over training time using this code:
class EpochSaver(CallbackAny2Vec):
'''Callback to save model after each epoch.'''
def __init__(self, path_prefix):
self.path_prefix = path_prefix
self.epoch = 0
os.makedirs(self.path_prefix, exist_ok=True)
def on_epoch_end(self, model):
savepath = get_tmpfile(
'{}_epoch{}.model'.format(self.path_prefix, self.epoch)
)
model.save(savepath)
print(
"Model alpha: {}".format(model.alpha),
"Model min_alpha: {}".format(model.min_alpha),
"Epoch saved: {}".format(self.epoch + 1),
"Start next epoch"
)
self.epoch += 1
def train():
workers = multiprocessing.cpu_count()*4
model = Doc2Vec(
DocIter(),
vec_size=600, alpha=0.03, min_alpha=0.00025, epochs=20,
min_count=10, dm=1, hs=1, negative=0, workers=workers,
callbacks=[EpochSaver("./checkpoints")]
)
print(
"HS", model.hs, "Negative", model.negative, "Epochs",
model.epochs, "Workers: ", model.workers, "Model alpha:
{}".format(model.alpha)
)
And while training I see that alpha is not changing over time. On each callback I see alpha = 0.03.
Is it possible to check if alpha is decreasing? Or it really not decreasing at all during training?
One more question:
How can I benefit from all my cores while training doc2vec?
As we can see, each core is not loaded more than +-30%.
The model.alpha property only holds the initially-configured starting-alpha – it's not updated to the effective learning-rate through training.
So, even if the value is being decreased properly (and I expect that it is), you wouldn't see it in the logging you've added.
Separate observations about your code:
in gensim versions at least through 3.5.0, maximum training throughput is most often reached with some value for workers between 3 and the number of cores – but usually not the full number of cores (if it's higher than 12) or larger. So workers=multiprocessing.cpu_count()*4 is likely going to much slower than what you could achieve with a lower number.
if your corpus is large enough to support 600-dimensional vectors, and discarding words with fewer than min_count=10 examples, negative sampling may work faster and get better results than the hs mode. (The pattern in published work seems to be to prefer negative-sampling with larger corpuses.)

3-layered Neural network doesen't learn properly

So, I'm trying to implement a neural network with 3 layers in python, however I am not the brightest person so anything with more then 2 layers is kinda difficult for me. The problem with this one is that it gets stuck at .5 and does not learn I have no actual clue where it went wrong. Thank you for anyone with the patience to explain the error to me. (I hope the code makes sense)
import numpy as np
def sigmoid(x):
return 1/(1+np.exp(-x))
def reduce(x):
return x*(1-x)
l0=[np.array([1,1,0,0]),
np.array([1,0,1,0]),
np.array([1,1,1,0]),
np.array([0,1,0,1]),
np.array([0,0,1,0]),
]
output=[0,1,1,0,1]
syn0=np.random.random((4,4))
syn1=np.random.random((4,1))
for justanumber in range(1000):
for i in range(len(l0)):
l1=sigmoid(np.dot(l0[i],syn0))
l2=sigmoid(np.dot(l1,syn1))
l2_err=output[i]-l2
l2_delta=reduce(l2_err)
l1_err=syn1*l2_delta
l1_delta=reduce(l1_err)
syn1=syn1.T
syn1+=l0[i].T*l2_delta
syn1=syn1.T
syn0=syn0.T
syn0+=l0[i].T*l1_delta
syn0=syn0.T
print l2
PS. I know that it might be a piece of trash as a script but that is why I asked for assistance
Your computations are not fully correct. For example, the reduce is called on the l1_err and l2_err, where it should be called on l1 and l2.
You are performing stochastic gradient descent. In this case with such few parameters, it oscilates hugely. In this case use a full batch gradient descent.
The bias units are not present. Although you can still learn without bias, technically.
I tried to rewrite your code with minimal changes. I have commented your lines to show the changes.
#!/usr/bin/python3
import matplotlib.pyplot as plt
import numpy as np
def sigmoid(x):
return 1/(1+np.exp(-x))
def reduce(x):
return x*(1-x)
l0=np.array ([np.array([1,1,0,0]),
np.array([1,0,1,0]),
np.array([1,1,1,0]),
np.array([0,1,0,1]),
np.array([0,0,1,0]),
]);
output=np.array ([[0],[1],[1],[0],[1]]);
syn0=np.random.random((4,4))
syn1=np.random.random((4,1))
final_err = list ();
gamma = 0.05
maxiter = 100000
for justanumber in range(maxiter):
syn0_del = np.zeros_like (syn0);
syn1_del = np.zeros_like (syn1);
l2_err_sum = 0;
for i in range(len(l0)):
this_data = l0[i,np.newaxis];
l1=sigmoid(np.matmul(this_data,syn0))[:]
l2=sigmoid(np.matmul(l1,syn1))[:]
l2_err=(output[i,:]-l2[:])
#l2_delta=reduce(l2_err)
l2_delta=np.dot (reduce(l2), l2_err)
l1_err=np.dot (syn1, l2_delta)
#l1_delta=reduce(l1_err)
l1_delta=np.dot(reduce(l1), l1_err)
# Accumulate gradient for this point for layer 1
syn1_del += np.matmul(l2_delta, l1).T;
#syn1=syn1.T
#syn1+=l1.T*l2_delta
#syn1=syn1.T
# Accumulate gradient for this point for layer 0
syn0_del += np.matmul(l1_delta, this_data).T;
#syn0=syn0.T
#syn0-=l0[i,:].T*l1_delta
#syn0=syn0.T
# The error for this datpoint. Mean sum of squares
l2_err_sum += np.mean (l2_err ** 2);
l2_err_sum /= l0.shape[0]; # Mean sum of squares
syn0 += gamma * syn0_del;
syn1 += gamma * syn1_del;
print ("iter: ", justanumber, "error: ", l2_err_sum);
final_err.append (l2_err_sum);
# Predicting
l1=sigmoid(np.matmul(l0,syn0))[:]# 1 x d * d x 4 = 1 x 4;
l2=sigmoid(np.matmul(l1,syn1))[:] # 1 x 4 * 4 x 1 = 1 x 1
print ("Predicted: \n", l2)
print ("Actual: \n", output)
plt.plot (np.array (final_err));
plt.show ();
The output I get is:
Predicted:
[[0.05214011]
[0.97596354]
[0.97499515]
[0.03771324]
[0.97624119]]
Actual:
[[0]
[1]
[1]
[0]
[1]]
Therefore the network was able to predict all the toy training examples. (Note in real data you would not like to fit the data at its best as it leads to overfitting). Note that you may get a bit different result, as the weight initialisations are different. Also, try to initialise the weight between [-0.01, +0.01] as a rule of thumb, when you are not working on a specific problem and you specifically know the initialisation.
Here is the convergence plot.
Note that you do not need to actually iterate over each example, instead you can do matrix multiplication at once, which is much faster. Also, the above code does not have bias units. Make sure you have bias units when you re-implement the code.
I would recommend you go through the Raul Rojas' Neural Networks, a Systematic Introduction, Chapter 4, 6 and 7. Chapter 7 will tell you how to implement deeper networks in a simple way.

Translating glmer (binomial) into jags to include a correlated random effect (time)

Context:
I have a 12 item risk assessment where individuals are given a rating from 0-4 (4 being the highest risk).  The risk assessment can be done multiple times for each individual (max = 19, but most only have less than 5 measurements).
The baseline level of risk varies by individual so I am looking for a random intercepts model, but also need to reflect the dynamic nature of the risk ie adding 'time' as a random coefficient.
The outcome is binary:
further offending (FO.bin) which occurs at the measurement level and would mean that I am essentially looking at what dynamic changes have occurred within one or more of the 12 items and how they have contributed to the individual committing a further offence in the period between the measurements
Ultimately what I am essentially looking to do is to predict whether an individual will offend in the future, based on other's (who share the same characteristics) assessment history, contextual factors and factors which may change over time.  
Goal:
I wish to add to my 'basic' model by adding time-varying (level 1) and time-invariant (level 2) predictors:
Time varying include dummy variables around the criminal justice process such as non-compliance, going to court and spending time in custody.   These are reflected as being an 'event' which has occurred in the period between assessments
Time invariant includes dummy variables such as being female, being non-White, and continuous predictors such as age at time of first offence
I've managed to set this up OK using lmer4 and have some potentially interesting results from adding the level 1 and level 2 predictors including where there are interactions and cross-interactions.  However, the complexity of the enhanced models is throwing up all kinds of warning messages including ones about failing to converge.   I therefore feel that it would be appropriate to switch to a Bayesian framework using Rjags so that I can feel more confident about my findings. 
The Problem:
Basically it is one of translation.  This is my 'basic' model based on time and the 12 items in the risk assessment using lme4:
Basic_Model1 <- glmer(BinaryResponse ~ item1 + item2 + item3 + ... + item12 + time + (1+time|individual), data=data, family=binomial)
This is my attempt to translate this into a BUGS model:
# the number of Risk Assessments = 552
N <-nrow(data)                                                            
# number of Individuals (individual previously specified) = 88
J <- length(unique(Individual))                                           
# the 12 items (previously specified)
Z <- cbind(item1, item2, item3, item4, ... item12)                        
# number of columns = number of predictors, will increase as model enhanced
K <- ncol(Z)                                                              
## Store all data needed for the model in a list
jags.data1 <- list(y = FO.bin, Individual =Individual, time=time, Z=Z, N=N, J=J, K=K)                   
model1 <- function() {
    for (i in 1:N) {
    y[i] ~ dbern(p[i])
    logit(p[i]) <- a[Individual[i]] + b*time[i]
  }
 
  for (j in 1:J) {
    a[j] ~ dnorm(a.hat[j],tau.a)
    a.hat[j]<-mu.a + inprod(g[],Z[j,])
  }
  b ~ dnorm(0,.0001)
  tau.a<-pow(sigma.a,-2)
  sigma.a ~ dunif(0,100)
 
  mu.a ~ dnorm (0,.0001)
  for(k in 1:K) {
    g[k]~dnorm(0,.0001)
  }
}
write.model(model1, "Model_1.bug")
Looking at the output, my gut feeling is that I've not added the varying coefficient for time and that what I have done so far is only the equivalent of
Basic_Model2 <- glmer(BinaryResponse ~ item1 + item2 + item3 + ... + item12 + time + (1|individual), data=data, family=binomial)
How do I tweak my BUGS code to reflect time as a varying co-efficient ie Basic_Model1 ? 
Based on the examples I have managed to find, I know that I need to make an additional specification in the J loop so that I can monitor the U[j], and there is a need to change the second part of the logit statement involving time, but its got to the point where I can't see the wood for the trees!
I'm hoping that someone with a lot more expertise than me can point me in the right direction. Ultimately I am looking to expand the model by adding additional level 1 and level 2 predictors. Having looked at these using lme4, I anticipate having to specify interactions cross-level interactions, so I am looking for an approach which is flexible enough to expand in this way. I'm very new to coding so please be gentle with me!
For that kind of case you can use an autoregressive gaussian model (CAR) for the time. As your tag is winbugs (or openbugs), you can use function car.normal as follows. This code needs to be adapted to your dataset !
Data
y should be a matrix with observations in line and time in columns. If you do not have same number of time for each i, just put NA values.
You also need to define the parameters of the temporal process. This is the matrix of neighborhood with the weights. I am sorry, but I do not totally remember how to create it... For autoregressive of order one, this should be something like:
jags.data1 <- list(
# Number of time points
sumNumNeigh.tm = 14,
# Adjacency but I do not remember how it works
adj.tm = c(2, 1, 3, 2, 4, 3, 5, 4, 6, 5, 7, 6, 8, 7),
# Number of neighbours to look at for order 1
num.tm = c(1, 2, 2, 2, 2, 2, 2, 1),
# Matrix of data ind / time
y = FO.bin,
# You other parameters
Individual =Individual, Z=Z, N=N, J=J, K=K)
Model
model1 <- function() {
for (i in 1:N) {
for (t in 1:T) {
y[i,t] ~ dbern(p[i,t])
# logit(p[i]) <- a[Individual[i]] + b*time[i]
logit(p[i,t]) <- a[Individual[i]] + b*U[t]
}}
# intrinsic CAR prior on temporal random effects
U[1:T] ~ car.normal(adj.tm[], weights.tm[], num.tm[], prec.nu)
for(k in 1:sumNumNeigh.tm) {weights.tm[k] <- 1 }
# prior on precison of temporal random effects
prec.nu ~ dgamma(0.5, 0.0005)
# conditional variance of temporal random effects
sigma2.nu <- 1/prec.nu
for (j in 1:J) {
a[j] ~ dnorm(a.hat[j],tau.a)
a.hat[j]<-mu.a + inprod(g[],Z[j,])
}
b ~ dnorm(0,.0001)
tau.a<-pow(sigma.a,-2)
sigma.a ~ dunif(0,100)
mu.a ~ dnorm (0,.0001)
for(k in 1:K) {
g[k]~dnorm(0,.0001)
}
}
For your information, with JAGS, you would need to code yourself the CAR model using dmnorm.

multiple definitions of stochastic node binomial mixture model

I know that this is a question that has been asked before and I have looked at the answers, but they don't seem to apply to my question. I've written a binomial mixture model that includes the usage of a covariate in WinBUGS. The model is syntactically correct and the data loads fine, but I get the error 'multiple definitions of node 'lambda[1]'. Lambda is not defined within the data and it's subscripted to provide an estimate for every year there is data.
My code is the following:
model {
#Priors
for (k in 1:7) {
lambda[k]~dnorm(0,0.01)
p[k]~dunif(0,1)
}
alpha0~dunif(-10,10)
alpha1~dunif(-10,10)
beta0~dunif(-10,10)
beta1~dunif(-10,10)
#Likelihood
#Ecological model for true abundance (process model)
for (k in 1:7) { #Loop over years
lambda[k]<-exp(alpha.lam[k])
for (i in 1:R) { #Loop over R sites
N[i,k]~dpois(lambda[k]) #Abundance
log(lambda[k])<-alpha0+alpha1*well[i,k]
#Observation model for replicated counts
for (j in 1:T) { #Loop over repeated counts within a year
y[i,j,k] ~dbin(p[k],N[i,k]) #Detection
p[k]<-exp(lp[k])/(1+exp(lp[k]))
lp[k]<-beta0+beta1*well[i,k]
#Assess model fit using Chi-squared discrepancy
#Compute fit statistic "E" for observed data
eval[i,j,k]<-p[k]*N[i,k] #Expected values
E[i,j,k]<-pow((y[i,j,k]-eval[i,j,k]),2)/(eval[i,j,k]+0.5)
#Generate replicate data and compute fit stats for them
y.new[i,j,k]~dbin(p[k],N[i,k])
E.new[i,j,k]<-pow((y.new[i,j,k]-eval[i,j,k]),2)/(eval[i,j,k]+0.5)
}#j
}#i
}#k
#Derived and other quantities
for(k in 1:7) {
totalN[k]<-sum(N[,k]) #Total pop. size across all sites
mean.abundance[k]<-exp(alpha.lam[k])
}
fit<-sum(E[,,])
fit.new<-sum(E.new[,,])
}
#Data
list(R=669,T=3)
Can anyone tell me why I'm getting this error. Many thanks in advance.