Hierarchical Bayes with bayesm: two second-levels - hierarchical-data

Currently I am implementing a Hierarchical Bayes model with panel data for I stores with T weeks, and where my dependent variable is sales of a brand. I aim to relate store-characteristics to the explanatory variables I am using, but I wish to have two separate second-levels. My model looks like this:
y = intercept + alpha * X + beta * W + error term,
where I would like to introduce the second levels:
alpha = lambda1 * Z + error
beta = lambda2 * Z + error,
where Z contains the store-characteristics. However, as far as I know, I can only use one second-level when using rhierLinearModel in bayesm.
Does anybody know how I can specify the model or adjust the code to obtain two second-levels?
Thank you very much in advance!

Related

why the results from the joint_tests function (emmeans package) do not show one of the interactions of the model?

I run a GLMM_adaptive model (I am doing a resource selection function) and I am using the joint_tests function (emmeans package) to compute joint tests of the terms in the model. The problem is that one of the interactions does not appear in the results.
The model is:
mod.hinc <- mixed_model(fixed = Used ~ scale(ndvi) * season * vegfactor +
scale(ndvi^2) + scale(distance^2) + scale(distance) * season,
random = ~ 1 | id, data = hin.c,
family = binomial(link="logit"))
After running the model I run the joint_tests function:
install.packages("emmeans")
library(emmeans)
joint_tests(mod.hinc)
And this is the result:
joint_tests(mod.hinc)
model term df1 df2 F.ratio p.value
ndvi 1 Inf 36.465 <.0001
season 3 Inf 22.265 <.0001
vegfactor 4 Inf 4.548 0.0011
distance 1 Inf 33.939 <.0001
ndvi:season 3 Inf 13.826 <.0001
ndvi:vegfactor 4 Inf 8.500 <.0001
season:vegfactor 12 Inf 6.544 <.0001
ndvi:season:vegfactor 12 Inf 5.165 <.0001
I cannot find the reason why the interaction scale(distance)*season does not appear in the results.
Any help on that issue is welcome. I can provide more details about the model if is required.
Thank you very much in advance.
Juan
The short answer is that distance:season is not shown because it came up with zero d.f. for the associated interaction contrasts. You could verify this by running joint_tests(mod.hinc, show0df = TRUE).
Why it has 0 d.f. is less clear. However, that is not the only problem here. You have to be extremely careful with numeric predictors when using joint_tests(); it does not do a model ANOVA; instead, as documented, it constructs a reference grid from the fitted model and performs joint tests of interaction contrasts related to the predictors. With numeric predictors, the results depend on the reference grid used.
In this particular instance, the model includes quadratic effects of ndvi and distance; however, the default reference grid is constructed using the range of the covariates -- only two distinct values. Thus, we can pick up the effects of the overall linear trends, but not the curvature effects implied by the quadratic terms. That's why only 1 d.f. of those factors' main effects are tested. There are really 2 d.f. in the effects of ndvi and distance. In order to capture all of those effects, we need to have at least three distinct values of these covariates in the reference grid. One way (not the only way) to accomplish that is to reduce the covariates to their means, plus or minus 1 SD -- which can be accomplished via this code:
meanpm1sd <- function(x)
c(mean(x) - sd(x), mean(x), mean(x) + sd(x))
joint_tests(mod.hinc, cov.reduce = meanpm1sd)
This will yield a different set of joint tests that likely will include 2-d.f. tests of ndvi and distance. But I don't know if you will still have some interactions missing due to zero-d.f. dimensionalities.
You can look directly at the estimates being tested in detail if you have any questions about what those effects are. For example, for season:distance,
### construct the needed reference grid once and for all
RG <- ref_grid(mod.h1nc, cov.reduce = meanpm1sd)
EMM <- emmeans(RG, ~ season * distance)
CON <- contrast(EMM, interaction = "consec")
EMM ### see estimates
CON ### see interaction contrasts
test(CON, joint = TRUE)
I hope this helps shed some light on what is going on.

Dynamic force transmission simulation

I have been working on this all day but I haven't figured it out yet. So I thought I may as well ask on here and see if someone can help.
The problem is as follow:
----------
F(input)(t) --> | | --> F(output)(t)
----------
Given a sample with a known length, density, and spring constant (or young's modulus), find the 'output' force against time when a known variable force is applied at the 'input'.
My current solution can already discretise the sample into finite elements, however I am struggling to figure out how the force should transmit given that the change in transmission speed in the material changes itself with respect to the force (using equation c = sqrt(force*area/density)).
If someone could point me to a solution or any other helpful resources, it would be highly appreciated.
A method for applying damping to the system would also be helpful but I should be able to figure out that part myself. (losses to the environment via sound or internal heating)
I will remodel the porbem in the following way:
___ ___
F_input(t) --> |___|--/\/\/\/\/\/\/\/\--|___|
At time t=0 the system is in equilibrium, the distance between the two objects is L, the mass of the left one (object 1) is m1 and the mass of the right one (object 2) is m2.
___ ___
F_input(t) --> |<-x->|___|--/\/\/\/\/\/-|<-y->|___|
During the application of the force F_input(t), at time t > 0, denote by x the oriented distance of the position of object 1 from its original position at time t=0. Similarly, at time t > 0, denote by y the oriented distance of the position of object 2 from its original position at time t=0 (see the diagram above). Then the system is subject to the following system of ordinary differential equations:
x'' = -(k/m1) * x + (k/m2) * y + F_input(t)/m2
y'' = (k/m2) * x - (k/m2) * y
When you solve it, you get the change of x and y with time, i.e. you get two functions x = x(t), y = y(t). Then, the output force is
F_output(t) = m2 * y''(t)
The problem isn't well defined at all. For starters for F_out to exist, there must be some constraint it must obey. Otherwise, the system will have more unknowns than equations.
The discretization will lead you to a system like
M*xpp = -K*x + F
with m=ρ*A*Δx and k=E*A/Δx
But to solve this system with n equations, you either need to know F_in and F_out, or prescribe the motion of one of the nodes, like x_n = 0, which would lead to xpp_n = 0
As far as damping, usually, you employ proportional damping, with a damping matrix proportional to the stiffness matrix D = α*K multiplied by the vector of speeds.

Regarding Time scale issue in Netlogo

I am new user of netlogo. I have a system of reactions (converted to Ordinary Differential Equations), which can be solved using Matlab. I want to develop the same model in netlogo (for comparison with matlab results). I have the confusion regarding time/tick because netlogo uses "ticks" for increment in time, whereas Matlab uses time in seconds. How to convert my matlab sec to number of ticks? Can anyone help me in writing the code. The model is :
A + B ---> C (with rate constant k1 = 1e-6)
2A+ C ---> D (with rate constant k2 = 3e-7)
A + E ---> F (with rate constant k3 = 2e-5)
Initial values are A = B = C = 500, D = E = F = 10
Initial time t=0 sec and final time t=6 sec
I have a general comment first, NetLogo is intended for agent-based modelling. ABM has multiple entities with different characteristics interacting in some way. ABM is not really an appropriate methodology for solving ODEs. If your goal is to simply build your model in something other than Matlab for comparison rather than specifically requiring NetLogo, I can recommend Vensim as more appropriate. Having said that, you can build the model you want in NetLogo, it is just very awkward.
NetLogo handles time discretely rather than continuously. You can have any number of ticks per second (I would suggest 10 and then final time is 60 ticks). You will need to convert your equations into a discrete form, so your rates would be something like k1-discrete = k1 / 10. You may have precision problems with very small numbers.

Trying to find the point when two functions differ from each other a 5% with Matlab

As I've written in the title, I'm trying to find the exact distance (dimensionless distance in this case) when two functions start to differ from each other a 5% of the Y-axis. The two functions intersect at the value of 1 in the X-axis and I need to find the described distance before the intersection, not after (i.e., it must be less than 1). I've written a Matlab code for you to see the shape of the functions and the following calculations which I'm trying to make them work but they don't, I don't know why. "Explicit solution could not be found".
I don't know if I explained it clearly. Please let me know if you need a more detailed explanation.
I hope you can throw some light in this issue.
Thank you so much in advanced.
r=0:0.001:1.2;
ro=0.335;
rt=r./ro;
De=0.3534;
k=2.8552;
B=(2*k/De)^0.5;
Fm=2.*De.*B.*ro.*[1-exp(B.*ro.*(1-rt))].*exp(B.*ro.*(1-rt));
A=5;
b=2.2347;
C=167.4692;
Ftt=(C.*(exp(-b.*rt).*((b.^6.*rt.^5)./120 + (b.^5.*rt.^4)./24 + (b.^4.*rt.^3)./6 + (b.^3.*rt.^2)./2 + b.^2.*rt + b) - b.*exp(-b.*rt).*((b.^6.*rt.^6)./720 + (b.^5.*rt.^5)./120 + (b.^4.*rt.^4)./24 + (b.^3.*rt.^3)./6 + (b.^2.*rt.^2)./2 + b.*rt + 1)))./rt.^6 - (6.*C.*(exp(-b.*rt).*((b.^6.*rt.^6)./720 + (b.^5.*rt.^5)./120 + (b.^4.*rt.^4)./24 + (b.^3.*rt.^3)./6 + (b.^2.*rt.^2)./2 + b.*rt + 1) - 1))./rt.^7 - A.*b.*exp(-b.*rt);
plot(rt,-Fm,'red')
axis([0 2 -1 3])
xlabel('Dimensionless distance')
ylabel('Force, -dU/dr')
hold on
plot(rt,-Ftt,'green')
clear rt
syms rt
%assume(0<rt<1)
r1=solve((Fm-Ftt)/Ftt==0.05,rt)
r2=solve((Ftt-Fm)/Fm==0.05,rt)
Welcome to the crux of floating point data. The reason why is because for the values of r that you are providing, the exact solution of 0.05 may be in between two of the values in your r array and so you won't be able to get an exact solution. Also, FWIW, your equation may never generate a solution of 0.05, which is why you're getting that error too. Either way, doing that explicit solve on floating point data is never recommended, unless you know very well how your data are shaped and what values you expect for the output of the function you're applying the data to.
As such, it's always recommended that you find the nearest value that satisfies your condition. As such, you should do something like this:
[~,ind] = min(abs((Fm-Ftt)./Ftt - 0.05));
r1 = r(ind);
The first line will find the nearest location in your r array that satisfies the 5% criterion. The next line of code will then give you the value that is in your r array that satisfies this. You can do the same with r2 by:
[~,ind2] = min(abs((Ftt-Fm)./Fm - 0.05));
r2 = r(ind2);
What the above code is basically doing is that it is trying to find at what point in your array would the difference between your data and 5% be 0. In other words, which point in your r array would be close enough to make the above relation equal to 0, or essentially when it is as close to 5% as possible.
If you want to improve this, you can always change the step size of r... perhaps make it 0.00001 or something. However, the smaller the step size, the larger your array and you'll eventually run out of memory!

How to generate random matlab vector with these constraints

I'm having trouble creating a random vector V in Matlab subject to the following set of constraints: (given parameters N,D, L, and theta)
The vector V must be N units long
The elements must have an average of theta
No 2 successive elements may differ by more than +/-10
D == sum(L*cosd(V-theta))
I'm having the most problems with the last one. Any ideas?
Edit
Solutions in other languages or equation form are equally acceptable. Matlab is just a convenient prototyping tool for me, but the final algorithm will be in java.
Edit
From the comments and initial answers I want to add some clarifications and initial thoughts.
I am not seeking a 'truly random' solution from any standard distribution. I want a pseudo randomly generated sequence of values that satisfy the constraints given a parameter set.
The system I'm trying to approximate is a chain of N links of link length L where the end of the chain is D away from the other end in the direction of theta.
My initial insight here is that theta can be removed from consideration until the end, since (2) in essence adds theta to every element of a 0 mean vector V (shifting the mean to theta) and (4) simply removes that mean again. So, if you can find a solution for theta=0, the problem is solved for all theta.
As requested, here is a reasonable range of parameters (not hard constraints, but typical values):
5<N<200
3<D<150
L==1
0 < theta < 360
I would start by creating a "valid" vector. That should be possible - say calculate it for every entry to have the same value.
Once you got that vector I would apply some transformations to "shuffle" it. "Rejection sampling" is the keyword - if the shuffle would violate one of your rules you just don't do it.
As transformations I come up with:
switch two entries
modify the value of one entry and modify a second one to keep the 4th condition (Theoretically you could just shuffle two till the condition is fulfilled - but the chance that happens is quite low)
But maybe you can find some more.
Do this reasonable often and you get a "valid" random vector. Theoretically you should be able to get all valid vectors - practically you could try to construct several "start" vectors so it won't take that long.
Here's a way of doing it. It is clear that not all combinations of theta, N, L and D are valid. It is also clear that you're trying to simulate random objects that are quite complex. You will probably have a hard time showing anything useful with respect to these vectors.
The series you're trying to simulate seems similar to the Wiener process. So I started with that, you can start with anything that is random yet reasonable. I then use that as a starting point for an optimization that tries to satisfy 2,3 and 4. The closer your initial value to a valid vector (satisfying all your conditions) the better the convergence.
function series = generate_series(D, L, N,theta)
s(1) = theta;
for i=2:N,
s(i) = s(i-1) + randn(1,1);
end
f = #(x)objective(x,D,L,N,theta)
q = optimset('Display','iter','TolFun',1e-10,'MaxFunEvals',Inf,'MaxIter',Inf)
[sf,val] = fminunc(f,s,q);
val
series = sf;
function value= objective(s,D,L,N,theta)
a = abs(mean(s)-theta);
b = abs(D-sum(L*cos(s-theta)));
c = 0;
for i=2:N,
u =abs(s(i)-s(i-1)) ;
if u>10,
c = c + u;
end
end
value = a^2 + b^2+ c^2;
It seems like you're trying to simulate something very complex/strange (a path of a given curvature?), see questions by other commenters. Still you will have to use your domain knowledge to connect D and L with a reasonable mu and sigma for the Wiener to act as initialization.
So based on your new requirements, it seems like what you're actually looking for is an ordered list of random angles, with a maximum change in angle of 10 degrees (which I first convert to radians), such that the distance and direction from start to end and link length and number of links are specified?
Simulate an initial guess. It will not hold with the D and theta constraints (i.e. specified D and specified theta)
angles = zeros(N, 1)
for link = 2:N
angles (link) = theta(link - 1) + (rand() - 0.5)*(10*pi/180)
end
Use genetic algorithm (or another optimization) to adjust the angles based on the following cost function:
dx = sum(L*cos(angle));
dy = sum(L*sin(angle));
D = sqrt(dx^2 + dy^2);
theta = atan2(dy/dx);
the cost is now just the difference between the vector given by my D and theta above and the vector given by the specified D and theta (i.e. the inputs).
You will still have to enforce the max change of 10 degrees rule, perhaps that should just make the cost function enormous if it is violated? Perhaps there is a cleaner way to specify sequence constraints in optimization algorithms (I don't know how).
I feel like if you can find the right optimization with the right parameters this should be able to simulate your problem.
You don't give us a lot of detail to work with, so I'll assume the following:
random numbers are to be drawn from [-127+theta +127-theta]
all random numbers will be drawn from a uniform distribution
all random numbers will be of type int8
Then, for the first 3 requirements, you can use this:
N = 1e4;
theta = 40;
diffVal = 10;
g = #() randi([intmin('int8')+theta intmax('int8')-theta], 'int8') + theta;
V = [g(); zeros(N-1,1, 'int8')];
for ii = 2:N
V(ii) = g();
while abs(V(ii)-V(ii-1)) >= diffVal
V(ii) = g();
end
end
inline the anonymous function for more speed.
Now, the last requirement,
D == sum(L*cos(V-theta))
is a bit of a strange one...cos(V-theta) is a specific way to re-scale the data to the [-1 +1] interval, which the multiplication with L will then scale to [-L +L]. On first sight, you'd expect the sum to average out to 0.
However, the expected value of cos(x) when x is a random variable from a uniform distribution in [0 2*pi] is 2/pi (see here for example). Ignoring for the moment the fact that our limits are different from [0 2*pi], the expected value of sum(L*cos(V-theta)) would simply reduce to the constant value of 2*N*L/pi.
How you can force this to equal some other constant D is beyond me...can you perhaps elaborate on that a bit more?