'load data' issue in winbugs (bayesian hierarchical)

'load data' issue in winbugs (bayesian hierarchical) - hierarchical-data

I have a hierarchical linear model in Winbugs. Data is a longitudinal one and is made up of three categories(red = 1, blue = 2, white = 3)
k - total observations =280
Structure of the data is as follows:
N[] T[] logs[] logp[] cat[] rank[]
1 1 4.2 8.9 1 1
1 2 4.2 8.1 1 2
1 3 3.5 9.2 1 1
2 1 4.1 7.5 1 2
2 2 4.5 6.5 1 2
3 1 5.1 6.6 2 4
3 2 6.2 6.8 3 7
#N = school
#T = time
#logs = log(score)
#logp = log(average hours of inputs)
#rank - rank of school
#cat = section red, section blue, section white in school
My model is syntactically correct, but when I try to load data, I get error = 'expected square bracket at the end]'
model {
# N brands
# T time periods
for (k in 1:K){
for (i in 1:N) {
for (t in 1:T) {
logs[k,i,t] ~ dnorm(mu[k,i,t], tau)
mu[k,i,t] <- bcons +bprice*(logp[t] - logpricebar)
+ brank[cat[t]]*(rank[t] - rankbar)
}
}
}
# C categories
for (c in 1:C) {
brank[c] ~ dnorm(beta, taub)}
# priors
bcons ~ dnorm(0,1.0E-6)
bprice ~ dnorm(0,1.0E-6)
bad ~ dnorm(0,1.0E-6)
beta ~ dnorm(0,1.0E-6)
tau ~ dgamma(0.001,0.001)
taub ~dgamma(0.001,0.001)
}
I follow the standard process of loading data, I select N and then press 'load data' in dialogue box.
Can someone help figure me out the issue here?

Related

How to filter data with starting and ending conditions?

I'm trying to filter my data based on two conditions dependent on sequential dates.
I am looking for values below 2 for 5+ sequential dates,
with a "cushion period" of values 2 to 5 for up to 3 sequential days.
It would look something like this (sorry for the terrible excel attempt here):
Day 1 to Day 10 would be included and day 11 would not be. Days 6 to 8 would be considered the "cushion period." I hope this makes sense!!
Right now, I am able to get the cushion period (in the reprex) only but I cant figure out how to add the start and ending condition for values under 2 for 5 sequential dates to be included (the 5 days could be broken up with the cushion period inbetween but I feel like this might complicate things).
Any help would be GREATLY appreciated!
For my reprex (below), the dates that would be included in the final df are in blue (dates from 1/1/2000 to 1/9/2000, and 1/22/2000 to 1/30/2000) and the dates in grey would not be.
Reprex:
library("dplyr")
#Goal: include all values with values of 2 or less for 5 consecutive days and allow for a "cushion" period of values of 2 to 5 for up to 3 days
data <- data.frame(Date = c("2000-01-01", "2000-01-02", "2000-01-03", "2000-01-04", "2000-01-05", "2000-01-06", "2000-01-07", "2000-01-08", "2000-01-09", "2000-01-10", "2000-01-11", "2000-01-12", "2000-01-13", "2000-01-14", "2000-01-15", "2000-01-16", "2000-01-17", "2000-01-18", "2000-01-19", "2000-01-20", "2000-01-21", "2000-01-22", "2000-01-23", "2000-01-24", "2000-01-25", "2000-01-26", "2000-01-27", "2000-01-28", "2000-01-29", "2000-01-30"),
Value = c(2,3,4,5,2,2,1,0,1,8,7,9,4,5,2,3,4,5,7,2,6,0,2,1,2,0,3,4,0,1))
head(data)
#Goal: values should include dates from 1/1/2000 to 1/9/2000, and 1/22/2000 to 1/30/2000
#I am able to subset the "cushion period" but I'm not sure how to add the starting and ending conditions for it
attempt1 <- data %>%
group_by(group_id = as.integer(gl(n(),3,n()))) %>%
filter(Value <= 5 & Value >=3) %>%
ungroup() %>%
select(-group_id)
head(attempt1)

If I get it correctly, you need to keep groups of consecutive values that are below or equal to 5 with at least 5 consecutive values below or equal to 2 within it. Here's a way to do that, with some explanation:
library(dplyr)
data %>%
mutate(under_three = Value <= 2) %>%
# under_three = TRUE if Value is below or equal to 2
group_by(rl_two = data.table::rleid(Value <= 2)) %>%
# Group by sequence of values that are under_three
mutate(big = n() >= 5 & all(under_three)) %>%
# big = T if there are more 5 or more consecutive values that are below or equal to 2
group_by(rl_five = data.table::rleid(Value <= 5)) %>%
# ungroup by rl_two, and group by rl_five, i.e. consecutive values that are below or equal to 5
filter(any(big))
# keep from the data frame groups of rl_five if they have at least one big = T; remove other groups.
Output:
data %>%
ungroup() %>%
select(Date, Value)
Date Value
1 2000-01-01 2
2 2000-01-02 3
3 2000-01-03 4
4 2000-01-04 5
5 2000-01-05 2
6 2000-01-06 2
7 2000-01-07 1
8 2000-01-08 0
9 2000-01-09 1
10 2000-01-22 0
11 2000-01-23 2
12 2000-01-24 1
13 2000-01-25 2
14 2000-01-26 0
15 2000-01-27 3
16 2000-01-28 4
17 2000-01-29 0
18 2000-01-30 1

Table sort by month

I have a table in MATLAB with attributes in the first three columns and data from the fourth column onwards. I was trying to sort the entire table based on the first three columns. However, one of the columns (Column C) contains months ('January', 'February' ...etc). The sortrows function would only let me choose 'ascend' or 'descend' but not a custom option to sort by month. Any help would be greatly appreciated. Below is the code I used.
sortrows(Table, {'Column A','Column B','Column C'} , {'ascend' , 'ascend' , '???' } )

As #AnonSubmitter85 suggested, the best thing you can do is to convert your month names to numeric values from 1 (January) to 12 (December) as follows:
c = {
7 1 'February';
1 0 'April';
2 1 'December';
2 1 'January';
5 1 'January';
};
t = cell2table(c,'VariableNames',{'ColumnA' 'ColumnB' 'ColumnC'});
t.ColumnC = month(datenum(t.ColumnC,'mmmm'));
This will facilitate the access to a standard sorting criterion for your ColumnC too (in this example, ascending):
t = sortrows(t,{'ColumnA' 'ColumnB' 'ColumnC'},{'ascend', 'ascend', 'ascend'});
If, for any reason that is unknown to us, you are forced to keep your months as literals, you can use a workaround that consists in sorting a clone of the table using the approach described above, and then applying to it the resulting indices:
c = {
7 1 'February';
1 0 'April';
2 1 'December';
2 1 'January';
5 1 'January';
};
t_original = cell2table(c,'VariableNames',{'ColumnA' 'ColumnB' 'ColumnC'});
t_clone = t_original;
t_clone.ColumnC = month(datenum(t_clone.ColumnC,'mmmm'));
[~,idx] = sortrows(t_clone,{'ColumnA' 'ColumnB' 'ColumnC'},{'ascend', 'ascend', 'ascend'});
t_original = t_original(idx,:);

flatten matlab table by key

I have a large table whose entries are
KEY_A,KEY_B,VAL
where KEY_A and KEY_B are finite sets of keys. For arguments sake, we'll have 4 different KEY_B values and 4 different KEY_A values. And example table:
KEY_A KEY_B KEY_C
_____ _____ _________
1 1 0.45054
1 2 0.083821
1 3 0.22898
1 4 0.91334
2 1 0.15238
2 2 0.82582
2 3 0.53834
2 4 0.99613
3 1 0.078176
3 2 0.44268
3 3 0.10665
3 4 0.9619
4 1 0.0046342
4 2 0.77491
4 3 0.8173
4 4 0.86869
4 5 1
I want to elegantly flatten the table into
KEY_A KEY_B_1 KEY_B_2 KEY_B_3 KEY_B_4 KEY_B_5
_____ _________ ________ _______ _______ _______
1 0.45054 0.083821 0.22898 0.91334 -1
2 0.15238 0.82582 0.53834 0.99613 -1
3 0.078176 0.44268 0.10665 0.9619 -1
4 0.0046342 0.77491 0.8173 0.86869 1
I'd like to be able to handle missing B values (set them to a default like -1), but I think if I get an elegant way to do this to start then such things will fall into place.
The actual table has millions of records, so I do want to use a vectorized call.
The line I've got (which doesn't handle int invalid 5) is:
cell2mat(arrayfun(#(x)[x,testtable{testtable.KEY_A==x,3}'],unique(testtable{:,1}),'UniformOutput',false))
But
it doesn't output a different table
If there are missing keys in the table, it doesn't handle that
I would think that this isn't that uncommon of an activity...has anyone done something like this before?

If the input table is T, then you could try this for the given case -
KEY_B_ =-1.*ones(max(T.KEY_A),max(T.KEY_B))
KEY_B_(sub2ind(size(KEY_B_),T.KEY_A,T.KEY_B)) = T.KEY_C
T1 = array2table(KEY_B_)
Output for the edited input -
T1 =
KEY_B_1 KEY_B_2 KEY_B_3 KEY_B_4 KEY_B_5
_________ ________ _______ _______ _______
0.45054 0.083821 0.22898 0.91334 -1
0.15238 0.82582 0.53834 0.99613 -1
0.078176 0.44268 0.10665 0.9619 -1
0.0046342 0.77491 0.8173 0.86869 1
Edit by MadScienceDreams: This answer lead me to write the following function, which will smash together pretty much any table based on the input keys. Enjoy!
function [ OT ] = flatten_table( T,primary_keys,secondary_keys,value_key,default_value )
%UNTITLED Summary of this function goes here
% Detailed explanation goes here
if nargin < 5
default_value = {NaN};
end
if ~iscell(default_value)
default_value={default_value};
end
if ~iscell(primary_keys)
primary_keys={primary_keys};
end
if ~iscell(secondary_keys)
secondary_keys={secondary_keys};
end
if ~iscell(value_key)
value_key={value_key};
end
primary_key_values = unique(T(:,primary_keys));
num_primary = size(primary_key_values,1);
[primary_key_map,primary_key_map] = ismember(T(:,primary_keys),primary_key_values);
secondary_key_values = unique(T(:,secondary_keys));
num_secondary = size(secondary_key_values,1);
[secondary_key_map,secondary_key_map] = ismember(T(:,secondary_keys),secondary_key_values);
%out =-1.*ones(max(T.KEY_A),max(T.KEY_B))
try
values = num2cell(T{:,value_key},2);
catch
values = num2cell(table2cell(T(:,value_key)),2);
end
if (~iscell(values))
values=num2cell(values);
end
OT=repmat(default_value,num_primary,num_secondary);
OT(sub2ind(size(OT),primary_key_map,secondary_key_map)) = values;
label_array = num2cell(cellfun(#(x,y)[x '_' mat2str(y)],...
repmat (secondary_keys,size(secondary_key_values,1),1),...
table2cell(secondary_key_values),'UniformOutput',false),1);
label_array = strcat(label_array{:});
OT = [primary_key_values,cell2table(OT,'VariableNames',label_array)];
end

Coffeescript - Improve algorithm for increasing grouping

The code below works but I am wondering if there is a better way that maybe uses some of the features of coffeescript that I am unfamiliar with.
The problem is this, I need to page items but the paging increases each time.
If I take the number 20 for example, it would create the following pages:
1 - 3
4 - 7
8 - 15
16 - 20
I have the following test and code which does pass:
module 'Remainder',
setup: ->
#remainder = 20
test 'splits remainder incrementally', ->
parts = #remainder.increasingSplit()
equal parts[0], '1 - 3', ''
equal parts[1], '4 - 7', ''
equal parts[2], '8 - 15', ''
equal parts[3], '16 - 20', ''
Number.prototype.increasingSplit = ->
start = 1
nextSplit = 3
parts = []
finished = false
while !finished
if nextSplit > #
parts.push "#{start} - #{#}"
break
parts.push "#{start} - #{nextSplit}"
start = nextSplit + 1
nextSplit = nextSplit * 2 + 1
parts

Without changing the algorithm too much, you can try this:
Number::increasingSplit = ->
start = 1
nextSplit = 3
parts = []
while start <= #
parts.push "#{start} - #{Math.min nextSplit, #}"
start = nextSplit + 1
nextSplit = nextSplit * 2 + 1
parts
The changes were:
replacing .prototype with ::,
removing of the finished variable (which was not being used effectively because the break anyway) and the break altogether and changing the condition to start <= #,
using only one parts.push <part>, with the minimum between nextSplit and # as the top.
Also, i'd advice against extending the Number prototype in this case. Extending the prototype of primitive types can sometimes cause weird problems, like:
Number::isFour = -> # is 4
console.log 4.isFour() # -> false
That happens because inside that function # will be a Number object instead of a primitive number, thus making the === 4 comparison always fail. That would not happen if you define isFour as a standalone function:
isFour = (n) -> n is 4
console.log isFour 4 # -> true
So, i'd prefer this version of incrasingSplit:
increasingSplit = (n) ->
start = 1
nextSplit = 3
parts = []
while start <= n
parts.push "#{start} - #{Math.min nextSplit, n}"
start = nextSplit + 1
nextSplit = nextSplit * 2 + 1
parts
Finally, if you don't mind recursion, you can go with a more FP-style algorithm :)
increasingSplit = (n, start = 1, nextSplit = 3) ->
if start > n
[]
else
part = "#{start} - #{Math.min nextSplit, n}"
rest = increasingSplit n, nextSplit + 1, nextSplit * 2 + 1
[part, rest...]

Calculations with Real Numbers, Verilog HDL

I noticed that Verilog rounds my real number results into integer results. For example when I look at simulator, it shows the result of 17/2 as 9. What should I do? Is there anyway to define something like a: output real reg [11:0] output_value ? Or is it something that has to be done by simulator settings?
Simulation only (no synthesis). Example:
x defined as a signed input and output_value defined as output reg.
output_value = ((x >>> 1) + x) + 5;
If x=+1 then output value has to be: 13/2=6.5.
However when I simulate I see output_value = 6.

Code would help, but I suspect your not dividing reals at all. 17 and 2 are integers, and so a simple statement like that will do integer division.
17 / 2 = 8 (not 9, always rounds towards 0)
17.0 / 2.0 = 8.5
In your second case
output_value = ((x >>> 1) + x) + 5
If x is 1, x >>> 1 is 0, not 0.5 because you've just gone off the bottom of the word.
output_value = ((1 >>> 1) + 1) + 5 = 0 + 1 + 5 = 6
There's nothing special about verilog here. This is true for the majority of languages.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

'load data' issue in winbugs (bayesian hierarchical) - hierarchical-data

Related

How to filter data with starting and ending conditions?

Table sort by month

flatten matlab table by key

Coffeescript - Improve algorithm for increasing grouping

Calculations with Real Numbers, Verilog HDL

Categories

Resources