Include Entry node in AQL Graph traversal - nosql

I'm using AQL to traverse through Graphs, right now that's my statement:
FOR v, e, p IN 1..1 ANY 'Bridges/1004'
GRAPH 'S_Graph'
FILTER not (p.vertices[1].IID != 'null' AND p.vertices[1].cls_name == "Bridge")
OR p.vertices[1].cls_name == "Node"
RETURN v
And the result are the Documents my Entry-Document Bridges/1004, but not the Entry-Document itself.
How is it possible to include the Entry-Document in the Query-Result?

Just change the traversal depth from 1..1 to 0..1 and that should include the initial node.
FOR v, e, p IN 0..1 ANY 'Bridges/1004'
GRAPH 'S_Graph'
FILTER not (p.vertices[1].IID != 'null' AND p.vertices[1].cls_name == "Bridge")
OR p.vertices[1].cls_name == "Node"
RETURN v
Also note that in your original query, if you return the path, it does include all the nodes in the path including the original node

Related

Huffman Encoding: How to find the path?

I have a tree with the most frequent letters at the top of the tree and I am trying to find the path to a node so I can turn a message into binary. In this program if I go left, I add 0 to path and, if I go right, I add 1 to the path until I find the node. But I have to go straight to the desired node which is not possible. The only thing I could think of is removing the last character or path if a node has no children, but it does not work if a node has grandchildren. Can someone help me on how to approach this? Thanks!
// global variables
String path;
int mark;
// encodes a message to binary
String encoder(char data) {
path = "";
mark = 0;
findPath(root, data);
return path;
}
// finds the path to a node
void findPath(TNode node, char data) {
if(node.data == data) {
mark = 1;
return;
}
if(mark==0 && node.left != null) {
path += 0;
findPath(node.left, data);
}
if(mark==0 && node.right != null) {
path += 1;
findPath(node.right, data);
}
if(mark==0 && node.left == null || node.right == null) {
path = path.substring(0, path.length() - 1);
}
}
This is not how Huffman coding works. Not at all. You assign a code to every symbol, and symbols can have different lengths.
An algorithm to determine the code: Take two symbols with the lowest frequency, say an and b. Combine them into one symbol x with higher frequency obviously, and when we are finished the code for a is the code for x plus a zero, and the code for b is the code for x plus a one. Then you look again for the two symbols with lowest frequency and combine them and so on. When you are down to two symbols you give them codes 0 and 1 and find all the other codes for symbols.
Example: a, b, c, d with frequencies 3, 47, 2 and 109. Combine a and c to x with frequency 5. Combine x and b to y with frequency 52. Then a = code 0, y = code 1. x = code 10 and b = code 11. Then a = code 100 and c = code 101.
You would not encode messages using the tree. You should instead traverse the entire tree once recursively, generating all of the codes for all of the symbols, and make a table of the symbols and their associated codes. Then you use that table to encode your messages.

Applying an operation on each element of nested list

I have a complex nested list (depth can be >2 also):
p:((`g;`d1`d2);(`r;enlist `e1);(`r;enlist `p1))
How to add an element to each element of the nested list but retaining the original structure; e.g. adding `h to each element of p to get the following :
((`g`h;(`d1`h;`d2`h));(`r`h;enlist `e1`h);(`r`h;enlist `p1`h))
I tried this but doesn't give what I want :
q)p,\:`h
((`g;`d1`d2;`h);(`r;enlist `e1;`h);(`r;enlist `p1;`h))
q)raze[p],\:`h
(`g`h;`d1`d2`h;`r`h;`e1`h;`r`h;`p1`h)
You can use .z.s to recursively go through the nested list and only append `h to lists of symbols:
q){$[0=type x;.z.s'[x];x,\:`h]}p
g h d1 h d2 h
`r`h ,`e1`h
`r`h ,`p1`h
For this function I have made the assumption that your nested lists will only contain symbols. It checks the type of the list, if it is not a mixed list then it appends `h to each element. If it is a mixed list then it passes each element of that list back into the function separately to check again.
Although not recursive (and so requires some knowledge about the shape of your nested list), a more conventional approach would be
q).[p;2#(::);,';`h]
g h d1 h d2 h
`r`h ,`e1`h
`r`h ,`p1`h
Though Thomas has already answered the question; In case you want to specify any other operation apart from append, you can use the following :
q)f:{`$ "_" sv string x,y}
q){[o;a;e] $[-11<>type e; .z.s [o;a] each e; o[e;a]] }[f;`h] each p
`g_h `d1_h`d2_h
`r_h ,`e1_h
`r_h ,`p1_h
or when f is assigned as append operation
q)f:{x,y}
q){[o;a;e] $[-11<>type e; .z.s [o;a] each e; o[e;a]] }[f;`h] each p
g h d1 h d2 h
`r`h ,`e1`h
`r`h ,`p1`h

Traversing a tree and assigning a subtree in Julia

I am trying to manipulate a tree in Julia. The tree is created as an object. All I want is substituting the one of the branches with another one. I can do it manually but can not do it by using a recursion function.
mutable struct ILeaf
majority::Any # +1 when prediction is correct
values::Vector # num_of_samples
indicies::Any # holds the index of training samples
end
mutable struct INode
featid::Integer
featval::Any
left::Union{ILeaf,INode}
right::Union{ILeaf,INode}
end
ILeafOrNode = Union{ILeaf,INode}
And my function for chaning the tree is (tree is original one where, by using LR_STACK, I am willing to change one of the branches and substitute it with the subtree. ) :
function traverse_and_assign(tree, subtree, lr_stack) # by using Global LR_stack
if top(lr_stack) == 0
tree = subtree
elseif top(lr_stack) == :LEFT
pop!(lr_stack)
return traverse_and_assign(tree.left, subtree, lr_stack)
else # right otherwise
pop!(lr_stack)
return traverse_and_assign(tree.right, lr_stack)
end
end
What happens is that I cannot change the original tree.
On the other hand :
tree.left.left = subtree
works perfectly fine.
What is wrong with my code ? Do I have to write a macro for this ?
B.R.
edit#1
In order to generate data :
n, m = 10^3, 5 ;
features = randn(n, m);
lables = rand(1:2, n);
edit#2
use 100 samples for training the decision tree :
base_learner = build_iterative_tree(labels, features, [1:20;])
then give other samples one by one :
i = 21
feature = features[21, :], label = labels[21]
gtree_stack, lr_stack = enter_iterate_on_tree(base_learner, feature[:], i, label[1])
get the indices of incorrect samples
ids = subtree_ids(gtree_stack)
build the subtree:
subtree = build_iterative_tree(l, f, ids)
update the original tree(base_learner):
traverse_and_assign(base_learner, subtree, lr_stack)
I still miss MWE but maybe I could help with one problem without it.
In Julia value is bind to variable. Parameters in functions are new variables. Let's do test what does it mean:
function test_assign!(tree, subtree)
tree = subtree
return tree
end
a = 4;
b = 5;
test_assign!(a, b) # return 5
show(a) # 4 ! a is not changed!
What happend? value 4 was bind to tree and value 5 was bind to subtree.
subtree's value (5) was bind to tree.
And nothing else! Means a is stil bound to 4.
How to could we change a? This will work:
mutable struct SimplifiedNode
featid::Integer
end
function test_assign!(tree, subtree)
tree.featid = subtree.featid
end
a = SimplifiedNode(4)
b = SimplifiedNode(5)
test_assign!(a, b)
show(a) # SimplifiedNode(5)
Why? What happend?
Value of a (which is something like pointer to mutable struct) is bind to tree and value of b is bound to subtree.
So a and tree are bound to same structure! Means that if we change that structure a is bind to changed structure.

Where in the sequence of a Probabilistic Suffix Tree does "e" occur?

In my data there are only missing data (*) on the right side of the sequences. That means that no sequence starts with * and no sequence has any other markers after *. Despite this the PST (Probabilistic Suffix Tree) seems to predict a 90% chance of starting with a *. Here's my code:
# Load libraries
library(RCurl)
library(TraMineR)
library(PST)
# Get data
x <- getURL("https://gist.githubusercontent.com/aronlindberg/08228977353bf6dc2edb3ec121f54a29/raw/c2539d06771317c5f4c8d3a2052a73fc485a09c6/challenge_level.csv")
data <- read.csv(text = x)
# Load and transform data
data <- read.table("thread_level.csv", sep = ",", header = F, stringsAsFactors = F)
# Create sequence object
data.seq <- seqdef(data[2:nrow(data),2:ncol(data)], missing = NA, right= NA, nr = "*")
# Make a tree
S1 <- pstree(data.seq, ymin = 0.05, L = 6, lik = TRUE, with.missing = TRUE)
# Look at first state
cmine(S1, pmin = 0, state = "N3", l = 1)
This generates:
[>] context: e
EX FA I1 I2 I3 N1 N2 N3 NR
S1 0.006821066 0.01107234 0.01218274 0.01208756 0.006821066 0.002569797 0.003299492 0.001554569 0.0161802
QU TR *
S1 0.01126269 0.006440355 0.9097081
How can the probability for * be 0.9097081 at the very beginning of the sequence, meaning after context e?
Does it mean that the context can appear anywhere inside a sequence, and that e denotes an arbitrary starting point somewhere inside a sequence?
A PST is a representation of a variable length Markov model (VLMC). As a classical Markov model a VLMC is assumed to be homogeneous (or stationary) meaning that the conditional probabilities of the outcome given the context are the same at each position in the sequence. In other words, the context can appear anywhere in the sequence. Actually, the search for contexts is done by exploring the tree that is supposed to apply anywhere in the sequences.
In your example, for l=1 (l is 1 + the length of the context), you look only for 0-length context, i.e., the only possible context is the empty sequence e. Your condition pmin=0, state=N3 (have a probability greater than 0 for N3) is equivalent to no condition at all. So you get the overall probability to observe each state. Because your sequences (with the missing states) are all of the same length, you would get the same results using TraMineR with
seqmeant(data.seq, with.missing=TRUE)/max(seqlength(data.seq))
To get the distribution at the first position, you can use TraMineR and look at the first column of the table of cross-sectional distributions at the successive positions returned by
seqstatd(data.seq, with.missing=TRUE)
Hope this helps.

Explanation of merge function

I saw some pseudocode in some old exam and I can't really figure out what it's doing.
Can anyone explain it to me?
A and B are BST's.
Foo(A,B)
if A= NULL
return B
if B != NULL
if value[A] > value[B]
return Foo(B,A)
left[B] <- Foo(right[A],left[B])
right[A] <- B
return A
This is a binary search tree merge routine. If either A or B is null (representing an empty tree), it returns the other. Otherwise, it makes sure that the root of A is less than the root of B; if the roots are in the wrong order, it recurses with the arguments swapped. Then, it recursively merges the right subtree of A and the left subtree of B, and attaches the result as the left subtree of B. Finally, it attaches B as the new right subtree of A and returns A.