Arangodb- Differences between filter and expandFilter in Foxx traversal - traversal

In traversal config options, there is two settings that seemed to do the same thing that is filter and expandFilter. is there any differences between them?

While filter is used to limit returned result of vertices using a traversal, expandFilter can exclude certain edges from the traversal.
filter: vertex filter function. The function signature is function (config, vertex, path). It may return one of the following values:
undefined: vertex will be included in the result and connected edges will be traversed
"exclude": vertex will not be included in the result and connected edges will be traversed
"prune": vertex will be included in the result but connected edges will not be traversed
[ "prune", "exclude" ]: vertex will not be included in the result and connected edges will not be returned
expandFilter: filter function applied on each edge/vertex combination determined by the expander. The function signature is function (config, vertex, edge, path). The function should return true if the edge/vertex combination should be processed, and false if it should be ignored.
This is documented in the ArangoDB Manual.

Related

Networkx create graph from adjacency matrix without edge weights

When I call G = nx.convert_matrix.from_numpy_array(A, create_using=nx.DiGraph), where A is a 0-1 adjacency matrix, the resulting graph automatically contains edge weights of 1.0 for each edge. How can I prevent this attribute from being added?
I realize I can write
for _,_,d in G.edges(data=True):
d.clear()
but I would prefer if the attributes were not added in the first place.
There is no way to do that with native networkx functions. This is how you can do it:
G = nx.empty_graph(0, nx.DiGraph)
G.add_nodes_from(range(A.shape[0]))
G.add_edges_from(((int(e[0]), int(e[1])) for e in zip(*A.nonzero())))
This is exactly how the nx.convert_matrix.from_numpy_array function is implemented internally. I got however rid of all controls, so be careful with this. Additional details can be found here

Get nodes from a graph condensation

I have an adjacency matrix adj and a cellarray nodeManes that contains names that will be given to the graph G that will be constructed from adj.
So I use G = digraph(adj,nodeNames); and I get the following graph :
Now, I want to find the strongly connected components in G and do a graph condensation so I use the following:
C = condensation(G);
p2 = plot(C);
and get this results :
So I have 6 strongly connected components, but my problem is that I lost the node names, I want to get something like:
Is that any way to get the nodes names in the result of the condentation?
I think the official documentation can take you to the right point:
Output Arguments
C - Condensation Graph
Condensation graph, returned as a digraph object. C is a directed
acyclic graph (DAG), and is topologically sorted. The node numbers in
C correspond to the bin numbers returned by conncomp.
Let's take a loot at conncomp:
conncomp(G) returns the connected components of graph G as bins. The
bin numbers indicate which component each node in the graph belongs to
Look at the examples... I think that if you use conncomp on your graph before using the condensation function, you will be able to rebuild your node names on your new graph with a little effort.

Define Nonconstant Boundary Conditions "partially" on an edge - MATLAB

I need to define Dirichlet and Neumann conditions in a plain stress problem only on part of one of the edges of a plate.
Matlab's help for defining nonconstant boundary conditions indicates that functions must be written such as:
applyBoundaryCondition(model,'edge',1,'r',#myrfun);
applyBoundaryCondition(model,'face',2,'g',#mygfun,'q',#myqfun);
applyBoundaryCondition(model,'edge',[3,4],'u',#myufun,'EquationIndex',[2,3]);
It moreover says that each function must have the following syntax.
function bcMatrix = myfun(region,state)
and finally that "region" is a structure containing the fields region.x (x-coordinate of the points), region.y (y-coordinate of the points), etc. if there are Neumann conditions (which is my case), then solvers pass the following data in the region structure: region.nx — (x-component of the normal vector at the evaluation points), etc. My questions are:
From where can I get the structure region?
How can I pass the argument that boundary conditions apply in one part of one of the edges?
Thanks!!
#Oliver,
1) I think you don't need to make the structure region, but make functions that are capable of using it. Since the boundary condition generally depends on the location, you will need region.x and region.y.
2) You can use region.x and region.y to make boundary conditions dependent on the location. This is one way of applying them "partially" if the type of boundary condition is the same. Otherwise, you will have to define the split in the boundary explicitly. This happens probably while defining the geometry of the problem.

How to modify vertex data when calling the mapTriplets method in Graphx of Spark

The mapTriplets operation in Graphx of Spark can transform the triplets into some other form as the definition describes:
def mapTriplets[ED2](map: EdgeTriplet[VD, ED] => ED2): Graph[VD, ED2]
My data is a sparse bipartite graph, and the vertices data of an edge will be updated during each iteration. For example, here is an edge (srcAttr, dstAttr, attr), the vertex of srcAttr and dstAttr will be modified according to attr. Therefore, what I need is to get all (srcAttr, dstAttr, attr) combinations, and use attr to update the vertices.
Graphx provides the mapTriplets method which can transform all (srcAttr, dstAttr, attr) combinations, but I cannot figure out how to modify vertex when executing this method.
So, is there any strategy that can modify the vertices when traversing all edges?
I cannot figure out how to modify vertex when executing this method
Because it is simply not possible. First of all GraphX data structures, same as other distributed data structures in Spark, are immutable. Moreover mapTriplets is designed to transforms edges not vertices.
is there any strategy that can modify the vertices when traversing all edges?
If you want to transform vertices using edge data then aggregateMessages should give you what you want. It takes two functions
one from EdgeContext to Unit, which can be used to send messages to the source and/or destination nodes
second one which reduces messages for each vertex
and returns a VertexRDD which can be further used to construct a new graph.

Cannot get clustering output Mahout

I am running kmeans in Mahout and as an output I get folders clusters-x, clusters-x-final and clusteredPoints.
If I understood well, clusters-x are centroid locations in each of iterations, clusters-x-final are final centroid locations, and clusteredPoints should be the points being clustered with cluster id and weight which represents probability of belonging to cluster (depending on the distance between point and its centroid). On the other hand, clusters-x and clusters-x-final contain clusters centroids, number of elements, features values of centroid and the radius of the cluster (distance between centroid and its farthest point.
How do I examine this outputs?
I used cluster dumper successfully for clusters-x and clusters-x-final from terminal, but when I used it clusteredPoints, I got an empty file? What seems to be the problem?
And how can I get this values from code? I mean, the centroid values and points belonging to clusters?
FOr clusteredPoint I used IntWritable as key, and WeightedPropertyVectorWritable for value, in a while loop, but it passes the loop like there are no elements in clusteredPoints?
This is even more strange because the file that I get with clusterDumper is empty?
What could be the problem?
Any help would be greatly appreciated!
I believe your interpretation of the data is correct (I've only been working with Mahout for ~3 weeks, so someone more seasoned should probably weigh in on this).
As far as linking points back to the input that created them I've used NamedVector, where the name is the key for the vector. When you read one of the generated points files (clusteredPoints) you can convert each row (point vector) back into a NamedVector and retrieve the name using .getName().
Update in response to comment
When you initially read your data into Mahout, you convert it into a collection of vectors with which you then write to a file (points) for use in the clustering algorithms later. Mahout gives you several Vector types which you can use, but they also give you access to a Vector wrapper class called NamedVector which will allow you to identify each vector.
For example, you could create each NamedVector as follows:
NamedVector nVec = new NamedVector(
new SequentialAccessSparseVector(vectorDimensions),
vectorName
);
Then you write your collection of NamedVectors to file with something like:
SequenceFile.Writer writer = new SequenceFile.Writer(...);
VectorWritable writable = new VectorWritable();
// the next two lines will be in a loop, but I'm omitting it for clarity
writable.set(nVec);
writer.append(new Text(nVec.getName()), nVec);
You can now use this file as input to one of the clustering algorithms.
After having run one of the clustering algorithms with your points file, it will have generated yet another points file, but it will be in a directory named clusteredPoints.
You can then read in this points file and extract the name you associated to each vector. It'll look something like this:
IntWritable clusterId = new IntWritable();
WeightedPropertyVectorWritable vector = new WeightedPropertyVectorWritable();
while (reader.next(clusterId, vector))
{
NamedVector nVec = (NamedVector)vector.getVector();
// you now have access to the original name using nVec.getName()
}
check the parameter named "clusterClassificationThreshold".
clusterClassificationThreshold should be 0.
You can check this http://mail-archives.apache.org/mod_mbox/mahout-user/201211.mbox/%3C50B62629.5020700#windwardsolutions.com%3E