I have the following case:
(p:p1)
V1 ---E1---\
\
V2 (with properties)
/
V1 ---E1---/
(p:p2)
So two vertices of class V1 are connected to another Vertex of type V2, both edges are of the same class E1 which has the property "p".
I would like a query that:
Targets a V1 instance (by rid)
Reads all V2 records (including all properties) that are connected to the selected V1 by the edge class E1
Also includes the property p from the connecting E1-edge in the returned documents.
I have tried with:
SELECT *,in('E1')[p] as p FROM (SELECT EXPAND(out('E1')) FROM <V1-rid>) UNWIND p
But that would of course give 2 results since the projection returns values from both the E1 edges. I only want to inlcude p form the edge between my selected V1 and V2.
I have some other connections to V2 as well which need to be included, but I wanted to specify the problem more precise to reduce the question complexity and I think (hope) they won't interfere with the answer.
Update
Clarification:
I would like the result to be whole V2-records with appended properties "p", so if V2 has properties v2p1:, v2p2, the resulting records should look something like:
{
"v2p1": <value>,
"v2p2": <value>,
"p": <value>
}
Where the last "p" value is from the edge and the other properties are from the actual V2 records.
You could use this query
SELECT FROM (
MATCH {CLASS:v1, AS:v1, WHERE: (#rid=#29:0)}.outE('e1'){AS:e1}.inV('e1'){AS:v2} RETURN v1, e1.p, v2
)
I hope that is clear enough.
try this:
select *,traversedEdge(-1).p as p from(traverse outE('E1'),inV('E1') from #21:0) where #class='V2'
Related
AKS = 1.17.9
Prometheus = 2.16.0
kube-state-metrics = 1.8.0
My use case: I want to alert when 1 of my persistent volumes are not in a "Bound" phase and only when this falls within a predefined set of namespaces.
This got me to my first attempt at joining Prometheus metrics - so, please bear with me : )
I opted to use the following to obtain the pv phase:
kube_persistentvolume_status_phase{phase="Bound",job="kube-state-metrics"}
Renders:
kube_persistentvolume_status_phase{instance="10.147.5.110:8080",job="kube-state-metrics",persistentvolume="pvc-33197ae6-d42a-777e-b8ca-efbd66a8750d",phase="Bound"} 1
kube_persistentvolume_status_phase{instance="10.147.5.110:8080",job="kube-state-metrics",persistentvolume="pvc-165d5006-erd4-481e-8acc-eed4a04a3bce",phase="Bound"} 1
This worked well, except for the fact that it does not include the namespace.
So I managed to determine the persistentvolumeclaim namespaces with this:
kube_persistentvolumeclaim_info{namespace=~"monitoring|vault"}
Renders:
kube_persistentvolumeclaim_info{instance="10.147.5.110:8080",job="kube-state-metrics",namespace="vault",persistentvolumeclaim="vault-file",storageclass="default",volumename="pvc-33197ae6-d42a-777e-b8ca-efbd66a8750d"} 1
kube_persistentvolumeclaim_info{instance="10.147.5.110:8080",job="kube-state-metrics",namespace="monitoring",persistentvolumeclaim="prometheus-prometheus-db-prometheus-prometheus-0",storageclass="default",volumename="pvc-165d5006-erd4-481e-8acc-eed4a04a3bce"} 1
So my idea was to join these sets with the matching values in the following fields:
(kube_persistentvolume_status_phase)persistentvolume
on
(kube_persistentvolumeclaim_info)volumename
BUT, if I understood it correctly you are only able to join two metrics sets on labels that match exactly (text and their values). I hence opted for the "instance" and "job" labels as these were common on both sides and matching.
kube_persistentvolume_status_phase{phase!="Bound",job="kube-state-metrics"} * on(instance,job) group_left(namespace) kube_persistentvolumeclaim_info{namespace=~"monitoring|vault"}
Renders:
Error executing query: found duplicate series for the match group {instance="10.147.5.110:8080" , job="kube-state-metrics"} on the right hand-side of the operation: [{__name__="kube_persistentvolumeclaim_info", instance="10.147.5.110:8080", job="kube-state-metrics", namespace="monitoring", persistentvolumeclaim="alertmanager-prometheusam-db-alertmanager-prometheusam-0", storageclass="default", volumename="pvc-b8406fb8-3262-7777-8da8-151815e05d75"}, {__name__="kube_persistentvolumeclaim_info", instance="10.147.5.110:8080", job="kube-state-metrics", namespace="vault", persistentvolumeclaim="vault-file", storageclass="default", volumename="pvc-33197ae6-d42a-777e-b8ca-efbd66a8750d"}];many-to-many matching not allowed: matching labels must be unique on one side
So in all fairness, the query does communicate well on what the problem is - so I attempted to solve this with the "ignoring" option - attempting to keep only the matching labels and values (instance and job) and "excluding/ignoring" the non-matching ones on both sides. This did not work either - resulting in a parsing error. Which in turn nudged me to take a step back and reassess what I am doing.
I am just a bit concerned that I am perhaps barking up the wrong tree here.
My question is: Is this at all possible and if so how? or is there perhaps another, more prudent way to achieve this?
Thanks in advance!
I have a map where the KEYS (although of type string) are meant to be LINKS (e.g. "#1:1", "#1:2") and I want to somehow be able to convert those string keys to LINKS and directly use them in queries.
E.g.
select * from CONVERT_TO_LINK( myMap.keys()[0] )
How can I do such a conversion?
Thanks
JS Function:
Test Code:
create class v1 extends v
create property v1.myMap embeddedmap
create vertex v set name="pippo"
create vertex v1 set myMap={"#9:8":"something"}
select expand(link) from (select toLink(myMap.keys()[0]) as link from v1)
Result:
Hope this helps.
Why not using select * from myMap.keys()[0].convert('LINK') built-in method?
Background: I'm trying to implement a time-series versioned DB using this approach, using gremlin (tinkerpop v3).
I want to get the latest state node (in red) for a given identity node (in blue) (linked by a 'state' edge which contains a timestamp range), but I want to return a single aggregated object which contains the id (cid) from the identity node and all the properties from the state node, but I don't want to have to list them explicitly.
(8640000000000000 is my way of indicating no 'to' date - i.e. the edge is current - slightly different from the image shown).
I've got this far:
:> g.V().hasLabel('product').
as('cid').
outE('state').
has('to', 8640000000000000).
inV().
as('name').
as('price').
select('cid', 'name','price').
by('cid').
by('name').
by('price')
=>{cid=1, name="Cheese", price=2.50}
=>{cid=2, name="Ham", price=5.00}
but as you can see I have to list out the properties of the 'state' node - in the example above the name and price properties of a product. But this will apply to any domain object so I don't want to have to list the properties all the time. I could run a query before this to get the properties but I don't think I should need to run 2 queries, and have the overhead of 2 round trips. I've looked at 'aggregate', 'union', 'fold' etc but nothing seems to do this.
Any ideas?
===================
Edit:
Based on Daniel's answer (which doesn't quite do what I want ATM) I'm going to use his example graph. In the 'modernGraph' people-create->software. If I run:
> g.V().hasLabel('person').valueMap()
==>[name:[marko], age:[29]]
==>[name:[vadas], age:[27]]
==>[name:[josh], age:[32]]
==>[name:[peter], age:[35]]
then the results are a list of entities's with the properties. What I want is, on the assumption that a person can only create one piece of software ever (although hopefully we will see how this could be opened up later for lists of software created), to include the created software 'language' property into the returned entity to get:
> <run some query here>
==>[name:[marko], age:[29], lang:[java]]
==>[name:[vadas], age:[27], lang:[java]]
==>[name:[josh], age:[32], lang:[java]]
==>[name:[peter], age:[35], lang:[java]]
At the moment the best suggestion so far comes up with the following:
> g.V().hasLabel('person').union(identity(), out("created")).valueMap().unfold().group().by {it.getKey()}.by {it.getValue()}
==>[name:[marko, lop, lop, lop, vadas, josh, ripple, peter], lang:[java, java, java, java], age:[29, 27, 32, 35]]
I hope that's clearer. If not please let me know.
Since you didn't provide I sample graph, I'll use TinkerPop's toy graph to show how it's done.
Assume you want to merge marko and lop:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).valueMap()
==>[name:[marko],age:[29]]
gremlin> g.V(1).out("created").valueMap()
==>[name:[lop],lang:[java]]
Note, that there are two name properties and in theory you won't be able to predict which name makes it into your merged result; however that doesn't seem to be an issue in your graph.
Get the properties for both vertices:
gremlin> g.V(1).union(identity(), out("created")).valueMap()
==>[name:[marko],age:[29]]
==>[name:[lop],lang:[java]]
Merge them:
gremlin> g.V(1).union(identity(), out("created")).valueMap().
unfold().group().by(select(keys)).by(select(values))
==>[name:[lop],lang:[java],age:[29]]
UPDATE
Thank you for the added sample output. That makes it a lot easier to come up with a solution (although I think your output contains errors; vadas didn't create anything).
gremlin> g.V().hasLabel("person").
filter(outE("created")).map(
union(valueMap(),
outE("created").limit(1).inV().valueMap("lang")).
unfold().group().by {it.getKey()}.by {it.getValue()})
==>[name:[marko], lang:[java], age:[29]]
==>[name:[josh], lang:[java], age:[32]]
==>[name:[peter], lang:[java], age:[35]]
Merging edge and vertex properties using gremlin java DSL:
g.V().has('User', 'id', userDbId).outE(Edges.TWEETS)
.union(__.identity().valueMap(), __.inV().valueMap())
.unfold().group().by(__.select(Column.keys)).by(__.select(Column.values))
.map(v -> converter.toTweet((Map) v.get())).toList();
Thanks for the answer by Daniel Kuppitz and youhans it has given me a basic idea on the solution of the issue. But later I found out that the solution is not working for multiple rows. It is required to have local step for handling multiple rows. The modified gremlin query will look like:
g.V()
.local(
__.union(__.valueMap(), __.outE().inV().valueMap())
.unfold().group().by(__.select(Column.keys)).by(__.select(Column.values))
)
This will limit the scope of union and group by to a single row.
If you can work with custom DSL ,create custom DSL with java like this one.
public default GraphTraversal<S, LinkedHashMap> unpackMaps(){
GraphTraversal<S, LinkedHashMap> it = map(x -> {
LinkedHashMap mapSource = (LinkedHashMap) x.get();
LinkedHashMap mapDest = new LinkedHashMap();
mapSource.keySet().stream().forEach(key->{
Object obj = mapSource.get(key);
if (obj instanceof LinkedHashMap) {
LinkedHashMap childMap = (LinkedHashMap) obj;
childMap.keySet().iterator().forEachRemaining( key_child ->
mapDest.put(key_child,childMap.get(key_child)
));
} else
mapDest.put(key,obj);
});
return mapDest;
});
return it;
}
and use it freely like
g.V().as("s")
.valueMap().as("value_map_0")
.select("s").outE("INFO1").inV().valueMap().as("value_map_1")
.select("s").outE("INFO2").inV().valueMap().as("value_map_2")
.select("s").outE("INFO3").inV().valueMap().as("value_map_3")
.select("s").local(__.outE("INFO1").count()).as("value_1")
.select("s").outE("INFO1").inV().value("name").as("value_2")
.project("val_map1","val_map2","val_map3","val1","val2")
.by(__.select("value_map_1"))
.by(__.select("value_map_2"))
.by(__.select("value_1"))
.by(__.select("value_2"))
.unpackMaps()
results to rows with
map1_val1, map1_val2,.... ,map2_va1, map2_val2....,value1, value2
This can handle mix of values and valueMaps in a natural gremlin way.
I need to query Mongo for a field where the value is in a list, and then print that value plus a few others. However the list is about 4000 values long, so it is in a file. How could I do that? So something like
db.table.find({field1:{$eq:"abc"},"field2.subvalue":{$in:list.txt}},{"_id":1,"field2.subvalue":1,"field3":1})
Maybe I'm just not using the correct search terms but I can't find anything where you query from values in a file when I search Google.
Thanks in advance!
I second Sorens answer, try loading them into simple JSON format. for example.
file name : testfile
f1 f2 f3 f4 f5 -- Field
v1 v2 v3 v4 v5 --- Row one
x1 x2 x3 x4 x5 --- row two
your collection would look like,
collection name : testCollection
{
f1:v2,f2:v2,f3:v3,f4:v4,f5:v5 <--- These are documents
}
{
f1:x1,f2:x2,f3:x3,f4:x4,f5:v5
}
And so on and then query from MongoDB / program
Please dont do this manually for 4k fields , that's a nightmare :).
Google around for some java or python code and have that do it for you.
Good luck
Let's say I have a triple store full of video metadata. A video can have an arbitrary selection of pre-defined tags, like this:
v1 ex:hasTag A.
v2 ex:hasTag B.
v3 ex:hasTag A;
ex:hasTag B.
So for the sake of this example, there are only the two pre-defined tags A and B.
Now I would like to get an overview of which video has which tags attached, in a matrix like this (or similar):
A B
v1 true false
v2 false true
v3 true true
However, I do not know how to achieve this. First of all, I would need a boolean test, e.g. something like isTrue(?video ex:hasTag A) or whatever it looks like. Even then, I do not know where to put it. If I put the previous statement in the WHERE clause, the query result only contains the videos having the tag A, of course.
Coming from SQL, I can imagine I need to use subqueries in some way, but these seem not to be standardized at the moment. I also saw the FILTER keyword, but I don't feel that is what I want.
I'm currently at a loss but always willing to learn. Any help is appreciated.
Subqueries are in SPARQL 1.1 and that is frozen now. There several complete implementations already.
As well as nested-SELECT subqueries, SPARQL 1.1 has the EXISTS and NOT EXISTS functions for testing whether a pattern matches or not. It#'s a form of subquery. You can use it in a FILTER or if you really want to return true/false:
BIND(EXISTS{?video ex:hasTag "A"} AS ?a)
or even:
SELECT ?video (EXISTS{?video ex:hasTag "A"} AS ?a) (EXISTS{?video ex:hasTag "B"} AS ?b)
{
?video a ex:Video .
}
Try starting with something like this:
SELECT ?video ?tagA ?tagB {
?video a ex:Video .
OPTIONAL { ?video ex:hasTag ?tagA . FILTER (?tagA = "A") }
OPTIONAL { ?video ex:hasTab ?tagB . FILTER (?tagB = "B") }
}
That will you roughly what you want. If you really must have boolean values rather than simple checking for unbound vals use:
SELECT ?video (bound(?tagA) as ?a) (bound(?tagB) as ?b) { ... }
P.S. Subqueries are (very nearly, proposed recommendation at time of writing) standardised in SPARQL 1.1.