How to query titan graph with gremlin queries based on integer properties - rest

I have multiple nodes in a titan graph server with integer properties, I want to query the graph based on integer properties, the server is configured with REST so I'm querying the graph this way:
titan-server:8182/gremlin=Query
(e.g Query could be : g.V().hasLabel("Person"))
I want to fetch all person vertices with age = 30 (just an example)
This can be done in gremlin console (socket based) as follow:
g.V().hasLabel("Person").has("age",30);
but this doesn't work on rest query, it give an empty results (even if there is such a vertex with age = 30 ):
titan-server:8182/gremlin=g.V().hasLabel("Person")**.has("age",30)**;
I didn't find any docs over the internet for gremlin on rest.
Thank you for help guys

I managed to get the REST API to work by doing the following. First, as specified here make sure to change the channel in the gremlin-server.yaml config to:
channelizer: org.apache.tinkerpop.gremlin.server.channel.HttpChannelizer
Then try the following post:
{
"gremlin" : "g.V().hasLabel(x).has(y,z)",
"bindings" :
{
"x" : "Person",
"y" : "age",
"z" : 30
}
}
More info on the REST API can be found here

Related

MongoDB Java API - How do I perform a Group By operation?

I'm struggling to find an answer to what is probably a simple question, and my question on the MongoDB forum has gone unanswered for days.
I'm trying to perform a Group By operation in the MongoDB Java Aggregation API and failing miserably.
I have this section of old GMongo code that performs the Group By:
[$group: [_id : [shop_id: '$shop.id'], // this "_id" expression isn't an Accumulator operation
count: [$sum: 1],
n : [$first: '$shop.n'],
dn : [$first: '$shop.dn']]]
And I'm trying to convert it to the new Java API like so:
// Mongo DB Java Sync Driver API query
final List<Bson> aggregationPipeline = asList(
Aggregates.match(Filters.and(Filters.gte("date.d", fromDate), Filters.lte("date.d", toDate))),
Aggregates.match(Filters.eq("product.cl", clientId)),
Aggregates.match(Filters.in("platform.id", platformIds)),
Aggregates.match(Filters.exists("shop.inf." + clientId, true)),
Aggregates.match(Filters.exists("shop.cl_pt." + clientId, false)),
Aggregates.match(Filters.eq("product.a", true)),
Aggregates.group(null,
asList(
// _id : [shop_id: '$shop.id'] // how to replicate this Group By line in Java API?
Accumulators.sum("count",1),
Accumulators.first("n","$shop.n"),
Accumulators.first("dn","$shop.dn")
)
)
);
I can replicate the logic the last three lines of the Group statement by using Accumulators (sum, first etc) but the very first line, this one:
_id : [shop_id: '$shop.id']
is what is confusing me. I cant figure out how to replicate it in the Java API as it's not an Accumulator operation, it looks more like an expression that can't find any documentation on.
Can anyone help? This one issue has held me up for a couple of days now.
Any clarification is much appreciated.

Querying player data in GameSparks returns 0 results

I want to get all players in GameSparks data with an email field. I can Query in Data Explorer, but when I tried it in CloudCode it does not work.
In GameSparks documentation I found this:
var results = Spark.metaCollection('metatest').find({"metatest1" : {"$gt" : 1}});
I replaced it with this:
var results = Spark.metaCollection('player').find({"email":{"$exists":"true"}});
When I trying to get the count of 'results' it returns 0.
What am I doing wrong? Is it impossible to access the MongoDB entries for any of the System Collections defined by GameSparks (like 'player', 'challengeInstance' etc.)?
You need to use systemCollection instead of metaCollection.

Titan: How to efficienlty get the maximum value of a Long property?

So if I want to retrieve the vertex that has the maximum value of a Long property, I should run:
graph.traversal().V().has("type","myType").values("myProperty").max().next()
This is really slow as it has to load all vertices to find out the maximum value. Is there any way faster?
Any indexing would help? I believe composite indexes won't help but is there a way to do it using mixed index with ElasticSearch back end?
Using Titan to create a Mixed Index on a numeric value will result in Elasticsearch indexing the property correctly. Kind of similarly to you, we want to know all our vertices ordered by a property DEGREE from max to min so we currently do the following for the property DEGREE:
TitanGraph titanGraph = TitanFactory.open("titan-cassandra-es.properties");
TitanManagement management = graph.openManagement();
PropertyKey degreeKey = management.makePropertyKey("DEGREE").dataType(Long.class).make();
management.buildIndex("byDegree", Vertex.class)
.addKey(degreeKey)
.buildMixedIndex("search");
We are currently having issues getting Titan to traverse this quickly (for some reason it can create the index but struggles to use it for certain queries) but we can query Elasticsearch directly:
curl -XGET 'localhost:9200/titan/byDegree/_search?size=80' -d '
{
"sort" : [
{ "DEGREE" : {"order" : "desc"}}
],
"query" : {
}
}
The answer is returned extremely quickly so for now we create the index with Titan but query elastic search directly.
Short Answer: Elasticsearch can do what is needed with numeric ranges very easily, the problem on our side at least seems to be getting Titan to use these indices fully. However the traversal you are trying to execute is simpler than ours (you just want the max) so you may not encounter these issues and you may just be able to stick with Titan traversals fully.
Edit:
I have recently confirmed that elasticsearch and titan can fulfill your needs (as it does mine). Just be wary of how you create your indices. Titan will be able to execute your query quickly as long as you create your Mixed index with the Type key being set to a String match not a Text Match.

Upsert an embedded array at specific position - will my work-around work in production?

I'm storing timeseries in MongoDB and the strucuture is as follows:
{
"_id" : ObjectId("5128e567df6232180e00fa7d"),
"values" : [563.424, 520.231, 529.658, 540.459, 544.271, 512.641, 579.591, 613.878, 627.708, 636.239, 672.883, 658.895, 646.44, 619.644, 623.543, 600.527, 619.431, 596.184, 604.073, 596.556, 590.898, 559.334, 568.09, 568.563],
"day" : 20110628,
}
The values-array is representing a value for each hour. So the position is important since position 0 = first hour, 1 = second hour and so on.
To update the value of a specific hour is quite easy. For example, to update the 7th hour of the day I do this:
db.timeseries.update({day:20130203},{$set : {values.6 : 482.65}}, {upsert : true})
My problem is that I would like to use upsert, like this
db.timeseries.update({day:20130203},{$set : {values.6 : 482.65}})
But if the document does not exist, MongoDB will craete an embedded document intead of an embedded array. Like this:
{
"_id" : ObjectId("5128e567df6232180e00fa7d"),
"values" : {"6" : 482.65},
"day" : 20130203,
}
There is a ticket to add a feature to solve this issue here, but meanwhile I have come up with a work-around to solve this in my case.
What I do, is that I first created a uniqe-index on the day-field. And whenever I want to upsert a hourly volume I do these two commands.
db.timeseries.insert({day:20130203, values : []}); // Will be rejected if it exists
db.timeseries.update({day:20130203},{$set : {values.6 : 482.65}});
The first statement tried to create a new document - and thanks to the uniqe-index the insert will be rejected if it already exists. If not, a document with an embedded array for value-field will be created. This ensures that the update will work.
Result:
{
"_id" : ObjectId("5128e567df6232180e00fa7d"),
"values" : [null,null,null,null,null,null,482.65],
"day" : 20130203,
}
And here's is my question
In production, when several commands like this will be run simultaneously can I be sure that my update-command will be executed after my insert-command? Note that I want to run both commands in unsafe-mode, that is I will not wait for any response from the server.
(It would also be interesting to here comments about my work-around from a performance perspective.)
Generally yes, there is a way to ensure that two requests from a client use the same connection. By using the same connection you force a strict order of execution on the server.
The way to accomplish this are different for different drivers.
For the Asynchronous Java Driver you can create a "Serialized" MongoClient from the initial MongoClient instance and it will ensure that all requests use a single connection.
For the 10gen java driver it will automatically (via a ThreadLocal) try to use the same connection. You can also give a hint to the driver via the DB.requestStart()/DB.requestEnd() methods that a group of commands need to be pipe-lined.
The startRequest/endRequest applies to most of the 10gen drivers. As another example the PyMongo driver mongo_client has a start_request()/end_request() pair.
From a performance point of view, it is better using only one access to the database than two. Cannot you use $push instead of $set for updating the values field?

How to query item's metadata from Couchbase?

I put some data to CouchBase 1.8.1,and get it successful.But I want to query its metadata,as expiration and att_reason(non-json or json).In some document,it list the metadata with json format,for example:
{
"_id" : "contact_475",
"_rev" : "1-AB9087AD0977F089",
"_bin" : "...",
"$flags" : 0,
"$expiration" : 0,
"name" : "Fred Bloggs",
}
How can I query item's metadata?
As Pavel the most common way to access metadata in Couchbase (2.0) is using Views.
You can also use the internal TAP protocol :
http://www.couchbase.com/wiki/display/couchbase/TAP+Protocol
Could you give us more information about your use case and why you need to access meta/expiration ? And why you cannot use views (that is the recommended way to do it)
Regards
Tug
The easiest way is issuing an HTTP request to:
http://serveraddress:8091/couchBase/default/contact_475
The response should contain an X-Couchbase-Meta header with the metadata. More information is here: http://xmeblog.blogspot.co.il/2013/08/couchbase-how-to-retrieve-key.html
If you want to see the meta data in a Couchbase query you can do something like this:
SELECT meta(b).* FROM bucket b
You can also see both meta data and all other data in a query by doing something like this:
SELECT meta(b).*, * FROM bucket b
If you want to query only meta data using N1QL you could run below query, it will return all meta data about document :
select meta(bucket_name) from bucket_name
But if you want to get these information from Sync Gateway it will return to you by every GET request using REST API, the REST API also include some filtering above these meta data.