How to remove an edge connection to a vertex in OrientDB? - orientdb

I created an HasAddress edge between an User and an Address vertex.
If I remove the HasAddress edge, the User vertex still shows the HasAddress connection, just empty.
Any way of removing it? Is this just a GUI thing?
this doesn't seem to work UPDATE User REMOVE HasAddress

It's not properly a GUI thing, but you can ignore it.
When you create an edge and connect it to a vertex, OrientDB creates a collection of links (a RIDBAG) as a property of the vertex. When you delete edges, the edge pointer is removed from the collection, but the collection itself is not removed.
If you really don't like that, you can run an
UPDATE User REMOVE in_HasAddress
/* or out_HasAddress if you want to remove the outgoing edges collection */
but PLEASE, make sure that the collections are EMPTY, otherwise you will break the graph consistency (you are using a document API to manipulate the graph).
My advice is to avoid it in general.

Related

Optimistic locking & aggregate root's internal entity

Let's assume I have the Aggregate root Picture with internal entity Shape. Picture contains list of shapes.
Shape will remain an internal entity of the Picture aggregate root because the Picture defines some rules among multiple Shape instances. Let's say you can't assign new Shape when Picture is read-only and Picture may not contain two Shapes of the same color. Having defined these rules, the aggregate root - knowing about all of its Shapes - can now consistently verify rules.
To not brake Law of Demeter, I am accessing the Shape always through the Picture.
My question is related to ptimistic locking with aggregate versioning. If I am updating color of the Shape through Picture root aggregate, am I increasing the version of the aggreagate root - Picture or only of the Shape ?
My assumption is - only of the Shape, because oposite would prevent parallel updating of multiple Shapes of one Picture.
But what if during update of the Shape, Picture was set to readonly mode?
Thanks for advice.
Every time an Aggregate mutates it should increase the version number when using an optimistic locking mechanism. An Aggregate mutates when its Aggregate root or any of the nested Entities mutate. When a conflicts occurs, it means that a previous faster state mutation has already been committed and it cannot be rollback. It also mean that the later state mutation was based on old data and it must be re-executed.
However, this conflict should be transparently retried by the framework by re-executing the command (load, execute, persist). The Aggregate should not care about this situation, the domain logic should be the same. In other words, in case of conflict, the client should not even notice, the HTTP response (or whatever) should be the same, maybe a little slower.
My question is related to ptimistic locking with aggregate versioning. If I am updating color of the Shape through Picture root aggregate, am I increasing the version of the aggreagate root - Picture or only of the Shape ?
You are increasing the version of the root. Specifically, you are changing the aggregate root from one that "points to" version:4 of Shape to one that points to version:5.
It's somewhat similar to how git handles file changes. You edited the file, which means that the file name that used to point to blob:1 now points to blob:2. But "file" is just a name in the tree, so we need to change the tree from one that says { file -> blob:1 } to a tree that says { file -> blob:2 }, and so on all the way up to the root.
Repeating the same idea another way, any fixed version of the aggregate is "immutable" -- I should be able to look at version:4 all day, and not be affected by the changes that you are making to the Shape, which means your changes need to happen in a new version.
As a clarification: it's weird.
The aggregate is, as a data pattern, a single graph of relations that changes atomically, to ensure that the invariant is maintained. But "objects" want to encapsulate their own state. So we take something that is a single tree, and break it into pieces that are individually managed by an object, and then stitch them all back together again to create a single new tree.
The version number relates to the aggregate as it's the aggregate whos state is changed when a shape changes colour. Not sure why this would prevent parallel updating as long as the updates don't actually conflict.
What I mean by that is let's say our AG is at version 3. It contains a red, yellow and blue triangle. Two commands are issued in parallel to change the red triangle to a green one and another command is issued to change blue one to a purple. Both commands are issued at version 3 so a concurrency error will be detected. But assuming you are using events, you can look back at the events and see that they don't conflict and can, therefore, allow the process to go through.
I have a blog post which goes into this in a lot more detail. You can find it here: Handling Concurrency Conflicts in a CQRS and Event Sourced System
I hope that helps.

Database schema for a tinder like app

I have a database of million of Objects (simply say lot of objects). Everyday i will present to my users 3 selected objects, and like with tinder they can swipe left to say they don't like or swipe right to say they like it.
I select each objects based on their location (more closest to the user are selected first) and also based on few user settings.
I m under mongoDB.
now the problem, how to implement the database in the way it's can provide fastly everyday a selection of object to show to the end user (and skip all the object he already swipe).
Well, considering you have made your choice of using MongoDB, you will have to maintain multiple collections. One is your main collection, and you will have to maintain user specific collections which hold user data, say the document ids the user has swiped. Then, when you want to fetch data, you might want to do a setDifference aggregation. SetDifference does this:
Takes two sets and returns an array containing the elements that only
exist in the first set; i.e. performs a relative complement of the
second set relative to the first.
Now how performant this is would depend on the size of your sets and the overall scale.
EDIT
I agree with your comment that this is not a scalable solution.
Solution 2:
One solution I could think of is to use a graph based solution, like Neo4j. You could represent all your 1M objects and all your user objects as nodes and have relationships between users and objects that he has swiped. Your query would be to return a list of all objects the user is not connected to.
You cannot shard a graph, which brings up scaling challenges. Graph based solutions require that the entire graph be in memory. So the feasibility of this solution depends on you.
Solution 3:
Use MySQL. Have 2 tables, one being the objects table and the other being (uid-viewed_object) mapping. A join would solve your problem. Joins work well for the longest time, till you hit a scale. So I don't think is a bad starting point.
Solution 4:
Use Bloom filters. Your problem eventually boils down to a set membership problem. Give a set of ids, check if its part of another set. A Bloom filter is a probabilistic data structure which answers set membership. They are super small and super efficient. But ya, its probabilistic though, false negatives will never happen, but false positives can. So thats a trade off. Check out this for how its used : http://blog.vawter.com/2016/03/17/Using-Bloomfilters-to-Avoid-Repetition/
Ill update the answer if I can think of something else.

traverse a model from one orientdb instance and duplicate result into another orientDb instance

I am having an orientDB instance where all my vertex and relations are stored. I am trying to search node via a searchable property, get node and traverse complete tree associated. I get the result chunk ( made up of all Vertex and Edges connected to my search node).
Now I would like to push result ( all vertex with edge) to another OrientDB instance.
Please suggest what will be best way to do it and how.
I am able to use tinkerpop graph APIs to get traversal result, but not sure how can I push this dataset to another orientdb. I want to maintain connected-ness in second instance too.

OrientDB: How to use traverse to get edges?

I am trying to use traverse to explore multiple orders of edges away from a specific starting node. For example, using the Grateful Dead graph, I call this command:
traverse bothE('followed_by') from #15:8 while $depth<3
I expect this to get two orders of edges. However, all the edges are ones that include the starting node. If instead I use both('followed_by') it appears to visit all the desired vertices, but it doesn't report the edges. What should I do?
The in edge on #15:8 record is called followed_by, and the out are sung_by, written_by, followed_by, so you can't use followed_by name and get also out edges, even if you use both in your query:
This one should do it:
traverse bothE() from #15:8 while $depth<3

Storing edges in OrientDb in ordered way

I am working with OrientDB graph API using java API and Gremlin Pipeline. I wanted to know is there a way to specify storing order for edges based on an attribute? I know we can create a custom edge type and define index on the attribute based upon which we want to retrieve.
I also had a look at the tutorial on the OrientDB website:
http://orientdb.com/docs/last/Graph-Database-Tinkerpop.html#ordered-edges
There they do mention that edges can be retrieved in an ordered way but they dont mention how is the order determined.So I would like to know:
What is the default storage order?And will fetching from this order give me edges in an LIFO format?
How can we store based on custom order i.e. store in the order in which we want it to be fetched?
The underlying type used is a List, so the order is the inserting order. To change it, get the edge list, work on it and then call vertex.save() where vertex is casted to OrientVertex.