Storing edges in OrientDb in ordered way - orientdb

I am working with OrientDB graph API using java API and Gremlin Pipeline. I wanted to know is there a way to specify storing order for edges based on an attribute? I know we can create a custom edge type and define index on the attribute based upon which we want to retrieve.
I also had a look at the tutorial on the OrientDB website:
http://orientdb.com/docs/last/Graph-Database-Tinkerpop.html#ordered-edges
There they do mention that edges can be retrieved in an ordered way but they dont mention how is the order determined.So I would like to know:
What is the default storage order?And will fetching from this order give me edges in an LIFO format?
How can we store based on custom order i.e. store in the order in which we want it to be fetched?

The underlying type used is a List, so the order is the inserting order. To change it, get the edge list, work on it and then call vertex.save() where vertex is casted to OrientVertex.

Related

In ObjectBox for Flutter, is there a way to compare two properties?

I'm new to using ObjectBox, so I've been trying to do some experimenting with its query system to familarize myself with it. One of the queries I've been unable to do is a query comparing two properties. Ignoring the errors they throw, these are some examples of what I'm looking to do:
// Get objects where first number is bigger than second number
boxA.query(ObjectA_.firstNumber.greaterThan(ObjectA_.secondNumber))
// Get parent objects where one of its children has a specific value from the parent
parentBox.query().linkMany(ParentObject_.children, ChildObject_.name.equals(ParentObject_.favoriteChild));
I know based on this question that it's possible in Java using filters, but I also know that query filters are not in ObjectBox for Dart. One of the workaround I've been testing is querying for one property, getting the values, and using each value to query for the second property. But that becomes unsustainable at even moderately sized amounts of data.
If anyone knows of a "proper" way to do this without the use of Java filters, that would be appreciated. Otherwise, if there's a more performant workaround than the one I came up with, that would be great too.
There is no query filter API for Dart in ObjectBox, because Dart already has the where API.
E.g. for a result list write results.where((a) => a.firstNumber >= a.secondNumber).

Flutter/Supabase stream filters - why the one field filter limit?

Using the "flutter_supabase" package, I've been trying to add a dynamic filtered stream to my Flutter app, and have found that an exception is thrown if more than filter is applied.
Why the one field limit? Is there any way around this?
In my case, I want to apply two filter fields to the stream, and the exact fields are applied dynamically based on user selections.
Supabase has advised that this is limit set by their real time streaming layer.
Work arounds to solve this:
a). Build logical views to represent the different filters
b). add a 'where' clause to the stream results and filter at the client end

Database schema for a tinder like app

I have a database of million of Objects (simply say lot of objects). Everyday i will present to my users 3 selected objects, and like with tinder they can swipe left to say they don't like or swipe right to say they like it.
I select each objects based on their location (more closest to the user are selected first) and also based on few user settings.
I m under mongoDB.
now the problem, how to implement the database in the way it's can provide fastly everyday a selection of object to show to the end user (and skip all the object he already swipe).
Well, considering you have made your choice of using MongoDB, you will have to maintain multiple collections. One is your main collection, and you will have to maintain user specific collections which hold user data, say the document ids the user has swiped. Then, when you want to fetch data, you might want to do a setDifference aggregation. SetDifference does this:
Takes two sets and returns an array containing the elements that only
exist in the first set; i.e. performs a relative complement of the
second set relative to the first.
Now how performant this is would depend on the size of your sets and the overall scale.
EDIT
I agree with your comment that this is not a scalable solution.
Solution 2:
One solution I could think of is to use a graph based solution, like Neo4j. You could represent all your 1M objects and all your user objects as nodes and have relationships between users and objects that he has swiped. Your query would be to return a list of all objects the user is not connected to.
You cannot shard a graph, which brings up scaling challenges. Graph based solutions require that the entire graph be in memory. So the feasibility of this solution depends on you.
Solution 3:
Use MySQL. Have 2 tables, one being the objects table and the other being (uid-viewed_object) mapping. A join would solve your problem. Joins work well for the longest time, till you hit a scale. So I don't think is a bad starting point.
Solution 4:
Use Bloom filters. Your problem eventually boils down to a set membership problem. Give a set of ids, check if its part of another set. A Bloom filter is a probabilistic data structure which answers set membership. They are super small and super efficient. But ya, its probabilistic though, false negatives will never happen, but false positives can. So thats a trade off. Check out this for how its used : http://blog.vawter.com/2016/03/17/Using-Bloomfilters-to-Avoid-Repetition/
Ill update the answer if I can think of something else.

what is the best way to retrive information in a graph through has Step

I'm using titan graph db with tinkerpop plugin. What is the best way to retrieve a vertex using has step?
Assuming employeeId is a unique attribute which has a unique vertex centric index defined.
Is it through label
i.e g.V().has(label,'employee').has('employeeId','emp123')
g.V().has('employee','employeeId','emp123')
(or)
is it better to retrieve a vertex based on Unique properties directly?
i.e g.V().has('employeeId','emp123')
Which one of the two is the quickest and better way?
First you have 2 options to create the index:
mgmt.buildIndex('byEmployeeId', Vertex.class).addKey(employeeId).buildCompositeIndex()
mgmt.buildIndex('byEmployeeId', Vertex.class).addKey(employeeId).indexOnly(employee).buildCompositeIndex()
For option 1 it doesn't really matter which query you're going to use. For option 2 it's mandatory to use g.V().has('employee','employeeId','emp123').
Note that g.V().hasLabel('employee').has('employeeId','emp123') will NOT select all employees first. Titan is smart enough to apply those filter conditions, that can leverage an index, first.
One more thing I want to point out is this: The whole point of indexOnly() is to allow to share properties between different types of vertices. So instead of calling the property employeeId, you could call it uuid and also use it for employers, companies, etc:
mgmt.buildIndex('employeeById', Vertex.class).addKey(uuid).indexOnly(employee).buildCompositeIndex()
mgmt.buildIndex('employerById', Vertex.class).addKey(uuid).indexOnly(employer).buildCompositeIndex()
mgmt.buildIndex('companyById', Vertex.class).addKey(uuid).indexOnly(company).buildCompositeIndex()
Your queries will then always have this pattern: g.V().has('<label>','<prop-key>','<prop-value>'). This is in fact the only way to go in DSE Graph, since we got completely rid of global indexes that span across all types of vertices. At first I really didn't like this decision, but meanwhile I have to agree that this is so much cleaner.
The second option g.V().has('employeeId','emp123') is better as long as the property employeeId has been indexed for better performance.
This is because each step in a gremlin traversal acts a filter. So when you say:
g.V().has(label,'employee').has('employeeId','emp123')
You first go to all the vertices with the label employee and then from the employee vertices you find emp123.
With g.V().has('employeeId','emp123') a composite index allows you to go directly to the correct vertex.
Edit:
As Daniel has pointed out in his answer, Titan is actually smart enough to not visit all employees and leverages the index immediately. So in this case it appears there is little difference between the traversals. I personally favour using direct global indices without labels (i.e. the first traversal) but that is just a preference when using Titan, I like to keep steps and filters to a minimum.

Storing the order of embedded documents in a separated array

I have a set of objects that the user can sort arbitrarily. I would like to make my client remember the sorting of the set of objects so that when the user visits the page again the ordering he/she chose will be preserved. However, the client-side framework should also be able to quickly lookup the objects from whatever array/hashmap they are stored in based upon the ordering. What is the most efficient way of doing this?
The best way I have found for doing this is using an array that stores the IDs of the array in the particular order I wanted. From there, I can access the array of objects I wanted by converting the array to a hashmap using Underscore.js.