Possible to do TSP with mongodb? - mongodb

I'm working an a map app where I can add waypoints along specific routes. I need to pull my waypoints out in order obviously so I can get directions from A-D in the correct order.
I've read a bit about geoJSON in mongodb but I'm curious if theres a way to query my data so that my points come out ordered by how close they are together rather than the order I've put them in.
Basically what I'm asking...Is there a way to do a "Traveling Salesman query" so that my waypoints are ordered in the smartest order?

I think the short answer is no. You're going to need to add an order key to your waypoints. This is a pretty standard pattern for any nav system with waypoints. At the very least you need to know which point is first and which is last so that you can then solve the TSP.

Related

Mapbox: Is there a way to retrieve all coordinates for a specific feature ID?

I'm building a feature where I need to extract all coordinates of a selecetd road/path in Mapbox when it's clicked on. I've attempted to use the queryRenderedFeatures method, but it seems the result list is fragmented. By "fragmented" I mean that if you have a road or path which is clearly just one long path/road when rendered on the map, it often consists up of 4-5-6 or more features, and you cannot really work out from the feature collection how they're supposed to be connected (in order)
I then tried to use the Tilequery API, but it doesn't return any coordinates for LineStrings.
Is there any API - server or client side - in Mapbox, where you can provide an ID of a feature and retrieve the all coordinates for a road or path?
Thanks in advance :-)
I think you're really asking: "is there a way to access complete LineString features for data in Mapbox's tilesets", to which the answer is, no, not really - other than trying to reassemble them in the way you have tried.
For your own data, you could host it using Mapbox's Datasets, rather than Tilesets.

Do I have to loop through each 'page' of orders to get all orders in one WooComerce REST Api query?

I've built a KNIME workflow that helps me analyse (sales) data from numerous channels. In the past I used to export all orders manually and use an XSLX or CSV reader but I want to do it via WooCommerce's REST API to reduce manual labor.
I would like to be able to receive all orders up until now from a single query. So far, I only get as many as the # I fill in for &per_page=X. But if I fill in like 1000, it gives an error. This + my common sense give me the feeling I'm thinking the wrong way!
If it is not possible, is looping through all pages the second best thing?
I've managed to connect to the api via basic auth. The following query returns orders, but only 10:
I've tried increasing the number per_page but I do not think this is the right way to get all orders in one table.
https://XXXX.nl/wp-json/wc/v3/orders?consumer_key=XXXX&consumer_secret=XXXX
My current mindset would like to be able to receive all orders up until now from a single query. But it personally also feels like that this is not the common way to do it. Is looping through all pages the second best thing?
Thanks in advance for your responses. I am more of a data analist than a data engineer or scientist and I hope your answers will help me towards my goal of being more of a scientist :)
It's possible by passing params "per_page" with the request
per_page integer Maximum number of items to be returned in result set. Default is 10.
Try -1 as the value
https://woocommerce.github.io/woocommerce-rest-api-docs/?php#list-all-orders

Database schema for a tinder like app

I have a database of million of Objects (simply say lot of objects). Everyday i will present to my users 3 selected objects, and like with tinder they can swipe left to say they don't like or swipe right to say they like it.
I select each objects based on their location (more closest to the user are selected first) and also based on few user settings.
I m under mongoDB.
now the problem, how to implement the database in the way it's can provide fastly everyday a selection of object to show to the end user (and skip all the object he already swipe).
Well, considering you have made your choice of using MongoDB, you will have to maintain multiple collections. One is your main collection, and you will have to maintain user specific collections which hold user data, say the document ids the user has swiped. Then, when you want to fetch data, you might want to do a setDifference aggregation. SetDifference does this:
Takes two sets and returns an array containing the elements that only
exist in the first set; i.e. performs a relative complement of the
second set relative to the first.
Now how performant this is would depend on the size of your sets and the overall scale.
EDIT
I agree with your comment that this is not a scalable solution.
Solution 2:
One solution I could think of is to use a graph based solution, like Neo4j. You could represent all your 1M objects and all your user objects as nodes and have relationships between users and objects that he has swiped. Your query would be to return a list of all objects the user is not connected to.
You cannot shard a graph, which brings up scaling challenges. Graph based solutions require that the entire graph be in memory. So the feasibility of this solution depends on you.
Solution 3:
Use MySQL. Have 2 tables, one being the objects table and the other being (uid-viewed_object) mapping. A join would solve your problem. Joins work well for the longest time, till you hit a scale. So I don't think is a bad starting point.
Solution 4:
Use Bloom filters. Your problem eventually boils down to a set membership problem. Give a set of ids, check if its part of another set. A Bloom filter is a probabilistic data structure which answers set membership. They are super small and super efficient. But ya, its probabilistic though, false negatives will never happen, but false positives can. So thats a trade off. Check out this for how its used : http://blog.vawter.com/2016/03/17/Using-Bloomfilters-to-Avoid-Repetition/
Ill update the answer if I can think of something else.

what is the best way to retrive information in a graph through has Step

I'm using titan graph db with tinkerpop plugin. What is the best way to retrieve a vertex using has step?
Assuming employeeId is a unique attribute which has a unique vertex centric index defined.
Is it through label
i.e g.V().has(label,'employee').has('employeeId','emp123')
g.V().has('employee','employeeId','emp123')
(or)
is it better to retrieve a vertex based on Unique properties directly?
i.e g.V().has('employeeId','emp123')
Which one of the two is the quickest and better way?
First you have 2 options to create the index:
mgmt.buildIndex('byEmployeeId', Vertex.class).addKey(employeeId).buildCompositeIndex()
mgmt.buildIndex('byEmployeeId', Vertex.class).addKey(employeeId).indexOnly(employee).buildCompositeIndex()
For option 1 it doesn't really matter which query you're going to use. For option 2 it's mandatory to use g.V().has('employee','employeeId','emp123').
Note that g.V().hasLabel('employee').has('employeeId','emp123') will NOT select all employees first. Titan is smart enough to apply those filter conditions, that can leverage an index, first.
One more thing I want to point out is this: The whole point of indexOnly() is to allow to share properties between different types of vertices. So instead of calling the property employeeId, you could call it uuid and also use it for employers, companies, etc:
mgmt.buildIndex('employeeById', Vertex.class).addKey(uuid).indexOnly(employee).buildCompositeIndex()
mgmt.buildIndex('employerById', Vertex.class).addKey(uuid).indexOnly(employer).buildCompositeIndex()
mgmt.buildIndex('companyById', Vertex.class).addKey(uuid).indexOnly(company).buildCompositeIndex()
Your queries will then always have this pattern: g.V().has('<label>','<prop-key>','<prop-value>'). This is in fact the only way to go in DSE Graph, since we got completely rid of global indexes that span across all types of vertices. At first I really didn't like this decision, but meanwhile I have to agree that this is so much cleaner.
The second option g.V().has('employeeId','emp123') is better as long as the property employeeId has been indexed for better performance.
This is because each step in a gremlin traversal acts a filter. So when you say:
g.V().has(label,'employee').has('employeeId','emp123')
You first go to all the vertices with the label employee and then from the employee vertices you find emp123.
With g.V().has('employeeId','emp123') a composite index allows you to go directly to the correct vertex.
Edit:
As Daniel has pointed out in his answer, Titan is actually smart enough to not visit all employees and leverages the index immediately. So in this case it appears there is little difference between the traversals. I personally favour using direct global indices without labels (i.e. the first traversal) but that is just a preference when using Titan, I like to keep steps and filters to a minimum.

Database and item orders (general)

I'm right now experimenting with a nodejs based experimental app, where I will be putting in a list of books and it will be posted on a forum automatically every x minutes.
Now my question is about order of these things posted.
I use mongodb (not sure if this changes the question or not) and I just add a new entry for every item to be posted. Normally, things are posted in the exact order I add them.
However, for the web interface of this experimental thing, I made a re-ordering interaction where I can simply drag and drop elements to reorder them.
My question is: how can I reflect this change to the database?
Or more in general terms, how can I order stuff in general, in databases?
For instance if I drag the 1000th item to 1st order, everything below needs to be edited (in db) between 1 and 1000 the entries. This does not seem like a valid and proper solution to me.
Any enlightenment is appreciated.
An elegant way might be lexicographic sorting. Introduce a String attribute for each item. Make the initial length of the values large enough to accomodate the estimated number of items. E.g., if you expect 1000 items, let the keys be baa, bab, bac, ... bba, bbb, bbc, ...
Then, when an item is moved from where it is to another place between two items, assign a value to the sorting attribute of the moved item that is somewhere equidistant (lexicographically) to those items. So to move an item between dei and dej, give it the value deim. To move an item between fadd and fado, give it the value fadi.
Keys starting with a were initially not used to leave space for elements that get dragged before the first one. Never use the key a, as it will be impossible to move an element before this one.
Of course, the characters used may vary according to the sort order provided by the database.
This solution should work fine as long as elements are not reordered extremely frequently. In a worst case scenario, this may lead to longer and longer attribute values. But if the movements are somewhat equally distributed, the length of values should stay reasonable.