How to create a join query between two entities in FIWARE orion? - fiware-orion

I have two entites A (postings) and B (users) with a many to one relationship and I want to do an SQL like join to list posts and users together.
I tried the following query, but it only returns an empty array.
I could not find any other documentation regarding this, all tutorials only refer to one id.
https://{{orion}}/v2/entities?q=Post.createdBy==User.id&options=count,keyValues&limit=100
I also tried this variant, but the same result:
https://{{orion}}/v2/entities?q=createdBy==id&options=count,keyValues&limit=100
And I checked that there is at least one entity which matches I manually checked
https://{{orion}}/v2/entities?q=createdBy==urn:ngsi-ld:User:78bc5cac26d1b3abe27cb2cf94b1015a3710958f
This query above returns an entity.
| Post | | User |
|------------------| |--------------|
| ID | | ID |
| Type | | Type |
| Message | | Username |
| CreatedBy: UserId|
In orion the datatype:
"createdBy": {
"type": "Relationship",
"value": "urn:ngsi-ld:User:78bc5cac26d1b3abe27cb2cf94b1015a3710958f",
"metadata": {}
},

The short answer is that you cannot do that kind of join operations using the NGSI API provided by Orion.
The NGSI API is oriented to context management, that is, retrieving and updating context information in the form of entities and their attributes. It is not designed to be a SQL-style language, which would involve a lot of complexity. Orion is a context broker, not a relational database ;)
Fortunately, that kind of join operations can be done by the client, doing several queries to Orion.

Related

Is there a better way to return a table that has a one-to-many relationship

I am wanting to have my data in a better format so my front-end application can be more efficient if possible. Currently, I am using Postgres and I have a table that contains a one-to-many relationship and after I join my tables, I get multiple rows with mainly repeated data just to show the different 'many' values it has (obviously). Is there any possible way to make it so the 'many' values are put into an array so I don't get so many extra rows or if not, what would be the most efficient way to process the data returned from a one-to-many relationship in the front-end?
Example:
"name" | "attributes"
"Shirt" | "Medium"
"Shirt" | "Red"
Would it be possible to instead get a result like this so that it is easier to process on the front-end:
"name" | "attributes"
"Shirt" | ["Medium", "Red"]
You are looking for the Aggregate Functions array_agg or string_agg.
select name, array_agg(attribute) attributes
from example
group by name;
See demo

Using NSPredicate to refer other NSPredicate rules

Let's say I have a Core Data database for NSPredicate rules.
enum PredicateType,Int {
case beginswith
case endswith
case contains
}
My Database looks like below
+------+-----------+
| Type | Content |
+------+-----------+
| 0 | Hello |
| 1 | end |
| 2 | somevalue |
| 0 | end |
+------+-----------+
I have a content "This is end". How can I query Core Data to check if there is any rule that satisfies this content? It should find second entry on the table
+------+-----------+
| Type | Content |
+------+-----------+
| 1 | end |
+------+-----------+
but shouldn't find
+------+-----------+
| Type | Content |
+------+-----------+
| 0 | end |
+------+-----------+
Because in this sentence end is not at the beginning.
Currently I am getting all values, Create predicate with Content and Type and query the database again which is a big overhead I believe.
They way you doing it now is correct. You first need to build your predicate (which in your case is very complex operation that also requires fetching) and run each predicate to see if which one matches.
I wouldn't be so quick to assume that there is a huge overhead with this. If your data set is small (<300) I would suspect that there would be no problem with this at all. If you are experencing problems then (and only then!) you should start optimizing.
If you see the app is running too slowly then use instrements to see where the issue is. There are two possible places that I could see having perforance issues - 1) the fetching of all the predicates from the database and 2) the running of all of the predicates.
If you want to make the fetching faster, then I would recommend using a NSFetchedResultsController. While it is generally used to keep data in sync with a tableview it can be used for any data that you want to have a correct data for at any time. With the controller you do a single fetch and then it monitors core-data and keeps itself up to data. Then when you you need all of the predicate instead of doing a fetch, you simply access the contoller's fetchedObjects property.
If you find that running all the predicates are taking a long time, then you can improve the running for beginsWith and endsWith by a clever use of a bianary search. You keep two arrays of custom predicate objects, one sorted alphabetically and the other will all the revered strings sorted alphabetically. To find which string it begins with use indexOfObject:inSortedRange:options:usingComparator: to find the relevant objects. If don't know how you can improve contains. You could see if running string methods on the objects is faster then NSPredicate methods. You could also try running the predicates on a background thread concurrently.
Again, you shouldn't do any of this unless you find that you need to. If your dataset is small, then the way you are doing it now is fine.

How to get multiselect results containing both properties and lists via JPA 2.1 in Java?

The goal is to get the firstName, lastName, and the list of todos for an employee by using multiselect:
CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<Tuple> q = cb.createTupleQuery();
Root emp = q.from(Employee.class);
q.multiselect(
emp.get("firstName").alias("fname"),
emp.get("lastName").alias("lname"),
emp.get("toDoList").alias("toDoList")
).where(cb.equal(emp.get("id"), 12345));
List<Tuple> tuples = em.createQuery(q).getResultList();
Iterator<Tuple> iter = tuples.iterator();
while(iter.hasNext()){
Tuple t = iter.next();
//t.get("fName"); // returns String
//t.get("lName"); // returns String
//t.get("toDoList"); // returns String
//?????
//...
}
The toDoList is just a simple list of Strings. Let's assume employee 12345 has 4 todos. This means I get the following result set:
---------------------------------
| firstName | lastName | toDo |
---------------------------------
| John | Doe | sleep |
---------------------------------
| John | Doe | eat |
---------------------------------
| John | Doe | play |
---------------------------------
| John | Doe | laugh |
---------------------------------
Now I'm looking for a smart way of creating an instance of ONE Employee and setting its firstName, lastName, and toDoList ONCE:
Employee employee = new Employee();
employee.setFirstName(...);
employee.setLastName(...);
employee.setToDoList(...);
What would be the best approach? Thing would get even more complicated if I would add additional relationships (i.e. favoriteSongs).
Unfortunately, you would have to write code for every combination of your multiselect call to get the data from the tuples list, and that code can be very dirty.
There are major disadvantages getting relationships with one query:
the java code for reading the data will get very dirty
One query sound like good performance at first, but this will retrieve duplicate data for every fetched row
So I actually believe that executing separate queries to get the data is a better approach in many cases (like in my case). In other words, I will avoid messy code and use single selects instead. The resulting code for getting the projection's data is much better to maintain and much more readable. Here are some important links I found:
SQL for JPQL with join fetch
SQL for JPQL with join fetch
A user may want every relationship loaded, but join fetching every
relationship, in particular every ToMany relationships will lead to a
huge join (outer joins at that), fetching a huge amount of duplicate
data.
Fetch Join in JPQL
2. Fetch Join in JPQL
And the main disadvantage is that we need to write additional code
which executes the query. But it gets even worse, if the entity has
multiple relations and we need to initialize different relations for
different use cases. In this case we need to write a query for every
required combination of fetch joined relations. This can become quite
messy.

Postgres array fields: find where array contains value

Currently I have a table schema that looks like this:
| id | visitor_ids | name |
|----|-------------|----------------|
| 1 | {abc,def} | Chris Houghton |
| 2 | {ghi} | Matt Quinn |
The visitor_ids are all GUIDs, I've just shortened them for simplicity.
A user can have multiple visitor ids, hence the array type.
I have a GIN index created on the visitor_ids field.
I want to be able to lookup users by a visitor id. Currently we're doing this:
SELECT *
FROM users
WHERE visitor_ids && array['abc'];
The above works, but it's really really slow at scale - it takes around 45ms which is ~700x slower than a lookup by the primary key. (Even with the GIN index)
Surely there's got to be a more efficient way of doing this? I've looked around and wasn't able to find anything.
Possible solutions I can think of could be:
The current query is just bad and needs improving
Using a separate user_visitor_ids table
Something smart with special indexes
Help appreciated :)
I tried the second solution - 700x faster. Bingo.
I feel like this is an unsolved problem however, what's the point in adding arrays to Postgres when the performance is so bad, even with indexes?

Neo4j: MERGE creates duplicate nodes

My database model has users and MAC addresses. A user can have multiple MAC addresses, but a MAC can only belong to one user. If some user sets his MAC and that MAC is already linked to another user, the existing relationship is removed and a new relationship is created between the new owner and that MAC. In other words, a MAC moves between users.
This is a particular instance of the Cypher query I'm using to assign MAC addresses:
MATCH (new:User { Id: 2 })
MERGE (mac:MacAddress { Value: "D857EFEF1CF6" })
WITH new, mac
OPTIONAL MATCH ()-[oldr:MAC_ADDRESS]->(mac)
DELETE oldr
MERGE (new)-[:MAC_ADDRESS]->(mac)
The query runs fine in my tests, but in production, for some strange reason it sometimes creates duplicate MacAddress nodes (and a new relationship between the user and each of those nodes). That is, a particular user can have multiple MacAddress nodes with the same Value.
I can tell they are different nodes because they have different node ID's. I'm also sure the Values are exactly the same because I can do a collect(distinct mac.Value) on them and the result is a collection with one element. The query above is the only one in the code that creates MacAddress nodes.
I'm using Neo4j 2.1.2. What's going on here?
Thanks,
Jan
Are you sure this is the entirety of the queries you're running? MERGE has this really common pitfall where it merges everything that you give it. So here's what people expect:
neo4j-sh (?)$ MERGE (mac:MacAddress { Value: "D857EFEF1CF6" });
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 1
Properties set: 1
Labels added: 1
1650 ms
neo4j-sh (?)$ MERGE (mac:MacAddress { Value: "D857EFEF1CF6" });
+--------------------------------------------+
| No data returned, and nothing was changed. |
+--------------------------------------------+
17 ms
neo4j-sh (?)$ match (mac:MacAddress { Value: "D857EFEF1CF6" }) return count(mac);
+------------+
| count(mac) |
+------------+
| 1 |
+------------+
1 row
200 ms
So far, so good. That's what we expect. Now watch this:
neo4j-sh (?)$ MERGE (mac:MacAddress { Value: "D857EFEF1CF6" })-[r:foo]->(b:SomeNode {label: "Foo!"});
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 2
Relationships created: 1
Properties set: 2
Labels added: 2
178 ms
neo4j-sh (?)$ match (mac:MacAddress { Value: "D857EFEF1CF6" }) return count(mac);
+------------+
| count(mac) |
+------------+
| 2 |
+------------+
1 row
2 ms
Wait, WTF happened here? We specified only the same MAC address again, why is a duplicate created?
The documentation on MERGE specifies that "MERGE will not partially use existing patterns — it’s all or nothing. If partial matches are needed, this can be accomplished by splitting a pattern up into multiple MERGE clauses". So because when we run this path MERGE the whole path doesn't already exist, it creates everything in it, including a duplicate mac address node.
There are frequently questions about duplicated nodes created by MERGE, and 99 times out of 100, this is what's going on.
This is the response I got back from Neo4j's support (emphasis mine):
I got some feedback from our team already, and it's currently known that this can happen in the absence of a constraint. MERGE is effectively MATCH or CREATE - and those two steps are run independently within the transaction. Given concurrent execution, and the "read committed" isolation level, there's a race condition between the two.
The team have done some discussion on how to provided a higher guarantee in the face of concurrency, and do have it noted as a feature request for consideration.
Meanwhile, they've assured me that using a constraint will provide the uniqueness you're looking for.