Neo4J - How to merge nodes with the same attribute value (name) but different node labels - merge

I currently have duplicate name values across different node labels and I want to merge them. The issue is that every question I've found online assumes that BOTH the attribute name and the labels are the same. The code I've executed to query the instances I'm referring to is:
MATCH (a)-[r:FEATURED_IN]->(b) WHERE a.name = b.name AND id(a) <> id(b)
So that means that 'a' is featured in 'b', obviously 'a' and 'b' refer to the different node labels but the values are the same. How can I perform a merge to ensure that the 'b' node is deleted and only the 'a' node is returned? I know I could do this manually but there are so many instances of this that I would like to find a quick fix.
Thanks in advance.

You can collect all names in the nodes and combine them. Then UNWIND (it is like a For loop) and return distinct name.
MATCH (a)-[r:SYN_OF]->(b) WHERE a.name = b.name AND id(a) <> id(b)
WITH collect(distinct a.name) + collect(distinct b.name) as names
UNWIND names as name
RETURN distinct name
Result:
╒═══════════╕
│"name" │
╞═══════════╡
│"Same Name"│
└───────────┘

Related

neo4j creating random empty nodes when merging

I'm trying to create a new node with label C and relationships from a-->c and b-->c, but if and only if the whole pattern a-->c,b-->c does exist.
a and b already exist (merged before the rest of the query).
The below query is a portion of the query I want to write to accomplish this.
However, it creates a random empty node devoid of properties and labels and attaches the relationship to that node instead. This shouldn't be possible and is certainty not what I want. How do I stop that from happening?
merge (a: A {id: 1})
merge (b: B {id:1})
with *
call {with a, b
match (a)-[:is_required]->(dummy:C), (a)-[:is_required]->(b)
with count(*) as cnt
where cnt = 0
merge (temp: Temporary {id: 12948125})
merge (a)-[:is_required]->(temp)
return temp
}
return *
Thanks
I think there are a couple of problems here:
There are restrictions on how you can use variables introduced with WITH in a sub-query. This article helps to explain them https://neo4j.com/developer/kb/conditional-cypher-execution/
I think you may be expecting the WHERE to introduce conditional flow like IF does in other languages. WHERE is a filter (maybe FILTER would have been a better choice of keyword than WHERE). In this case you are filtering out 'cnt's where they are 0, but then never reference cnt again, so the merge (temp: Temporary {id: 12948125}) and merge (a)-[:is_required]->(temp) always get executed. The trouble is, due to the above restrictions on using variables inside sub-queries, the (a) node you are trying to reference doesn't exist, it's not the one in the outer query. Neo4j then just creates an empty node, with no properties or labels and links it to the :Temporary node - this is completely valid and why you are getting empty nodes.
This query should result in what you intend:
merge (a: A {id: 1})
merge (b: B {id:1})
with *
// Check if a is connected to b or :C (can't use a again otherwise we'd overwrite it)
optional match(x:A {id: 1}) where exists((a)-[:is_required]->(:C)) or exists((a)-[:is_required]->(b))
with *, count(x) as cnt
// use a case to 'fool' foreach into creating the extra :Temporary node required if a is not related to b or :C
foreach ( i in case when cnt = 0 then [1] else [] end |
merge (temp: Temporary {id: 12948125})
merge (a)-[:is_required]->(temp)
)
with *
// Fetch the :Temporary node if it was created
optional match (a)-[:is_required]->(t:Temporary)
return *
There are apoc procedures you could use to perform conditional query execution (they are mentioned in the linked article). You could also play around with looking for a path from (a) and check its length, rather than introduce a new MATCH and the variable x then checking for the existance of related nodes.
If anyone is having the same problem, the answer is that the Neo4j browser is display nonexistent nodes. The query executes fine…

Postgres: Query Values in nested jsonb-structure with unknown keys

I am quite new in working with psql.
Goal is to get values from a nested jsonb-structure where the last key has so many different characteristics it is not possible to query them explicitely.
The jsonb-structure in any row is as follows:
TABLE_Products
{'products':[{'product1':['TYPE'], 'product2':['TYPE2','TYPE3'], 'productN':['TYPE_N']}]}
I want to get the values (TYPE1, etc.) assigned to each product-key (product1, etc.). The product-keys are the unknown, because of too many different names.
My work so far achieves to pull out a tuple for each key:value-pair on the last level. To illustrate this here you can see my code and the results from the previously described structure.
My Code:
select url, jsonb_each(pro)
from (
select id , jsonb_array_elements(data #> '{products}') as pro
from TABLE_Products
where data is not null
) z
My result:
("product2","[""TYPE2""]")
("product2","[""TYPE3""]")
My questions:
Is there a way to split this tuple on two columns?
Or how can I query the values kind of 'unsupervised', so without knowing the exact names of 'product1 ... n'

How to distinct dupulicate nodes on orientdb?

I executed code below.
select expand(both()) from data
where value = '123456'
There are two nodes (node_A, node_B) whose value equals '123456'
and both of them connected with node_C and node_D.
Finally it returned node_C and node_D for several times
but I just want each of them for one time.
I tried to use distinct() around "expand(both())" but failed.
How can I distinct dupulicate nodes?
select expand(distinct(#rid)) from (select expand(both()) from data
where value = '123456')

CASE in JOIN not working PostgreSQL

I got the following tables:
Teams
Matches
I want to get an output like:
matches.semana | teams.nom_equipo | teams.nom_equipo | Winner
1 AMERICA CRUZ AZUL AMERICA
1 SANTOS MORELIA MORELIA
1 LEON CHIVAS LEON
The columns teams.nom_equipo reference to matches.num_eqpo_lo & to matches.num_eqpo_v and at the same time they reference to the column teams.nom_equipo to get the name of each team based on their id
Edit: I have used the following:
SELECT m.semana, t_loc.nom_equipo AS LOCAL, t_vis.nom_equipo AS VISITANTE,
CASE WHEN m.goles_loc > m.goles_vis THEN 'home'
WHEN m.goles_vis > m.goles_loc THEN 'visitor'
ELSE 'tie'
END AS Vencedor
FROM matches AS m
JOIN teams AS t_loc ON (m.num_eqpo_loc = t_loc.num_eqpo)
JOIN teams AS t_vis ON (m.num_eqpo_vis = t_vis.num_eqpo)
ORDER BY m.semana;
But as you can see from the table Matches in row #5 from the goles_loc column (home team) & goles_vis (visitor) column, they have 2 vs 2 (number of goals - home vs visitor) being a tie but and when I run the code I get something that is not a tie:
Matches' score
Resultset from Select:
I also noticed that since the row #5 the names of both teams in the matches are not correct (both visitor & home team).
So, the Select brings correct data but in other order different than the original order (referring to the order from the table matches)
The order from the second week must be:
matches.semana | teams.nom_equipo | teams.nom_equipo | Winner
5 2 CRUZ AZUL TOLUCA TIE
6 2 MORELIA LEON LEON
7 2 CHIVAS SANTOS TIE
Row 8 from the Resultset must be Row # 5 and so on.
Any help would be really thanked!
When doing a SELECT which includes null for a column, that's the value it will always be, so winner in your case will never be populated.
Something like this is probably more along the lines of what you want:
SELECT m.semana, t_loc.nom_equipo AS loc_equipo, t_vis.nom_equipo AS vis_equipo,
CASE WHEN m.goles_loc - m.goles_vis > 0 THEN t_loc.nom_equipo
WHEN m.goles_vis - m.goles_loc > 0 THEN t_vis.nom_equipo
ELSE NULL
END AS winner
FROM matches AS m
JOIN teams AS t_loc ON (m.nom_eqpo_loc = t.num_eqpo)
JOIN teams AS t_vis ON (m.nom_eqpo_vis = t.num_eqpo)
ORDER BY m.semana;
Untested, but this should provide the general approach. Basically, you JOIN to the teams table twice, but using different conditions, and then you need to calculate the scores. I'm using NULL to indicate a tie, here.
Edit in response to comment from OP:
It's the same table -- teams -- but the JOINs produce different results, because the query uses different JOIN conditions in each JOIN.
The first JOIN, for t_loc, compares m.nom_eqpo_loc to t.num_eqpo. This means it gets the teams rows for the home team.
The second JOIN, for t_vis, compares m.nom_eqpo_vis to t.num_eqpo. This means it gets the teams rows for the visting team.
Therefore, in the CASE statement, t_loc refers to the home team, while t_vis refers to the visting one, enabling both to be used in the CASE statement, enabling the correct name to be found for winning.
Edit in response to follow-up comment from OP:
My original query was sorting by m.semana, which means other columns can appear in any order (essentially whichever Postgres feels is most efficient).
If you want the resulting table to be sorted exactly the same way as the matches table, then use the same ORDER BY tuple in its ORDER BY.
So, the ORDER BY clause would then become:
ORDER BY m.semana, m.nom_eqpo_loc, m.nom_eqpo_vis
Basically, the matches table PRIMARY KEY tuple.

Merge neo4j relationships into one while returning the result if certain condition satisfies

My use case is:
I have to return whole graph in result but the condition is
If there are more than 1 relationship in between two particular nodes in the same direction then I have to just merge it into 1 relationship. For ex: Lets say there are two nodes 'm' and 'n' and there are 3 relations in between these nodes say r1, r2, r3 (in the same direction) then when I get the result after firing cypher query I should get only 1 relation in between 'n' and 'm'.
I need to perform some operations on top of it like the resultant relation that we got from merging all the relations should contain the properties and their values that I want to retain. Actually I will retain all the properties of any one of the relations that are merging depending upon the timestamp field that is one of the properties in relation.
Note : I have same properties throughout all my relations (The number of properties and name of properties are same across all relations. Values may differ for sure)
Any help would be appreciated. Thanks in advance.
You mean something like this?
Delete all except the first
MATCH (a)-[r]->(b)
WITH a,b,type(r) as type, collect(r) as rels
FOREACH (r in rels[1..] | DELETE r)
Ordering by timestamp first
MATCH (a)-[r]->(b)
WITH a,r,b
ORDER BY r.timestamp DESC
WITH a,b,type(r) as type, collect(r) as rels
FOREACH (r in rels[1..] | DELETE r)
If you want to do all those operations virtually just on query results you'd do them in your programming language of choice.