Print hierarchical vertices in a graph [closed] - graphic

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
How to print all the parents of a given node.

Graham's answer is okay, but uses expensive path computations, which are not really required for your use-case.
This is how I've set up your graph:
g = TinkerGraph.open().traversal()
g.addV().property(id, 1).as("v1").
addV().property(id, 2).as("v2").
addV().property(id, 3).as("v3").
addV().property(id, 4).as("v4").
addV().property(id, 5).as("v5").
addV().property(id, 6).as("v6").
addV().property(id, 7).as("v7").
addV().property(id, 8).as("v8").
addE("black").from("v2").to("v1").
addE("black").from("v7").to("v2").
addE("black").from("v8").to("v7").
addE("orange").from("v8").to("v7").
addE("black").from("v3").to("v2").
addE("black").from("v6").to("v3").
addE("black").from("v4").to("v3").
addE("black").from("v5").to("v4").
addE("orange").from("v5").to("v4").iterate()
Now, to get all ancestors, all you need is this:
gremlin> g.V(5).repeat(out().dedup()).emit()
==>v[4]
==>v[3]
==>v[2]
==>v[1]
Likewise you won't need path computations to determine the maximum depth:
gremlin> g.V(5).emit().repeat(out().dedup()).count()
==>5

Here's a sample graph that I think matches the branch you want to traverse.
graph=TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
g=graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
v1 = graph.addVertex(T.label, "vtx", T.id, 1, "name", "alpha")
==>v[1]
v2 = graph.addVertex(T.label, "vtx", T.id, 2, "name", "beta")
==>v[2]
v3 = graph.addVertex(T.label, "vtx", T.id, 3, "name", "gamma")
==>v[3]
v4 = graph.addVertex(T.label, "vtx", T.id, 4, "name", "delta")
==>v[4]
v5 = graph.addVertex(T.label, "vtx", T.id, 5, "name", "epsilon")
==>v[5]
v5.addEdge("parent", v4, T.id, 101)
==>e[101][5-parent->4]
v4.addEdge("parent", v3, T.id, 102)
==>e[102][4-parent->3]
v3.addEdge("parent", v2, T.id, 103)
==>e[103][3-parent->2]
v2.addEdge("parent", v1, T.id, 104)
==>e[104][2-parent->1]
A repeat() loop taking outbound 'parent' edges will traverse 'up' the hierarchy. In the following example, we've told it to start at v[5] and take two 'hops':
g.V(5).repeat(out('parent')).times(2).path()
==>[v[5],v[4],v[3]]
In practice you probably want to keep traversing (i.e. repeating) until you hit some terminating condition which will depend on what you want to achieve.
You might know you want to stop at a particular vertex, e.g. if you know the root of the hierarchy is v[1], or has a particular property:
g.V(5).repeat(out('parent')).until(hasId(1)).path()
==>[v[5],v[4],v[3],v[2],v[1]]
Or you may want to traverse until you hit the top of the hierarchy (i.e. until there are no more outgoing edges):
g.V(5).repeat(out('parent')).until(outE().count().is(0)).path()
==>[v[5],v[4],v[3],v[2],v[1]]
To get the max depth from the root to the furthest leaf node:
Let's add some more vertices to the graph, to match your graph.
v7 = graph.addVertex(T.label, "vtx", T.id, 7, "name", "eta")
==>v[7]
v8 = graph.addVertex(T.label, "vtx", T.id, 8, "name", "theta")
==>v[8]
v6 = graph.addVertex(T.label, "vtx", T.id, 6, "name", "zeta")
==>v[6]
v8.addEdge("parent", v7, T.id, 105)
==>e[105][8-parent->7]
v7.addEdge("parent", v2, T.id, 106)
==>e[106][7-parent->2]
v6.addEdge("parent", v3, T.id, 107)
==>e[107][6-parent->3]
A traversal from the root toward the leaf vertices will now result in 3 paths:
g.V(1).repeat(__.in('parent')).until(inE().count().is(0)).path()
==>[v[1],v[2],v[3],v[6]]
==>[v[1],v[2],v[7],v[8]]
==>[v[1],v[2],v[3],v[4],v[5]]
But you only want the length of the longest path:
g.V(1).repeat(__.in('parent')).until(inE().count().is(0)).path().
......1> tail(1).unfold().count()
==>5
Hope that helps,
Graham

Related

Add sorting to this select [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 days ago.
Improve this question
PostgreSQL 15.2
select
current.id as currentId,
current.date as currentDate,
current.position as currentPosition,
current.relevant_url as currentRelevantUrl,
current.restrictions_probability as currentRestrictionsProbability,
current.url_changed as currentUrlChanged,
current.identifier_id as currentIdentifierId,
current.phrase_id as phraseId,
previous.id as previousId,
previous.date as previousDate,
previous.position as previousPosition,
previous.relevant_url as previousRelevantUrl,
previous.restrictions_probability as previousRestrictionsProbability,
previous.url_changed as previousUrlChanged,
previous.identifier_id as previousIdentifierId,
semantics_clusters.id as clusterId,
semantics_clusters.name as clusterName,
site_pages.id as pageId,
site_pages.url as pageUrl,
site_pages.site_id as siteId,
site_sites.url as siteUrl,
semantics_core_phrases.frequency as frequency
from (
select
id,
date,
position,
relevant_url,
restrictions_probability,
url_changed,
identifier_id,
phrase_id
from overoptimisation
where identifier_id = 1) current
left join (
select
id,
date,
position,
relevant_url,
restrictions_probability,
url_changed,
identifier_id,
phrase_id
from overoptimisation
where identifier_id = 1) previous
on current.phrase_id = previous.phrase_id
inner join semantics_core_phrases on current.phrase_id = semantics_core_phrases.id
inner join semantics_clusters on semantics_core_phrases.cluster_id = semantics_clusters.id
inner join site_pages on semantics_clusters.page_id = site_pages.id
inner join site_sites on site_pages.site_id = site_sites.id
This select works. But I have failed to add sorting.
How can I add order by for these fields?
siteId
pageId
clusterId
frequency
I don't see any problem to add order by clause.
Basically your SQL is valid, you can check it, for example, here
If you add order by clause in the end, then it is valid too:
select
current.id as currentId,
current.date as currentDate,
current.position as currentPosition,
current.relevant_url as currentRelevantUrl,
current.restrictions_probability as currentRestrictionsProbability,
current.url_changed as currentUrlChanged,
current.identifier_id as currentIdentifierId,
current.phrase_id as phraseId,
previous.id as previousId,
previous.date as previousDate,
previous.position as previousPosition,
previous.relevant_url as previousRelevantUrl,
previous.restrictions_probability as previousRestrictionsProbability,
previous.url_changed as previousUrlChanged,
previous.identifier_id as previousIdentifierId,
semantics_clusters.id as clusterId,
semantics_clusters.name as clusterName,
site_pages.id as pageId,
site_pages.url as pageUrl,
site_pages.site_id as siteId,
site_sites.url as siteUrl,
semantics_core_phrases.frequency as frequency
from
(
select
id,
date,
position,
relevant_url,
restrictions_probability,
url_changed,
identifier_id,
phrase_id
from
overoptimisation
where
identifier_id = 1
) current
left join (
select
id,
date,
position,
relevant_url,
restrictions_probability,
url_changed,
identifier_id,
phrase_id
from
overoptimisation
where
identifier_id = 1
) previous on current.phrase_id = previous.phrase_id
inner join semantics_core_phrases on current.phrase_id = semantics_core_phrases.id
inner join semantics_clusters on semantics_core_phrases.cluster_id = semantics_clusters.id
inner join site_pages on semantics_clusters.page_id = site_pages.id
inner join site_sites on site_pages.site_id = site_sites.id
order by
siteId
The same way you can order by pageId, clusterId, frequency

How to merge two vertex result using only one edge in orientdb

I'm currently working on orientdb, and I'm having a hard time how to merge to rows or a two result. I just want to merge the two vertex in an edge, i tried unionall but it doesn't work in my side, please help me.
I already used unionall, unwind, and bothV() but not working.
person and company is a vertex
is_working is an edge(from(person), to(company))
i want to merge the result of the two vertices
ex.
select expand(bothV()) from is_working where in = '13:3'
i just want to get all users that is working on the specific company.
expected result:
{name: "Randolf", gender: "Male", company_name:"Name of the company"},
{name: "Jefferson", gender: "Male", company_name:"Name of the company"}
I already tried the code below
select person, company.*
from (select person, in('is_working') as company
from(select expand(out('is_working'))
from #13:2)
unwind company)
select expand($all) let
#a = (select expand(in('is_working') from company where #rid = '13:2'),
#b = (select expand(in('is_working').out('is_working')) from company where #rid = '13:2'),
#all = unionall(#a, #b)
Nothing error but it doesn't show any result.
when i tried "select expand(bothV()) from is_working where in = '13:3'"
there is a result but it's not merge.
by the way the 13:2 and 13:3 is the RID of my company
Assuming (for simplicity, anything carried out in the active-orient console)
> V.create_class :company, :person
# INFO->CREATE CLASS company EXTENDS V
# INFO->CREATE CLASS person EXTENDS V
=> [Company, Person]
> E.create_class :is_working
# INFO->CREATE CLASS is_working EXTENDS E
=> IS_WORKING
> c = Company.create name: 'C'
# INFO->CREATE VERTEX company set name = 'C'
> c.assign via: IS_WORKING, vertex: Person.create( name: 'A')
# INFO->CREATE VERTEX person set name = 'A'
# INFO->CREATE EDGE is_working from #41:0 to #49:0
> Person.create( name: 'B').assign( via: IS_WORKING, vertex: c)
# INFO->CREATE VERTEX person set name = 'B'
# INFO->CREATE EDGE is_working from #50:0 to #41:0
you got (active-orient specific)
c.edges.to_human
=> ["<IS_WORKING[#58:0] :.: 50:0->{ }->41:0>", "<IS_WORKING[#57:0] :.: 41:0->{ }->49:0>"]
which is basically our setup, i guess
Then
> c.nodes( :out, via: IS_WORKING).to_human
# INFO->select outE('is_working').in from #41:0
=> ["<Person[49:0]: in: {IS_WORKING=>1}, name : A>"]
> c.nodes( :in, via: IS_WORKING).to_human
# INFO->select inE('is_working').out from #41:0
=> ["<Person[50:0]: out: {IS_WORKING=>1}, name : B>"]
> c.nodes( :both, via: IS_WORKING).to_human
# INFO->select both('is_working') from #41:0
=> ["<Person[49:0]: in: {IS_WORKING=>1}, name : A>",
"<Person[50:0]: out: {IS_WORKING=>1}, name : B>"]
I guess, the last query covers your case
The expanded version:
> t.nodes( :both, via: IS_WORKING, expand: true)
# INFO->select expand ( both('is_working') ) from #41:0

SELECT attribute with MIN value from set of attribute instances that are part of specific row?

I know my question is probably very vague and hard to understand at first glance, and i've sat 30 minutes thinking about a proper title. However my database knowledge is very limited so I have a hard time formulating myself properly yet.
It is part of a school assignment I'm currently doing, where the following is what I'm trying to achieve:
and an ER diagram I made of the system:
What I'm trying to accomplish is, selecting the quantity of the component within the computer_system that has the lowest quantity (current stock) so that I in the dataset I am printing out, am able to state exactly how many of each computer_system the store is able to sell, based on the lowest current quantity of any component the computer_system consists of.
This is the query that i am currently working with to accomplish it, but I've had multiple problems with quantity being ambiguous and other errors every time I try to make a fix. I have consulted a dozen of friends from class, but without luck.
SELECT
computer_system.NAME,
cpu.name as cpu,
gpu.name as gpu,
board.name as mainboard,
pccase.name as pc_case,
ram.name as ram,
component.quantity as qty,
(cpu.price *1.3+ board.price*1.3 + pccase.price*1.3 + ram.price*1.3 + gpu.price*1.3) as computer_system_price
FROM computer_system, component
join component cpu on cpu.id = computer_system.cpu
join component gpu on gpu.id = computer_system.gpu
join component board on board.id = computer_system.mainboard
join component pccase on pccase.id = computer_system.pc_case
join component ram on ram.id = computer_system.ram
JOIN component qty ON qty.quantity = (SELECT MIN(component.quantity) FROM component WHERE component.id IN
(computer_system.pc_case,
computer_system.mainboard,
computer_system.cpu,
computer_system.gpu,
computer_system.ram))
The following code fixed it for me:
SELECT computer_system.name AS "System name",
cpu.name AS "CPU",
gpu.name AS "GPU",
pc_case.name AS "Case",
mainboard.name AS "Mainboard",
ram.name AS "RAM",
FLOOR((cpu.price + mainboard.price + pc_case.price + ram.price + coalesce(gpu.price, 0))*1.3/100)*100+99 AS "System price",
SYSTEM.maxamount
FROM computer_system
join (SELECT name,
id,
price
FROM component) AS cpu
ON cpu.id = computer_system.cpu
left join (SELECT name,
id,
price
FROM component) AS gpu
ON coalesce(gpu.id,0) = computer_system.gpu
join (SELECT name,
id,
price
FROM component) AS pc_case
ON pc_case.id = computer_system.pc_case
join (SELECT name,
id,
price
FROM component) AS mainboard
ON mainboard.id = computer_system.mainboard
join (SELECT name,
id,
price
FROM component) AS ram
ON ram.id = computer_system.ram
join (SELECT computer_system.name,
Min(component.quantity) AS maxamount
FROM computer_system,
component
WHERE computer_system.cpu = component.id
OR computer_system.gpu = component.id
OR computer_system.mainboard = component.id
OR computer_system.pc_case = component.id
OR computer_system.ram = component.id
GROUP BY computer_system.name) AS SYSTEM
ON SYSTEM.name = computer_system.name;

Entity framework: cross join causes OutOfMemoryException

I've got a table with 1.5 million records on a SQL Server 2008. There's a varchar column 'ReferenzNummer' that is indexed.
The following query executed in the SQL Management Studio works and is fast:
SELECT v1.Id, v2.Id FROM Vorpapier as v1 cross join Vorpapier as v2
WHERE v1.ReferenzNummer LIKE '7bd48e26-58d9-4c31-a755-a15500bce4c4'
AND v2.ReferenzNummer LIKE '7bd4%'
(I know the query does not make much sense like this, there will be more constraints, but that's not important for the moment)
Now I'd like to execute a query like this from Entity Framework 5.0, my LINQ looks like this:
var result = (from v1 in vorpapierRepository.DbSet
from v2 in vorpapierRepository.DbSet
where v1.ReferenzNummer == "7bd48e26-58d9-4c31-a755-a15500bce4c4" &&
v2.ReferenzNummer.StartsWith("7bd4")
select new { V1 = v1.Id, V2 = v2.Id })
.Take(10)
.ToList();
This tries to load the whole table into memory, leading to an OutOfMemoryException after some time. I've tried to move the WHERE parts, with no success:
var result = (from v1 in vorpapierRepository.DbSet.Where(v => v.ReferenzNummer == "7bd48e26-58d9-4c31-a755-a15500bce4c4")
from v2 in vorpapierRepository.DbSet.Where(v => v.ReferenzNummer.StartsWith("7bd4"))
select new { V1 = v1.Id, V2 = v2.Id })
.Take(10)
.ToList();
Is it possible to tell Entity Framework to create a cross join statement, like the one I've written myself?
UPDATE 1
The EF generated SQL looks like this (for both queries)
SELECT [Extent1].[Id] AS [Id],
[Extent1].[VorpapierArtId] AS [VorpapierArtId],
[Extent1].[ReferenzNummer] AS [ReferenzNummer],
[Extent1].[IsImported] AS [IsImported],
[Extent1].[DwhVorpapierId] AS [DwhVorpapierId],
[Extent1].[Datenbasis_Id] AS [Datenbasis_Id]
FROM [dbo].[Vorpapier] AS [Extent1]
UPDATE 2
When I change the LINQ query and join the table with itself on the field DatenbasisIDd (which is not exactly what I want, but it might work), EF creates a join:
var result = (from v1 in vorpapierRepository.DbSet
join v2 in vorpapierRepository.DbSet
on v1.DatenbasisId equals v2.DatenbasisId
where v1.ReferenzNummer == "7bd48e26-58d9-4c31-a755-a15500bce4c4" && v2.ReferenzNummer.StartsWith("7bd4")
select new { V1 = v1.Id, V2 = v2.Id })
.Take(10)
.ToList();
The resulting SQL query looks like this. It works and is fast enough.
SELECT TOP (10) 1 AS [C1],
[Extent1].[Id] AS [Id],
[Extent2].[Id] AS [Id1]
FROM [dbo].[Vorpapier] AS [Extent1]
INNER JOIN [dbo].[Vorpapier] AS [Extent2]
ON ([Extent1].[Datenbasis_Id] = [Extent2].[Datenbasis_Id])
OR (([Extent1].[Datenbasis_Id] IS NULL)
AND ([Extent2].[Datenbasis_Id] IS NULL))
WHERE (N'7bd48e26-58d9-4c31-a755-a15500bce4c4' = [Extent1].[ReferenzNummer])
AND ([Extent2].[ReferenzNummer] LIKE N'7bd4%')
I still don't see, why EF doesn't create the cross join in the original query. Is it simply not supported?
If you use a join in the linq statement it will get passed back to SQL Server. Here are some examples of the join operator in linq: http://code.msdn.microsoft.com/LINQ-Join-Operators-dabef4e9

PostgreSQL view embedded if-statements

In my database, for stores that have no rating from Reseller Ratings, they still have an entry but have -1.0 instead of a number between 0.0 and 10.0. The following query results in -10.00 showing up in my view for those stores with -1.0. Instead, I would like either nothing or a - showing up in its place, but I'm not very comfortable with implementing embedded if-statements in my view. Here is my current view.
CREATE VIEW myview AS
SELECT co_url_name AS company_url, score_combined AS stella_score, trunc(score*10, 2) AS bizrate_score,
(SELECT trunc("lifetimeRating"*10, 2)) AS resellerRating_score
FROM ss_profile_co AS s LEFT OUTER JOIN "resellerRatings_ratings" AS r
ON s.id = r.company_id
LEFT OUTER JOIN (SELECT * FROM bizrate_bizrate_ratings WHERE score_name = 'Overall rating') AS b
ON s.id = b.fk_co_id
ORDER BY co_url_name ASC;
The line (SELECT trunc("lifetimeRating"*10, 2)) AS resellerRating_score is the one that returns the negative numbers (or, for valid entries, will return a score between 0.00 and 100.00).
Obviously, I could simply remove these entries from the database that result in this, but it's half a learning experience and half out of my hands to do so anyways.
I appreciate the help!
EDIT: Attempted an embedded if but not surprisingly got an error.
IF (SELECT trunc("lifetimeRating"*10, 2)) = -10.00 THEN NULL ELSE (SELECT trunc("lifetimeRating"*10, 2)) AS resellerRating_score
EDIT2: Figured it out. Line in question is as follows:
(SELECT trunc("lifetimeRating"*10, 2) WHERE trunc("lifetimeRating"*10, 2) > 0) AS resellerrating_score
/foreveralone
Could look like this:
CREATE VIEW myview AS
SELECT co_url_name AS company_url
,score_combined AS stella_score
,trunc(score * 10, 2) AS bizrate_score
,CASE WHEN "lifetimeRating" < 0
THEN NULL
ELSE trunc("lifetimeRating" * 10, 2)
END AS resellerRating_score
FROM ss_profile_co s
LEFT JOIN "resellerRatings_ratings" r ON r.company_id = s.id
LEFT JOIN bizrate_bizrate_ratings b ON b.score_name = 'Overall rating'
AND b.fk_co_id = s.id
ORDER BY co_url_name;
Major points
Concerning your core question: the sub-select without a FROM clause serves no purpose. I simplified that and use a CASE statement instead.
I also simplified your LEFT JOIN to bizrate_bizrate_ratings. No sub-select necessary either. I pulled the WHERE clause up into the JOIN condition. Simpler and faster.
I would advise not to use mixed case identifiers, so you never have to use double quotes. (This probably makes #Daniels comment invalid, because lifetimerating != "lifetimeRating"