Selecting parent records by filtering multiple fields of collection of links - orientdb

I have been trying to figure out this for a couple of days know but I can't come up with a query that gives me the correct results. The essence of the task is that I am trying to retrieve all the nodes of a graph that have children with attributes that satisfy multiple constraints. The issue I have is that a node may have multiple linked nodes and when I try to apply criteria to restrict which nodes must be returned by the query the criteria need to be imposed against sets of nodes instead of individual nodes.
Let me explain the problem in more detail through an example. Here is a sample schema of companies and locations. Each company can have multiple locations.
create class company extends V;
create property company.name STRING;
create class location extends V;
create property location.name STRING;
create property location.type INTEGER;
create property location.inactive STRING;
Let me now create a couple of records to illustrate the problem I have.
create vertex company set name = 'Company1';
create vertex location set name = 'Location1', type = 5;
create vertex location set name = 'Location2', type = 7;
create edge from (select from company where name = 'Company1') to (select from location where name in ['Location1', 'Location2']);
create vertex company set name = 'Company2';
create vertex location set name = 'Location3', type = 6;
create vertex location set name = 'Location4', type = 5, inactive = 'Y';
create edge from (select from company where name = 'Company2') to (select from location where name in ['Location3','Location4']);
I want to retrieve all companies that either don't have a location of type 5 or have a location of type 5 that is inactive (inactive = 'Y'). The query that I tried initially is shown below. It doesn't work because the $loc.type is evaluated against a collection of values instead of a individual record so the is null is not applied against the individual field 'inactive' of each location record but against the collection of values of the field 'inactive' for each parent record. I tried sub-queries, the set function, append and so on but I can't get it to work.
select from company let loc = out() where $loc.type not contains 5 or ($loc.type contains 5 and $loc.type is null)

You can try with this query:
select expand($c)
let $a = ( select expand(out) from E where out.#class = "company" and in.#class="location" and in.type = 5 and in.inactive = "Y" ),
$b = ( select from company where 5 not in out("E").type ),
$c = unionAll($a,$b)
UPDATE
I have created this graph
You can use this query
select expand($f)
let $a = ( select from E where out.#class = "company" and in.#class="location" ),
$b = ( select expand(out) from $a where in.type = 5 and in.inactive = "Y"),
$c = ( select expand(out) from $a where in.type = 5 and in.inactive is null),
$d = difference($b,$c),
$e = ( select from company where 5 not in out("E").type ),
$f = unionAll($d,$e)
Hope it helps.

Try this query:
select expand($d) from company
let $a=(select from company where out().type <> 5 and name contains $parent.current.name),
$b=(select from company where out().type contains 5 and name=$parent.current.name),
$c=(select from company where out().inactive contains "Y" and name=$parent.current.name),
$d=unionall($a,intersect($b,$c))
Hope it helps,
Regards,
Michela

Related

Writing subqueries in orientDB

This is the query i'm using right now :
INSERT INTO details SET name = "INITIALIZE",actionMap ={"1":12:1,"2":12:2};
Here 12:1,12:2 are rid's from another table.I'm facing a lot of problem's hardcoding these rid values.In order to avoid this i'd like to add the query like this
INSERT INTO details SET name = "INITIALIZE",actionMap ={"1":(select #rid from action where start is not null),"2":(select #rid from action where stop is not null)};
I'm getting this exception:
com.orientechnologies.orient.core.exception.OValidationException: The
field 'details.actionMap' has been declared as LINKMAP but the
value is not a record or a record-id
So how can i change my query to help my case.
This indeed can be done in a more gracefull way using batches.
This is just to create the object you want
INSERT INTO details SET name = "INITIALIZE"
We will turn this into
let $inserted = INSERT INTO details SET name = "INITIALIZE"
And add the edges you would like to add:
CREATE EDGE actionMap FROM $inserted TO (SELECT FROM action WHERE start is not null )
CREATE EDGE actionMap FROM $inserted TO (SELECT FROM action WHERE stop is not null )
So the entire batch you would have to run is
let $inserted = INSERT INTO details SET name = "INITIALIZE"
CREATE EDGE actionMap FROM $inserted TO (SELECT FROM action WHERE start is not null )
CREATE EDGE actionMap FROM $inserted TO (SELECT FROM action WHERE stop is not null )
If you have any more questions about this feel free to ask.
Adapted from this question
let $a1 = SELECT FROM action WHERE start IS NOT null
let $a2 = SELECT FROM action WHERE stop IS NOT null
INSERT INTO details SET name = "INITIALIZE", actionMap = {"1": $a1[0], "2": $a2[0]}
This is original answer and is for LINKLIST
Your error states:
The field 'details.actionMap' has been declared as LINKMAP but the value is not a record or a record-id
You are trying to store a value into a field that is supposed to store references
According to orientDB docs
You store references like this:
INSERT INTO Profiles SET name = 'Luca', friends = [#10:3, #10:4]
or with SELECT sub-querie:
INSERT INTO Diver SET name = 'Luca', buddy = (SELECT FROM Diver
WHERE name = 'Marko')
That will make your code to be:
INSERT INTO details SET name = "INITIALIZE", actionMap =[(SELECT FROM action WHERE start IS NOT null),(SELECT FROM action WHERE stop IS NOT null)];
Extra tip:
If you crate your actions in the same time, then you can add details and those 2 action with one query:
INSERT INTO details SET name = "INITIALIZE", actionMap = [(INSERT INTO yourActionTable SET yourActionField = 'yourFirstAction'), (INSERT INTO yourActionTable SET yourActionField = 'YourSecondAction')]

using let blocks in orientdb select statements

Hopefully a quick one - I'm struggling to make a let statement work.
I have a database of People vertexes. The vertexes have ident fields and name fields. This query returns one row - a person named Bob.
select from Person where ident = 1
I want to return all rows with the same name as this person. There are two Bobs in the data (as proof, the following query returns two rows):
select from Person where name = 'Bob'
I think all of the following queries should return those same two rows, but they all return 0 rows. They all involve different ways of using a let statement. Can anyone see what I'm doing wrong?
select name from Person
let $tmp = (select from Person where ident = 1)
where name = $tmp.name
select name from Person
let $tmp = (select name from Person where ident = 1)
where name = $tmp
select name from Person
let $tmp = 'Bob'
where name = $tmp
$tmp will be a list of records, so you are asking to compare a string to a list, and it isn't working. You could do the following;
select name from Person
let $tmp = (select from Person where ident = 1)
where name = first($tmp).name
That wont return Person records though (only rows of names). Limiting the $tmp query will also improve the performance slightly. So the following is better;
select from Person
let $tmp = (select from Person where ident = 1 limit 1)
where name = first($tmp).name
But actually, using the let clause in this manner is not good. The docs say
The LET block contains the list of context variables to assign each
time a record is evaluated.
So it is* (*should be) better to rearrange your query entirely;
select expand($persons_with_name) let
$person_with_ident = first((select from Person where ident = 1 limit 1)),
$persons_with_name = (select from Person where name = $parent.$person_with_ident.name)
You can look if name is contained in $tmp
select name from Person
let $tmp = (select name from Person where ident = 1)
where name in $tmp
or if $tmp contains name
select name from Person
let $tmp = (select name from Person where ident = 1)
where $tmp contains name

How to write a query that filters traversed elements

I'm struggling with the following query. For a family tree database, I have a vertex 'Person' and a lightweight edge 'Child', so the edge would go out of a parent and into a child (ie 'child-of'). From a person, I need to get their siblings who share the exact same parents.
I can get all of a persons siblings fairly easy, as follows;
SELECT
FROM (
TRAVERSE out_Child
FROM (
SELECT expand(in_Child)
FROM #11:3
)
WHILE $depth <= 1
)
WHERE $depth = 1
So this gets the parents of the person in question, then gets all the children of the parents. The results might look like the following
#rid in_Child
#11:2 #11:0
#11:3 #11:0, #11:1
#11:4 #11:0, #11:1
#11:5 #11:1
I need to filter these results though, as I only want records that have the exact same parents as #11:3. So in this instance, the query should only return #11:3 and #11:4. If the query were for #11:5, it should return #11:5 only. So basically, the in_Child fields must be the same.
I've tried all sorts of queries such as the following, but the query either doesnt run or doesnt filter.
SELECT
FROM (
SELECT
FROM (
TRAVERSE out_Child
FROM (
SELECT expand(in_Child)
FROM #11:3
)
WHILE $depth <= 1
)
WHERE $depth = 1
)
LET $testinChild = (SELECT expand(in_Child) FROM #11:3)
WHERE in_Child CONTAINSALL $testinChild
Ultimately I would prefer to not do any sub-queries, but if it's required then so be it. I Also tried to use traversedElement(0) function, but it only returns the first record traversed (ie #11:0, but not #11:1), so it can't be used.
Update;
If you copy-paste the following into orientdb console (change the password etc to suit your setup), you will have the same dataset described above.
create database remote:localhost/persondb root pass memory graph
alter database custom useLightweightEdges=true
create class Person extends V
create property Person.name string
create class Child extends E
create vertex Person set name = "Father"
create vertex Person set name = "Mother"
create vertex Person set name = "Child of father only"
create edge Child from #11:0 to #11:2
create vertex Person set name = "Child of father+mother #1"
create edge Child from #11:0 to #11:3
create edge Child from #11:1 to #11:3
create vertex Person set name = "Child of father+mother #2"
create edge Child from #11:0 to #11:4
create edge Child from #11:1 to #11:4
create vertex Person set name = "Child of mother only"
create edge Child from #11:1 to #11:5
Okay, I've found some solutions.
First of all, the way I used CONTAINSALL in the question is not correct, as pointed out to me here. CONTAINSALL does not check that all the items on the 'right' are in the 'left', but actually loops over each item in the 'left' and uses that item in the expression on the 'right'. SO WHERE in_Child CONTAINSALL (sex = 'Male) will filter for records where all of the in_Child records are only Male (ie no females). It's basically checking that in_Child[0:n].sex = 'Male'.
So I tried this query;
SELECT
FROM (
SELECT
FROM (
TRAVERSE
out('Child')
FROM (
SELECT
expand(in('Child'))
FROM
#11:3
)
WHILE
$depth <= 1
)
WHERE
$depth = 1
)
WHERE
(SELECT expand(in('Child')) from #11:3) CONTAINSALL (#rid in $current.in_Child)
I think OrientDB might have a bug here. The above query return #11:2, #11:3 and #11:4, which doesn't make sense to me. I changed this query slightly...
SELECT
FROM (
SELECT
FROM (
TRAVERSE
out('Child')
FROM (
SELECT
expand(in('Child'))
FROM
#11:3
)
WHILE
$depth <= 1
)
WHERE
$depth = 1
)
LET
$parents = (SELECT expand(in('Child')) from #11:3)
WHERE
$parents CONTAINSALL (#rid in $current.in_Child)
This works better. The above query correctly returns #11:3 and #11:4, but a query on #11:2 or #11:5 also incorrectly includes both #11:3 and #11:4. This makes sense, because it checking the parent rids of eg #11:2 (which is only 1) is in the parents of the rest, which they are. So I added a check to ensure they had the same amount of parents.
SELECT
FROM (
SELECT
FROM (
TRAVERSE
out('Child')
FROM (
SELECT
expand(in('Child'))
FROM
#11:3
)
WHILE
$depth <= 1
)
WHERE
$depth = 1
)
LET
$parents = (SELECT expand(in('Child')) from #11:3)
WHERE
$parents CONTAINSALL (#rid in $current.in_Child)
AND
$parents.size() = in('Child').size()
Now the query is working correctly for all instances. However, I still wasn't happy with this query. I abandonned the use of CONTAINSALL and eventually came up with the following...
SELECT
FROM (
SELECT
FROM (
TRAVERSE
out('Child')
FROM (
SELECT
expand(in('Child'))
FROM
#11:3
)
WHILE
$depth <= 1
)
WHERE
$depth = 1
)
LET
$parents = (SELECT expand(in('Child')) from #11:3)
WHERE
in_Child.asSet() = $parents.asSet()
This appears the best/safest, and is the one I will use.
UPDATE for dynamic number of parents :
SELECT
distinct(#rid)
FROM
(SELECT
expand(intersect)
FROM
(SELECT
in('Child').out('Child') as intersect
FROM
#17:2))
WHERE
in('Child').size() = $parentCount.size[0]
LET $parentCount = (SELECT
in('Child').size() as size
FROM
#17:2)

How to join the tables in linq to sql?

Table1 :
userid name address
1 venkat srinagr
2 venkatesh sainagar
Table2:
id userid lat lon
1 1 14.000 15.000
2 2 14.3526 15.3698
by passing "venkat" as parameter then need to pull all matching records and his userid,name,lat,lon.
in above table1 "venkat" contains in both rows then need to pull 2 records.how to get userid,name,lat,lon for all matching rows..
for sigle record i am able to get.but there are multiple rows how to get please tell me....
var result = from p in cxt.Table2
where p.Table1.Name.Contains(name)
select new
{
p.Users.User_Id,p.Users.Name,p.Latitude,p.Longitude
};
Im sure someone will say this is not the most effective way but this is how i would do it.
string InputString = "venkat";
var tab =(from a in db.tablea
from b in db.tableb
where a.userid == b.userid && a.name == InputString
select new
{
UserID = a.userid,
Username = a.name,
Latitude = b.lat,
Longditude = b.lon
}).FirstOrDefault();
FirstOrDefault() is only if you want to force only one output or null,
if you want a collection of some sort, then just remove it.

Metadata about a column in SQL Server 2008 R2?

I'm trying to figure out a way to store metadata about a column without repeating myself.
I'm currently working on a generic dimension loading SSIS package that will handle all my dimensions. It currently does :
Create a temporary table identical to the given table name in parameters (this is a generic stored procedure that receive the table name as parameter, and then do : select top 0 * into ##[INSERT ORIGINAL TABLE NAME HERE] from [INSERT ORIGINAL TABLE NAME HERE]).
==> Here we insert custom code for this particular dimension that will first query the data from a datasource and get my delta, then transform the data and finally loads it into my temporary table.
Merge the temporary table into my original table with a T-SQL MERGE, taking care of type1 and type2 fields accordingly.
My problem right now is that I have to maintain a table with all the fields in it to store a metadata to tell my scripts if this particular field is type1 or type2... this is nonsense, I can get the same data (minus type1/type2) from sys.columns/sys.types.
I was ultimately thinking about renaming my fields to include their type in it, such as :
FirstName_T2, LastName_T2, Sex_T1 (well, I know this can be type2, let's not fall into that debate here).
What do you guyz would do with that? My solution (using a table with that metadata) is currently in place and working, but it's obvious that repeating myself from the systables to a custom table is nonsense, just for a simple type1/type2 info.
UPDATE: I also thought about creating user defined types like varchar => t1_varchar, t2_varchar, etc. This sounds like something a bit sluggy too...
Everything you need should already be in INFORMATION_SCHEMA.COLUMNS
I can't follow your thinking of not using provided tables/views...
Edit: As scarpacci mentioned, this somewhat portable if needed.
I know this is bad, but I will post an answer to my own question... Thanks to GBN for the help tho!
I am now storing "flags" in the "description" field of my columns. I, for example, can store a flag this way : "TYPE_2_DATA".
Then, I use this query to get the flag back for each and every column :
select columns.name as [column_name]
,types.name as [type_name]
,extended_properties.value as [column_flags]
from sys.columns
inner join sys.types
on columns.system_type_id = types.system_type_id
left join sys.extended_properties
on extended_properties.major_id = columns.object_id
and extended_properties.minor_id = columns.column_id
and extended_properties.name = 'MS_Description'
where object_id = ( select id from sys.sysobjects where name = 'DimDivision' )
and is_identity = 0
order by column_id
Now I can store metadata about columns without having to create a separate table. I use what's already in place and I don't repeat myself. I'm not sure this is the best possible solution yet, but it works and is far better than duplicating information.
In the future, I will be able to use this field to store more metadata, where as : "TYPE_2_DATA|ANOTHER_FLAG|ETC|OH BOY!".
UPDATE :
I now store the information in separate extended properties. You can manage extended properties using sp_addextendedproperty and sp_updateextendedproperty stored procedures. I have created a simple store procedure that help me to update those values regardless if they currently exist or not :
create procedure [dbo].[UpdateSCDType]
#tablename nvarchar(50),
#fieldname nvarchar(50),
#scdtype char(1),
#dbschema nvarchar(25) = 'dbo'
as
begin
declare #already_exists int;
if ( #scdtype = '1' or #scdtype = '2' )
begin
select #already_exists = count(1)
from sys.columns
inner join sys.extended_properties
on extended_properties.major_id = columns.object_id
and extended_properties.minor_id = columns.column_id
and extended_properties.name = 'ScdType'
where object_id = (select sysobjects.id from sys.sysobjects where sysobjects.name = #tablename)
and columns.name = #fieldname
if ( #already_exists = 0 )
begin
exec sys.sp_addextendedproperty
#name = N'Scd_Type',
#value = #scdtype,
#level0type = N'SCHEMA',
#level0name = #dbschema,
#level1type = N'TABLE',
#level1name = #tablename,
#level2type = N'COLUMN',
#level2name = #fieldname
end
else
begin
exec sys.sp_updateextendedproperty
#name = N'Scd_Type',
#value = #scdtype,
#level0type = N'SCHEMA',
#level0name = #dbschema,
#level1type = N'TABLE',
#level1name = #tablename,
#level2type = N'COLUMN',
#level2name = #fieldname
end
end
end
Thanks again