OrientDB - Multiple records showing after re-applying edges - orientdb

I have a problem. When I create edges first time, the number of records in the output are OK. But, when I add another record to the class and create the edge again, I get multiple records. Here is what I am doing.
create class Country extends V
create class Immigrant extends V
create class comesFrom extends E
create property Country.c_id integer
create property Country.c_name String
create property Immigrant.i_id integer
create property Immigrant.i_name String
create property Immigrant.i_country Integer
insert into Country(c_id, c_name) values (1, 'USA')
insert into Country(c_id, c_name) values (2, 'UK')
insert into Country(c_id, c_name) values (3,'PAK')
insert into Immigrant(i_id, i_name,i_country) values (1, 'John',1)
insert into Immigrant(i_id, i_name,i_country) values (2, 'Graham',2)
insert into Immigrant(i_id, i_name,i_country) values (3, 'Ali',3)
create edge comesFrom from (select from Immigrant where i_country = 1) to (select from Country where c_id = 1)
create edge comesFrom from (select from Immigrant where i_country = 2) to (select from Country where c_id = 2)
create edge comesFrom from (select from Immigrant where i_country = 3) to (select from Country where c_id = 3)
select i_id, i_name, out('comesFrom').c_id as c_id, out('comesFrom').c_name as c_name from Immigrant unwind c_id, c_name
I get the result as below.
Click here to view image of correct records
Then I add another record to the class Immigrant.
insert into Immigrant(i_id, i_name,i_country) values (4, ‘James',2)
And create the edge again. Please note that the new immigrant belongs to an already existing country.
create edge comesFrom from (select from Immigrant where i_country = 2) to (select from Country where c_id = 2)
I run the same query as below.
select i_id, i_name, out('comesFrom').c_id as c_id, out('comesFrom').c_name as c_name from Immigrant unwind c_id, c_name
Now I get multiple records as below.
Click here to view image of incorrect records.
What wrong am I doing.
Thank you!

The problem is this command:
create edge comesFrom from (select from Immigrant where i_country = 2)
to (select from Country where c_id = 2)
Because if you execute only this part:
select from Immigrant where i_country = 2
You can see that there are 2 results: Graham and James.
So, it will create an edge between the two Immigrants (Graham and James) and the Country.
To avoid this problem you can create the edge using the name of the Immigrants.
However I add a couple of attachments so you can understand better.
The problem: http://i.stack.imgur.com/NptOt.png
Your Solution: http://i.stack.imgur.com/PfU6c.png
My Solution: http://i.stack.imgur.com/WEEal.png
Regards

Related

OrientDB - How to create edge on imported data

I understand that in a graph DB, relations are created after insertion and instead or field to field relations, they are record to record relation. If my understanding is correct, then what do I need to do when I import a million records from a CSV file and need to relate them to records in the other table?
Is it not possible to relations (edges) at design time, before insertion so that whenever a record is inserted, it has a relation already there?
My database is detailed as below.
create class Country extends V
create class Immigrant extends V
create class comesFrom extends E
create property Country.c_id integer
create property Country.c_name String
create property Immigrant.i_id integer
create property Immigrant.i_name String
create property Immigrant.i_country Integer
If I create the edges manually, it will be like this.
insert into Country(c_id, c_name) values (1, 'USA')
insert into Country(c_id, c_name) values (2, 'UK')
insert into Country(c_id, c_name) values (3,'PAK')
insert into Immigrant(i_id, i_name,i_country) values (1, 'John',1)
insert into Immigrant(i_id, i_name,i_country) values (2, 'Graham',2)
insert into Immigrant(i_id, i_name,i_country) values (3, 'Ali',3)
create edge comesFrom from (select from Immigrant where i_country = 1) to (select from Country where c_id = 1)
create edge comesFrom from (select from Immigrant where i_country = 2) to (select from Country where c_id = 2)
create edge comesFrom from (select from Immigrant where i_country = 3) to (select from Country where c_id = 3)
You can use OrientDB ETL to do that. For more information look at this example: http://orientdb.com/docs/last/Import-from-CSV-to-a-Graph.html

Converting Traditional IF EXIST UPDATE ELSE INSERT into MERGE is not working?

I am going to use MERGE to insert or update a table depending upon ehether it's exist or not. This is my query,
declare #t table
(
id int,
name varchar(10)
)
insert into #t values(1,'a')
MERGE INTO #t t1
USING (SELECT id FROM #t WHERE ID = 2) t2 ON (t1.id = t2.id)
WHEN MATCHED THEN
UPDATE SET name = 'd', id = 3
WHEN NOT MATCHED THEN
INSERT (id, name)
VALUES (2, 'b');
select * from #t;
The result is,
id name
1 a
I think it should be,
id name
1 a
2 b
You have your USING part slightly messed up, that's where to put what you want to match against (although in this case you're only using id)
declare #t table
(
id int,
name varchar(10)
)
insert into #t values(1,'a')
MERGE INTO #t t1
USING (SELECT 2, 'b') AS t2 (id, name) ON (t1.id = t2.id)
WHEN MATCHED THEN
UPDATE SET name = 'd', id = 3
WHEN NOT MATCHED THEN
INSERT (id, name)
VALUES (2, 'b');
select * from #t;
As Mikhail pointed out, your query in the USING clause doesn't contain any rows.
If you want to do an upsert, put the new data into the USING clause:
MERGE INTO #t t1
USING (SELECT 2 as id, 'b' as name) t2 ON (t1.id = t2.id) --This no longer has an artificial dependency on #t
WHEN MATCHED THEN
UPDATE SET name = t2.name
WHEN NOT MATCHED THEN
INSERT (id, name)
VALUES (t2.id, t2.name);
This query won't return anything:
SELECT id FROM #t WHERE ID = 2
Because where is no rows in table with ID = 2, so there is nothing to merge into table.
Besides, in MATCHED clause you are updating a field ID on which you are joining table, i think, it's forbidden.
For each DML operations you have to commit (Marks the end of a successful the transaction)Then only you will be able to see the latest data
For example :
GO
BEGIN TRANSACTION;
GO
DELETE FROM HumanResources.JobCandidate
WHERE JobCandidateID = 13;
GO
COMMIT TRANSACTION;
GO

Select value from an enumerated list in PostgreSQL

I want to select from an enumaration that is not in database.
E.g. SELECT id FROM my_table returns values like 1, 2, 3
I want to display 1 -> 'chocolate', 2 -> 'coconut', 3 -> 'pizza' etc. SELECT CASE works but is too complicated and hard to overview for many values. I think of something like
SELECT id, array['chocolate','coconut','pizza'][id] FROM my_table
But I couldn't succeed with arrays. Is there an easy solution? So this is a simple query, not a plpgsql script or something like that.
with food (fid, name) as (
values
(1, 'chocolate'),
(2, 'coconut'),
(3, 'pizza')
)
select t.id, f.name
from my_table t
join food f on f.fid = t.id;
or without a CTE (but using the same idea):
select t.id, f.name
from my_table t
join (
values
(1, 'chocolate'),
(2, 'coconut'),
(3, 'pizza')
) f (fid, name) on f.fid = t.id;
This is the correct syntax:
SELECT id, (array['chocolate','coconut','pizza'])[id] FROM my_table
But you should create a referenced table with those values.
What about creating another table that enumerate all cases, and do join ?
CREATE TABLE table_case
(
case_id bigserial NOT NULL,
case_name character varying,
CONSTRAINT table_case_pkey PRIMARY KEY (case_id)
)
WITH (
OIDS=FALSE
);
and when you select from your table:
SELECT id, case_name FROM my_table
inner join table_case on case_id=my_table_id;

Copy content in TSQL

I need to copy content from one table to itself and related tables... Let me schematize the problem. Let's say I have two tables:
Order
OrderID : int
CustomerID : int
OrderName : nvarchar(32)
OrderItem
OrderItemID : int
OrderID : int
Quantity : int
With the PK being autoincremental.
Let's say I want to duplicate the content of one customer to another. How do I do that efficiently?
The problem are the PKs. I would need to map the values of OrderIDs from the original set of data to the copy in order to create proper references in OrderItem. If I just select-Insert, I won't be able to create that map.
Suggestions?
For duplicating one parent and many children with identities as the keys, I think the OUTPUT clause can make things pretty clean (SqlFiddle here):
-- Make a duplicate of parent 1, including children
-- Setup some test data
create table Parents (
ID int not null primary key identity
, Col1 varchar(10) not null
, Col2 varchar(10) not null
)
insert into Parents (Col1, Col2) select 'A', 'B'
insert into Parents (Col1, Col2) select 'C', 'D'
insert into Parents (Col1, Col2) select 'E', 'F'
create table Children (
ID int not null primary key identity
, ParentID int not null references Parents (ID)
, Col1 varchar(10) not null
, Col2 varchar(10) not null
)
insert into Children (ParentID, Col1, Col2) select 1, 'g', 'h'
insert into Children (ParentID, Col1, Col2) select 1, 'i', 'j'
insert into Children (ParentID, Col1, Col2) select 2, 'k', 'l'
insert into Children (ParentID, Col1, Col2) select 3, 'm', 'n'
-- Get one parent to copy
declare #oldID int = 1
-- Create a place to store new ParentID
declare #newID table (
ID int not null primary key
)
-- Create new parent
insert into Parents (Col1, Col2)
output inserted.ID into #newID -- Capturing the new ParentID
select Col1, Col2
from Parents
where ID = #oldID -- Only one parent
-- Create new children using the new ParentID
insert into Children (ParentID, Col1, Col2)
select n.ID, c.Col1, c.Col2
from Children c
cross join #newID n
where c.ParentID = #oldID -- Only one parent
-- Show some output
select * from Parents
select * from Children
Do you have to have the primary keys from table A as primaries in Table B? If not you can do a select statement with an insert into. Primary Key's are usually int's that start from an ever increasing seed (identity). Going around this and declaring an insert of this same data problematically has the disadvantage of someone thinking this is a distinct key set on this table and not a 'relationship' or foreign key value.
You can Select Primary Key's for inserts into other tables, just not themselves.... UNLESS you set the 'identity insert on' hint. Do not do this unless you know what this does as you can create more problems than it's worth if you don't understand the ramifications.
I would just do the ole:
insert into TableB
select *
from TableA
where (criteria)
Simple example (This assumes SQL Server 2008 or higher). My bad I did not see you did not list TSQL framework. Not sure if this will run on Oracle or MySql.
declare #Order Table ( OrderID int identity primary key, person varchar(8));
insert into #Order values ('Brett'),('John'),('Peter');
declare #OrderItem Table (orderItemID int identity primary key, OrderID int, OrderInfo varchar(16));
insert into #OrderItem
select
OrderID -- I can insert a primary key just fine
, person + 'Stuff'
from #Order
select *
from #Order
Select *
from #OrderItem
Add an extra helper column to Order called OldOrderID
Copy all the Order's from the #OldCustomerID to the #NewCustomerID
Copy all of the OrderItems using the OldOrderID column to help make the relation
Remove the extra helper column from Order
ALTER TABLE Order ADD OldOrderID INT NULL
INSERT INTO Order (CustomerID, OrderName, OldOrderID)
SELECT #NewCustomerID, OrderName, OrderID
FROM Order
WHERE CustomerID = #OldCustomerID
INSERT INTO OrderItem (OrderID, Quantity)
SELECT o.OrderID, i.Quantity
FROM Order o INNER JOIN OrderItem i ON o.OldOrderID = i.OrderID
WHERE o.CustomerID = #NewCustomerID
UPDATE Order SET OldOrderID = null WHERE OldOrderID IS NOT NULL
ALTER TABLE Order DROP COLUMN OldOrderID
IF the OrderName is unique per customer, you could simply do:
INSERT INTO [Order] ([CustomerID], [OrderName])
SELECT
2 AS [CustomerID],
[OrderName]
FROM [Order]
WHERE [CustomerID] = 1
INSERT INTO [OrderItem] ([OrderID], [Quantity])
SELECT
[o2].[OrderID],
[oi1].[Quantity]
FROM [OrderItem] [oi1]
INNER JOIN [Order] [o1] ON [oi1].[OrderID] = [o1].[OrderID]
INNER JOIN [Order] [o2] ON [o1].[OrderName] = [o2].[OrderName]
WHERE [o1].[CustomerID] = 1 AND [o2].[CustomerID] = 2
Otherwise, you will have to use a temporary table or alter the existing Order table like #LastCoder suggested.

Getting those records that have a match of all records in another table

Within the realm of this problem I have 3 entities:
User
Position
License
Then I have two relational (many-to-many) tables:
PositionLicense - this one connects Position with License ie. which licenses are required for a particular position
UserLicense - this one connects User with License ie. which licenses a particular user has. But with an additional complexity: user licenses have validity date range (ValidFrom and ValidTo)
The problem
These are input variables:
UserID that identifiers a particular User
RangeFrom defines the lower date range limit
RangeTo defines the upper date range limit
What I need to get? For a particular user (and date range) I need to get a list of positions that this particular user can work at. The problem is that user must have at least all licenses required by every matching position.
I'm having huge problems writing a SQL query to get this list.
If at all possible I would like to do this using a single SQL query (can have additional CTEs of course). If you can convince me that doing it in several queries would be more efficient I'm willing to listen in.
Some workable data
Copy and runs this script. 3 users, 3 positions, 6 licenses. Mark and John should have a match but not Jane.
create table [User] (
UserID int identity not null
primary key,
Name nvarchar(100) not null
)
go
create table Position (
PositionID int identity not null
primary key,
Name nvarchar(100) not null
)
go
create table License (
LicenseID int identity not null
primary key,
Name nvarchar(100) not null
)
go
create table UserLicense (
UserID int not null
references [User](UserID),
LicenseID int not null
references License(LicenseID),
ValidFrom date not null,
ValidTo date not null,
check (ValidFrom < ValidTo),
primary key (UserID, LicenseID)
)
go
create table PositionLicense (
PositionID int not null
references Position(PositionID),
LicenseID int not null
references License(LicenseID),
primary key (PositionID, LicenseID)
)
go
insert [User] (Name) values ('Mark the mechanic');
insert [User] (Name) values ('John the pilot');
insert [User] (Name) values ('Jane only has arts PhD but not medical.');
insert Position (Name) values ('Mechanic');
insert Position (Name) values ('Pilot');
insert Position (Name) values ('Doctor');
insert License (Name) values ('Mecha');
insert License (Name) values ('Flying');
insert License (Name) values ('Medicine');
insert License (Name) values ('PhD');
insert License (Name) values ('Phycho');
insert License (Name) values ('Arts');
insert PositionLicense (PositionID, LicenseID) values (1, 1);
insert PositionLicense (PositionID, LicenseID) values (2, 2);
insert PositionLicense (PositionID, LicenseID) values (2, 5);
insert PositionLicense (PositionID, LicenseID) values (3, 3);
insert PositionLicense (PositionID, LicenseID) values (3, 4);
insert UserLicense (UserID, LicenseID, ValidFrom, ValidTo) values (1, 1, '20110101', '20120101');
insert UserLicense (UserID, LicenseID, ValidFrom, ValidTo) values (2, 2, '20110101', '20120101');
insert UserLicense (UserID, LicenseID, ValidFrom, ValidTo) values (2, 5, '20110101', '20120101');
insert UserLicense (UserID, LicenseID, ValidFrom, ValidTo) values (3, 4, '20110101', '20120101');
insert UserLicense (UserID, LicenseID, ValidFrom, ValidTo) values (3, 6, '20110101', '20120101');
Resulting solution
I've setup my resulting solution based on accepted answer which provides the most simplified solution to this problem. If you'd like to play with the query just hit edit/clone (whether you're logged in or not). What can be changed:
three variables:
two variable to set date range (#From and #To)
user ID (#User)
you can toggle commented code in the first CTE to switch code between fully overlapping user licenses or partially overlapping ones.
This makes a number of assumptions (ignores presence of time in the datetime columns, assumes fairly obvious primary keys) and skips the joins to pull in user name, position details, and the like. (And you implied that the user had to hold all the licenses for the full period specified, right?)
SELECT pl.PositionId
from PositionLicense pl
left outer join (-- All licenses user has for the entirety (sp?) of the specified date range
select LicenseId
from UserLicense
where UserId = #UserId
and #RangeFrom <= ValidFrom
and #RangeTo >= ValidTo) li
on li.LicenseId = pl.LicenseId
group by pl.PositionId
-- Where all licenses required by position are held by user
having count(pl.LicenseId) = count(li.LicenseId)
No data so I can't debug or test it, but this or something very close to it should do the trick.
Select ...
From User As U
Cross Join Position As P
Where Exists (
Select 1
From PositionLicense As PL1
Join UserLicense As UL1
On UL1.LicenseId = PL1.LicenseId
And UL1.ValidFrom <= #RangeTo
And UL1.ValidTo >= #RangeFrom
Where PL1.PositionId = P.Id
And UL1.UserId = U.Id
Except
Select 1
From PositionLicense As PL2
Left Join UserLicense As UL2
On UL2.LicenseId = PL2.LicenseId
And UL2.ValidFrom <= #RangeTo
And UL2.ValidTo >= #RangeFrom
And UL2.UserId = U.Id
Where PL2.PositionId = P.Id
And UL2.UserId Is Null
)
If the requirement is that you want users and positions that are valid across the entire range, that is trickier:
With Calendar As
(
Select #RangeFrom As [Date]
Union All
Select DateAdd(d, 1, [Date])
From Calendar
Where [Date] <= #RangeTo
)
Select ...
From User As U
Cross Join Position As P
Where Exists (
Select 1
From UserLicense As UL1
Join PositionLicense As PL1
On PL1.LicenseId = UL1.LicenseId
Where UL1.UserId = U.Id
And PL1.PositionId = P.Id
And UL1.ValidFrom <= #RangeTo
And UL1.ValidTo >= #RangeFrom
Except
Select 1
From Calendar As C1
Cross Join User As U1
Cross Join PositionLicense As PL1
Where U1.Id = U.Id
And PL1.PositionId = P.Id
And Not Exists (
Select 1
From UserLicense As UL2
Where UL2.LicenseId = PL1.LicenseId
And UL1.UserId = U1.Id
And C1.Date Between UL2.ValidFrom And UL2.ValidTo
)
)
Option ( MaxRecursion 0 );
Runnable Version Here
WITH PositionRequirements AS (
SELECT p.PositionID, COUNT(*) AS LicenseCt
FROM #Position AS p
INNER JOIN #PositionLicense AS posl
ON posl.PositionID = p.PositionID
GROUP BY p.PositionID
)
,Satisfied AS (
SELECT u.UserID, posl.PositionID, COUNT(*) AS LicenseCt
FROM #User AS u
INNER JOIN #UserLicense AS perl
ON perl.UserID = u.UserID
-- AND #Date BETWEEN perl.ValidFrom AND perl.ValidTo
AND '20110101' BETWEEN perl.ValidFrom AND perl.ValidTo
INNER JOIN #PositionLicense AS posl
ON posl.LicenseID = perl.LicenseID
-- WHERE u.UserID = #UserID -- Not strictly necessary, we can go over all people
GROUP BY u.UserID, posl.PositionID
)
SELECT PositionRequirements.PositionID, Satisfied.UserID
FROM PositionRequirements
INNER JOIN Satisfied
ON Satisfied.PositionID = PositionRequirements.PositionID
AND PositionRequirements.LicenseCt = Satisfied.LicenseCt
You could probably turn this into an inline table-valued function parameterized on effective date.