OrientDB - How to create edge on imported data - orientdb

I understand that in a graph DB, relations are created after insertion and instead or field to field relations, they are record to record relation. If my understanding is correct, then what do I need to do when I import a million records from a CSV file and need to relate them to records in the other table?
Is it not possible to relations (edges) at design time, before insertion so that whenever a record is inserted, it has a relation already there?
My database is detailed as below.
create class Country extends V
create class Immigrant extends V
create class comesFrom extends E
create property Country.c_id integer
create property Country.c_name String
create property Immigrant.i_id integer
create property Immigrant.i_name String
create property Immigrant.i_country Integer
If I create the edges manually, it will be like this.
insert into Country(c_id, c_name) values (1, 'USA')
insert into Country(c_id, c_name) values (2, 'UK')
insert into Country(c_id, c_name) values (3,'PAK')
insert into Immigrant(i_id, i_name,i_country) values (1, 'John',1)
insert into Immigrant(i_id, i_name,i_country) values (2, 'Graham',2)
insert into Immigrant(i_id, i_name,i_country) values (3, 'Ali',3)
create edge comesFrom from (select from Immigrant where i_country = 1) to (select from Country where c_id = 1)
create edge comesFrom from (select from Immigrant where i_country = 2) to (select from Country where c_id = 2)
create edge comesFrom from (select from Immigrant where i_country = 3) to (select from Country where c_id = 3)

You can use OrientDB ETL to do that. For more information look at this example: http://orientdb.com/docs/last/Import-from-CSV-to-a-Graph.html

Related

postgres constraint on one table according to data in another table

I have a database with 2 tables, foo and foo_trash.
Both have the same structure with an id (primary key), and a title. foo_trash gets populated with data copied from foo with a statement like this:
INSERT INTO foo_trash (SELECT * FROM foo WHERE id = 253)
I would like to add a constraint on the table foo_trash so that no rows may be inserted into foo_trash if the same pair id and title is not present in foo.
How do I write that?
Given the table foo:
create table foo (
id int,
title varchar(50),
primary key (id, title)
);
Define the table foo_trash to reference the two columns you mentioned:
create table foo_trash (
id int primary key,
title varchar(50),
FOREIGN KEY (id, title) REFERENCES foo (id, title)
);
Now you can insert data into foo:
insert into foo values (1, 'title1');
insert into foo values (2, 'title2');
insert into foo values (3, 'title3');
insert into foo values (253, 'title253');
If you try to insert a row into foo_trash that doesn't exist in foo, you will receive an error:
insert into foo_trash values (4, 'title4');
Output:
ERROR: insert or update on table "foo_trash" violates foreign key constraint "foo_trash_id_title_fkey"
DETAIL: Key (id, title)=(4, title4) is not present in table "foo".
You can insert a row in foo_trash that exists in foo:
insert into foo_trash values (3, 'title3');
And you can do your insert into foo_trash as select from foo successfully, assuming that id exists:
INSERT INTO foo_trash (SELECT * FROM foo WHERE id = 253);

Query matching property in another table given a comma-separated string in JSONB

I would like to look up a property in another table B, where the source is part of a comma-separated string inside a JSONB column of table A.
create table option
(
optionid bigint not null primary key,
attributevalues jsonb default '{}'::jsonb
);
create table district
(
districtid bigint not null primary key,
uid varchar(11) not null,
name varchar(230) not null unique
);
INSERT into option values (1, '{"value": "N8UXIAycxy3,uVwyu3R4nZG,fuja8k8PCFO,y0eUmlYp7ey", "attribute": {"id": "K54wAf6EX0s"}}'::jsonb);
INSERT INTO district (districtid, uid, name) VALUES
(1, 'N8UXIAycxy3', 'district1'),
(2, 'uVwyu3R4nZG', 'district2'),
(3, 'fuja8k8PCFO', 'district3'),
(4, 'y0eUmlYp7ey', 'district4');
I can get all the items split by , but how do I "join" to look up the name (e.g. N8UXIAycxy3 --> district1)?
I tried to "join" in a traditional sense but this will not work as the district_uid is not accessible for the query as such:
SELECT UNNEST(STRING_TO_ARRAY(co.attributevalues #>> '{"K54wAf6EX0s", "value"}', ',')) AS district_uid
FROM option o
JOIN district d on district_uid = d.uid;
I would like to have the query result: district1,district2,district3,district4. Is this possible or do I need a loop?
DB Fiddle
You need to convert to array the comma separated string, i.e. attributevalues->>'value':
select name
from option
cross join unnest(string_to_array(attributevalues->>'value', ',')) as district_uid
join district on uid = district_uid
DB fiddle.

OrientDB - Multiple records showing after re-applying edges

I have a problem. When I create edges first time, the number of records in the output are OK. But, when I add another record to the class and create the edge again, I get multiple records. Here is what I am doing.
create class Country extends V
create class Immigrant extends V
create class comesFrom extends E
create property Country.c_id integer
create property Country.c_name String
create property Immigrant.i_id integer
create property Immigrant.i_name String
create property Immigrant.i_country Integer
insert into Country(c_id, c_name) values (1, 'USA')
insert into Country(c_id, c_name) values (2, 'UK')
insert into Country(c_id, c_name) values (3,'PAK')
insert into Immigrant(i_id, i_name,i_country) values (1, 'John',1)
insert into Immigrant(i_id, i_name,i_country) values (2, 'Graham',2)
insert into Immigrant(i_id, i_name,i_country) values (3, 'Ali',3)
create edge comesFrom from (select from Immigrant where i_country = 1) to (select from Country where c_id = 1)
create edge comesFrom from (select from Immigrant where i_country = 2) to (select from Country where c_id = 2)
create edge comesFrom from (select from Immigrant where i_country = 3) to (select from Country where c_id = 3)
select i_id, i_name, out('comesFrom').c_id as c_id, out('comesFrom').c_name as c_name from Immigrant unwind c_id, c_name
I get the result as below.
Click here to view image of correct records
Then I add another record to the class Immigrant.
insert into Immigrant(i_id, i_name,i_country) values (4, ‘James',2)
And create the edge again. Please note that the new immigrant belongs to an already existing country.
create edge comesFrom from (select from Immigrant where i_country = 2) to (select from Country where c_id = 2)
I run the same query as below.
select i_id, i_name, out('comesFrom').c_id as c_id, out('comesFrom').c_name as c_name from Immigrant unwind c_id, c_name
Now I get multiple records as below.
Click here to view image of incorrect records.
What wrong am I doing.
Thank you!
The problem is this command:
create edge comesFrom from (select from Immigrant where i_country = 2)
to (select from Country where c_id = 2)
Because if you execute only this part:
select from Immigrant where i_country = 2
You can see that there are 2 results: Graham and James.
So, it will create an edge between the two Immigrants (Graham and James) and the Country.
To avoid this problem you can create the edge using the name of the Immigrants.
However I add a couple of attachments so you can understand better.
The problem: http://i.stack.imgur.com/NptOt.png
Your Solution: http://i.stack.imgur.com/PfU6c.png
My Solution: http://i.stack.imgur.com/WEEal.png
Regards

Select value from an enumerated list in PostgreSQL

I want to select from an enumaration that is not in database.
E.g. SELECT id FROM my_table returns values like 1, 2, 3
I want to display 1 -> 'chocolate', 2 -> 'coconut', 3 -> 'pizza' etc. SELECT CASE works but is too complicated and hard to overview for many values. I think of something like
SELECT id, array['chocolate','coconut','pizza'][id] FROM my_table
But I couldn't succeed with arrays. Is there an easy solution? So this is a simple query, not a plpgsql script or something like that.
with food (fid, name) as (
values
(1, 'chocolate'),
(2, 'coconut'),
(3, 'pizza')
)
select t.id, f.name
from my_table t
join food f on f.fid = t.id;
or without a CTE (but using the same idea):
select t.id, f.name
from my_table t
join (
values
(1, 'chocolate'),
(2, 'coconut'),
(3, 'pizza')
) f (fid, name) on f.fid = t.id;
This is the correct syntax:
SELECT id, (array['chocolate','coconut','pizza'])[id] FROM my_table
But you should create a referenced table with those values.
What about creating another table that enumerate all cases, and do join ?
CREATE TABLE table_case
(
case_id bigserial NOT NULL,
case_name character varying,
CONSTRAINT table_case_pkey PRIMARY KEY (case_id)
)
WITH (
OIDS=FALSE
);
and when you select from your table:
SELECT id, case_name FROM my_table
inner join table_case on case_id=my_table_id;

Using views for access control in PostgreSQL

I have a schema of tables whose contents basically boil down to:
A set of users
A set of object groups
An access control list (acl) indicating what users have access to what groups
A set of objects, each of which belongs to exactly one group.
I want to create a simple application that supports access control. I'm thinking views would be a good approach here.
Suppose I have the following database initialization:
/* Database definition */
BEGIN;
CREATE SCHEMA foo;
CREATE TABLE foo.users (
id SERIAL PRIMARY KEY,
name TEXT
);
CREATE TABLE foo.groups (
id SERIAL PRIMARY KEY,
name TEXT
);
CREATE TABLE foo.acl (
user_ INT REFERENCES foo.users,
group_ INT REFERENCES foo.groups
);
CREATE TABLE foo.objects (
id SERIAL PRIMARY KEY,
group_ INT REFERENCES foo.groups,
name TEXT,
data TEXT
);
/* Sample data */
-- Create groups A and B
INSERT INTO foo.groups VALUES (1, 'A');
INSERT INTO foo.groups VALUES (2, 'B');
-- Create objects belonging to group A
INSERT INTO foo.objects VALUES (1, 1, 'object in A', 'apples');
INSERT INTO foo.objects VALUES (2, 1, 'another object in A', 'asparagus');
-- Create objects belonging to group B
INSERT INTO foo.objects VALUES (3, 2, 'object in B', 'bananas');
INSERT INTO foo.objects VALUES (4, 2, 'object in B', 'blueberries');
-- Create users
INSERT INTO foo.users VALUES (1, 'alice');
INSERT INTO foo.users VALUES (2, 'amy');
INSERT INTO foo.users VALUES (3, 'billy');
INSERT INTO foo.users VALUES (4, 'bob');
INSERT INTO foo.users VALUES (5, 'caitlin');
INSERT INTO foo.users VALUES (6, 'charlie');
-- alice and amy can access group A
INSERT INTO foo.acl VALUES (1, 1);
INSERT INTO foo.acl VALUES (2, 1);
-- billy and bob can access group B
INSERT INTO foo.acl VALUES (3, 2);
INSERT INTO foo.acl VALUES (4, 2);
-- caitlin and charlie can access groups A and B
INSERT INTO foo.acl VALUES (5, 1);
INSERT INTO foo.acl VALUES (5, 2);
INSERT INTO foo.acl VALUES (6, 1);
INSERT INTO foo.acl VALUES (6, 2);
COMMIT;
My idea is to use views that mirror the database, but restrict content to only that which the current user (ascertained by my PHP script) may access (here I'll just use the user 'bob'). Suppose I run this at the beginning of every PostgreSQL session (meaning every time someone accesses a page on my site):
BEGIN;
CREATE TEMPORARY VIEW users AS
SELECT * FROM foo.users
WHERE name='bob';
CREATE TEMPORARY VIEW acl AS
SELECT acl.* FROM foo.acl, users
WHERE acl.user_=users.id;
CREATE TEMPORARY VIEW groups AS
SELECT groups.* FROM foo.groups, acl
WHERE groups.id=acl.group_;
CREATE TEMPORARY VIEW objects AS
SELECT objects.* FROM foo.objects, groups
WHERE objects.group_=groups.id;
COMMIT;
My question is, is this a good approach? Do these CREATE TEMPORARY VIEW statements produce significant overhead, especially compared to a couple simple queries?
Also, is there a way to make these views permanent in my database definition, then bind a value to the user name per session? This way, it doesn't have to create all these views every time a user loads a page.
Several problems with this approach:
One user web session is not the same thing as one database session. Multiple users with with sort of setup would fail instantly.
Management overhead creating/destroying the views.
Instead, I would recommend something like the following view:
CREATE VIEW AllowedObjects
SELECT objects.*, users.name AS alloweduser
FROM objects
INNER JOIN groups ON groups.id = objects.group_
INNER JOIN acl ON acl.group_ = groups.id
INNER JOIN users ON users.id = acl.user_
Then, everywhere you select objects:
SELECT * FROM AllowedObjects
WHERE alloweduser='Bob'
This assumes Bob can only have one ACL joining him to a particular group, otherwise a DISTINCT would be necessary.
This could be abstracted to a slightly less complex view that could be used to make it easier to check permissions for UPDATE and DELETE:
CREATE VIEW AllowedUserGroup
SELECT groups.id AS allowedgroup, users.name AS alloweduser
FROM groups
INNER JOIN acl ON acl.group_ = groups.id
INNER JOIN users ON users.id = acl.user_
This provides a flattened view of which users are in which groups, which you can check against the objects table during an UPDATE/DELETE:
UPDATE objects SET foo='bar' WHERE id=42 AND EXISTS
(SELECT NULL FROM AllowedUserGroup
WHERE alloweduser='Bob' AND allowedgroup = objects.group_)