orientdb - error on import csv - admin credentials - orientdb

I trie import and edge by oetl with orientdb 2.2.4/2.2.5/2.2.6. In all versions the error is the same. If I use version 2.1 the error doesn't occurs.
My json file is
{
"config": {
"log": "info",
"parallel": false
},
"source": {
"file": {
"path": "/opt/orientdb/csvs_1milhao/metodo03/a10a.csv"
}
},
"extractor": {
"row": {
}
},
"transformers": [{
"csv": {
"separator": ",",
"columnsOnFirstLine": true,
"columns": ["psq_id_from:integer",
"pro_id_to:integer",
"ordem:integer"]
}
},
{
"command": {
"command": "create edge PUBLICOU from (SELECT FROM index:Pesquisador.psq_id WHERE key = ${input.psq_id_from}) to (SELECT FROM index:Producao.pro_id where key = ${input.pro_id_to})",
"output": "edge"
}
}],
"loader": {
"orientdb": {
"dbURL": "remote:localhost/dbUmMilhaoM03",
"dbUser": "admin",
"dbPassword": "admin",
"dbURL": "remote:localhost/dbUmMilhaoM03",
"dbType": "graph",
"standardElementConstraints": false,
"batchCommit": 1000,
"classes": [{
"name": "PUBLICOU",
"extends": "E"
}]
}
}
}
When I execute the oetl command, the result is:
root#teste:/opt/orientdb_226/bin# ./oetl.sh /opt/orientdb_226/scripts_orientdb/Db1Milhao/metodo03/a10a_psq_publicou_pro.json >> log_m03
Exception in thread "main" com.orientechnologies.orient.core.exception.OConfigurationException: Error on creating ETL processor
at com.orientechnologies.orient.etl.OETLProcessor.parse(OETLProcessor.java:225)
at com.orientechnologies.orient.etl.OETLProcessor.parse(OETLProcessor.java:176)
at com.orientechnologies.orient.etl.OETLProcessor.parseConfigAndParameters(OETLProcessor.java:144)
at com.orientechnologies.orient.etl.OETLProcessor.main(OETLProcessor.java:108)
Caused by: com.orientechnologies.orient.etl.loader.OLoaderException: unable to manage remote db without server admin credentials
at com.orientechnologies.orient.etl.loader.OOrientDBLoader.manageRemoteDatabase(OOrientDBLoader.java:447)
at com.orientechnologies.orient.etl.loader.OOrientDBLoader.configure(OOrientDBLoader.java:391)
at com.orientechnologies.orient.etl.OETLProcessor.configureComponent(OETLProcessor.java:448)
at com.orientechnologies.orient.etl.OETLProcessor.configureLoader(OETLProcessor.java:262)
at com.orientechnologies.orient.etl.OETLProcessor.parse(OETLProcessor.java:209)
... 3 more
When I execute with OrientDb 2.1 the result is:
Exception in thread "main" com.orientechnologies.orient.etl.OETLProcessHaltedException: com.orientechnologies.orient.core.exception.OCommandExecutionException: Source vertex '#-1:-1' not exists
But the indexes exists
Name Type Class Properties Engine Actions
Atuacao.atu_id UNIQUE Atuacao [atu_id] SBTREE
dictionary DICTIONARY [undefined] SBTREE
Instituicao.ins_id UNIQUE Instituicao [ins_id] SBTREE
ORole.name UNIQUE ORole [name] SBTREE
OUser.name UNIQUE OUser [name] SBTREE
Pais.pai_id UNIQUE Pais [pai_id] SBTREE
Pesquisador.psq_id UNIQUE Pesquisador [psq_id] SBTREE
Producao.pro_id UNIQUE Producao [pro_id] SBTREE
Publicacao.pub_id UNIQUE Publicacao [pub_id] SBTREE
TipoPublicacao.tpu_id UNIQUE TipoPublicacao [tpu_id] SBTREE
Is this an Orientdb bug?

try this as your command:
"command": "create edge PUBLICOU from (SELECT expand(rid) FROM index:Pesquisador.psq_id WHERE key = ${input.psq_id_from}) to (SELECT expand(rid) FROM index:Producao.pro_id where key = ${input.pro_id_to})"
this should work because when you select from index the rid associated with the result record is in the property rid.
Or even better you can directly select from class instead of index:
create edge PUBLICOU from (SELECT FROM Pesquisador WHERE psq_id = ${input.psq_id_from}) to (SELECT FROM Producao where pro_id = ${input.pro_id_to})
in this way it uses indexes as well.
Ivan

Related

Can i add an index to a table in Tilda for performance?

We created a simple table in Tilda and want to create an index. How to do that?
{ "name": "Test"
,"description": "The table to keep track tests amd owners"
,"columns":[
{ "name": "id" , "type": "STRING" , "nullable": false, "size": 128, "invariant": true, "description": "The name of the test" }
,{ "name": "name" , "type": "STRING" , "nullable": false, "size": 128, "invariant": true, "description": "The name of the test" }
,{ "name": "test_fk" , "sameAs":"Test.refnum" , "nullable": false, "invariant": true, "description": "The name of the test" }
]
,"primary": { "autogen": true }
,"foreign": [
{ "name":"Test", "srcColumns":["test_fk"], "destObject":"Test" }
]
}
You can see documentation details about Index creation in table definitions in the Wiki:
https://github.com/CapsicoHealth/Tilda/wiki/Tilda-JSON-Syntax%3A-Object-Indices
// Within an Object definition, you can define any number of indices (or none).
"indices":[
// A unique index that exists in the database over colA and colB
{ "name":"Blah1", "columns":["colA", "ColB"], "db":true }
// A regular (non-unique) index that overlaps with the previous definition
// but only exists for application needs (no database definition).
,{ "name":"Blah2", "columns":["colA"] , "orderBy":["colB"], "db":false}
// A regular index with only an orderBy definition.
,{ "name":"Blah3", "orderBy":["colB"], "db":true }
]
The markup above shows the JSON to create a variety of indices with comments explaining what each attribute does.

Loopback indexes - how to specify different index types in model definition?

In Loopback (v3), when defining indexes in my model.json files, how do I specify different types of indexes (such as a BRIN)? Also, how do I specify index conditions (such as if I want to create a partial index)? I'm using postgres for the database, if that's relevant.
You can configure the index type via type field.
{
"name": "MyModel",
"properties": {
// ...
},
"indexes": {
"myindex": {
"columns": "name, email",
"type": "BRIN",
// ...
}
}
}
I am afraid LoopBack does not support index conditions (partial indexes) yet. Feel free to open a new issue in https://github.com/strongloop/loopback-connector-postgresql/issues.
i was trying to add in Lb4. Its pretty straightforward there (should be same for lb3 as well i hope)
#model({
name: 'tablename',
settings: {
indexes: {
idx_tablename: {
columnA : '',
columnB : '',
columnC: ''
}
}
}
})
once the build is done, the index name idx_tablename with 3 columns will get created
In PostgreSQL and Loopback 3 you can specify an index for multi-column like this.
The following loopback JSON code creates index in Postgres with fields message and type are unique together.
{
"name": "notification",
"base": "PersistedModel",
"idInjection": true,
"options": {
"validateUpsert": true
},
"properties": {
"message": {
"type": "string",
"required": true
},
"type": {
"type": "string",
"required": true
},
"seen": {
"type": "boolean",
"required": true,
"default": false
}
},
"validations": [],
"relations": {},
"acls": [],
"methods": {},
"indexes": {
"message_type_index": {
"keys": "message, type",
"options": {"unique": true}
}
}
}

Copying 7 column table to 6 column table

I'm porting SQL Server Integration Services packages to Azure Data Factory.
I have two tables (Table 1 and Table 2) which live on different servers. One has seven columns, the other six. I followed the example at https://learn.microsoft.com/en-us/azure/data-factory/data-factory-map-columns
Table 1 DDL:
CREATE TABLE dbo.Table1
(
zonename nvarchar(max),
propertyname nvarchar(max),
basePropertyid int,
dfp_ad_unit_id bigint,
MomentType nvarchar(200),
OperatingSystemName nvarchar(50)
)
Table 2 DDL
CREATE TABLE dbo.Table2
(
ZoneID int IDENTITY,
ZoneName nvarchar(max),
propertyName nvarchar(max),
BasePropertyID int,
dfp_ad_unit_id bigint,
MomentType nvarchar(200),
OperatingSystemName nvarchar(50)
)
In ADF, I define Table 1 as:
{
"$schema": "http://datafactories.schema.management.azure.com/schemas/2015-09-01/Microsoft.DataFactory.Table.json",
"name": "Table1",
"properties": {
"type": "AzureSqlTable",
"linkedServiceName": "PlatformX",
"structure": [
{ "name": "zonename" },
{ "name": "propertyname" },
{ "name": "basePropertyid" },
{ "name": "dfp_ad_unit_id" },
{ "name": "MomentType" },
{ "name": "OperatingSystemName" }
],
"external": true,
"typeProperties": {
"tableName": "Platform.Zone"
},
"availability": {
"frequency": "Day",
"interval": 1
}
}
}
In ADF I define Table 2 as:
{
"$schema": "http://datafactories.schema.management.azure.com/schemas/2015-09-01/Microsoft.DataFactory.Table.json",
"name": "Table2",
"properties": {
"type": "SqlServerTable",
"linkedServiceName": "BrixDW",
"structure": [
{ "name": "ZoneID" },
{ "name": "ZoneName" },
{ "name": "propertyName" },
{ "name": "BasePropertyID" },
{ "name": "dfp_ad_unit_id" },
{ "name": "MomentType" },
{ "name": "OperatingSystemName" }
],
"external": true,
"typeProperties": {
"tableName": "staging.DimZone"
},
"availability": {
"frequency": "Day",
"interval": 1
}
}
}
As you can see, Table2 has an identity column, which will automatically populated.
This should be a simple Copy activity:
{
"$schema": "http://datafactories.schema.management.azure.com/schemas/2015-09-01/Microsoft.DataFactory.Pipeline.json",
"name": "Copy_Table1_to_Table2",
"properties": {
"description": "Copy_Table1_to_Table2",
"activities": [
{
"name": "Copy_Table1_to_Table2",
"type": "Copy",
"inputs": [
{ "name": "Table1" }
],
"outputs": [
{
"name": "Table2"
}
],
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": "select * from dbo.Table1"
},
"sink": {
"type": "SqlSink"
},
"translator": {
"type": "TabularTranslator",
"columnMappings": "zonename: ZoneName, propertyname: propertyName, basePropertyid: BasePropertyID, dfp_ad_unit_id: dfp_ad_unit_id, MomentType: MomentType, OperatingSystemName: OperatingSystemName"
}
},
"policy": {
"concurrency": 1,
"executionPriorityOrder": "OldestFirst",
"retry": 3,
"timeout": "01:00:00"
},
"scheduler": {
"frequency": "Day",
"interval": 1
}
}
],
"start": "2017-07-23T00:00:00Z",
"end": "2020-07-19T00:00:00Z"
}
}
I figured by not mapping ZoneID, it would just be ignored. But ADF is giving me the following error.
Copy activity encountered a user error: GatewayNodeName=APP1250S,ErrorCode=UserErrorInvalidColumnMappingColumnCountMismatch,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Invalid column mapping provided to copy activity: 'zonename: ZoneName, propertyname: propertyName, basePropertyid: BasePropertyID, dfp_ad_unit_id: dfp_ad_unit_id, MomentType: MomentType, OperatingSystemName: OperatingSystemName', Detailed message: Different column count between target structure and column mapping. Target column count:7, Column mapping count:6. Check column mapping in table definition.,Source=Microsoft.DataTransfer.Common,'
In a nutshell I'm trying to copy a 7 column table to a 6 column table and Data Factory doesn't like it. How can I accomplish this task?
I realize this is an old question, but I ran into this issue just now. My problem was that I initially generated the destination/sink table, created a pipeline, and then added a column.
Despite clearing and reimporting the schemas, whenever triggering the pipeline, it would throw the above error. I made sure the new column (which has a default on it) was deselected in the mappings, so it would only use the default value. The error was still thrown.
The only way I managed to get things to work was by completely recreating the pipelines from scratch. It's almost as if somewhere in the meta data, the old mappings are retained.
I had the exact same issue and I solved it by going into the azure dataset and removing the identity column. Then making sure I had the same number of columns in my source and target(sink). After doing this the copy will add the records and the identity in the table will just work as expected. I did not have to modify the physical table in SQL only the dataset for the table in azure.
One option would be to create a view over the 7-column table which does not include the identity column and insert into that view.
CREATE VIEW bulkLoad.Table2
AS
SELECT
ZoneName,
propertyName,
BasePropertyID,
dfp_ad_unit_id,
MomentType,
OperatingSystemName
GO
I can do some digging and see if some trick is possible with the column mapping but that should unblock you.
HTH
I was told by MSFT support to just remove the identity column from the table definition. It seems to have worked.

OrientDB: missing "half edges"

I'm still playing with OrientDB.
Now I'm trying the schema functionalities, that look awesome :-)
I have two data files: joinA.txt and joinB.txt, which I used to populate a database with the following schema (the content of the two files is at the end of the post):
CREATE CLASS Employee EXTENDS V;
CREATE PROPERTY Employee.eid Integer;
CREATE PROPERTY Employee.name String;
CREATE PROPERTY Employee.eage Short;
CREATE INDEX Employee.eid unique_hash_index;
CREATE CLASS ExtendedProfile EXTENDS V;
CREATE CLASS XYZProfile EXTENDS ExtendedProfile;
CREATE PROPERTY XYZProfile.textual String;
-- SameAs can only connect Employees to ExtendedProfile
CREATE CLASS SameAs EXTENDS E; -- same employee across many tables
CREATE PROPERTY SameAs.out LINK ExtendedProfile;
CREATE PROPERTY SameAs.In LINK Employee;
The JSONs I gave to the ETL tool are, for JoinA:
{
"source": { "file": {"path": "the_path"}},
"extractor": {"csv": {
"separator": " ",
"columns": [
"eid:Integer",
"name:String",
"eage:Short"
]
}
},
"transformers": [
{"vertex": {"class": "Employee", "skipDuplicates": true}}
]
,"loader": {
"orientdb": {
"dbURL": "plocal:thepath",
"dbType": "graph",
"useLightweightEdges": false
}
}
}
and for JoinB:
{
"source": { "file": {"path": "thepath"}},
"extractor": {"csv": {
"separator": " ",
"columnsOnFirstLine": false,
"quote": "\"",
"columns": [
"id:String",
"textual:String"
]
}
},
"transformers": [
{"vertex": {"class": "XYZProfile", "skipDuplicates": true}},
{ "edge": { "class": "SameAs",
"direction": "out",
"joinFieldName": "id",
"lookup":"Employee.eid",
"unresolvedLinkAction":"ERROR"}},
],
"loader": {
"orientdb": {
"dbURL": "path",
"dbUser": "root",
"dbPassword": "pwd",
"dbType": "graph",
"useLightweightEdges": false}
}
}
Now, the problem is that when I run select expand(both()) from Employee I get the edges in the column out_SameAs, while when I run select expand(both()) from XYZProfile I get nothing.
This is weird since the first query told me that the #CLASS pointed by the edges is XYZProfile.
Does anybody know what's wrong with my example?
Cheers,
Alberto
JoinA:
1 A 10
2 B 14
3 C 22
JoinB:
1 i0
1 i1
2 i2
Check out your JSON file, I think there is an error on your JSON file. You forget to put [] at the beginning and ending of the JSON file.
It was actually my fault.
The line CREATE PROPERTY SameAs.In LINK Employee; was the problem: In should have been all lowercased, as pointed out here.

OrientDB: Create Edges with subselect

I have a simple tree structure in a MySQL table (id, parentId) with about 3 million vertices and wanted to import this into a OrientDB Graph database. The ETL importer imports the vertices smoothly, but can't create edges (NullPointerException). The ETL does not even work on a plain database with the given examples in the documentation (http://orientdb.com/docs/last/Import-a-tree-structure.html throws the same exception), so I just imported the vertices and wanted to create the edges manually.
I have a Vertex class (Address) with two properties (id, parentId) and I want to create the Edges between these Vertices (parentId -> id). Is there a simple way to do this instead of inserting the edges in a loop? Something like in SQL
INSERT INTO E (out, in) VALUES (SELECT parentId, id FROM Address)
Since edges shall only be created with CREATE EDGE, I guess OrientDB does not support such an operation by default. But maybe there is a workaround to create these 3 million edges?
I found it is easy to create a link between the two records:
CREATE LINK parentLink TYPE LINK FROM Address.parentId TO Address.Id
However, I cannot create Edges in such a way. I tried working with variables
CREATE EDGE isParentOf FROM (SELECT FROM Address) TO (SELECT FROM Address WHERE id = $current.parentId)
But that does not work.
Have you tried this ETL Json:
{
"config": {"log": "debug", "parallel": true },
"extractor" : {
"jdbc": { "driver": "oracle.jdbc.driver.OracleDriver",
"url": "jdbc:oracle:thin:hostname/db",
"userName": "username",
"userPassword": "password",
"query": "select id, A.parentId from Address a where rownum<2" }
},
"transformers": [`enter code here`
{ "vertex": { "class": "Address" }},
{ "edge": { "class": "isParentOf",
"joinFieldName": "parentId",
"lookup": "Address.Id",
"direction": "in",
"skipDuplicates":true
}
}
],
"loader": {
"orientdb": {
"dbURL": "remote:server/db",
"dbUser": "user",
"dbPassword": "passwd!",
"dbType": "graph",
"classes": [
{"name": "Address", "extends": "V"},
{"name": "isParentOf", "extends": "E"}
], "indexes": [
{"class":"Address", "fields":["ID:string"], "type":"UNIQUE" }
]
}
}
}