Orient DB distributed replica on embedded server - orientdb

We are setting a distributed OrientDB database on an embedded server (we are using OrientDB v.2.2.31). We would like to have a master-replica configuration, but we have encountered some issues in doing that.
We have setted the default-distributed-db-config.json file in the following way, both for the master and for the replica:
{
"autoDeploy": true,
"hotAlignment": true,
"executionMode": "asynchronous",
"readQuorum": 1,
"writeQuorum": 1,
"failureAvailableNodesLessQuorum": false,
"readYourWrites": true,
"newNodeStrategy" : "static",
"servers": {
"orientdb_master": "master",
"orientdb_replica1": "replica"
},
"clusters": {
"internal": {
},
"index": {
},
"*": {
"servers": ["<NEW_NODE>"]
}
}
}
"orientdb_master" and "orientdb_replica1" are the hostnames associated to the the master and slave server, respectively.
We start the master server first and then the other server: the connection between them takes place without problems, but the server that should be the replica is actually another master (and so, we have a multi-master configuration).
How can we specify that the second server is a replica? There are other parameters that it is necessary to set?
Thanks in advance

Instead of setting orientdb_replica1 (the hostname), you should use the node name you assigned at startup. You can find it under config/orientdb-server-config.xml.

Related

import metadata from RDBMS into Apache Atlas

I am learning Atlas and trying to find a way to import metadata from RDBMS like (Sql Server or Postgre Sql).
Could somebody provide reference/s to do it or steps?
I am using Atlas in docker with build in HBase and Solr. Intention is to import metadata from AWS RDS.
Update 1
To rephrase my question. Can we import metadata directly from RDS Sql Server or PostgreSql without importing actual data in hive (hadoop)?
Any comment/s or answer is appreciated. Thank you!
AFAIK, Atlas works on hive metastore.
Below is the AWS documention of how to do it in AWS Emr while creating the cluster it self. ... Metadata classification, lineage, and discovery using Apache Atlas on Amazon EMR
Here is Cloudera source from sqoop stand point.
From Cloudera source : Populate metadata repository from RDBMS in Apache Atlas question from Cloudera.
1) you create the new types in Atlas. For example, in the case of Oracle, and Oracle table type, column type, etc.
2) create a script or process that pulls the meta data from the source meta data store.
3) Once you have the meta data you want to store in Atlas, your process would create the associated Atlas entities, based on the new types, using the Java API or JSON representations through the REST API directly. If you wanted to, you could add lineage to that as you store the new entities.
The below documentation has step by step details on how to use sqoop to move from any RDBMS to hive.
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_data-access/content/using_sqoop_to_move_...
You can refer to this as well: http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_literal_sqoop_import_all_tables_literal
To get the metadata of all this sqoop imported data in to Atlas, make sure the below configurations are set properly.
http://atlas.incubator.apache.org/Bridge-Sqoop.html
Please note the above configuration step is not needed if your cluster configuration is managed by Ambari.
Using Rest API is one way is a good way to show MySQL metadata to the atlas catalog
other way using spark hive_support() spark -> read MySQL using JDBC -> write into hive , or using sqoop
To help to create RDBMS related instances, DB, tables, columns, I have created a GitHub repository
contains a template that can help you to understand how to add RDBMS or MySQL entities to the atlas
https://github.com/vettrivikas/Apche-Atlas-for-RDBMS
We can use REST API to create a type and then send data to it. Like
Lets say i have a dashboard and a visualization on it. I can create a Type Definition and then push data to it
{
"entityDefs": [
{
"superTypes": [
"DataSet"
],
"name": "Dashboard",
"description": "The definition of a Dashboard",
"attributeDefs": [
{
"name": "name",
"typeName": "string",
"isOptional": true,
"cardinality": "SINGLE",
"valuesMinCount": -1,
"valuesMaxCount": 1,
"isUnique": false,
"isIndexable": false,
"includeInNotification": false,
"searchWeight": -1
},
{
"name": "childDataset",
"typeName": "array<Visualization>",
"isOptional": true,
"cardinality": "SET",
"valuesMinCount": 0,
"valuesMaxCount": 2147483647,
"isUnique": false,
"isIndexable": false,
"includeInNotification": false,
"searchWeight": -1
}
]
},
{
"superTypes": [
"DataSet"
],
"name": "Visualization",
"description": "The definition of a Dashboard",
"attributeDefs": [
{
"name": "name",
"typeName": "string",
"isOptional": true,
"cardinality": "SINGLE",
"valuesMinCount": -1,
"valuesMaxCount": 1,
"isUnique": false,
"isIndexable": false,
"includeInNotification": false,
"searchWeight": -1
},
{
"name": "parentDataset",
"typeName": "array<Dashboard>",
"isOptional": true,
"cardinality": "SET",
"valuesMinCount": 0,
"valuesMaxCount": 2147483647,
"isUnique": false,
"isIndexable": false,
"includeInNotification": false,
"searchWeight": -1
}
]
}
],
"relationshipDefs": [
{
"category": "RELATIONSHIP",
"name": "dashboards_visualization_assignment",
"description": "The relationship between a Dashboard and a Visualization",
"relationshipCategory": "ASSOCIATION",
"attributeDefs": [],
"propagateTags": "NONE",
"endDef1": {
"type": "Dashboard",
"name": "childDataset",
"isContainer": false,
"cardinality": "SET",
"isLegacyAttribute": false
},
"endDef2": {
"type": "Visualization",
"name": "parentDataset",
"isContainer": false,
"cardinality": "SET",
"isLegacyAttribute": false
}
}
]
}
Then, you can simply add data using a REST Call to {servername}:{port}/api/atlas/v2/entity/bulk
{
"entities": [
{
"typeName": "Dashboard",
"guid": -1000,
"createdBy": "admin",
"attributes": {
"name": "sample dashboard",
"childDataset": [
{
"guid": "-200",
"typeName": "Visualization"
}
]
}
}
],
"referredEntities": {
"-200": {
"guid": "-200",
"typeName": "Visualization",
"attributes": {
"qualifiedName": "bar-chart"
}
}
}
}
}
Now, Look for Entities in Atlas.
Dashboard Entity on Atlas

Strapi EADDRNOTAVAIL error while deploying on Dokku

I am trying to deploy Strapi on a Dokku instance on a Digital Ocean droplet. I originally ran into some issues connecting to the mongo database, but after some trial and error and a lot of review of these docs and this issue, I was able to get it to stop complaining about the mongo connection. Here was my final config/environments/production/database.json
{
"defaultConnection": "default",
"connections": {
"default": {
"connector": "mongoose",
"settings": {
"client": "mongo",
"uri": "${process.env.MONGO_URL}",
"database": "${process.env.DATABASE_NAME}",
"username": "${process.env.DATABASE_USERNAME}",
"password": "${process.env.DATABASE_PASSWORD}",
"port": "${process.env.DATABASE_PORT || 27017}"
},
"options": {
"authenticationDatabase": "${process.env.DATABASE_AUTHENTICATION_DATABASE || ''}",
"useUnifiedTopology": "${process.env.USE_UNIFIED_TOPOLOGY || false}",
"ssl": "${process.env.DATABASE_SSL || false}"
}
}
}
}
Here is my config/environments/production/server.json
{
"host": "${process.env.HOST || '0.0.0.0'}",
"port": "${process.env.PORT || 1337}",
"production": true,
"proxy": {
"enabled": false
},
"cron": {
"enabled": false
},
"admin": {
"autoOpen": false
}
}
I believe the original issue was that I was accidentally using the PORT variable for the database instead of the DATABASE_PORT variable.
However, now that I have that worked out I am getting this error:
error Error: listen EADDRNOTAVAIL: address not available <my-host-ip>:5000
I thought maybe there was some wrong port being cached somewhere, but regardless of what I do, I can't seem to get it to work. Do I need to enable ssl? and then add a letsencrypt cert to my domain? am i using the wrong ports? set a proxy in the server.json?
PS. I am using Dokku Mongo. Didn't think that would be an issue considering the dynos don't go to sleep like they would on heroku. Is that an incorrect assumption?
Also, there are other apps running on the droplet. Maybe a proxy problem?

Couchbase Sync Gateway doesn't sync between Couchbase Lite and Couchbase Server

i have a very big problem. I'm building an app at the moment. When I start the App with the Android emulator it works fine. I can save some Data and it will show me these too. So it saves the Data locally. (Couchbase Lite)
I work with an ionic framework.
Now I want to sync between Couchbase Server and Couchbase Lite.
I use the Sync Gateway, but it doesn't work.
Below you can see my sync-gateway-config.json and my log.
Can someone help me please?
{
"interface": ":4984",
"adminInterface": "0.0.0.0:4985",
"log": ["*"],
"databases": {
"syncdb": {
"server": "http://127.0.0.1:8091",
"bucket": "sync_gateway",
"username": "sync_gateway",
"password": "********",
"sync":
function (doc) {
channel (doc.channels);
},
"users": {
"GUEST": {
"disabled": true,
"admin_channels": ["public"]
},
"Administrator": {
"disabled": false,
"password": "**********",
"admin_channels": ["*"]
}
}
}
Log
The Android emulator uses a special address to connect to the machine it runs on. Set your app to connect to Sync Gateway via 10.0.2.2. See this Stack Overflow post for more information.
you can add the property: "import_docs": "continuous", and "enable_shared_bucket_access": true, in your database config, between syncdb braces, to see if there is a differencee.
"syncdb": {
"import_docs": "continuous",
"enable_shared_bucket_access": true,
"server": "http://127.0.0.1:8091",
........
}

Sharding in orientdb distributed database

I am creating a distributed database in orientDB 2.2.6 with 3 nodes, namely master1, master2 and master3. I modified the hazelcast.xml and orientdb.server.config.xml files on each of the nodes. I used a common default-distributed-db-config.json on all 3 nodes which looks like as shown below.
{
"autoDeploy": true,
"readQuorum": 1,
"writeQuorum": "majority",
"executionMode": "undefined",
"readYourWrites": true,
"failureAvailableNodesLessQuorum": false,
"servers": {
"*": "master"
},
"clusters": {
"internal": {
},
"address": {
"owner" : "master1",
"servers": [ "master1" ]
},
"address_1": {
"owner" : "master1",
"servers" : [ "master1" ]
},
"ip": {
"owner" : "master2",
"servers" : [ "master2" ]
},
"ip_1": {
"owner" : "master2",
"servers" : [ "master2" ]
},
"id": {
"owner" : "master3",
"servers" : [ "master3" ]
},
"id_1": {
"owner" : "master3",
"servers" : [ "master3" ]
},
"*": {
"servers": [ "<NEW_NODE>" ]
}
}
}
Then I started the distributed server in the master1 machine, master2 and master3 in this order and let them synchronize the default DB. Then I created a database and three classes(Address, IP, ID) and their properties and indexes in the master1 machine. As I mentioned in the default-distributed-db-config.json file, Address class has two clusters and they are residing in the master1 machine. Class IP has two clusters and they reside in master2 machine.
When I insert values into Address class, as expected they are getting into master1 machine's clusters, following the round-robin strategy. But when I insert values for IP from the master2 machine they are creating a cluster in master1 and inserting into the new cluster. Basically, all the values are getting into master1 machine. When I do List Clusters, the clusters in master2 and master3 machines are empty.
So, I could not distribute the data across the three nodes. It basically stores the data into single machine. How to shard the data ? Is there any issue with the way I am trying to insert the data ?
Thanks
In current OrientDB releases, write operations (create/update/delete) are not forwarded. only the reads are. For this reason, the client should be connected to the server that handles the cluster you want your data written to.
Usually, this isn't a problem, because a local cluster is selected, but if you want to write on a specific cluster on a remote server this is not supported yet.

I try to use OrientDB in distributed mode. How can I configure on which nodes will be located a specific database?

For example: I have three nodes in the same cluster ("node1","node2" and "node3"). These nodes are identical in configuration files hazelcast.xml. I want that the database "DB_1" was placed only on the first and on the second node. And database "DB_2" was placed only on the second and the third node.
I modifed the file "default-distributed-db-config.json" on the first and on the second node:
{
"autoDeploy": true,
"hotAlignment": false,
"executionMode": "undefined",
"readQuorum": 1,
"writeQuorum": 2,
"failureAvailableNodesLessQuorum": false,
"readYourWrites": true,
"servers": {
"*": "master"
},
"clusters": {
"internal": {
},
"index": {
},
"*": {
"servers": ["node1","node2"]
}
}
}
I modifed the file "default-distributed-db-config.json" on the third node:
{
"autoDeploy": true,
"hotAlignment": false,
"executionMode": "undefined",
"readQuorum": 1,
"writeQuorum": 2,
"failureAvailableNodesLessQuorum": false,
"readYourWrites": true,
"servers": {
"*": "master"
},
"clusters": {
"internal": {
},
"index": {
},
"*": {
"servers": ["node3"]
}
}
}
I modifed the file "distributed-config.json" in the database directory "DB_1" on the first and second node.
I removed from it all:
<NEW_NODE>
and wrote everywhere the only names of the first and second nodes:
"*":{"#type":"d","#version":0,"servers":["node1","node2"]},"orole_node2":{"#type":"d","#version":0,"servers":["node2","node1"]},"e_node2":{"#type":"d","#version":0,"servers":["node2","node1"]},"ouser_node2":{"#type":"d","#version":0,"servers":["node2","node1"]},"oschedule_node2":{"#type":"d","#version":0,"servers":["node2","node1"]},"orids_node2":{"#type":"d","#version":0,"servers":["node2","node1"]},"v_node2":{"#type":"d","#version":0,"servers":["node2","node1"]},"ofunction_node2":{"#type":"d","#version":0,"servers":["node2","node1"]}}
But nonetheless if start the third node, then for database "DB_1" will run replicating in the third node too.
(OrientDB v2.1.13)
I don't think it is possible, every node of the same cluster has to be synchronized with the others.