Connect Neo4J on an existing Postgresql database - postgresql

I'm a Neo4j new user and I played around with the webadmin interface of Neo4j to create small databases and simple queries in Cypher. Now I want to use Neo4J to create a graph with my existing database. It's a postgresql database with millions of entries with the same structure (Neo4J is very adapted to represent these data). My question is how to import these data ? What is the easiest way to do that ? I already saw that Cypher recognizes csv files but do I have to create a csv file with my data or is there another way to import them ? Thank you for your help. Sam

One option is to export your postgres data to csv and apply LOAD CSV to import them into the graph.
Another way is writing a script in a language of choice (I'd vote for groovy here) that connects to Postgres using JDBC and connects to Neo4j and then applies the business logic to transform between the two.
A third option is using a ETL tool like Talend. It basically does the same as your custom script but provides a point & click interface to define the transformation, see http://neo4j.com/blog/fun-with-music-neo4j-and-talend/ for more details.

Related

Initiating schema in PostgreSQL RDS?

I'm trying to build an app using Node/Express and RDS PostgreSQL on the back-end to get some more experience with these two technologies. More specifically, I'm looking to build this using the node-postgres package and without the aid of an ORM. I currently have a .sql file in my app that contains the desired schema.
What would be considered "best practice" when implementing a schema for the first time? For example, is it considered better to import a schema via the command line, use something like pgAdmin, or throw a bunch of "CREATE TABLEs" into queries through node-postgres?
Thanks in advance for the help!

import data from Postgres to Cassandra

I need to import data from Postgres to Cassandra using open source technologies only.
Can anyone please outline the steps I need to take.
As per instructions, I have to refrain from using DataStax software as they come with license.
Steps I have already tried:
Export one table from Postgres in csv format and imported to HDFS (using sqoop) {If I take this approach do I need to use Map_Reduce after this?}.
Tried to import the csv file to Cassandra using cql, however, got this error
Cassandra: Unable to import null value from csv
I am trying several methods, but unable to find a solid approach of attach.
Can anyone of you please provide me the steps required for the whole process. I believe there would be many people who have already done that.

Importing AccessDB and Oracle directly into MongoDB

I am receiving .dmp and .mdb files from a customer & need to get that data into MongoDB.
Is there any way to straight import these file types into Mongo?
The goal is to programmatically ingest these into mongo in any way I can. The only rule is that customer will not change their method of data delivery, meaning I'm stuck with the .dmp and .mdb files as a source.
Any assistance would be greatly appreciated.
Here are a few options/ideas:
Convert mdb to csv, then use mongoimport --type csv to import into MongoDB.
Use an ETL tool, e.g. Pentaho, Informatica, etc. This will give you much more flexibility for doing any necessary transformation/conversion of data.
Write a custom ETL tool, using libraries that know how to read mdb and dmp files.
You don't mention how you plan to use this data, how many tables are in the database, and how normalized the tables are. Depending on the specifics of your use case, it's very possible that loading the data from Access "as is" will not be a good choice since normalized schemas are not a good fit for MongoDB and MongoDB does not natively support joins. This is where an ETL tool can help, by extracting the source data and transforming it into an appropriate JSON structure.
MongoDB has released ODBC drivers. Go Here MongoDB ODBC Drivers connect MSAccess directly to MongoDB through ODBC. Voila!

Migrating a schema from one database to other

As part of some requirement, I need to migrate a schema from some existing database to a new schema in a different database. Some part of it is already done and now I need to compare the 2 schema and make changes in the new schema as per gap finding.
I am not using a tool and was trying to understand some details using syscat command but could not get much success.
Any pointer on what is the best way to solve this?
Regards,
Ramakant
A tool really is the best way to solve this – IBM Data Studio is free and can compare schemas between databases.
Assuming you are using DB2 for Linux/UNIX/Windows, you can do a rudimentary compare by looking at selected columns in SYSCAT.TABLES and SYSCAT.COLUMNS (for table definitions), and SYSCAT.INDEXES (for indexes). Exporting this data to files and using diff may be the easiest method. However, doing this for more complex structures (tables with range or database partitioning, foreign keys, etc) will become very complex very quickly as this information is spread across a lot of different system catalog tables.
An alternative method would be to extract DDL using the db2look utility. However, you can't specify the order that db2look outputs objects (db2look extracts DDL based on the objects' CREATE_TIME), so you can't extract DDL for an entire schema into a file and expect to use diff to compare. You would need to extract DDL into a separate file for each table.
Use SchemaCrawler for IBM DB2, a free open-source tool that is designed to produce text output that is designed to be diffed. You can get very detailed information about your schema, including view and stored procedure definitions. All of the information that you need will be output in a single file, and can be compared very easily using a standard diff tool.
Sualeh Fatehi, SchemaCrawler
unfortunately as per company policy, cannot use these tools at this point of time. So am writing some program using JDBC to get the details and do some comparison kind of stuff.

Download HTTP data with Postgres StoredProcedure

I am wondering if there is a way to import data from an HTTP source from within an pgsql function.
I am porting an old system that harvests data from a website. Rather than maintaining a separate set of files to manage the downloading of the data, I was hoping to put the import routines directly into stored procedures.
I do know how to import data with COPY, but that requires the data to already be available locally. Is there a way to get the download the data with PL/PGSQL? Am I out to lunch?
Related: How to import CSV file data into a PostgreSQL table?
Depending what you're after, the Postgres extension www_fdw might work for you: http://pgxn.org/dist/www_fdw/
If you want download custom data by HTTP protocol, then PostgreSQL extensive support for different languages might be handy. Here is the example of connecting to Google Translate service from Postgres function written in Python:
https://wiki.postgresql.org/wiki/Google_Translate