Can't map postgres to Neo4j graph using Neo4j ETL tool? - postgresql

I am trying to map my Cloudquery database (postgres) to Neo4j using ETL Tool.
Project created with the following version:
Version 4.4.5
Neoj4 Desktop version 1.4.15.
I am on OSX Monterey 12.5 (21G72), Apple M1 Pro.
For the postgres (where cloudquery is pouring the data) I am using postgres:13.7-alpine3.16 docker image.
Selected RDBMS instance (connection succeeded) and my project. Within the main.log I get:
[2022-07-28 14:27:13.192] [info] Executing '/Users/stevesolun/Library/Application Support/Neo4j Desktop/Application/distributions/java/zulu11.54.25-ca-jdk11.0.14.1/bin/java, -cp, /Users/stevesolun/Library/Application Support/Neo4j Desktop/Application/graphApps/_global/neo4j-etl-ui/dist/neo4j-etl.jar, org.neo4j.etl.rdbms.Support, jdbc:postgresql://localhost:5432/postgres?ssl=false, postgres, pass'
[2022-07-28 14:27:13.844] [info] Process [87889] exit with code '1', signal 'null'
[2022-07-28 14:27:43.264] [info] Online check request: https://dist.neo4j.org/neo4j-desktop/win/latest.yml
[2022-07-28 14:27:43.372] [info] Online check response: 200 version: 1.4.15
file
[2022-07-28 14:28:23.273] [info] Online check request: https://dist.neo4j.org/neo4j-desktop/win/latest.yml
[2022-07-28 14:28:23.382] [info] Online check response: 200 version: 1.4.15
file
[2022-07-28 14:29:03.278] [info] Online check request: https://dist.neo4j.org/neo4j-desktop/win/latest.yml
[2022-07-28 14:29:03.386] [info] Online check response: 200 version: 1.4.15
file
[2022-07-28 14:29:39.909] [info] Executing '/Users/stevesolun/Library/Application Support/Neo4j Desktop/Application/distributions/java/zulu11.54.25-ca-jdk11.0.14.1/bin/java, -cp, /Users/stevesolun/Library/Application Support/Neo4j Desktop/Application/graphApps/_global/neo4j-etl-ui/dist/neo4j-etl.jar, org.neo4j.etl.rdbms.Support, jdbc:postgresql://localhost:5432/postgres?ssl=false, postgres, pass'
[2022-07-28 14:29:40.771] [info] Process [88036] exit with code '0', signal 'null'
[2022-07-28 14:29:43.282] [info] Online check request: https://dist.neo4j.org/neo4j-desktop/win/latest.yml
[2022-07-28 14:29:43.338] [info] Online check response: 200 version: 1.4.15
file
[2022-07-28 14:29:44.217] [info] Executing '/Users/stevesolun/Library/Application Support/Neo4j Desktop/Application/distributions/java/zulu11.54.25-ca-jdk11.0.14.1/bin/java, -cp, /Users/stevesolun/Library/Application Support/Neo4j Desktop/Application/graphApps/_global/neo4j-etl-ui/dist/neo4j-etl.jar, org.neo4j.etl.NeoIntegrationCli, generate-metadata-mapping, --rdbms:url, jdbc:postgresql://localhost:5432/postgres?ssl=false, --rdbms:password, pass, --rdbms:user, postgres, --output-mapping-file, /var/folders/nl/301vjtr53s92b4wr66tchrjr0000gn/T/postgresql_postgres_mapping.json'
[2022-07-28 14:30:04.214] [info] Process [88039] exit with code '0', signal 'null'
[2022-07-28 14:30:04.216] [info] Executing '/Users/stevesolun/Library/Application Support/Neo4j Desktop/Application/distributions/java/zulu11.54.25-ca-jdk11.0.14.1/bin/java, -cp, /Users/stevesolun/Library/Application Support/Neo4j Desktop/Application/graphApps/_global/neo4j-etl-ui/dist/neo4j-etl.jar, org.neo4j.etl.util.FileUtils, readfile, /var/folders/nl/301vjtr53s92b4wr66tchrjr0000gn/T/postgresql_postgres_mapping.json'
[2022-07-28 14:30:04.616] [info] Process [88058] exit with code '143', signal 'null'
[2022-07-28 14:30:23.288] [info] Online check request: https://dist.neo4j.org/neo4j-desktop/win/latest.yml
[2022-07-28 14:30:23.596] [info] Online check response: 200 version: 1.4.15
file
[2022-07-28 14:31:03.292] [info] Online check request: https://dist.neo4j.org/neo4j-desktop/win/latest.yml
[2022-07-28 14:31:03.344] [info] Online check response: 200 version: 1.4.15
file
[2022-07-28 14:31:43.296] [info] Online check request: https://dist.neo4j.org/neo4j-desktop/win/latest.yml
[2022-07-28 14:31:43.459] [info] Online check response: 200 version: 1.4.15
It seems I do get the mapping file but the mapping process takes ages without finishing.
Please advise what should I do to make it work?
Can I upload manually the mapping file? How can I force it to do the mapping? Is it ok it takes so long?

Related

Mongooseim 3.6.0 postgress connection Issue

Hi I am new to mongooseim , I am planing to setup mongooseim in my local setup with postgres database connection. I have installed mongooseim 3.6.0 in ubuntu 14.04 machine. And create database in postgres & add the schema for that.Then I have done the following changes in mongooseim.cfg file.
{outgoing_pools, [
{rdbms, global, default, [{workers, 1}], [{server, {psql, "localhost", 5432, "mongooseim", "postgres", "password"}}]}
]}.
And this one.
{rdbms_server_type, pgsql}.
These are the changes I have done for default config file. Then When I restart the server it gives this error. postgres server running & user credentials are working.
020-06-05 18:17:46.815 [info] <0.249.0> msg: "Starting reporters with []\n", options: []
2020-06-05 18:17:47.034 [notice] <0.130.0>#lager_file_backend:143 Changed loglevel of /var/log/mongooseim/ejabberd.log to info
2020-06-05 18:17:47.136 [info] <0.43.0> Application mnesia exited with reason: stopped
2020-06-05 18:17:47.621 [error] <0.593.0>#mongoose_rdbms_psql:connect CRASH REPORT Process <0.593.0> with 0 neighbours crashed with reason: call to undefined function mongoose_rdbms_psql:connect({psql,"server",5432,"mongooseim","postgres","password"}, 5000)
2020-06-05 18:17:47.621 [error] <0.592.0>#mongoose_rdbms_psql:connect Supervisor 'wpool_pool-mongoose_wpool$rdbms$global$default-process-sup' had child 'wpool_pool-mongoose_wpool$rdbms$global$default-1' started with wpool_process:start_link('wpool_pool-mongoose_wpool$rdbms$global$default-1', mongoose_rdbms, [{server,{psql,"server",5432,"mongooseim","postgres","password"}}], [{queue_manager,'wpool_pool-mongoose_wpool$rdbms$global$default-queue-manager'},{time_checker,'wpool_pool-mongoose_wpool$rdbms$global$default-time-checker'},...]) at undefined exit with reason call to undefined function mongoose_rdbms_psql:connect({psql,"server",5432,"mongooseim","postgres","password"}, 5000) in context start_error
2020-06-05 18:17:47.622 [error] <0.589.0> Supervisor 'mongoose_wpool$rdbms$global$default' had child 'wpool_pool-mongoose_wpool$rdbms$global$default-process-sup' started with wpool_process_sup:start_link('mongoose_wpool$rdbms$global$default', 'wpool_pool-mongoose_wpool$rdbms$global$default-process-sup', [{queue_manager,'wpool_pool-mongoose_wpool$rdbms$global$default-queue-manager'},{time_checker,'wpool_pool-mongoose_wpool$rdbms$global$default-time-checker'},...]) at undefined exit with reason {shutdown,{failed_to_start_child,'wpool_pool-mongoose_wpool$rdbms$global$default-1',{undef,[{mongoose_rdbms_psql,connect,[{psql,"server",5432,"mongooseim","postgres","password"},5000],[]},{mongoose_rdbms,connect,4,[{file,"/root/deb/mongooseim/_build/prod/lib/mongooseim/src/rdbms/mongoose_rdbms.erl"},{line,668}]},{mongoose_rdbms,init,1,[{file,"/root/deb/mongooseim/_build/prod/lib/mongooseim/src/rdbms/mongoose_rdbms.erl"},{line,431}]},{wpool_process,init,1,[{file,"/root/deb/mongooseim/_build/..."},...]},...]}}} in context start_error
2020-06-05 18:17:47.622 [error] <0.583.0>#mongoose_wpool_mgr:handle_call:105 Pool not started: {error,{{shutdown,{failed_to_start_child,'wpool_pool-mongoose_wpool$rdbms$global$default-process-sup',{shutdown,{failed_to_start_child,'wpool_pool-mongoose_wpool$rdbms$global$default-1',{undef,[{mongoose_rdbms_psql,connect,[{psql,"server",5432,"mongooseim","postgres","password"},5000],[]},{mongoose_rdbms,connect,4,[{file,"/root/deb/mongooseim/_build/prod/lib/mongooseim/src/rdbms/mongoose_rdbms.erl"},{line,668}]},{mongoose_rdbms,init,1,[{file,"/root/deb/mongooseim/_build/prod/lib/mongooseim/src/rdbms/mongoose_rdbms.erl"},{line,431}]},{wpool_process,init,1,[{file,"/root/deb/mongooseim/_build/default/lib/worker_pool/src/wpool_process.erl"},{line,85}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,374}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}}}}},{child,undefined,'mongoose_wpool$rdbms$global$default',{wpool,start_pool,['mongoose_wpool$rdbms$global$default',[{worker,{mongoose_rdbms,[{server,{psql,"server",5432,"mongooseim","postgres","password"}}]}},{pool_sup_shutdown,infinity},{workers,1}]]},temporary,infinity,supervisor,[wpool]}}}
2020-06-05 18:17:47.678 [warning] <0.615.0>#service_mongoose_system_metrics:report_transparency:129 We are gathering the MongooseIM system's metrics to analyse the trends and needs of our users, improve MongooseIM, and know where to focus our efforts. For more info on how to customise, read, enable, and disable these metrics visit:
- MongooseIM docs -
https://mongooseim.readthedocs.io/en/latest/operation-and-maintenance/System-Metrics-Privacy-Policy/
- MongooseIM GitHub page - https://github.com/esl/MongooseIM
The last sent report is also written to a file /var/log/mongooseim/system_metrics_report.json
2020-06-05 18:17:48.404 [warning] <0.1289.0>#nkpacket_stun:get_stun_servers:231 Current NAT is changing ports!

OSRM Extract Silently Fails

I'm trying to setup a OSRM backend server using Docker. The host machine is a MacBook Pro with 32GB RAM running OS 10.14 (Mojave).
Having downloaded the england-latest.osm.pbf file I've tried to start the process with the following command.
docker run -t -v $(pwd):/data osrm/osrm-backend osrm-extract -p /opt/car.lua /data/england-latest.osm.pbf
The output is as follows:
[info] Parsed 0 location-dependent features with 0 GeoJSON polygons
[info] Using script /opt/car.lua
[info] Input file: england-latest.osm.pbf
[info] Profile: car.lua
[info] Threads: 6
[info] Parsing in progress..
[info] input file generated by osmium/1.8.0
[info] timestamp: 2018-11-02T21:15:02Z
[info] Using profile api version 4
[info] Found 3 turn restriction tags:
[info] motorcar
[info] motor_vehicle
[info] vehicle
[info] Parse relations ...
[info] Parse ways and nodes ...
[info] Using profile api version 4
[info] Using profile api version 4
[info] Using profile api version 4
[info] Using profile api version 4
[info] Using profile api version 4
At this point the process silently fails and doesn't write any of the files you'd expect. If I use the greater-london-latest.osm.pbf file then everything works fine so I'm guessing it's some sort of memory constraint. How do I go about identifying the problem and fixing it?
After running into the same issue on my Mac, I solved it by increasing the memory available to Docker. Look for the whale icon on the menubar, click then select Preferences > Advanced. Change the slider to a higher amount, press Apply & Restart, then try running again.
https://docs.docker.com/docker-for-mac/#memory
I could run within the 2GB default limit with a 415MB file (Canada > British Columbia) but nothing larger. Given the England file is 422MB, you can probably increase to 4GB and have it work. (Trying to run for all of Canada, a 2.4GB file, needed >8GB of RAM.)

ScalaTest: Suites are being executed but no test cases are actually running?

I am using the latest version of Play framework, along with the following test dependencies from by build.sbt file:
"org.scalatest" %% "scalatest" % "3.0.0",
"org.scalatestplus.play" % "scalatestplus-play_2.11" % "2.0.0-M1"
I have a base specification all of my test cases extend from. I return a Future[Assertion] in each one of my clauses, it looks like this:
trait BaseSpec extends AsyncWordSpec with TestSuite with OneServerPerSuite with MustMatchers with ParallelTestExecution
An example spec looks like this:
"PUT /v1/user/create" should {
"create a new user" in {
wsClient
.url(s"http://localhost:${port}/v1/user")
.put(Json.obj(
"name" -> "username",
"email" -> "email",
"password" -> "hunter12"
)).map { response => response.status must equal(201) }
}
}
I decided to rewrite my current tests using the AsyncWordSpec provided by the newer version of ScalaTest, but when I run the test suite, this is the output that I get:
[info] UserControllerSpec:
[info] PUT /v1/user/create
[info] application - ApplicationTimer demo: Starting application at 2016-11-13T01:29:12.161Z.
[info] application - ApplicationTimer demo: Stopping application at 2016-11-13T01:29:12.416Z after 1s.
[info] application - ApplicationTimer demo: Stopping application at 2016-11-13T01:29:12.438Z after 0s.
[info] application - ApplicationTimer demo: Stopping application at 2016-11-13T01:29:12.716Z after 0s.
[info] application - ApplicationTimer demo: Stopping application at 2016-11-13T01:29:13.022Z after 1s.
[info] ScalaTest
[info] Run completed in 13 seconds, 540 milliseconds.
[info] Total number of tests run: 0
[info] Suites: completed 4, aborted 0
[info] Tests: succeeded 0, failed 0, canceled 0, ignored 0, pending 0
[info] No tests were executed.
[info] Passed: Total 0, Failed 0, Errors 0, Passed 0
[success] Total time: 20 s, completed Nov 12, 2016 8:29:13 PM
All of my test classes are found, built, and seemingly run by the test runner when invoking sbt test. I have also tried using the IDEA test runner, and it reports Empty Test Suite under each one of my test classes. I have exhaustively attempted to RTFM but I cannot see what I am doing wrong. The synchronous versions of my tests are running totally fine.
EDIT 1: A friend suggested to attempt doing whenReady() { /* clause */ } on my Future[WSResponse], but this too has failed.
I was having the same problem while using a test suite with multiple traits. I got it to work by removing all the other traits except for AsyncFlatSpec. I will add them back one at a time as I need them.

Prediction.io - pio train fails

I'm using the Elasticsearch + Hbase version of Prediction.IO from the sphereio/docker-predictionio docker image and the universal recommendation template template-scala-parallel-universal-recommendation.
pio-start-all and pio status work fine and the eventserver is prefectly functional. I have created an app and imported a few hundred events to start with.
However, after doing pio build on the template, pio train fails giving a couple of javax.naming.NameNotFoundException warnings. Even pio.log does not contain anything else.
Here's my engine.json:
{
"comment": " This config file uses default settings for all but the required values see README.md for docs",
"id": "default",
"description": "Default settings",
"engineFactory": "com.test.RecommendationEngine",
"datasource": {
"params": {
"name": "sample-handmade-data.txt",
"appName": "testapp",
"eventNames": ["START"]
}
},
"sparkConf": {
"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryo.referenceTracking": "false",
"spark.kryoserializer.buffer": "300m",
"spark.executor.memory": "4g",
"es.index.auto.create": "true"
},
"algorithms": [{
"comment": "simplest setup where all values are default, popularity based backfill, must add eventsNames",
"name": "ur",
"params": {
"appName": "testapp",
"indexName": "urindex",
"typeName": "items",
"comment": "must have data for the first event or the model will not build, other events are optional",
"eventNames": ["START"]
}
}]
}
And the pio train output:
[INFO] [Console$] Using existing engine manifest JSON at /PredictionIO-0.9.6/engines/universal-recommendation/manifest.json
[INFO] [Runner$] Submission command: /PredictionIO-0.9.6/vendors/spark-1.5.1-bin-hadoop2.6/bin/spark-submit --class io.prediction.workflow.CreateWorkflow --jars file:/PredictionIO-0.9.6/engines/universal-recommendation/target/scala-2.10/template-scala-parallel-universal-recommendation-assembly-0.2.3-deps.jar,file:/PredictionIO-0.9.6/engines/universal-recommendation/target/scala-2.10/template-scala-parallel-universal-recommendation_2.10-0.2.3.jar --files file:/PredictionIO-0.9.6/conf/log4j.properties,file:/PredictionIO-0.9.6/vendors/hbase-1.0.0/conf/hbase-site.xml --driver-class-path /PredictionIO-0.9.6/conf:/PredictionIO-0.9.6/vendors/hbase-1.0.0/conf file:/PredictionIO-0.9.6/lib/pio-assembly-0.9.6.jar --engine-id FYOHZGlAmUH2xAYWNmQFIf9Jls201WVr --engine-version a892fe59be15dcf27a17f07fb76135a967309fda --engine-variant file:/PredictionIO-0.9.6/engines/universal-recommendation/engine.json --verbosity 0 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_VERSION=0.9.6,PIO_FS_BASEDIR=/root/.pio_store,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost,PIO_STORAGE_SOURCES_HBASE_HOME=/PredictionIO-0.9.6/vendors/hbase-1.0.0,PIO_HOME=/PredictionIO-0.9.6,PIO_FS_ENGINESDIR=/root/.pio_store/engines,PIO_STORAGE_SOURCES_LOCALFS_PATH=/root/.pio_store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=predictionio,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/PredictionIO-0.9.6/vendors/elasticsearch-1.4.4,PIO_FS_TMPDIR=/root/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_CONF_DIR=/PredictionIO-0.9.6/conf,PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
[INFO] [Engine] Extracting datasource params...
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
[INFO] [Engine] Datasource params: (,DataSourceParams(testapp,List(START)))
[INFO] [Engine] Extracting preparator params...
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[INFO] [Engine] Serving params: (,Empty)
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://sparkDriver#172.17.0.2:42582]
[WARN] [MetricsSystem] Using default name DAGScheduler for source because spark.app.id is not set.
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: com.test.DataSource#75bd28d
[INFO] [Engine$] Preparator: com.test.Preparator#13278a41
[INFO] [Engine$] AlgorithmList: List(com.test.URAlgorithm#2365ea38)
[INFO] [Engine$] Data sanity check is on.
[WARN] [TableInputFormatBase] Cannot resolve the host name for 9a94fb2890b3/172.17.0.2 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '2.0.17.172.in-addr.arpa'
[INFO] [Engine$] com.test.TrainingData does not support data sanity check. Skipping check.
[WARN] [TableInputFormatBase] Cannot resolve the host name for 9a94fb2890b3/172.17.0.2 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '2.0.17.172.in-addr.arpa'
There is one way to short out this problem. Please use google d.n.s while starting your docker container.
--dns=8.8.8.8

Changing the Default Play Framework HTTP Port (without Using System Property)

There's an oft-asked question about changing the HTTP port to which a Play application will bind. James Ward's answer is generally accepted as the most complete, but it involves overriding the default by setting a http.port system property. However, is it possible to change this default without having to manually add it to the run command at development time, tweak the environment, or package an override in a runtime configuration?
This can be accomplished by setting the playDefaultPort key, as follows:
import PlayKeys._
playDefaultPort := 9123
Afterwards, you'll be able to run and testProd without needing to remember the desired port.
This works in both development:
$ sbt run
[info] Loading project definition from /Users/michaelahlers/Projects/MyApp/project
[info] Set current project to MyApp (in build file:/Users/michaelahlers/Projects/MyApp/)
--- (Running the application, auto-reloading is enabled) ---
[info] p.c.s.NettyServer - Listening for HTTP on /0:0:0:0:0:0:0:0:9123
(Server started, use Ctrl+D to stop and go back to the console...)
And production modes:
$ sbt testProd
[info] Loading project definition from /Users/michaelahlers/Projects/MyApp/project
[info] Set current project to MyApp (in build file:/Users/michaelahlers/Projects/MyApp/)
[info] Packaging /Users/michaelahlers/Projects/MyApp/target/scala-2.11/MyApp_2.11-1.0.0-SNAPSHOT-web-assets.jar ...
[info] Done packaging.
(Starting server. Type Ctrl+D to exit logs, the server will remain in background)
2016-04-08 13:09:45,594 [info] a.e.s.Slf4jLogger - Slf4jLogger started
2016-04-08 13:09:45,655 [info] play.api.Play - Application started (Prod)
2016-04-08 13:09:45,767 [info] p.c.s.NettyServer - Listening for HTTP on /0:0:0:0:0:0:0:0:9123