Sqoop Import from Mysql DB is taking Huge time to Complete

Sqoop Import from Mysql DB is taking Huge time to Complete - import

I am trying to import some 5 records from a MySQL table "employees" and import it to hdfs by creating a hive table in Sqoop Import command. But the query is running for a long time and I am unable to see any result.
I have started Hadoop services and it is up and running. I can create tables in Hive manually also.
sqoop import --connect jdbc:mysql://localhost:3306/DBName
--username UserID --password Pwd --split-by EMPLOYEE_ID
-e "select EMPLOYEE_ID,FIRST_NAME from employees where EMPLOYEE_ID <=105 and $CONDITIONS"
--target-dir /user/hive/warehouse/mysqldb.db/employees
--fields-terminated-by ","
--hive-import
--create-hive-table
--hive-table mysqldb.employees -m2
What is going on here?
Sqoop Query ->
URL ->

Related

How to import rows from only one table from pgAdmin db to table in Heroku Postgres db?

I have table Films with data in pgAdmin db(local Postgres db). And I want to import this data from Films table to the same table Films in Heroku Postgres db. I am using SQLAlchemy and Flask. I have read here(https://devcenter.heroku.com/articles/heroku-postgres-import-export) that it can be done through the console. I even tried it, but without any success. I am going to do this by write all data from Films into csv file and then copy it from csv file to Films on Heroku Postgres db, but is there a better way to do what i want? If exists, please give me understandable example. I will be very appreciative.
P.S. I tried create table dump: pg_dump -Fc --no-acl --no-owner -h localhost -U oleksiy -t films --data-only fm_bot > table.dump
But I don't understand next step: Generate a signed URL using the aws console - aws s3 presign s3://your-bucket-address/your-object
What is "your-bucket-address", what is "your-object"? And the main question: is it that I need?

How to transfer data from one schema to another schema of different machine in Postgres db using Dbeaver

I have a Postgres database schema in machine A, I want to copy the whole database schema into machine B using Dbeaver. How can I do that?

You should take the backup using pg_dump. To take a dump from a remote server you should run the following
pg_dump -h host_address -U username -s schema_name -Fc database_name > dump_file_path.sql
This will create a SQL dump of the selected schema. Then you can use PSQL or DBeaver to import and create the schema in your server B or database B.
To do it with psql you should run the following -
psql "sslmode=disable dbname=_db_name_ user=_user_ hostaddr=_host_" < exported_sql_file_path

Sqoop - changing a Hive column data type to Postgres data type

I'm trying to change the last column of the hive table (which is type STRING in hive) to a Postgres type date below is the command:
sqoop export
--connect jdbc:postgresql://192.168.11.1:5432/test
--username test
--password test_password
--table posgres_table
--hcatalog-database hive_db
--hcatalog-table hive_table
I have tried using, this but the column in Postgres is still empty:
-map-column-hive batch_date=date

-map-column-hive works only for Sqoop import (i.e. while fetching data from RDBMS to HDFS/Hive)
All you need to make your Hive's String data in proper date format, it should work.
Internally, sqoop export create statements like
INSERT INTO posgres_table...
You can verify by manually creating INSERT INTO posgres_table values(...) statment via JDBC driver or any client like pgAdmin, squirrel-sql, etc.

Primary Key Error while importing table Using Sqoop

I tried to import all tables using the sqoop into one of the directories.But one of the tables has no primary key.This is the code I executed.
sqoop import-all-tables --connect "jdbc:mysql://quickstart.cloudera/retail_db"
--username=retail_dba
--password=cloudera
--warehouse-dir /user/cloudera/sqoop_import/
I am getting the following error:
Error during import: No primary key could be found for table
departments_export. Please specify one with --split-by or perform a
sequential import with '-m 1'.
By seeing
sqoop import without primary key in RDBMS
I understood that we can just use --split-by for a single table import.Is there a way i can specify --splity-by for Import-all-tables command. Is there a way I can use more than one mapper for the multi-table import with no primary-key.

you need to use --autoreset-to-one-mapper:
Tables without primary key will be imported with one mapper and others with primary key with default mappers (4 - if not specified in sqoop command)
As #JaimeCr said you can't use --split-by with import-all-tables but this just a quote from sqoop guide in context of error you got:
If a table does not have a primary key defined and the --split-by> <col> is not provided, then import will fail unless the number of mappers is explicitly set to one with the --num-mappers 1 or --m 1 option or the --autoreset-to-one-mapper option is used.
The option --autoreset-to-one-mapper is typically used with the import-all-tables tool to automatically handle tables without a primary key in a schema.
sqoop import-all-tables --connect "jdbc:mysql://quickstart.cloudera/retail_db" \
--username=retail_dba \
--password=cloudera \
--autoreset-to-one-mapper \
--warehouse-dir /user/cloudera/sqoop_import/

How to insert and Update simultaneously to PostgreSQL with sqoop command

I am trying to insert into postgreSQL DB with sqoop command.
sqoop export --connect jdbc:postgresql://10.11.12.13:1234/db --table table1 --username user1 --password pass1--export-dir /hivetables/table/ --fields-terminated-by '|' --lines-terminated-by '\n' -- --schema schema
It is working fine if there is not primary key constrain. I want to insert new records and update old records simultaneously.
I have tried
--update-key primary_key This updates only those primary keys present in both DB(hive and postgreSQL. No insertion)
--update-mode allowinsert - This only does the insert
--update-key primary_key --update-mode allowinsert - This gives error
ERROR tool.ExportTool: Error during export: Mixed update/insert is not supported against the
target database yet
Can anyone help me to write sqoop command which insert and update the data to postgreSQL ?

According to my internet search, it is not possible to perform both insert and update directly to postgreSQL DB. Instead you can create a storedProc/function in postgreSQL and you can send data there..
sqoop export --connect <url> --call <upsert proc> --export-dir /results/bar_data
Stored proc/function should perform both Update and Insert.
Link 1 - https://issues.apache.org/jira/browse/SQOOP-1270
Link 2 - PLPGSQL-UPSERT-EXAMPLE

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Sqoop Import from Mysql DB is taking Huge time to Complete - import

Related

How to import rows from only one table from pgAdmin db to table in Heroku Postgres db?

How to transfer data from one schema to another schema of different machine in Postgres db using Dbeaver

Sqoop - changing a Hive column data type to Postgres data type

Primary Key Error while importing table Using Sqoop

How to insert and Update simultaneously to PostgreSQL with sqoop command

Categories

Resources