how to export data to a CSV file in ODI 12c - export-to-csv

I have a table in DB which I need to export the data based on the condition < current date to a CSV file using ODI 12c can I get the steps please..

Please follow below steps to export data from DB to a file using a condition.
1) Create a New Package
2) Under Toolbox navigate to Files --> click on OdiSqlUnload and click on the actual package area.
3) Input Parameters in Properties section
Target File: Give the Path where file needs to be created
JDBC Driver: Oracle.jdbc.oracledriver
JDBC URL: Give the jdbc url which you use to connect to DB
User: User Name for the DB which you are trying to connect to.
Pwd:
File Format: Delimited
Field Separator: , (Comma)
SQL Query: Example: select * from emp where rownum<10
4) Leave the rest all parameters as is
5) Save Package and execute the step with desired context. If you are creating file in local do not use agent while execution.

Related

Export a postgis table to geopackage using the database management in QGIS

I am working in QGIS through the DB Manager by running some spatial queries. After I create new tables with the queries I need to export the tables as new geopackages.
I've tried using the Export to Vector file inside the DB Manager but I get the following error message:
Error 2 Creation of data source failed (OGR error:
sqlite3_open(/Users/xxx/Documents/xxx/xxx/xxx/xxxx/xx/new_geopackage_layer.gpkg)failed:
unable to open database file)
I've read a couple of posts and they said I needed to create an empty geopackage first and then export the table and save it inside the geopackage but that did not work either. When I try to save inside an existing geopackage
I get an error saying:
"geopackage.gpkg already exists. Do you want to replace it? A file or
folder with the same name already exists in the folder xxx Replacing
it will overwrite its current contents."
If I choose to overwrite then I get a second error message saying:
" Error 1 Unable to create the datasource.
/Users/xxx/Documents/xxx/xxx/xxx/xxx/xxxx/new_geopackage.gpkg exists
and overwrite flag is false."
All I want is to be able to run spatial queries inside QGIS and be able to export the tables created with the queries as geopackages.
It seems that as of now I won't be able to do this from inside QGIS but instead will need to use ogr2ogr command to export to any file type.
Any help would be really appreciated.
Thank you

How Do i read the Lake database in Azure Synapse in a PySpark notebook

Hi I created a Database in Azure Synapse Studio and I can see the database and table in there, Now I have created a Notebook where I have added the required libraries but I am unable to read the table by below code. Can anyone fix what wrong am i doing here ?
My database name is Utilities_66_Demo . It gives me error as
AnalysisException: Path does not exist:
abfss://users#stcdmsynapsedev01.dfs.core.windows.net/Utilities_66_Demo.parquet
From where should I take the path? I tried to follow the MS article. Where Do I read path? if I click on edit Database, i get this
%%pyspark
df = spark.read.load('abfss://users#stcdmsynapsedev01.dfs.core.windows.net/Utilities_66_Demo.parquet', format='parquet')
display(df.limit(10))
Trying to access the created Lake Database Table:
Selected Azure Synapse Analytics:
I select my workspace and in dropdown there is no table shown:
I select Edit and put my Db name and Table name and it says Invalid
details.
Now I select Azure Dedicated Synapse Pool from Linked Service,
I get no option to select in SQL Pool or Table, and without SQL Pool I am unable to create a Linked service just by inserting Table name:
You can directly go to your ADLS and right click the parquet file and select properties. There, you will be able to find the ABFSS path which is in the format :
abfss://<container_name>#<storage_account_name>.dfs.core.windows.net/<path

dashDB and DB2 Load operation

I am currently trying to use a dashDB database with the db2cli utility and ODBC (values are from Connect/Connection Information on the dashDB web console). At this moment I can perfectly do SELECT or INSERT statements and fetch data from custom tables which I have created, thanks to the command:
db2cli execsql -connstring "DRIVER={IBM DB2 ODBC DRIVER - IBMDBCL1}; DATABASE=BLUDB; HOSTNAME=yp-dashdb-small-01-lon02.services.eu-gb.bluemix.net; PORT=50000; PROTOCOL=TCPIP; UID=xxxxxx; PWD=xxxxxx" -inputsql /tmp/input.sql
Now I am trying to do a DB2 LOAD operation through the db2cli utility, but I don't know how to proceed or even if it is possible to do so.
The aim is to import data from a file without cataloging the DB2 dashDB database on my side, but only through ODBC. Does someone know if this kind of operation is possible (with db2cli or another utility)?
The latest API version referenced from the DB2 on Cloud (ex DashDB) dashboard is available here. It requires first to call the /auth/tokens endpoint to generate an auth token based on your Bluemix credentials to be used to authorize the API calls.
I've published recently a npm module - db2-rest-client - to simplify the usage of these operations. For example, to load data from a .csv file you can use the following commands:
# install the module globally
npm i db2-rest-client -g
# call the load job
export DB_USERID='<USERID>'
export DB_PASSWORD='<PASSWORD>'
export DB_URI='https://<HOSTNAME>/dbapi/v3'
export DEBUG=db2-rest-client:cli
db2-rest-client load --file=mydata.csv --table='MY_TABLE' --schema='MY_SCHEMA'
For the load job, a test on Bluemix dedicated with a 70MB source file and about 4 million rows took about 4 minutes to load. There are also other CLI options as executing export statement, comma separated statements and uploading files.
This is not possible. LOAD is not an SQL statement, therefore it cannot be executed via an SQL interface such as ODBC, only using the the DB2 CLP, which in turn requires a cataloged database.
ADMIN_CMD() can be invoked via an SQL interface, however, it requires that the input file be on the server -- it won't work with a file stored on your workstation.
If JDBC is an option, you could use the CLPPlus IMPORT command.
You can try loading data using REST API.
Example:
curl --user dashXXX:XXXXXX -H "Content-Type: multipart/form-data" -X POST -F loadFile1=#"/home/yogesh/Downloads/datasets/order_details_0.csv" "https://yp-dashdb-small-01-lon02.services.eu-gb.bluemix.net:8443/dashdb-api/load/local/del/dashXXX.ORDER_DETAILS?hasHeaderRow=true&timestampFormat=YYYY-MM-DD%20HH:MM:SS.U"
I have used the REST API and have not seen any size limitations. In ver 1.11 of dashDB local (warehouse db) external tables have been included. As long as file is on the container it can be loaded. Also the DB2 Load locks the table until load is finished where a external table load won't
There are a number of ways to get data into Db2 Warehouse on Cloud. From a command line you can use Lift CLI https://lift.ng.bluemix.net/ which provides the best performance for large data sets
You can also use EXTERNAL TABLEs https://www.ibm.com/support/knowledgecenter/ean/SS6NHC/com.ibm.swg.im.dashdb.sql.ref.doc/doc/r_create_ext_table.html which are also high performance and have lots of options
This is a quick example using a local file (not on the server) hence the REMOTESOURCE YES option
db2 "create table foo(i int)"
echo "1" > /tmp/foo.csv
db2 "insert into foo select * from external '/tmp/foo.csv' using (REMOTESOURCE YES)"
db2 "select * from foo"
I
-----------
1
1 record(s) selected.
for large files, you can use gzip, either on the fly
db2 "insert into foo select * from external '/tmp/foo.csv' using (REMOTESOURCE GZIP)"
or from gzip'ed files
gzip /tmp/foo.csv
db2 "insert into foo select * from external '/tmp/foo2.csv.gz' using (REMOTESOURCE YES)"

Pass parameter to pentaho kettle over kettleTransFromFile in pentaho CDE

I was used pentaho CE biserver-ce-4.8.0 stable version. I want to create dashboard which fetch data from mongodb, so I was created ktr file in data integration which communicate to mongodb and fetch data from mongodb. After that I was used .ktr file in my CDE dashboard datasource and below was some part in ktr file
<hostname>localhost</hostname>
<port>27017</port>
<use_all_replica_members>N</use_all_replica_members>
<db_name>${db_name}</db_name>
<fields_name/>
<collection>test</collection>
<json_field_name>json</json_field_name>
<json_query/>
<auth_user/>
<auth_password>Encrypted </auth_password>
<auth_kerberos>N</auth_kerberos>
<connect_timeout/>
<socket_timeout/>
<read_preference>primary</read_preference>
<output_json>Y</output_json>
<query_is_pipeline>N</query_is_pipeline>
<execute_for_each_row>N</execute_for_each_row>`
and ${db_name} was my parameter and I want to pass this parameter through url but when I was passed db_name as url and read that url parameter I got url parameter but my ktr file not understand parameter and hence it was created db in mongo with name ${db_name} so I was passed parameter to ktr file in pentaho CDE?
After going through Pentaho Data Integration 4 Cookbook I found solution of my questions. I solved my problem using following ways.
1> First create transformation using PDI and that transformation file add mongodb input.
2> Click on edit->settings and select parameters and add parameter name as db_name.
3> In mongodb input set database name as ${db_name} and set collection name save that transformation file.
4> Now login to pentaho bi server and create new CDE dashboard.
5> go to datasource and select kettle query and add kettle transformation file and add above created ktr file, in variable set arg as db_name and value blank.
6> In same datasource add parameters set name as db_name and value your data base name which you want to pass to ktr file in my case db name as demo.
7> Set above ktr to component panel and it work fine
for more ref click here

How do you import a spreadsheet into DB2?

I have a CSV file, but this could apply to any txt, data, or xls file. (xlsx) I have exported the data from one source and I want to import the data into a DB2 table.
I first tried Data Tools Plugin (DTP) in eclipse Helios (3.6.3) by right clicking on the table and selecting: Data > Load...
But I got this error:
Loading "myschema"."mytable"... com.ibm.db2.jcc.am.SqlException:
[jcc][10103][10941][4.14.113] Method executeQuery cannot be used for
update. ERRORCODE=-4476, SQLSTATE=null Data loading failed.
Then I tried Eclipse SQL Explorer on Eclipse Juno, but it does not support data import.
How do I get past this error so I can import?
You can import a CSV file directly into DB2 via the IMPORT or LOAD command, even with XML or BLOB as part of the data to import.
The procedure to import depends on the structure of the file you are going to import. Probably you should modify the default behaviour of these commands; DB2 has many option to adapt the command to the input file.
For more information about:
The import command: http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.admin.cmd.doc/doc/r0008304.html
The Load command http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.admin.cmd.doc/doc/r0008305.html
I think your question was more oriented to: how to use Eclipse to import data in DB2 from a CSV file. However, as I said, you can do that directly via DB2.
If you are going to import a file like the next one, the only thing that you need is to have access to a db2 client.
data.txt
1,"Andres","2013-05-18"
2,"Tom","2011-04-16"
3,"Jessica","2002-03-09"
You import with
db2 import from data.txt of del insert into test
I solved this by installing Eclipse Juno (4.2) and Data Tools Plugin (DTP) 1.10.2.
Now Data > Load... will work fine. This is the new message I get:
Data loading was successful. 142 row(s) loaded. 135 row(s) could not
be loaded.
com.ibm.db2.jcc.am.go: DB2 SQL Error: SQLCODE=-407, SQLSTATE=23502,
SQLERRMC= , DRIVER=4.7.85 One or more values could not be set in the
following column(s): USER_TIME, USER_DATE
FYI for the entire process I was using this:
DB2 driver: /opt/IBM/db2/V9.7/java
With jar files: db2jcc4.jar, db2jcc_license_cisuz.jar
Driver Class: com.ibm.db2.jcc.DB2Driver
You can import using DB2 "Control Center" *
Right Click the table and select "Import"
Then specify the csv file and message file
message file is important because in the case of failed upload, you can find the error cause in the message file
* Control Center is now deprecated in favor of "Data Studio"
From the db2 console, try this:
Import from 'yourcommaseparatedfile.csv' of del insert into "SCHEMA"."TABLE"
Regards =)
db2 'import from /users/n0sdsds/test.csv of del insert into ENTPRISE.tmp_x'