I'm having a question in external table concept in PostgreSQL. I'm using Enterprise version (EDB).
CREATE OR REPLACE DIRECTORY bdump AS '/u01/app/oracle/admin/SID/bdump/';
DROP TABLE alert_log;
CREATE TABLE alert_log (
line VARCHAR2(4000)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY bdump
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE
BADFILE bdump:'read_alert_%a_%p.bad'
LOGFILE bdump:'read_alert_%a_%p.log'
FIELDS TERMINATED BY '~'
MISSING FIELD VALUES ARE NULL
(
line CHAR(4000)
)
)
LOCATION ('alert_SID.log')
)
/
I'm converting the above code in PostgreSQL. I created an extension file_fdw(Foreign Tables) and used the OPTIONS field to some extent. And I also want to know whether if we can specify BADFILE & DISCARDFILE in Postgres. Alternative for all_external_tables in PostgreSQL? Your help is highly appreciated.
Related
Source for the Informatica Cloud is of type DB2 for I cdc. There are few tables that contain # in their column names. If the mapping was run with a column name containing # in the name then the mapping fails.
Example : If there is an Employee table with the column First#Name then used SQL Override
So to eliminate # from the column name tried using SQL Override to alias the column name.
Used a select statement that contains a column list for the same table. Sample SQL statement for the same:
Select First#Name as First_Name
Last#Name as Last_Name
.
.
.
.
.
.
.
Employee;
But still, the column name is being fetched with a # symbol and this is breaking the mapping.
Any solution how the # can be aliased _ in the column name?
You have two options -
You can enclose column names with double uotes "col_name".
Select "First#Name" as First_Name
"Last#Name" as Last_Name
.
Employee;
If this doesnt solve the issue, please do not mention any sql override and connect only required columns. Informatica should automatically built sql and fetch data.
If above two doesnt work, then you need to change some settings in DB2 so as to handle the special char in column name. I have not tested this so i can not gurantee this.
Select First#Name AS First_Name,Last#Name AS Last_Name,Column3,Column4,Column5,Column6,Column7,Column8 FROM Employee;
The SQL query should be in such a way that there are no spaces except when required and the entire query should be present in a single line.
Remove all the spaces (Except the below mentioned ones) and new line characters from the query. That will solve the issue.
Where spaces can be left over:
After the SELECT statement
Before the FROM keyword
After the FROM keyword
Before and after the AS keyword when aliasing the columns that have special characters (Here 2 spaces will be utilized, one before the AS keyword and second after the AS keyword)
Is there a way to COPY the CSV file data directly into a JSON or JSONb array?
Example:
CREATE TABLE mytable (
id serial PRIMARY KEY,
info jSONb -- or JSON
);
COPY mytable(info) FROM '/tmp/myfile.csv' HEADER csv;
NOTE: each CSV line is mapped to a JSON array. It is a normal CSV.
Normal CSV (no JSON-embeded)... /tmp/myfile.csv =
a,b,c
100,Mum,Dad
200,Hello,Bye
The correct COPY command must be equivalent to the usual copy bellow.
Usual COPY (ugly but works fine)
CREATE TEMPORARY TABLE temp1 (
a int, b text, c text
);
COPY temp1(a,b,c) FROM '/tmp/myfile.csv' HEADER csv;
INSERT INTO mytable(info) SELECT json_build_array(a,b,c) FROM temp1;
It is ugly because:
need the a priory knowledge about fields, and a previous CREATE TABLE with it.
for "big data" need a big temporary table, so lost CPU, disk and my time — the table mytable have CHECKs and UNIQUEs constraints for each line.
... Needs more than 1 SQL command.
Perfect solution!
Not need to know all the CSV columns, only extract what you know.
Use at SQL CREATE EXTENSION PLpythonU;: if the command produce an error like "could not open extension control file ... No such file" you need to install pg-py extra-packages. In standard UBUNTU (16 LTS) is simple, apt install postgresql-contrib postgresql-plpython.
CREATE FUNCTION get_csvfile(
file text,
delim_char char(1) = ',',
quote_char char(1) = '"')
returns setof text[] stable language plpythonu as $$
import csv
return csv.reader(
open(file, 'rb'),
quotechar=quote_char,
delimiter=delim_char,
skipinitialspace=True,
escapechar='\\'
)
$$;
INSERT INTO mytable(info)
SELECT jsonb_build_array(c[1],c[2],c[3])
FROM get_csvfile('/tmp/myfile1.csv') c;
The split_csv() function was defined here. The csv.reader is very reliable (!).
Not tested for big-big CSV... But expected Python do job.
PostgreSQL workaround
It is not a perfect solution, but it solves the main problem, that is the
... big temporary table, so lost CPU, disk and my time"...
This is the way we do it, a workaround with file_fdw!
Adopt your conventions to avoid file-copy and file-permission confusions... The standard file path for a CSV. Example: /tmp/pg_myPrj_file.csv
Initialise your database or SQL script with the magic extension,
CREATE EXTENSION file_fdw;
CREATE SERVER files FOREIGN DATA WRAPPER file_fdw;
For each CSV file, myNewData.csv,
3.1. make a symbolic link (or scp remote copy) for your new file ln -sf $PWD/myNewData.csv /tmp/pg_socKer_file.csv
3.2. configure the file_fdw for your new table (suppose mytable).
CREATE FOREIGN TABLE temp1 (a int, b text, c text)
SERVER files OPTIONS (
filename '/tmp/pg_socKer_file.csv',
format 'csv',
header 'true'
);
PS: after running SQL script with psql, when having some permission problem, change owner of the link by sudo chown -h postgres:postgres /tmp/pg_socKer_file.csv.
3.3. use the file_fdw table as source (suppose populating mytable).
INSERT INTO mytable(info)
SELECT json_build_array(a,b,c) FROM temp1;
Thanks to #JosMac (and his tutorial)!
NOTE: if there is a STDIN way to do it (exists??), will be easy, avoiding permission problems and use of absolute paths. See this answer/discussion.
I've used the psql module to create a new database using the following syntax:
CREATE DATABASE fish
I can open the database. However, when I try to create tables or columns it gives me a syntax error for the following message.
CREATE TABLE salmon;
this is the error message:
ERROR: syntax error at or near ";"
LINE 1: CREATE TABLE species;
I've checked a lot of online postgreSQL resources and they haven't been of much help. To the best of my knowledge, I haven't messed up the syntax. Thanks.
you can use this syntax for empty table:
create table salmon();
You must create atleast one column in a table:
CREATE TABLE salmon ( column_name data_type ...........);
Postgres create table link: https://www.postgresql.org/docs/9.1/static/sql-createtable.html
You can't create an empty table - it must have at least one column. E.g.:
CREATE TABLE salmon (name VARCHAR(10));
psql is not a module. Please read https://www.postgresql.org/docs/current/static/app-psql.html
you odn't open a database - you connect to it.
Establishes a new connection to a PostgreSQL server
https://www.postgresql.org/docs/current/static/sql-createtable.html
{ column_name | ( expression ) }
Either column list (create table a (a int);) or expression (create table b as select now() time_column) is obligatory part.
For example, there is a table named 'testtable' that has following columns: testint (integer) and testtext (varchar(30)).
What i want to do is pretty much something like that:
INSERT INTO testtable VALUES(15, CONTENT_OF_FILE('file'));
While reading postgresql documentation, all I could find is COPY TO/FROM command, but that one's applied to tables, not single columns.
So, what shall I do?
If this SQL code is executed dynamically from your programming language, use the means of that language to read the file, and execute a plain INSERT statement.
However, if this SQL code is meant to be executed via the psql command line tool, you can use the following construct:
\set content `cat file`
INSERT INTO testtable VALUES(15, :'content');
Note that this syntax is specific to psql and makes use of the cat shell command.
It is explained in detail in the PostgreSQL manual:
psql / SQL Interpolation
psql / Meta-Commands
If I understand your question correctly, you could read the single string(s) into a temp table and use that for insert:
DROP SCHEMA str CASCADE;
CREATE SCHEMA str;
SET search_path='str';
CREATE TABLE strings
( string_id INTEGER PRIMARY KEY
, the_string varchar
);
CREATE TEMP TABLE string_only
( the_string varchar
);
COPY string_only(the_string)
FROM '/tmp/string'
;
INSERT INTO strings(string_id,the_string)
SELECT 5, t.the_string
FROM string_only t
;
SELECT * FROM strings;
Result:
NOTICE: drop cascades to table str.strings
DROP SCHEMA
CREATE SCHEMA
SET
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "strings_pkey" for table "strings"
CREATE TABLE
CREATE TABLE
COPY 1
INSERT 0 1
string_id | the_string
-----------+---------------------
5 | this is the content
(1 row)
Please note that the file is "seen" by the server as the server sees the filesystem. The "current directory" from that point of view is probably $PG_DATA, but you should assume nothing, and specify the complete pathname, which should be reacheable and readable by the server. That is why I used '/tmp', which is unsafe (but an excellent rendez-vous point ;-)
I am using SEQUENCE keyword in SQL Loader control file to generate primary keys. But for a special scenario I would like to use Oracle sequence in the control file. The Oracle documentation for SQL Loader doesn't mentioned anything about it. does SQL Loader support it?
I have managed to load without using the dummy by the switching the sequence to be the last column as in :
LOAD DATA
INFILE 'data.csv'
APPEND INTO TABLE my_data
FIELDS TERMINATED BY ','
(
name char,
ID "MY_SEQUENCE.NEXTVAL"
)
and data.csv would be like:
"dave"
"carol"
"tim"
"sue"
I have successfully used a sequence from my Oracle 10g database to populate a primary key field during an sqlldr run:
Here is my data.ctl:
LOAD DATA
INFILE 'data.csv'
APPEND INTO TABLE my_data
FIELDS TERMINATED BY ','
(
ID "MY_SEQUENCE.NEXTVAL",
name char
)
and my data.csv:
-1, "dave"
-1, "carol"
-1, "tim"
-1, "sue"
For some reason you have to put a dummy value in the CSV file even though you'd figure that sqlldr would just figure out that you wanted to use a sequence.
I don't think so, but you can assign the sequence via the on insert trigger unless this is a direct path load.