I have the following text file aatest.txt:
09/25/2019 | 1234.5
10/01/2018 | 6789.0
that would like to convert into zztext.txt:
2019-09-25 | 1234.5
2018-10-01 | 6789.0
My Postgres script is:
CREATE TABLE documents (tdate TEXT, val NUMERIC);
COPY documents FROM 'aatest.txt' WITH CSV DELIMITER '|';
SELECT TO_DATE(tdate, 'mm/dd/yyyy');
COPY documents TO 'zztest.txt' WITH CSV DELIMITER '|';
However I am getting the following error message:
ERROR: column "tdate" does not exist
What am I doing wrong? Thank you!
Your SELECT has no FROM clause, so you can't reference any columns. But you need to put that SELECT into the COPY statement anyways:
CREATE TABLE documents (tdate TEXT, val NUMERIC);
COPY documents FROM 'aatest.txt' WITH CSV DELIMITER '|';
COPY (select to_char(TO_DATE(tdate, 'mm/dd/yyyy'), 'yyyy-mm-dd'), val FROM documents)
TO 'zztest.txt' WITH CSV DELIMITER '|';
Related
I have several CSVs with varying field names that I am copying into a Postgres database from an s3 data source. There are quite a few of them that contain empty strings, "" which I would like to convert to NULLs at import. When I attempt to copy I get an error along the lines of this (same issue for other data types, integer, etc.):
psycopg2.errors.InvalidDatetimeFormat: invalid input syntax for type date: ""
I have tried using FORCE_NULL (field 1, field2, field3) - and this works for me, except I would like to do FORCE_NULL (*) and apply to all of the columns as I have A LOT of fields I am bringing in that I'd like this applied to.
Is this available?
This is an example of my csv:
"ABC","tgif","123","","XyZ"
Using psycopg2 Copy functions. In this case copy_expert:
cat empty_str.csv
1, ,3,07/22/2
2,test,4,
3,dog,,07/23/2022
create table empty_str_test(id integer, str_fld varchar, int_fld integer, date_fld date);
import psycopg2
con = psycopg2.connect("dbname=test user=postgres host=localhost port=5432")
cur = con.cursor()
with open("empty_str.csv") as csv_file:
cur.copy_expert("COPY empty_str_test FROM STDIN WITH csv", csv_file)
con.commit()
select * from empty_str_test ;
id | str_fld | int_fld | date_fld
----+---------+---------+------------
1 | | 3 | 2022-07-22
2 | test | 4 |
3 | dog | | 2022-07-23
From here COPY:
NULL
Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format. You might prefer an empty string even in text format for cases where you don't want to distinguish nulls from empty strings. This option is not allowed when using binary format.
copy_expert allows you specify the CSV format. If you use copy_from it will use the text format.
I have the following csv file delimited by comma
col1,col2,col3
123,ABC,"DEF,EFG"
456,XYZ,"CFD,FGG"
I am creating an external table in databricks using
spark.sql(f'''
CREATE TABLE table_name USING CSV
OPTIONS (
header="true",
delimiter=",",
inferSchema="true"
path=/mnt/csvfile/abc.csv
)
''')
Due to the comma inside a column csv is wrapping it in quotes. How do I escape the quotes to get the below output in the table?
col1 col2 col3
123 ABC DEF,EFG
456 XYZ CFD,FGG
I have a ton of CSV files that I'm trying to import into Postgres. The CSV data is all quoted regardless of what the data type is. Here's an example:
"3971","14","34419","","","","","6/25/2010 9:07:02 PM","70.21.238.46 "
The first 4 columns are supposed to be integers. Postgres handles the cast from the string "3971" to the integer 3971 correctly, but it pukes at the empty string in the 4th column.
PG::InvalidTextRepresentation: ERROR: invalid input syntax for type integer: ""
This is the command I'm using:
copy "mytable" from '/path/to/file.csv' with delimiter ',' NULL as '' csv header
Is there a proper way to tell Postgres to treat empty strings as null?
How to do this. Since I'm working in psql and using a file that the server user can't reach I use \copy, but the principle is the same:
create table csv_test(col1 integer, col2 integer);
cat csv_test.csv
"1",""
"","2"
\copy csv_test from '/home/aklaver/csv_test.csv' with (format 'csv', force_null (col1, col2));
COPY 2
select * from csv_test ;
col1 | col2
------+------
1 | NULL
NULL | 2
I am reading a CSV file through a named pipe. In the CSV file the field2 column is blank which need to be inserted into a table column as NULL. The table column is of type integer, but When I try to run the ingest
I am getting an error that says 'field2 cannot be converted to the value type: integer'.
Here is my below code
mkfifo mypipe
tail -n +2 myfile.csv > mypipe &
db2 "INGEST FROM FILE mypipe
FORMAT DELIMITED
(
$field1 CHAR(9),
$field2 INTEGER EXTERNAL,
$field3 CHAR(32)
)
INSERT INTO my_table
VALUES($field1, $field2, $field3)"
In the above code, $field2 will be blank. In the my_table, $field2 value doesn't get inserted as NULL when the field is blank in csv.
Sample input csv data as shown below
Subject_Name,Student_ID,STATUS
Maths,,COMPLETED
Physics,,PENDING
Computers,,PENDING
I want the data to be ingested in the table like below
Subject_Name|Student_id|STATUS |
------------|----------|---------|
Maths |NULL |COMPLETED|
------------|----------|---------|
Physics |NULL |PENDING |
------------|----------|---------|
Computers |NULL |PENDING |
------------|----------|---------|
Can anyone suggest a way to resolve this issue?
I am trying to copy data spooled form oracle to PostgreSQL with csv format.
I am getting below error while doing the copy .
ERROR: invalid input syntax for type timestamp: "20-MAR-17
08.30.41.453267 AM"
I tried to set the date time in DMY on Postgres but it did not work. I can input the data if I convert it to YMD format (i.e. I have to change numerous fields and almost 50 TB of data)
can someone please help me on this.
badmin=# copy downloaded_file from '/export/home/dbadmin/postgresql/TESTPGDB/scripts/FACTSET_IDS_2_V1.DOWNLOADED_FILE.csv'
with delimiter ',';
ERROR: invalid input syntax for type timestamp:
"20-MAR-17 08.30.41.453267 AM" CONTEXT: COPY downloaded_file, line 1,
column DOWNLOAD_TIME: "20-MAR-17 08.30.41.453267 AM"
Let's assume your main table has following columns and datatypes.
\d downloaded_file
Column | Type
--------+----------------------------
id | integer
txt | text
tstamp | timestamp without time zone
Now, rather than copying directly into the table, create a temporary table with the same columns but with all text datatypes.
create temporary table downloaded_file_tmp ( id text, txt text, tstamp text);
Now, copy the contents of the file into this temp table.
The file looks like this.
$cat f.csv
1,'TEXT1','20-MAR-17 08.30.41.453267 AM'
Copying from file to temp table.
\copy downloaded_file_tmp from 'f.csv' with delimiter ',' CSV;
Copying from temp table to main table.
INSERT INTO downloaded_file
(id,
txt,
tstamp)
SELECT id :: INT,
txt,
TO_TIMESTAMP('20-MAR-17 08.30.41.453267 AM', 'dd-mon-yy hh.mi.ss.US AM')
FROM downloaded_file_tmp;
Notice the format specifier US which represents microsecond (000000-999999)
knayak=# select * from downloaded_file;
id | txt | tstamp
----+---------+----------------------------
1 | 'TEXT1' | 2017-03-20 08:30:41.453267