Importing CSV file PostgreSQL using pgAdmin 4 - postgresql

I'm trying to import a CSV file to my PostgreSQL but I get this error
ERROR: invalid input syntax for integer: "id;date;time;latitude;longitude"
CONTEXT: COPY test, line 1, column id: "id;date;time;latitude;longitude"
my csv file is simple
id;date;time;latitude;longitude
12980;2015-10-22;14:13:44.1430000;59,86411203;17,64274849
The table is created with the following code:
CREATE TABLE kordinater.test
(
id integer NOT NULL,
date date,
"time" time without time zone,
latitude real,
longitude real
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE kordinater.test
OWNER to postgres;

You can use Import/Export option for this task.
Right click on your table
Select "Import/Export" option & Click
Provide proper option
Click Ok button

You should try this it must work
COPY kordinater.test(id,date,time,latitude,longitude)
FROM 'C:\tmp\yourfile.csv' DELIMITER ',' CSV HEADER;
Your csv header must be separated by comma NOT WITH semi-colon or try to change id column type to bigint
to know more

I believe the quickest way to overcome this issue is to create an intermediary temporary table, so that you can import your data and cast the coordinates as you please.
Create a similar temporary table with the problematic columns as text:
CREATE TEMPORARY TABLE tmp
(
id integer,
date date,
time time without time zone,
latitude text,
longitude text
);
And import your file using COPY:
COPY tmp FROM '/path/to/file.csv' DELIMITER ';' CSV HEADER;
Once you have your data in the tmp table, you can cast the coordinates and insert them into the test table with this command:
INSERT INTO test (id, date, time, latitude, longitude)
SELECT id, date, time, replace(latitude,',','.')::numeric, replace(longitude,',','.')::numeric from tmp;
One more thing:
Since you're working with geographic coordinates, I sincerely recommend you to take a look at PostGIS. It is quite easy to install and makes your life much easier when you start your first calculations with geospatial data.

Related

value too long for type character varying(512)--Why can't import the data?

The maximum size of limited character types (e.g. varchar(n)) in Postgres is 10485760.
description on max length of postgresql's varchar
Please download the file for testing and extract it in /tmp/2019q4, we only use pre.txt to import data with.
sample data
Enter you psql and create a database:
postgres=# create database edgar;
postgres=# \c edgar;
Create table according to the webpage:
fields in pre table definations
edgar=# create table pre(
id serial ,
adsh varchar(20),
report numeric(6,0),
line numeric(6,0),
stmt varchar(2),
inpth boolean,
rfile char(1),
tag varchar(256),
version varchar(20),
plabel varchar(512),
negating boolean
);
CREATE TABLE
Try to import data:
edgar=# \copy pre(adsh,report,line,stmt,inpth,rfile,tag,version,plabel,negating) from '/tmp/2019q4/pre.txt' with delimiter E'\t' csv header;
We analyse the error info:
ERROR: value too long for type character varying(512)
CONTEXT: COPY pre, line 1005798, column plabel: "LIABILITIES AND STOCKHOLDERS EQUITY 0
0001493152-19-017173 2 11 BS 0 H LiabilitiesAndStockholdersEqu..."
Time: 1481.566 ms (00:01.482)
1.What size i set in the field is just 512 ,more less than 10485760.
2.the content in line 1005798 is not same as in error info:
0001654954-19-012748 6 20 EQ 0 H ReclassificationAdjustmentRelatingToAvailableforsaleSecuritiesNetOfTaxEffect 0001654954-19-012748 Reclassification adjustment relating to available-for-sale securities, net of tax effect" 0
Now i drop the previous table ,convert the plabel field as text,re-create it:
edgar=# drop table pre;
DROP TABLE
Time: 22.763 ms
edgar=# create table pre(
id serial ,
adsh varchar(20),
report numeric(6,0),
line numeric(6,0),
stmt varchar(2),
inpth boolean,
rfile char(1),
tag varchar(256),
version varchar(20),
plabel text,
negating boolean
);
CREATE TABLE
Time: 81.895 ms
Import the same data with same copy command:
edgar=# \copy pre(adsh,report,line,stmt,inpth,rfile,tag,version,plabel,negating) from '/tmp/2019q4/pre.txt' with delimiter E'\t' csv header;
COPY 275079
Time: 2964.898 ms (00:02.965)
edgar=#
No error info in psql console,let me check the raw data '/tmp/2019q4/pre.txt' ,which it contain 1043000 lines.
wc -l /tmp/2019q4/pre.txt
1043000 /tmp/2019q4/pre.txt
There are 1043000 lines,how much lines imported then?
edgar=# select count(*) from pre;
count
--------
275079
(1 row)
Why so less data imported without error info ?
The sample data you provided is obviously not the data you are really loading. It does still show the same error, but of course the line numbers and markers are different.
That file occasionally has double quote marks where there should be single quote marks (apostrophes). Because you are using CSV mode, these stray double quotes will start multi-line strings, which span all the way until the next stray double quote mark. That is why you have fewer rows of data than lines of input, because some of the data values are giant multiline strings.
Since your data clearly isn't CSV, you probably shouldn't be using \copy in CSV format. It loads fine in text format as long as you specify "header", although that option didn't become available in text format until v15. For versions before that, you could manually remove the header line, or use PROGRAM to skip the header like FROM PROGRAM 'tail +2 /tmp/pre.txt' Alternatively, you could keep using CSV format, but choose a different quote character, one that never shows up in your data such as with (delimiter E'\t', format csv, header, quote E'\b')

stl_load_errors returning invalid timestamp format I can't figure out

I'm trying to use the copy function to create a table in Redshift. I've setup this particular field that keeps failing in my schema as a standard timestamp because I don't know why it would be anything otherwise. But when I run this statement:
copy sample_table
from 's3://aws-bucket/data_push_2018-10-05.txt'
credentials 'aws_access_key_id=XXXXXXXXXXXXXXXXXXXX;aws_secret_access_key=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/XXX'
dateformat 'auto'
ignoreheader 1;
It keeps returning this error: Invalid timestamp format or value [YYYY-MM-DD HH24:MI:SS]
raw_field_value: "2018-08-29 15:04:52"
raw_line: 12039752|311525|"67daf211abbe11e8b0010a28385dd2bc"|98953|"2018-08-20"|"2018-11-30"|"active"|"risk"|||||||"sample"|15750|0|"2018-08-29 15:04:52"|"2018-08-29 16:05:01"
There is a very similar table in our database (that I did not make) which has the aforementioned error value as timestamp and values for that field identical to 2018-08-29 15:04:52 so what's happening when I run it that's causing the issue?
Your copy command seems OK, and seems like you are missing FORMAT as CSV QUOTE AS '"' AND DELIMITER AS '|' parameters and It should work.
I'm here using some sample data and command to prove my case, to make it simple, I did made the table simple and covered all your data points though.
create table sample_table(
salesid integer not null,
category varchar(100),
created_at timestamp,
update_at timestamp );
Here goes your sample data test_file.csv,
12039752|"67daf211abbe11e8b0010a28385dd2bc"|"2018-08-29 11:04:52"|"2018-08-29 14:05:01"
12039754|"67daf211abbe11e8b0010a2838cccddbc"|"2018-08-29 15:04:52"|"2018-08-29 16:05:01"
12039755|"67daf211abbe11e8b0010a28385ff2bc"|"2018-08-29 12:04:52"|"2018-08-29 13:05:01"
12039756|"67daf211abbe11e8b0010a28385bb2bc |"2018-08-29 10:04:52"|"2018-08-29 15:05:01"
Here goes your copy command,
COPY sample_table FROM 's3://path/to/csv/test_file.csv' CREDENTIALS 'aws_access_key_id=XXXXXXXXXXX;aws_secret_access_key=XXXXXXXXX' FORMAT as CSV QUOTE AS '"' DELIMITER AS '|';
It will returns,
INFO: Load into table 'sample_table' completed, 4 record(s) loaded successfully.
COPY
Though this command works fine, but if there are more issues with your data you could try MAXERROR option as well.
Hope it answers your question.

Can I get Unix timestamp automatically converted to a TIMESTAMP column when importing from CSV to a PostgreSQL database?

This is basically a duplicate of this with s/mysql/postgresql/g.
I created a table that has a timestamp TIMESTAMP column and I am trying to import data from CSV files that have Unix timestamped rows.
However, when I try to COPY the file into the table, I get errors to the tune of
2:1: conversion failed: "1394755260" to timestamp
3:1: conversion failed: "1394755320" to timestamp
4:1: conversion failed: "1394755800" to timestamp
5:1: conversion failed: "1394755920" to timestamp
Obviously this works if I set the column to be INT.
In the MySQL variant, I solved with a trick like
LOAD DATA LOCAL INFILE 'file.csv'
INTO TABLE raw_data
fields terminated by ','
lines terminated by '\n'
IGNORE 1 LINES
(#timestamp, other_column)
SET timestamp = FROM_UNIXTIME(#timestamp),
third_column = 'SomeSpecialValue'
;
Note two things: I can map the #timestamp variable from the CSV file using a function to turn it into a proper DATETIME, and I can set extra columns to certain values (this is necessary because I have more columns in the database than in the CSV).
I'm switching to postgresql because mysql lacks some functions that make my life so much easier with the queries I need to write.
Is there a way of configuring the table so that the conversion happens automatically?
I think you could accomplish this by importing the data as-is, creating a second column with the converted timestamp and then using a trigger to make sure any time a row is inserted it populates the new column:
alter table raw_table
add column time_stamp timestamp;
CREATE OR REPLACE FUNCTION raw_table_insert()
RETURNS trigger AS
$BODY$
BEGIN
NEW.time_stamp = timestamp 'epoch' + NEW.unix_time_stamp * interval '1 second';
return NEW;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
CREATE TRIGGER insert_raw_table_trigger
BEFORE INSERT
ON raw_table
FOR EACH ROW
EXECUTE PROCEDURE raw_table_insert();
If the timestamp column can be modified, then you will want to make sure the trigger applies to updates as well.
Alternatively, you can create a view that generates the timestamp on the fly, but the advantages/disadvantages depend on how often you search on the column, how large the table is going to be and how much DML you expect.

How to store string spaces as null in numeric column

I want to get records from my local txt file to postgresql table.
I have created following table.
create table player_info
(
Name varchar(20),
City varchar(30),
State varchar(30),
DateOfTour date,
pay numeric(5),
flag char
)
And, my local txt file contains following data.
John|Mumbai| |20170203|55555|Y
David|Mumbai| |20170305| |N
Romcy|Mumbai| |20170405|55555|N
Gotry|Mumbai| |20170708| |Y
I am just executing this,
copy player_info (Name,
City,
State,
DateOfTour,
pay_id,
flag)
from local 'D:\sample_player_info.txt'
delimiter '|' null as ''
exceptions 'D:\Logs\player_info'
What I want is,
For my numeric column, If 3 spaces are there,
then I have to insert NULL as pay else whatever 5 digits numeric number.
pay is a column in my table whose datatype is numeric.
Is this correct or possible to do this ?
You cannot store strings in a numeric column, at all. 3 spaces is a string, so it cannot be stored in the column pay as that is defined as numeric.
A common approach to this conundrum is to create a staging table which uses less precise data types in the column definitions. Import the source data into the staging table. Then process that data so that it can be reliably added to the final table. e.g. in the staging table set a column called pay_str to NULL where pay_str = ' ' (or perhaps LIKE ' %')

ERROR: extra data after last expected column in postgres table

I am working on a project where I need to create a new table, then import data from a CSV. I've read many similar questions ("extra data after last expected column") and answers on StackOverflow, but I still haven't found the culprit.
CREATE TABLE colleges2014_15 (
unitid integer,
intsnm text,
city text,
stabbr text,
zip_clean char,
control integer,
latitude float,
longitude float,
tutionfee_in float,
tuitionfee_out float,
pctpell float,
inc_pct_lo float,
dep_stat_pct_ind float,
dep_debt_mdn float,
ind_debt_mdn float,
pell_debt_mdn float,
ugds_men float,
ubds_women float,
locale integer,
PRIMARY KEY(unitid)
);
The table is created successfully with the 19 different columns. Then I go try to import the data into the new table.
COPY colleges2014_15(
unitid,
intsnm,
city,
stabbr,
zip_clean,
control,
latitude,
longitude,
tutionfee_in,
tuitionfee_out,
pctpell,
inc_pct_lo,
dep_stat_pct_ind,
dep_debt_mdn,
ind_debt_mdn,
pell_debt_mdn,
ugds_men,
ubds_women,
locale
)
FROM '/Users/compose/Downloads/CollegeScorecard_Raw_Data x/MERGED2014_15_cleaned.csv' CSV HEADER
;
And I get the error message. I've done the following in the CSV:
Made sure it's saved as UTF-8 CSV (working on a Mac)
Already cleaned out all commas in every row
Cleaned out all NULL values
Confirmed that all the data types (integer, float, text, etc.) are correct
I've tried to simply COPY only the first column, unitid; it failed. I've tried importing only the second column (intsnm) and it failed with the same error.
The full error message when trying to COPY over all 19 columns is as follows:
An error occurred when executing the SQL command: COPY
colleges2014_15( unitid, intsnm, city, stabbr, zip_clean,
control, latitude, longitude, tutionfee_in, tuitionfee_out,
pctpell, inc_pct_...
ERROR: extra data after last expected column Where: COPY
colleges2014_15, line 2: "100654,Alabama A & M
University,Normal,AL,35762,35762,1,34.783368,-86.568502,9096,16596,0.7356,0.651..."
1 statement failed.
Execution time: 0.03s
The full error message when trying to copy simply the first column only is:
An error occurred when executing the SQL command: COPY
colleges2014_15( unitid ) FROM
'/Users/compose/Downloads/CollegeScorecard_Raw_Data
x/MERGED2014_15_cleaned.csv' CSV HEADER
ERROR: extra data after last expected column Where: COPY
colleges2014_15, line 2: "100654,Alabama A & M
University,Normal,AL,35762,35762,1,34.783368,-86.568502,9096,16596,0.7356,0.651..."
1 statement failed.
Execution time: 0.01s
Hugely appreciate any help.
It took me a while to figure out what was wrong searching on the error so have posted my problem to help others. My issue was inexperience with pgAdmin as
pgAdmin requires the table to be created WITH columns before the data is imported. I had expected that the headers would be used from the .csv file, most other packages I have used worked this way.
If you are working with a GIS system using PostGIS there is an easy solution. I am using QGIS 3.4, with Postgres and PostGIS installed.
In QGIS
Select Database menu option
Select DBManager
On left - choose location for table
Select Import Layer/File
On the next window select the following
Input - choose file
Table - enter table name
OK
Your CSV has a ZIP column which your table and COPY statement omit.