Importing csv file using COPY FROM on Mac - postgresql

Using the query editor in the pgAdmin4 app, I would like to import data from a csv file into a table. My code is as follows:
CREATE DATABASE gps_tracking_db
ENCODING = 'UTF8'
TEMPLATE = template0
LC_COLLATE = 'C'
LC_CTYPE = 'C';
CREATE SCHEMA main;
COMMENT ON SCHEMA main IS 'Schema that stores all the GPS tracking core data.';
CREATE TABLE main.gps_data(
gps_data_id serial,
gps_sensors_code character varying,
line_no integer,
utc_date date,
utc_time time without time zone,
lmt_date date,
lmt_time time without time zone,
ecef_x integer,
ecef_y integer,
ecef_z integer,
latitude double precision,
longitude double precision,
height double precision,
dop double precision,
nav character varying(2),
validated character varying(3),
sats_used integer,
ch01_sat_id integer,
ch01_sat_cnr integer,
ch02_sat_id integer,
ch02_sat_cnr integer,
ch03_sat_id integer,
ch03_sat_cnr integer,
ch04_sat_id integer,
ch04_sat_cnr integer,
ch05_sat_id integer,
ch05_sat_cnr integer,
ch06_sat_id integer,
ch06_sat_cnr integer,
ch07_sat_id integer,
ch07_sat_cnr integer,
ch08_sat_id integer,
ch08_sat_cnr integer,
ch09_sat_id integer,
ch09_sat_cnr integer,
ch10_sat_id integer,
ch10_sat_cnr integer,
ch11_sat_id integer,
ch11_sat_cnr integer,
ch12_sat_id integer,
ch12_sat_cnr integer,
main_vol double precision,
bu_vol double precision,
temp double precision,
easting integer,
northing integer,
remarks character varying
);
COMMENT ON TABLE main.gps_data
IS 'Table that stores raw data as they come from the sensors (plus the ID of
the sensor).';
ALTER TABLE main.gps_data
ADD CONSTRAINT gps_data_pkey
PRIMARY KEY(gps_data_id);
ALTER TABLE main.gps_data
ADD COLUMN insert_timestamp timestamp with time zone
DEFAULT now();
ALTER TABLE main.gps_data
ADD CONSTRAINT unique_gps_data_record
UNIQUE(gps_sensors_code, line_no); /*what does line_no mean?*/
COPY main.gps_data(
gps_sensors_code, line_no, utc_date, utc_time, lmt_date, lmt_time, ecef_x,
ecef_y, ecef_z, latitude, longitude, height, dop, nav, validated, sats_used,
ch01_sat_id, ch01_sat_cnr, ch02_sat_id, ch02_sat_cnr, ch03_sat_id,
ch03_sat_cnr, ch04_sat_id, ch04_sat_cnr, ch05_sat_id, ch05_sat_cnr,
ch06_sat_id, ch06_sat_cnr, ch07_sat_id, ch07_sat_cnr, ch08_sat_id,
ch08_sat_cnr, ch09_sat_id, ch09_sat_cnr, ch10_sat_id, ch10_sat_cnr,
ch11_sat_id, ch11_sat_cnr, ch12_sat_id, ch12_sat_cnr, main_vol, bu_vol,
temp, easting, northing, remarks)
FROM
'/Users/CDDEP/Downloads⁩/Urbano 2014/⁩tracking_db⁩/data⁩/sensors_data⁩/GSM01438.csv'
WITH (FORMAT csv, HEADER, DELIMITER ';')
However, when I run the CREATE FROM command, the following error message is returned:
ERROR: could not open file "/Users/CDDEP/Downloads⁩/Urbano 2014/⁩tracking_db⁩/data⁩/sensors_data⁩/GSM01438.csv" for reading: No such file or directory
HINT: COPY FROM instructs the PostgreSQL server process to read a file. You may want a client-side facility such as psql's \copy.
SQL state: 58P01
I wonder if the error is due to a formatting issue with the Mac filepath or something else.

Make sure the file /Users/CDDEP/Downloads⁩/Urbano 2014/⁩tracking_db⁩/data⁩/sensors_data⁩/GSM01438.csv does exists
Replace COPY main.gps_data with \COPY main.gps_data to use the client side utility

Related

can't import files csv in pgAdmin 4

i will import data csv to postgresql via pgAdmin 4. But, there are problem
ERROR: invalid input syntax for type integer: ""
CONTEXT: COPY films, line 1, column gross: ""
i understand about the error that is line 1 column gross there is null value and in some other columns there are also null values. My questions, how to import file csv but in the that file there is null value. I've been search in google but not found similar my case.
CREATE TABLE public.films
(
id int,
title varchar,
release_year float4,
country varchar,
duration float4,
language varchar,
certification varchar,
gross int,
budget int
);
And i try in this code below, but failed
CREATE TABLE public.films
(
id int,
title varchar,
release_year float4 null,
country varchar null,
duration float4 null,
language varchar null,
certification varchar null,
gross float4 null,
budget float4 null
);
error message in image
I've searched on google and on the stackoverflow forums. I hope that someone will help solve my problem
There is no difference between the two table definitions. A column accepts NULL by default.
The issue is not a NULL value but an empty string:
select ''::integer;
ERROR: invalid input syntax for type integer: ""
LINE 1: select ''::integer;
select null::integer;
int4
------
NULL
Create a staging table that has data type of varchar for the fields that are now integer. Load the data into that table. Then modify the empty string data that will be integer using something like:
update table set gross = nullif(trim(gross), '');
Then move the data to the production table.
This is not a pgAdmin4 issue it is a data issue. Working in psql because it is easier to follow:
CREATE TABLE public.films_text
(
id varchar,
title varchar,
release_year varchar,
country varchar,
duration varchar,
language varchar,
certification varchar,
gross varchar,
budget varchar
);
\copy films_text from '~/Downloads/films.csv' with csv
COPY 4968
CREATE TABLE public.films
(
id int,
title varchar,
release_year float4,
country varchar,
duration float4,
language varchar,
certification varchar,
gross int,
budget int
);
-- Below done because of this value 12215500000 in budget column
alter table films alter COLUMN budget type int8;
INSERT INTO films
SELECT
id::int,
title,
nullif (trim(release_year), '')::real, country, nullif(trim(duration), '')::real,
LANGUAGE,
certification,
nullif (trim(gross), '')::float, nullif(trim(budget), '')::float
FROM
films_text;
INSERT 0 4968
It worked for me:
https://learnsql.com/blog/how-to-import-csv-to-postgresql/
a small workaround but it works
I created a table
I added headers to csv file
Right click on the newly created table-> Import/export data, select csv file to upload, go to tab2 - select Header and it should work

Cannot read CSV file with pgAdmin,

I want to read CSV file thats on my desktop named "tripdata". I wrote a code but I always get this error:
ERROR: invalid input syntax for integer: "NULL"
CONTEXT: COPY tripdata, line 4, column birth_year: "NULL"
SQL state: 22P02
I do not know whats the problem. I read at the same way other CSV files.
CREATE TABLE public."tripdata" (tripduration integer,
starttime timestamp,
stoptime timestamp,
start_station_id integer,
start_station_name varchar(100),
start_station_latitude float,
start_station_longituder float,
end_station_id integer,
end_station_name varchar(100),
end_station_latitude float,
end_station_longituder float,
bikeid integer,
usertime varchar(100),
birth_year integer,
gender varchar(100));
SELECT * FROM public."tripdata";
COPY public."tripdata" FROM 'C:\Users\Pc\Desktop\tripdata.csv' DELIMITER ',' CSV HEADER;
select * from tripdata;
I believe you will have to tell COPY what NULL is.
https://www.postgresql.org/docs/10/sql-copy.html
NULL
Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV
format.
So in your case:
COPY ... NULL AS 'NULL';

crosstab function in postgresql causes invalid memory alloc request size

PostgreSQL 9.5.10, RAM = 8GB
I have a table with three columns (ID, Category, anzahl (=Count)). the table has around 132million rows. There are 58 unique values in category column i.e.: 58 different category.
Similar to example demonstrated here PostgreSQL Crosstab Query i want to create a pivot where i have ID and 58 Categories as columns (so in all 59 columns) and row populated with respective Count values. below is the query:
select * into sde.demographie100m_transposed
from crosstab(
'select gitter_id_100m, category, anzahl
from sde.demographie100m_3col
order by 1,2',
'select distinct category from sde.demographie100m_3col order by 1'
)
AS ct
("gitter_id_100m" text,
"INSGESAMT_Einheiten insgesamt" integer,
"ALTER_10JG_10 - 19" integer,
"ALTER_10JG_20 - 29" integer,
"ALTER_10JG_30 - 39" integer,
"ALTER_10JG_40 - 49" integer,
"ALTER_10JG_50 - 59" integer,
"ALTER_10JG_60 - 69" integer,
"ALTER_10JG_70 - 79" integer,
"ALTER_10JG_80 und älter" integer,
"ALTER_10JG_Unter 10" integer,
"ALTER_KURZ_18 - 29" integer,
"ALTER_KURZ_30 - 49" integer,
"ALTER_KURZ_50 - 64" integer,
"ALTER_KURZ_65 und älter" integer,
"ALTER_KURZ_Unter 18" integer,
"FAMSTND_AUSF_Eingetr. Lebenspartner/-in verstorben" integer,
"FAMSTND_AUSF_Eingetr. Lebenspartnerschaft" integer,
"FAMSTND_AUSF_Eingetr. Lebenspartnerschaft aufgehoben" integer,
"FAMSTND_AUSF_Geschieden" integer,
"FAMSTND_AUSF_Ledig" integer,
"FAMSTND_AUSF_Ohne Angabe" integer,
"FAMSTND_AUSF_Verheiratet" integer,
"FAMSTND_AUSF_Verwitwet" integer,
"GEBURTLAND_GRP_Deutschland" integer,
"GEBURTLAND_GRP_EU27-Land" integer,
"GEBURTLAND_GRP_Sonstige" integer,
"GEBURTLAND_GRP_Sonstige Welt" integer,
"GEBURTLAND_GRP_Sonstiges Europa" integer,
"GESCHLECHT_Männlich" integer,
"GESCHLECHT_Weiblich" integer,
"RELIGION_KURZ_Evangelische Kirche (öffentlich-rechtlich)" integer,
"RELIGION_KURZ_Römisch-katholische Kirche (öffentlich-rechtlich)" integer,
"RELIGION_KURZ_Sonstige, keine, ohne Angabe" integer,
"STAATSANGE_GRP_Deutschland" integer,
"STAATSANGE_GRP_EU27-Land" integer,
"STAATSANGE_GRP_Sonstige" integer,
"STAATSANGE_GRP_Sonstige Welt" integer,
"STAATSANGE_GRP_Sonstiges Europa" integer,
"STAATSANGE_HLND_Bosnien und Herzegowina" integer,
"STAATSANGE_HLND_Deutschland" integer,
"STAATSANGE_HLND_Griechenland" integer,
"STAATSANGE_HLND_Italien" integer,
"STAATSANGE_HLND_Kasachstan" integer,
"STAATSANGE_HLND_Kroatien" integer,
"STAATSANGE_HLND_Niederlande" integer,
"STAATSANGE_HLND_Österreich" integer,
"STAATSANGE_HLND_Polen" integer,
"STAATSANGE_HLND_Rumänien" integer,
"STAATSANGE_HLND_Russische Föderation" integer,
"STAATSANGE_HLND_Sonstige" integer,
"STAATSANGE_HLND_Türkei" integer,
"STAATSANGE_HLND_Ukraine" integer,
"STAATSANGE_KURZ_Ausland" integer,
"STAATSANGE_KURZ_Deutschland" integer,
"STAATZHL_Eine Staatsangehörigkeit" integer,
"STAATZHL_Mehrere Staatsangehörigkeiten, deutsch und ausländisch" integer,
"STAATZHL_Mehrere Staatsangehörigkeiten, nur ausländisch" integer,
"STAATZHL_Nicht bekannt" integer
);
but it results in error as below:
ERROR: invalid memory alloc request size 1073741824
SQL Status:XX000
Kontext:SQL statement "select gitter_id_100m, category, anzahl
from sde.demographie100m_3col
order by 1,2"
Try the canonical form instead:
SELECT gitter_id_100m,
SUM(CASE when category='INSGESAMT_Einheiten insgesamt' then anzahl END) AS "INSGESAMT_Einheiten insgesamt",
SUM(CASE when category='ALTER_10JG_10 - 19' then anzahl END) AS "ALTER_10JG_10 - 19",
...etc...
FROM sde.demographie100m_3col
GROUP BY 1
ORDER BY 1; -- remove the ORDER BY if you can do without it.
Presumably that form would be much easier (than crosstab) for the server to spill to disk if necessary as opposed to generating the entire result in memory.
You may also use a SQL cursor to retrieve the result in chunks. In some cases it can help a lot with the memory consumption, both client-side and server-side.
Client-side code to use a cursor:
BEGIN; -- open transaction
DECLARE mycursor CURSOR FOR SELECT ... rest of the query;
FETCH mycursor; -- retrieve 1 line
-- FETCH mycursor repeatedly
CLOSE mycursor;
COMMIT;
There's also a dynamic_pivot function on github that can be used to automate the above (creates the pivot query and returns a cursor to it), but I'm not sure how its implementation would behave performance-wise with 132M rows.

`ERROR: value too long for type character(2)` when running `\copy`

I set up a table like this.
CREATE TABLE IF NOT EXISTS details (
CountyCode CHAR(3) NOT NULL,
VoterID CHAR(10) NOT NULL UNIQUE,
NameLast TEXT,
NameSuffix TEXT,
NameFirst TEXT,
NameMiddle TEXT,
PublicRecordExemption CHAR(1),
ResidenceAddressLine1 TEXT,
ResidenceAddressLine2 TEXT,
ResidenceCity TEXT,
ResidenceState TEXT,
ResidenceZipcode TEXT,
MailingAddressLine1 TEXT,
MailingAddressLine2 TEXT,
MailingAddressLine3 TEXT,
MailingCity TEXT,
MailingState CHAR(2),
MailingZipcode TEXT,
MailingCountry TEXT,
Gender CHAR(1),
Race CHAR(1),
BirthDate CHAR(10),
RegistrationDate CHAR(10),
PartyAffiliation CHAR(3),
Precinct CHAR(6),
PrecinctGroup CHAR(3),
PrecinctSplit CHAR(6),
PrecinctSuffix CHAR(3),
VoterStatus CHAR(3),
CongressionalDistrict CHAR(3),
HouseDistrict CHAR(3),
SenateDistrict CHAR(3),
CountyCommissionDistrict CHAR(3),
SchoolBoardDistrict CHAR(2),
DaytimeAreaCode CHAR(3),
DaytimePhoneNumber CHAR(7),
DaytimePhoneExtension CHAR(4),
Emailaddress TEXT
);
I ran this command to import data from tab-delimited file Detail.txt.
\copy details FROM Detail.txt;
After a few seconds, the command line console spits out this error.
ERROR: value too long for type character(2)
CONTEXT: COPY details, line 449121, column mailingstate: "273707216".
Here's line 449121 copied into a pastebin.
The error indicates PSQL tries to read the value 273707216 into the the mailingstate column, which is two characters in length. I thought PSQL would instead read NC into that column.
Why does PSQL read this line wrong?
I think the problem resides in these fields of that row.
Hand Dr \ Melbourne
The backslash in that data is immediately followed by a tab. To the copy function it looks like a \t, the double backslash might be treated as an escaped character and not a field delimiter, so that would "eat" up a column thus putting the mailingZipcode into the mailingState column.
Try to remove that backslash and try to re-import the row.

PostgreSQL COPY from CSV with delimiter "|"

please, can anyone help me to solve this problem?
I'd like to create a table in Postgres database with data from CSV file with delimiter "|", while trying to use the command COPY (or Import) I get this error:
ERROR: extra data after last expected column
CONTEXT: COPY twitter, line 2: ""Sono da Via Martignacco
http://t.co/NUC6MP0z|"<a href=""http://foursquare.com"" rel=""nofollow"">f..."
The first 2 lines of CSV:
txt|"source"|"ulang"|"coords"|"tweettime_wtz"|"country"|"id"|"userid"|"in_reply_user_id"|"in_reply_status_id"|"uname"|"ucreationdate"|"utimezone"|"followers_count"|"friends_count"|"x_coords"|"y_coords"
Sono da Via Martignacco http://t.co/NUC6MP0z|"foursquare"|"it"|"0101000020E6100000191CA9E7726F2A4026C1E1269F094740"|"2012-05-13 10:00:45+02"|112|201582743333777411|35445264|""|""|"toffo93"|"2009-04-26 11:00:03"|"Rome"|1044|198|13.21767353|46.07516943
For this data I have created in Postgres a table "Twitter"
CREATE TABLE public.twitter
(
txt character varying(255),
source character varying(255),
ulang character varying(255),
coords geometry(Point,4326),
tweettime_wtz character varying(255),
country integer,
userid integer NOT NULL,
in_reply_user_id character varying(255),
in_reply_status_id character varying(255),
uname character varying(255),
ucreationdate character varying(255),
utimezone character varying(255),
followers_count integer,
friends_count integer,
x_coords numeric,
y_coords numeric,
CONSTRAINT id PRIMARY KEY (userid)
)
WITH (
OIDS=FALSE
);
ALTER TABLE public.twitter
OWNER TO postgres;
Any ideas, guys?
The destination table contain 16 column, but your file contain have 17 column.
It seems to be the id field who is missing.
try to set you table as:
CREATE TABLE public.twitter
(
txt character varying(255),
source character varying(255),
ulang character varying(255),
coords geometry(Point,4326),
tweettime_wtz character varying(255),
country integer,
id character varying,
userid integer NOT NULL,
in_reply_user_id character varying(255),
in_reply_status_id character varying(255),
uname character varying(255),
ucreationdate character varying(255),
utimezone character varying(255),
followers_count integer,
friends_count integer,
x_coords numeric,
y_coords numeric,
CONSTRAINT twitter_pk PRIMARY KEY (userid)
)
WITH (
OIDS=FALSE
);
Change the data type of the id field as you need it.
My solution:
So the problem was in my CSV file: it has had invisible signs of quotes. I haven't seen them when I opened CSV in Excel, I saw the lines in this way:
txt|"source"|"ulang"|"coords"|"tweettime_wtz"|"country"|"id"|"userid"|"in_reply_user_id"|"in_reply_status_id"|"uname"|"ucreationdate"|"utimezone"|"followers_count"|"friends_count"|"x_coords"|"y_coords"
Sono da Via Martignacco http://t.co/NUC6MP0z|"foursquare"|"it"|"0101000020E6100000191CA9E7726F2A4026C1E1269F094740"|"2012-05-13 10:00:45+02"|112|201582743333777411|35445264|""|""|"toffo93"|"2009-04-26 11:00:03"|"Rome"|1044|198|13.21767353|46.07516943
But when I opened CSV in notepad I saw it differently:
"txt"|"source"|"ulang"|"coords"|"tweettime_wtz"|"country"|"id"|"userid"|"in_reply_user_id"|"in_reply_status_id"|"uname"|"ucreationdate"|"utimezone"|"followers_count"|"friends_count"|"x_coords"|"y_coords"
"Sono da Via Martignacco http://t.co/NUC6MP0z"|"foursquare"|"it"|"0101000020E6100000191CA9E7726F2A4026C1E1269F094740"|"2012-05-13 10:00:45+02"|112|201582743333777411|35445264|""|""|"toffo93"|"2009-04-26 11:00:03"|"Rome"|1044|198|13.21767353|46.07516943
"
So i should delete all quotes (in Notepad and saving the file as CSV), so that the text became:
txt|source|ulang|coords|tweettime_wtz|country|id|userid|in_reply_user_id|in_reply_status_id|uname|ucreationdate|utimezone|followers_count|friends_count|x_coords|y_coords
Sono da Via Martignacco http://t.co/NUC6MP0z|<a href=http://foursquare.com rel=nofollow>foursquare</a>|it|0101000020E6100000191CA9E7726F2A4026C1E1269F094740|2012-05-13 10:00:45+02|112|201582743333777411|35445264|||toffo93|2009-04-26 11:00:03|Rome|1044|198|13.21767353|46.07516943
Only after this I was able to use Import tool in pgAdmin without any problem!