PostgreSQL COPY from CSV with delimiter "|"

PostgreSQL COPY from CSV with delimiter "|" - postgresql

please, can anyone help me to solve this problem?
I'd like to create a table in Postgres database with data from CSV file with delimiter "|", while trying to use the command COPY (or Import) I get this error:
ERROR: extra data after last expected column
CONTEXT: COPY twitter, line 2: ""Sono da Via Martignacco
http://t.co/NUC6MP0z|"<a href=""http://foursquare.com"" rel=""nofollow"">f..."
The first 2 lines of CSV:
txt|"source"|"ulang"|"coords"|"tweettime_wtz"|"country"|"id"|"userid"|"in_reply_user_id"|"in_reply_status_id"|"uname"|"ucreationdate"|"utimezone"|"followers_count"|"friends_count"|"x_coords"|"y_coords"
Sono da Via Martignacco http://t.co/NUC6MP0z|"foursquare"|"it"|"0101000020E6100000191CA9E7726F2A4026C1E1269F094740"|"2012-05-13 10:00:45+02"|112|201582743333777411|35445264|""|""|"toffo93"|"2009-04-26 11:00:03"|"Rome"|1044|198|13.21767353|46.07516943
For this data I have created in Postgres a table "Twitter"
CREATE TABLE public.twitter
(
txt character varying(255),
source character varying(255),
ulang character varying(255),
coords geometry(Point,4326),
tweettime_wtz character varying(255),
country integer,
userid integer NOT NULL,
in_reply_user_id character varying(255),
in_reply_status_id character varying(255),
uname character varying(255),
ucreationdate character varying(255),
utimezone character varying(255),
followers_count integer,
friends_count integer,
x_coords numeric,
y_coords numeric,
CONSTRAINT id PRIMARY KEY (userid)
)
WITH (
OIDS=FALSE
);
ALTER TABLE public.twitter
OWNER TO postgres;
Any ideas, guys?

The destination table contain 16 column, but your file contain have 17 column.
It seems to be the id field who is missing.
try to set you table as:
CREATE TABLE public.twitter
(
txt character varying(255),
source character varying(255),
ulang character varying(255),
coords geometry(Point,4326),
tweettime_wtz character varying(255),
country integer,
id character varying,
userid integer NOT NULL,
in_reply_user_id character varying(255),
in_reply_status_id character varying(255),
uname character varying(255),
ucreationdate character varying(255),
utimezone character varying(255),
followers_count integer,
friends_count integer,
x_coords numeric,
y_coords numeric,
CONSTRAINT twitter_pk PRIMARY KEY (userid)
)
WITH (
OIDS=FALSE
);
Change the data type of the id field as you need it.

My solution:
So the problem was in my CSV file: it has had invisible signs of quotes. I haven't seen them when I opened CSV in Excel, I saw the lines in this way:
txt|"source"|"ulang"|"coords"|"tweettime_wtz"|"country"|"id"|"userid"|"in_reply_user_id"|"in_reply_status_id"|"uname"|"ucreationdate"|"utimezone"|"followers_count"|"friends_count"|"x_coords"|"y_coords"
Sono da Via Martignacco http://t.co/NUC6MP0z|"foursquare"|"it"|"0101000020E6100000191CA9E7726F2A4026C1E1269F094740"|"2012-05-13 10:00:45+02"|112|201582743333777411|35445264|""|""|"toffo93"|"2009-04-26 11:00:03"|"Rome"|1044|198|13.21767353|46.07516943
But when I opened CSV in notepad I saw it differently:
"txt"|"source"|"ulang"|"coords"|"tweettime_wtz"|"country"|"id"|"userid"|"in_reply_user_id"|"in_reply_status_id"|"uname"|"ucreationdate"|"utimezone"|"followers_count"|"friends_count"|"x_coords"|"y_coords"
"Sono da Via Martignacco http://t.co/NUC6MP0z"|"foursquare"|"it"|"0101000020E6100000191CA9E7726F2A4026C1E1269F094740"|"2012-05-13 10:00:45+02"|112|201582743333777411|35445264|""|""|"toffo93"|"2009-04-26 11:00:03"|"Rome"|1044|198|13.21767353|46.07516943
"
So i should delete all quotes (in Notepad and saving the file as CSV), so that the text became:
txt|source|ulang|coords|tweettime_wtz|country|id|userid|in_reply_user_id|in_reply_status_id|uname|ucreationdate|utimezone|followers_count|friends_count|x_coords|y_coords
Sono da Via Martignacco http://t.co/NUC6MP0z|<a href=http://foursquare.com rel=nofollow>foursquare</a>|it|0101000020E6100000191CA9E7726F2A4026C1E1269F094740|2012-05-13 10:00:45+02|112|201582743333777411|35445264|||toffo93|2009-04-26 11:00:03|Rome|1044|198|13.21767353|46.07516943
Only after this I was able to use Import tool in pgAdmin without any problem!

Related

Importing csv file using COPY FROM on Mac

Using the query editor in the pgAdmin4 app, I would like to import data from a csv file into a table. My code is as follows:
CREATE DATABASE gps_tracking_db
ENCODING = 'UTF8'
TEMPLATE = template0
LC_COLLATE = 'C'
LC_CTYPE = 'C';
CREATE SCHEMA main;
COMMENT ON SCHEMA main IS 'Schema that stores all the GPS tracking core data.';
CREATE TABLE main.gps_data(
gps_data_id serial,
gps_sensors_code character varying,
line_no integer,
utc_date date,
utc_time time without time zone,
lmt_date date,
lmt_time time without time zone,
ecef_x integer,
ecef_y integer,
ecef_z integer,
latitude double precision,
longitude double precision,
height double precision,
dop double precision,
nav character varying(2),
validated character varying(3),
sats_used integer,
ch01_sat_id integer,
ch01_sat_cnr integer,
ch02_sat_id integer,
ch02_sat_cnr integer,
ch03_sat_id integer,
ch03_sat_cnr integer,
ch04_sat_id integer,
ch04_sat_cnr integer,
ch05_sat_id integer,
ch05_sat_cnr integer,
ch06_sat_id integer,
ch06_sat_cnr integer,
ch07_sat_id integer,
ch07_sat_cnr integer,
ch08_sat_id integer,
ch08_sat_cnr integer,
ch09_sat_id integer,
ch09_sat_cnr integer,
ch10_sat_id integer,
ch10_sat_cnr integer,
ch11_sat_id integer,
ch11_sat_cnr integer,
ch12_sat_id integer,
ch12_sat_cnr integer,
main_vol double precision,
bu_vol double precision,
temp double precision,
easting integer,
northing integer,
remarks character varying
);
COMMENT ON TABLE main.gps_data
IS 'Table that stores raw data as they come from the sensors (plus the ID of
the sensor).';
ALTER TABLE main.gps_data
ADD CONSTRAINT gps_data_pkey
PRIMARY KEY(gps_data_id);
ALTER TABLE main.gps_data
ADD COLUMN insert_timestamp timestamp with time zone
DEFAULT now();
ALTER TABLE main.gps_data
ADD CONSTRAINT unique_gps_data_record
UNIQUE(gps_sensors_code, line_no); /*what does line_no mean?*/
COPY main.gps_data(
gps_sensors_code, line_no, utc_date, utc_time, lmt_date, lmt_time, ecef_x,
ecef_y, ecef_z, latitude, longitude, height, dop, nav, validated, sats_used,
ch01_sat_id, ch01_sat_cnr, ch02_sat_id, ch02_sat_cnr, ch03_sat_id,
ch03_sat_cnr, ch04_sat_id, ch04_sat_cnr, ch05_sat_id, ch05_sat_cnr,
ch06_sat_id, ch06_sat_cnr, ch07_sat_id, ch07_sat_cnr, ch08_sat_id,
ch08_sat_cnr, ch09_sat_id, ch09_sat_cnr, ch10_sat_id, ch10_sat_cnr,
ch11_sat_id, ch11_sat_cnr, ch12_sat_id, ch12_sat_cnr, main_vol, bu_vol,
temp, easting, northing, remarks)
FROM
'/Users/CDDEP/Downloads⁩/Urbano 2014/⁩tracking_db⁩/data⁩/sensors_data⁩/GSM01438.csv'
WITH (FORMAT csv, HEADER, DELIMITER ';')
However, when I run the CREATE FROM command, the following error message is returned:
ERROR: could not open file "/Users/CDDEP/Downloads⁩/Urbano 2014/⁩tracking_db⁩/data⁩/sensors_data⁩/GSM01438.csv" for reading: No such file or directory
HINT: COPY FROM instructs the PostgreSQL server process to read a file. You may want a client-side facility such as psql's \copy.
SQL state: 58P01
I wonder if the error is due to a formatting issue with the Mac filepath or something else.

Make sure the file /Users/CDDEP/Downloads⁩/Urbano 2014/⁩tracking_db⁩/data⁩/sensors_data⁩/GSM01438.csv does exists
Replace COPY main.gps_data with \COPY main.gps_data to use the client side utility

Postgres - how to bulk insert table with foreign keys

I am looking to do a bulk insert into my postgreSQL database.
database is not yet live
postgreSQL 13
I have a temporary staging table which I bulk inserted data
TABLE public.temp_inverter_location
(
id integer ,
inverter_num_in_sld integer,
lift_requirements character varying,
geo_location_id integer NOT NULL (foreign key references geo_location.id),
location_name character varying,
project_info_id integer NOT NULL (foreign key references project_info.id)
)
I am trying to populate the two foreign key columns temp_inverter_location.geo_location_id and temp_inverter_location.project_info_id.
The two referenced tables are referenced by their id columns:
geo_location
CREATE TABLE public.geo_location
(
id integer,
country character varying(50) COLLATE pg_catalog."default",
region character varying(50) COLLATE pg_catalog."default",
city character varying(100) COLLATE pg_catalog."default",
location_name character varying COLLATE pg_catalog."default",
)
and
project_info
CREATE TABLE public.project_info
(
id integer
operation_name character varying,
project_num character varying(10),
grafana_site_num character varying(10)
)
I want to populate the correct foreign keys into the columns temp_inverter_location.geo_location_id and temp_inverter_location.project_info_id.
I am trying to use INSERT INTO SELECT to populate temp_inverter_location.geo_location_id with a JOIN that matches geo_location.location_name and temp_inverter_location.name.
I have tried this query however inverter_location.geo_location_id remains blank:
INSERT INTO temp_inverter_location(geo_location_id) SELECT geo_location.id FROM geo_location INNER JOIN temp_inverter_location ON geo_location.location_name=temp_inverter_location.location_name;
Please let me know if more info is needed, thanks!

I was able to resolve this issue using update referencing another table.
Basically, I updated the geo_location_id column using
UPDATE temp_inverter_location SET geo_location_id = geo_location.id FROM geo_location WHERE geo_location.location_name = temp_inverter_location.location_name;
and updated the project_info_id using
UPDATE load_table SET project_info_id = project_info.id FROM project_info WHERE project_info.operation_name = load_table.location_name;
It seems to have worked.

POSTGRESQL PG/PGSQL- Function with params

I'm having some issues making a function in Postgresql, I have this function:
CREATE OR REPLACE FUNCTION public.isp_ticket(_cr integer, _grupo character varying(255), _numero integer, _descripcion text, _resumen character varying(255), _fechaaper timestamp with time zone, _fechacierr timestamp with time zone, _tipo smallint, _apellidousuarioafectado character varying(255), _apellidosolicitante character varying(255), _tenant character varying(255), _metodoreportado character varying(100), _prioridad smallint, _sla character varying(255), _categoria character varying(255), _estado character varying(255), _herramienta_id integer, _asignado character varying(255), _nombresolicitante character varying(255), _nombreusuarioafectado character varying(255))
RETURNS void AS $$
BEGIN
CASE
WHEN _asignado = '' AND _close_date = '' AND _sla = ''
THEN INSERT INTO public.website_ticket(cr, grupo, numero, descripcion, resumen, fechaaper, tipo, apellidousuarioafectado, apellidosolicitante, tenant, metodoreportado, prioridad, categoria, estado, herramienta_id, nombresolicitante, nombreusuarioafectado) VALUES (_cr, _grupo, _numero, _descripcion, _resumen, _fechaaper, _tipo, _apellidousuarioafectado, _apellidosolicitante, _tenant, _metodoreportado, _prioridad, _categoria, estado, _herramienta_id, _nombresolicitante, _nombreusuarioafectado);
WHEN _asignado = '' AND _close_date = ''
THEN INSERT INTO public.website_ticket(cr, grupo, numero, descripcion, resumen, fechaaper, tipo, apellidousuarioafectado, apellidosolicitante, tenant, metodoreportado, prioridad, sla, categoria, estado, herramienta_id, nombresolicitante, nombreusuarioafectado) VALUES (_cr, _grupo, _numero, _descripcion, _resumen, _fechaaper, _tipo, _apellidousuarioafectado, _apellidosolicitante, _tenant, _metodoreportado, _prioridad, _sla, _categoria, _estado, _herramienta_id, _nombresolicitante, _nombreusuarioafectado);
WHEN new_close_date = ''
THEN INSERT INTO public.website_ticket(cr, grupo, numero, descripcion, resumen, fechaaper, tipo, apellidousuarioafectado, apellidosolicitante, tenant, metodoreportado, prioridad, sla, categoria, estado, herramienta_id, asignado,nombresolicitante, nombreusuarioafectado)
VALUES (_cr, _grupo, _numero, _descripcion, _resumen, _fechaaper, _tipo, _apellidousuarioafectado, _apellidosolicitante, _tenant, _metodoreportado, _prioridad, _sla, _categoria, _estado, _herramienta_id, _asignado, _nombresolicitante, _nombreusuarioafectado);
ELSE
UPDATE public.website_ticket SET fechacierr = _fechacierr WHERE numero = _numero;
END CASE;
END;
$$ LANGUAGE plpgsql;
and when I try to use the function doing this:
SELECT public.isp_ticket(924266,
'EUS_Zona V Region',
512294,
'Nombre: Gisselle Espinoza Contreras\nCorreo: gespinoza#bancoripley.cl
\nAnexo: 6221\nUbicación: Valparaiso\nPais: Chile\nMotivo: Usuario indica
que su computador se apagó repentinamente. Se pudo entrar a windows después
de un buen rato, pero no puede ingresar a las aplicaciones que se conecten a
red.\n\nDirección: Plaza Victoria 1646 - Piso 1 - Banco',
'Valparaiso // Computador con problemas de conexión.',
'2018-01-23 15:17:51',
'',
1,
'Espinoza Contreras',
'Espinoza Contreras',
'Ripley',
'Telephone',
3,
'',
'Ripley.Hardware.Desktop.Falla',
'Open',
1,
'',
'Gissel Rose Marie',
'Gissel Rose Marie')
I tried to CAST every value, and it didn't work either, always appear the same error:
ERROR: no existe la función public.isp_ticket(integer, character varying, integer, text, character varying, timestamp with time zone, unknown, integer, character varying, character varying, character varying, character varying, integer, unknown, character varying, character varying, integer, unknown, character varying, character varying)
LINE 1: SELECT public.isp_ticket(
^
SQL state: 42883
Character: 8
I need help how can I fix it?
Forwards thanks everyone!!!

Parameter #7, _fechacierr should be timestamp with time zone. You can not pass '', change it to null (and cast it to timestamp with time zone) if you need empty value.
And it's worth to read how PostgreSQL finds specific function to call, especially:
If any input arguments are unknown, check the type categories accepted
at those argument positions by the remaining candidates. At each
position, select the string category if any candidate accepts that
category. (This bias towards string is appropriate since an
unknown-type literal looks like a string.) Otherwise, if all the
remaining candidates accept the same type category, select that
category; otherwise fail because the correct choice cannot be deduced
without more clues. Now discard candidates that do not accept the
selected type category. Furthermore, if any candidate accepts a
preferred type in that category, discard candidates that accept
non-preferred types for that argument. Keep all candidates if none
survive these tests. If only one candidate remains, use it; else
continue to the next step.

ERROR: extra data after last expected column - COPY

When I try to import the data with delimiter | I receive the error:
ERROR: extra data after last expected column
I am able to load the data if I remove double quote or single quote from the filed which have issue in the below sample data but my requirement is I need all data without removing any.
This is my copy command:
COPY public.dimingredient FROM '/Users//Downloads/archive1/test.txt'
DELIMITER '|' NULL AS '' CSV HEADER ESCAPE AS '"' ;
My table:
public.dimingredient
(
dr_id integer NOT NULL,
dr_loadtime timestamp(6) without time zone NOT NULL,
dr_start timestamp(6) without time zone NOT NULL,
dr_end timestamp(6) without time zone NOT NULL,
dr_current boolean NOT NULL,
casnumber character varying(100) COLLATE pg_catalog."default" NOT NULL,
ingredientname character varying(300) COLLATE pg_catalog."default" NOT NULL,
matchingstrategy character varying(21) COLLATE pg_catalog."default",
percentofconfidence double precision,
disclosurestatus character varying(42) COLLATE pg_catalog."default",
issand character varying(1) COLLATE pg_catalog."default",
sandmeshsize character varying(20) COLLATE pg_catalog."default",
sandquality character varying(20) COLLATE pg_catalog."default",
isresincoated character varying(1) COLLATE pg_catalog."default",
isartificial character varying(1) COLLATE pg_catalog."default",
CONSTRAINT dimingredient_pkey PRIMARY KEY (dr_id)
)
my data:
5144|2016-07-01 13:34:25.1001891|1900-01-01 00:00:00.0000000|9999-12-31 23:59:59.9999999|True|93834|"9-octadecenamide,n,n-bis(2-hydroxyethyl)-, (9z)"|"NO CAS MATCH FOUND"||Disclosed|||||
5145|2016-07-01 13:34:25.1001891|1900-01-01 00:00:00.0000000|9999-12-31 23:59:59.9999999|True|93834|"9-octadecenamide,n,n-bis-2(hydroxy-ethyl)-,(z)""|"NO CAS MATCH FOUND"||Disclosed|||||

Omitting the empty line in your dample data, I get a different error message with 9.6, to wit:
ERROR: unterminated CSV quoted field
CONTEXT: COPY dimingredient, line 3: "5145|2016-07-01 13:34:25.1001891|1900-01-01 00:00:00.0000000|9999-12-31 23:59:59.9999999|True|93834|..."
Strangely enough, that error message has been there since CSV COPY was introduced in version 8.0, so I wonder how your data are different from the data you show above.
The error message is easily explained: There is an odd number of quotation characters (") in the second line.
Since two doubled quotes in a quoted string are interpreted as a single double quote (" is escaped as ""), the fields in the second line are:
5145
2016-07-01 13:34:25.1001891
1900-01-01 00:00:00.0000000
9999-12-31 23:59:59.9999999
True
93834
9-octadecenamide,n,n-bis-2(hydroxy-ethyl)-,(z)"|NO CAS MATCH FOUND||Disclosed|||||
... and then COPY hits the end of file while parsing a quoted string. Hence the error.
The solution is to use an even number of " characters per field.
If you need a " character in a field, either choose a different QUOTE or quote the field and double the ".

Cascade Delete is not work

I am not able to delete record from parent table of PostGres DB.. Any one of you can get me an idea on this.
-- Table: tbl_patient
-- DROP TABLE tbl_patient;
CREATE TABLE tbl_patient
(
patient_id bigserial NOT NULL,
date_of_birth date NOT NULL,
fathers_name character varying(255) NOT NULL,
first_name character varying(255) NOT NULL,
last_name character varying(255),
marital_status character varying(255),
mobile_number character varying(255) NOT NULL,
occupation character varying(255),
phone_number character varying(255),
pregnancy_status character varying(255),
sex character varying(255) NOT NULL,
CONSTRAINT tbl_patient_pkey PRIMARY KEY (patient_id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE tbl_patient
OWNER TO postgres;
-- Table: tbl_address
CREATE TABLE tbl_address
(
address_id bigserial NOT NULL,
address_line_1 character varying(255) NOT NULL,
address_line_2 character varying(255),
city character varying(255),
country character varying(255),
district character varying(255) NOT NULL,
pincode character varying(255) NOT NULL,
state character varying(255),
street character varying(255),
patient_id bigint,
CONSTRAINT tbl_address_pkey PRIMARY KEY (address_id),
CONSTRAINT fk_slc6pgeimmox5buka8bydy6c4 FOREIGN KEY (patient_id)
REFERENCES tbl_patient (patient_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
ALTER TABLE tbl_address
OWNER TO postgres;
//-------------------------------------------------------------------
When I put this command
DELETE FROM tbl_patient
WHERE patient_id = 1;
I got this error below
ERROR: update or delete on table "tbl_patient" violates foreign key
constraint "fk_slc6pgeimmox5buka8bydy6c4" on table "tbl_address" SQL
state: 23503 Detail: Key (patient_id)=(1) is still referenced from
table "tbl_address".

You write DELETE NO ACTION and you wanting actions:) Just need to change to
REFERENCES tbl_patient (patient_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE
NO ACTION means that server wont do anything with referenced rows if they exists. Since they exists and you specified also MATCH SIMPLE to one-column-foreign key then PostgreSQL cannot perform deletion because of that referenced rows.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

PostgreSQL COPY from CSV with delimiter "|" - postgresql

Related

Importing csv file using COPY FROM on Mac

Postgres - how to bulk insert table with foreign keys

POSTGRESQL PG/PGSQL- Function with params

ERROR: extra data after last expected column - COPY

Cascade Delete is not work

Categories

Resources