ERROR: extra data after last expected column - COPY - postgresql

When I try to import the data with delimiter | I receive the error:
ERROR: extra data after last expected column
I am able to load the data if I remove double quote or single quote from the filed which have issue in the below sample data but my requirement is I need all data without removing any.
This is my copy command:
COPY public.dimingredient FROM '/Users//Downloads/archive1/test.txt'
DELIMITER '|' NULL AS '' CSV HEADER ESCAPE AS '"' ;
My table:
public.dimingredient
(
dr_id integer NOT NULL,
dr_loadtime timestamp(6) without time zone NOT NULL,
dr_start timestamp(6) without time zone NOT NULL,
dr_end timestamp(6) without time zone NOT NULL,
dr_current boolean NOT NULL,
casnumber character varying(100) COLLATE pg_catalog."default" NOT NULL,
ingredientname character varying(300) COLLATE pg_catalog."default" NOT NULL,
matchingstrategy character varying(21) COLLATE pg_catalog."default",
percentofconfidence double precision,
disclosurestatus character varying(42) COLLATE pg_catalog."default",
issand character varying(1) COLLATE pg_catalog."default",
sandmeshsize character varying(20) COLLATE pg_catalog."default",
sandquality character varying(20) COLLATE pg_catalog."default",
isresincoated character varying(1) COLLATE pg_catalog."default",
isartificial character varying(1) COLLATE pg_catalog."default",
CONSTRAINT dimingredient_pkey PRIMARY KEY (dr_id)
)
my data:
5144|2016-07-01 13:34:25.1001891|1900-01-01 00:00:00.0000000|9999-12-31 23:59:59.9999999|True|93834|"9-octadecenamide,n,n-bis(2-hydroxyethyl)-, (9z)"|"NO CAS MATCH FOUND"||Disclosed|||||
5145|2016-07-01 13:34:25.1001891|1900-01-01 00:00:00.0000000|9999-12-31 23:59:59.9999999|True|93834|"9-octadecenamide,n,n-bis-2(hydroxy-ethyl)-,(z)""|"NO CAS MATCH FOUND"||Disclosed|||||

Omitting the empty line in your dample data, I get a different error message with 9.6, to wit:
ERROR: unterminated CSV quoted field
CONTEXT: COPY dimingredient, line 3: "5145|2016-07-01 13:34:25.1001891|1900-01-01 00:00:00.0000000|9999-12-31 23:59:59.9999999|True|93834|..."
Strangely enough, that error message has been there since CSV COPY was introduced in version 8.0, so I wonder how your data are different from the data you show above.
The error message is easily explained: There is an odd number of quotation characters (") in the second line.
Since two doubled quotes in a quoted string are interpreted as a single double quote (" is escaped as ""), the fields in the second line are:
5145
2016-07-01 13:34:25.1001891
1900-01-01 00:00:00.0000000
9999-12-31 23:59:59.9999999
True
93834
9-octadecenamide,n,n-bis-2(hydroxy-ethyl)-,(z)"|NO CAS MATCH FOUND||Disclosed|||||
... and then COPY hits the end of file while parsing a quoted string. Hence the error.
The solution is to use an even number of " characters per field.
If you need a " character in a field, either choose a different QUOTE or quote the field and double the ".

Related

Is there any solution for error problems when uploading csv file like this one I experienced?

So I already write code like this :
-- SCHEMA: Portofolio2
-- DROP SCHEMA IF EXISTS "Portofolio2" ;
CREATE SCHEMA IF NOT EXISTS "Portofolio2"
AUTHORIZATION postgres;
-- Table: Portofolio2.vidgames
-- DROP TABLE IF EXISTS "Portofolio2"."vidgames";
CREATE TABLE IF NOT EXISTS "Portofolio2"."vidgames"
(
index character varying COLLATE pg_catalog."default",
Rank character varying COLLATE pg_catalog."default",
Game_Title character varying COLLATE pg_catalog."default",
Platform character varying COLLATE pg_catalog."default",
Year character varying COLLATE pg_catalog."default",
Genre character varying COLLATE pg_catalog."default",
Publisher character varying COLLATE pg_catalog."default",
North_America character varying COLLATE pg_catalog."default",
Europe character varying COLLATE pg_catalog."default",
Japan character varying COLLATE pg_catalog."default",
Rest_of_World character varying COLLATE pg_catalog."default",
Global character varying COLLATE pg_catalog."default",
Review character varying COLLATE pg_catalog."default"
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE IF EXISTS "Portofolio2"."vidgames"
OWNER to postgres;
Copy "Portofolio2"."vidgames" ("index","rank","game_title","platform","year","genre","publisher","north_america","europe","japan","rest_of_world","global","review") from 'C:\Users\Admin\Downloads\Portofolio\Video Games Sales\Video Games Sales.csv' WITH DELIMITER ',' CSV HEADER QUOTE '''' ;
But the error exist :
NOTICE: schema "Portofolio2" already exists, skipping
NOTICE: relation "vidgames" already exists, skipping
ERROR: extra data after last expected column
CONTEXT: COPY vidgames, line 1261: ""1259,1260,""WarioWare, Inc.: Mega MicroGame$"",GBA,2003.0,Puzzle,Nintendo,0.4,0.11,0.7,0.02,1.23,76..."
SQL state: 22P04
Anyone can explain why ?

type "hstore" is only a shell

I am trying to setup automatic audit Logging in Postgres Using Triggers and Trigger Functions. For this i want to create the table logged_actions in audit schema. When i run the following query :
CREATE TABLE IF NOT EXISTS audit.logged_actions
(
event_id bigint NOT NULL DEFAULT nextval('audit.logged_actions_event_id_seq'::regclass),
schema_name text COLLATE pg_catalog."default" NOT NULL,
table_name text COLLATE pg_catalog."default" NOT NULL,
relid oid NOT NULL,
session_user_name text COLLATE pg_catalog."default",
action_tstamp_tx timestamp with time zone NOT NULL,
action_tstamp_stm timestamp with time zone NOT NULL,
action_tstamp_clk timestamp with time zone NOT NULL,
transaction_id bigint,
application_name text COLLATE pg_catalog."default",
client_addr inet,
client_port integer,
client_query text COLLATE pg_catalog."default",
action text COLLATE pg_catalog."default" NOT NULL,
row_data hstore,
changed_fields hstore,
statement_only boolean NOT NULL,
CONSTRAINT logged_actions_pkey PRIMARY KEY (event_id),
CONSTRAINT logged_actions_action_check CHECK (action = ANY (ARRAY['I'::text, 'D'::text, 'U'::text, 'T'::text]))
)
I have already created the extension "hstore" and query is not executed and error message appears stating that
ERROR: type "hstore" is only a shell LINE 17: row_data hstore
That's a cryptic way of saying the hstore extension isn't loaded. You need to create extension hstore before you can use it.
Note that jsonb more-or-less makes hstore obsolete.

Postgres - how to bulk insert table with foreign keys

I am looking to do a bulk insert into my postgreSQL database.
database is not yet live
postgreSQL 13
I have a temporary staging table which I bulk inserted data
TABLE public.temp_inverter_location
(
id integer ,
inverter_num_in_sld integer,
lift_requirements character varying,
geo_location_id integer NOT NULL (foreign key references geo_location.id),
location_name character varying,
project_info_id integer NOT NULL (foreign key references project_info.id)
)
I am trying to populate the two foreign key columns temp_inverter_location.geo_location_id and temp_inverter_location.project_info_id.
The two referenced tables are referenced by their id columns:
geo_location
CREATE TABLE public.geo_location
(
id integer,
country character varying(50) COLLATE pg_catalog."default",
region character varying(50) COLLATE pg_catalog."default",
city character varying(100) COLLATE pg_catalog."default",
location_name character varying COLLATE pg_catalog."default",
)
and
project_info
CREATE TABLE public.project_info
(
id integer
operation_name character varying,
project_num character varying(10),
grafana_site_num character varying(10)
)
I want to populate the correct foreign keys into the columns temp_inverter_location.geo_location_id and temp_inverter_location.project_info_id.
I am trying to use INSERT INTO SELECT to populate temp_inverter_location.geo_location_id with a JOIN that matches geo_location.location_name and temp_inverter_location.name.
I have tried this query however inverter_location.geo_location_id remains blank:
INSERT INTO temp_inverter_location(geo_location_id) SELECT geo_location.id FROM geo_location INNER JOIN temp_inverter_location ON geo_location.location_name=temp_inverter_location.location_name;
Please let me know if more info is needed, thanks!
I was able to resolve this issue using update referencing another table.
Basically, I updated the geo_location_id column using
UPDATE temp_inverter_location SET geo_location_id = geo_location.id FROM geo_location WHERE geo_location.location_name = temp_inverter_location.location_name;
and updated the project_info_id using
UPDATE load_table SET project_info_id = project_info.id FROM project_info WHERE project_info.operation_name = load_table.location_name;
It seems to have worked.

How can i update a row in postgresql?

i have a table named person and it looks like this:
CREATE TABLE public.person
(
"idPerson" integer NOT NULL DEFAULT nextval('idperson'::regclass),
fname character varying(45) COLLATE pg_catalog."default",
lname character varying(45) COLLATE pg_catalog."default",
sex character(1) COLLATE pg_catalog."default",
dateofbirth date,
address character varying(75) COLLATE pg_catalog."default",
city character varying(45) COLLATE pg_catalog."default",
country character varying(45) COLLATE pg_catalog."default",
CONSTRAINT "idPerson_PK" PRIMARY KEY ("idPerson")
)
I want to perform an update through a function( my_update_function(fname,lname , sex ,dateofbirth,address,city,country) . My problem here is that i dont want a specific condition ,instead i want to just call the function like this for example:
SELECT my_update_function('Jamie','Phillips', 'F',1973-03-08,'Santina Island 108','Okhotsk',Russia)
and updates my table in the row with idPerson 57 only in the column that it is different (fname in this case )
This is what i did:
UPDATE person
SET fname=my_update_function.fname , lname=my_update_function.lname
,sex=my_update_function.sex , dateofbirth=my_update_function.dateofbirth
,address=my_update_function.address , city=my_update_function.city
,country=my_update_function.country
WHERE person.fname='Karissa';
I updated my table but the problem here is that i had to put the specific fname 'Karissa' inside of the function ,instead i want to be done automatically .
How can i do something like that?
Thank you.
Maybe you can try something like this
execute 'UPDATE person
SET fname='||my_update_function.fname||' , lname='||my_update_function.lname||'
,sex='||my_update_function.sex||' , dateofbirth='||my_update_function.dateofbirth||'
,address='||my_update_function.address||' , city='||y_update_function.city||'
,country='||my_update_function.country||'
WHERE person.fname='''||my_update_function.fname||'''';
or (better, because it is not vulnerable to SQL injection)
prepare query as UPDATE person SET fname=$1 , lname=$2 ,sex=$3, dateofbirth=$4,address=$5, city=$6,country=$7 WHERE person.fname=$8;
execute query(fname,lname,sex,dateofbirth,address,city,country,fname);

PostgreSQL COPY from CSV with delimiter "|"

please, can anyone help me to solve this problem?
I'd like to create a table in Postgres database with data from CSV file with delimiter "|", while trying to use the command COPY (or Import) I get this error:
ERROR: extra data after last expected column
CONTEXT: COPY twitter, line 2: ""Sono da Via Martignacco
http://t.co/NUC6MP0z|"<a href=""http://foursquare.com"" rel=""nofollow"">f..."
The first 2 lines of CSV:
txt|"source"|"ulang"|"coords"|"tweettime_wtz"|"country"|"id"|"userid"|"in_reply_user_id"|"in_reply_status_id"|"uname"|"ucreationdate"|"utimezone"|"followers_count"|"friends_count"|"x_coords"|"y_coords"
Sono da Via Martignacco http://t.co/NUC6MP0z|"foursquare"|"it"|"0101000020E6100000191CA9E7726F2A4026C1E1269F094740"|"2012-05-13 10:00:45+02"|112|201582743333777411|35445264|""|""|"toffo93"|"2009-04-26 11:00:03"|"Rome"|1044|198|13.21767353|46.07516943
For this data I have created in Postgres a table "Twitter"
CREATE TABLE public.twitter
(
txt character varying(255),
source character varying(255),
ulang character varying(255),
coords geometry(Point,4326),
tweettime_wtz character varying(255),
country integer,
userid integer NOT NULL,
in_reply_user_id character varying(255),
in_reply_status_id character varying(255),
uname character varying(255),
ucreationdate character varying(255),
utimezone character varying(255),
followers_count integer,
friends_count integer,
x_coords numeric,
y_coords numeric,
CONSTRAINT id PRIMARY KEY (userid)
)
WITH (
OIDS=FALSE
);
ALTER TABLE public.twitter
OWNER TO postgres;
Any ideas, guys?
The destination table contain 16 column, but your file contain have 17 column.
It seems to be the id field who is missing.
try to set you table as:
CREATE TABLE public.twitter
(
txt character varying(255),
source character varying(255),
ulang character varying(255),
coords geometry(Point,4326),
tweettime_wtz character varying(255),
country integer,
id character varying,
userid integer NOT NULL,
in_reply_user_id character varying(255),
in_reply_status_id character varying(255),
uname character varying(255),
ucreationdate character varying(255),
utimezone character varying(255),
followers_count integer,
friends_count integer,
x_coords numeric,
y_coords numeric,
CONSTRAINT twitter_pk PRIMARY KEY (userid)
)
WITH (
OIDS=FALSE
);
Change the data type of the id field as you need it.
My solution:
So the problem was in my CSV file: it has had invisible signs of quotes. I haven't seen them when I opened CSV in Excel, I saw the lines in this way:
txt|"source"|"ulang"|"coords"|"tweettime_wtz"|"country"|"id"|"userid"|"in_reply_user_id"|"in_reply_status_id"|"uname"|"ucreationdate"|"utimezone"|"followers_count"|"friends_count"|"x_coords"|"y_coords"
Sono da Via Martignacco http://t.co/NUC6MP0z|"foursquare"|"it"|"0101000020E6100000191CA9E7726F2A4026C1E1269F094740"|"2012-05-13 10:00:45+02"|112|201582743333777411|35445264|""|""|"toffo93"|"2009-04-26 11:00:03"|"Rome"|1044|198|13.21767353|46.07516943
But when I opened CSV in notepad I saw it differently:
"txt"|"source"|"ulang"|"coords"|"tweettime_wtz"|"country"|"id"|"userid"|"in_reply_user_id"|"in_reply_status_id"|"uname"|"ucreationdate"|"utimezone"|"followers_count"|"friends_count"|"x_coords"|"y_coords"
"Sono da Via Martignacco http://t.co/NUC6MP0z"|"foursquare"|"it"|"0101000020E6100000191CA9E7726F2A4026C1E1269F094740"|"2012-05-13 10:00:45+02"|112|201582743333777411|35445264|""|""|"toffo93"|"2009-04-26 11:00:03"|"Rome"|1044|198|13.21767353|46.07516943
"
So i should delete all quotes (in Notepad and saving the file as CSV), so that the text became:
txt|source|ulang|coords|tweettime_wtz|country|id|userid|in_reply_user_id|in_reply_status_id|uname|ucreationdate|utimezone|followers_count|friends_count|x_coords|y_coords
Sono da Via Martignacco http://t.co/NUC6MP0z|<a href=http://foursquare.com rel=nofollow>foursquare</a>|it|0101000020E6100000191CA9E7726F2A4026C1E1269F094740|2012-05-13 10:00:45+02|112|201582743333777411|35445264|||toffo93|2009-04-26 11:00:03|Rome|1044|198|13.21767353|46.07516943
Only after this I was able to use Import tool in pgAdmin without any problem!