How to import data from csv to postgresql table that has an auto_increment first column or field - postgresql

Hi I am trying to import data to a postgresql table from a csv file, so I'd like to know how do I exlude the first column since it is an identity column that increments when data is inserted?
Error I get is here

If you are using some script to add or modify data, then you should just skip the variable on the script (e.g. does not write then on the insert statement), but on doing so you should modify your csv to delete the insert column(the data, if any, and the separator, usually a comma) since the number of variables are now different.
Looking at your print I suppose you are using pgadmin or some simmilar GUI, on the case of pgadmin if you click on the column and select the import\export data... option, you will open a windows where you should select "import" and then, on the upper windows menu, click on "columns" and exclude the "ID" or any other auto-increment, this slution also needs you to remove the csv column as well.

Related

Basic questions about Cloud SQL

I'm trying to populate a cloud sql database using a cloud storage bucket, but I'm getting some errors. The csv has the headers (or column names) as first row and does not have all the columns (some columns in the database can be null, so I'm loading the data I need for now).
The database is in postgresql and this is the first database in GCP I'm trying to configure and I'm a little bit confused.
Does it matters if the csv file has the column names?
Does the order of the columns matter in the csv file? (I guess they do if there are not present in the csv)
The PK of the table is a serial number, which I'm not including in the csv file. Do I need to include also the PK? I mean, because its a serial number it should be "auto assigned", right?
Sorry for the noob questions and thanks in advance :)
This is all covered by the COPY documentation.
It matters in that you will have to specify the HEADER option so that the first line is skipped:
[...] on input, the first line is ignored.
The order matters, and if the CSV file does not contain all the columns in the same order as the table, you have to specify them with COPY:
COPY mytable (col12, col2, col4, ...) FROM '/dir/afile' OPTIONS (...);
Same as above: if you omit a table column in the column list, it will be filled with the default value, in that case that is the autogenerated number.

Preconfigure column types when using DataGrip to import from CSV

I'm using DataGrip 2016.3 to connect to a PostgreSQL server.
When I right click and Import From File, currently DataGrip makes assumptions about the type associated with each column
(see image for what DataGrip defaults to in the import dialog). I'd like to specify that column A is VARCHAR(50), column B is INT, column C is DATE, and so on. I will be uploading similar files multiple times, and I'd like to avoid having to specify my types each time I import. Is there a way to save and select configurations of columns A, B, and C's types?
The best way that I've found to accomplish this is to first create the destination table. Then Right-click on the table in the database tool window to launch the "Import Data from File..." wizard.
This is also great if you want the destination table to contain additional columns, such as an auto-incrementing id column.

Can I import CSV data into a table without knowing the columns of the CSV?

I have a CSV file file.csv.
In Postgres, I have made a table named grants:
CREATE TABLE grants
(
)
WITH (
OIDS=FALSE
);
ALTER TABLE grants
OWNER TO postgres;
I want to import file.csv data without having to specify columns in Postgres.
But if I run COPY grants FROM '/PATH/TO/grants.csv' CSV HEADER;, I get this error: ERROR: extra data after last expected column.
How do I import the CSV data without having to specify columns and types?
The error is normal.
You created a table with no column. The COPY command try to import data into the table with the good structure.
So you have to create the table corresponding to your csv file before execute the COPY command.
I discovered pgfutter :
"Import CSV and JSON into PostgreSQL the easy way. This small tool abstract all the hassles and swearing you normally have to deal with when you just want to dump some data into the database"
Perhaps a solution ...
The best method for me was to convert the csv to dataframe and then follow
https://github.com/sp-anna-jones/data_science/wiki/Importing-pandas-dataframe-to-postgres
No, it is not possible using the COPY command
If a list of columns is specified, COPY will only copy the data in the
specified columns to or from the file. If there are any columns in the
table that are not in the column list, COPY FROM will insert the
default values for those columns.
COPY does not create columns for you.

PostgreSQL leaving headers blank while importing CSV

I'm trying to import a CSV file with column names "Zip Code", "2010 Population", "Land-Sq-Mi" and "Density per Sq Mile" into my test table, which is named derp--that's why I have the drop statement at the beginning, so I don't replicate any rows and can start clean in each iteration.
Code is as follows:
DROP TABLE derp;
CREATE TABLE public.derp("Zip Code" varchar, "2010 Population" integer, "Land-Sq-Mi" numeric, "Density Per Sq Mile" numeric);
COPY derp("Zip Code", "2010 Population", "Land-Sq-Mi", "Density Per Sq Mile")
FROM '/home/michael/PycharmProjects/cmsDataProject/Zipcode-ZCTA-Population-Density-And-Area-Unsorted.csv'
DELIMITER','
CSV HEADER;
This does a fine job of importing the actual data, but it leaves the column headers blank in the pgadmin III data view. I looked at the source file in Nano--the headers are there, and if they weren't the query would have thrown a syntax error telling me that there was no relation for the column I was trying to import into.
Any ideas about what I'm doing wrong?
Edit: I would like pgadmin III data view to display the header names, and possibly a way to verify that the columns are actually named even if they aren't being imported and not displayed. To reiterate, every row after the headers is intact and in view, just the header row is blank.
Edit 2: When I CREATE TABLE public.derp(); and then manually add the columns, they show correctly in the data view. Something about the multi-line query statement was causing the breakage.
So pgadmin is not showing the column names but it's showing the data?
If you open a table in pgadmin, then alter the table, but keep the table window open, it seems to lose the column names.
Close the window with the table. Click the tables icon in the pgadmin tree view and refresh the tables, and reopen the table window.

Using IMPORT instead of LOAD in DB2

I wanted to prepare a load utility to load the data into DB2 table. The table has columns which contains GENERATEDALWAYS feature set.
So, I am not able to load an unloaded details from the table.
Is it possible to use import for tables having columns with GENERATEDALWAYS set?
Steps I did:
1. db2 "export to tbl.txt of del modified by coldel| select * from <schema.table> where col=value"
2. db2 "delete from <schema.table> where col=value"
3. db2 "import from tbl.txt of del modified by coldel| allow write access warningcount1 insert into <schema.table>"
The columns with "GENERATEDALWAYS" is having NEW Value after import. Is it possible to use import to populate GENERATEDALWAYS columns to have the old values?
Appreciate the assistance.
Thanks,
Mathew Liju
What you are asking is not possible. With IMPORT you can't override columns that have GENERATED ALWAYS. As #Peter Miehle suggests you could alter the table to specify that the column is GENERATED BY DEFAULT, but this may break other applications.
Your question's title implies that you don't want to use the LOAD utility (but you don't mention anything about it in the actual question). However, LOAD is the only way to write data into the table and maintain the values for the generated column as they exist in the file:
db2 "load from tbl.txt of del modified by generatedoverride insert into schema.table"
If you do this, be aware that:
DB2 does not check if there are conflicts with existing rows in the table. You would need to define a unique index on the column(s) in question to resolve this; this would cause DB2 to delete the rows that you just loaded in the DELETE phase of the load.
If your generated column(s) are using IDENTITY, make sure that you alter the column to ensure that future generated values do not conflict with the rows that you just inserted into the table.
maybe you can drop the "generation" from the column and add it after importing with the appropriate values again.
#Ian Bjorhovde has given you the options.
IMPORT actually does INSERTs in the background - ie, it first prepares a INSERT statement with parameter markers and uses the values in the input file for those markers.
In your SQL snapshot you will see INSERT statement that is used.
Anything that is not possible in an INSERT statement isn't possible with IMPORT (kind of .. )