PostgreSQL copy command generate primary key id - postgresql

I have a CSV file with two columns: city and zipcode. I want to be able to copy this file into a PostgreSQL table using the copy command and at the same time auto generate the id value.
The table has the following columns: id, city, and zipcode.
My CSV file has only: city and zipcode.

The COPY command should do that all by itself if your table uses a serial column for the id:
If there are any columns in the table that are not in the column list, COPY FROM will insert the default values for those columns.
So you should be able to say:
copy table_name(city, zipcode) from ...
and the id will be generated as usual. If you don't have a serial column for id (or a manually attached sequence), then you could hook up a sequence by hand, do your COPY, and then detach the sequence.

Related

PostgreSQL id column not defined

I am new in PostgreSQL and I am working with this database.
I got a file which I imported, and I am trying to get rows with a certain ID. But the ID is not defined, as you can see it in this picture:
so how do I access this ID? I want to use an SQL command like this:
SELECT * from table_name WHERE ID = 1;
If any order of rows is ok for you, just add a row number according to the current arbitrary sort order:
CREATE SEQUENCE tbl_tbl_id_seq;
ALTER TABLE tbl ADD COLUMN tbl_id integer DEFAULT nextval('tbl_tbl_id_seq');
The new default value is filled in automatically in the process. You might want to run VACUUM FULL ANALYZE tbl to remove bloat and update statistics for the query planner afterwards. And possibly make the column your new PRIMARY KEY ...
To make it a fully fledged serial column:
ALTER SEQUENCE tbl_tbl_id_seq OWNED BY tbl.tbl_id;
See:
Creating a PostgreSQL sequence to a field (which is not the ID of the record)
What you see are just row numbers that pgAdmin displays, they are not really stored in the database.
If you want an artificial numeric primary key for the table, you'll have to create it explicitly.
For example:
CREATE TABLE mydata (
id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
obec text NOT NULL,
datum timestamp with time zone NOT NULL,
...
);
Then to copy the data from a CSV file, you would run
COPY mydata (obec, datum, ...) FROM '/path/to/csvfile' (FORMAT 'csv');
Then the id column is automatically filled.

I want to Store folder name while copying data from S3 bucket to Redshift table

I am trying to load data from S3 bucket to redshift table,there is one column as source id in the table and i want to store the folder name where the source file is available,in to that column.
Actually i have multiple folders in S3 bucket and in each folder i have one file and i port all the files in same table with copy command in redshift, so to identify from which folder the data is, so i need to store the folder name along with data into the Redshift table, i have seperate column in table as Source id.
can any body help me.
If you are using the Redshift copy command, then you have no choice other than a process to import each folder (e.g. as a temp table) and then set your value manually the the value of the folder that you restored. repeat for each folder.
Another option is to use redshift spectrum and create an external table that maps to your folder as partitions.
first you create your base table like this
create external table spectrum.sales_part(
salesid integer,
listid integer,
sellerid integer,
buyerid integer,
eventid integer,
dateid smallint,
qtysold smallint,
pricepaid decimal(8,2),
commission decimal(8,2),
saletime timestamp)
partitioned by (saledate date)
row format delimited
fields terminated by '|'
stored as textfile
location 's3://awssampledbuswest2/tickit/spectrum/sales_partition/'
table properties ('numRows'='172000');
Then you add partitions to it like this
alter table spectrum.sales_part
add partition(saledate='2008-01-01')
location 's3://awssampledbuswest2/tickit/spectrum/sales_partition/saledate=2008-01/';
alter table spectrum.sales_part
add partition(saledate='2008-02-01')
location 's3://awssampledbuswest2/tickit/spectrum/sales_partition/saledate=2008-02/';
alter table spectrum.sales_part
add partition(saledate='2008-03-01')
location 's3://awssampledbuswest2/tickit/spectrum/sales_partition/saledate=2008-03/';
Once you have that set up as an external table, you can use standard sql against that table, for example you could run your queries against that table or copy it to a permanent redshift table using CTAS.
Here is a link to the documentation
https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-external-tables.html

How to clone or copy records in same table in postgres?

How to clone or copy records in same table in PostgreSQL by creating temporary table.
trying to create clones of records from one table to the same table with changed name(which is basically composite key in that table).
You can do it all in one INSERT combined with a SELECT.
i.e. say you have the following table definition and data populated in it:
create table original
(
id serial,
name text,
location text
);
INSERT INTO original (name, location)
VALUES ('joe', 'London'),
('james', 'Munich');
And then you can INSERT doing the kind of switch you're talking about without using a TEMP TABLE, like this:
INSERT INTO original (name, location)
SELECT 'john', location
FROM original
WHERE name = 'joe';
Here's an sqlfiddle.
This should also be faster (although for tiny data sets probably not hugely so in absolute time terms), since it's doing only one INSERT and SELECT as opposed to an extra SELECT and CREATE TABLE plus an UPDATE.
Did a bit of research, came up with a logic :
Create temp table
Copy records into it
Update the records in temp table
Copy it back to original table
CREATE TEMP TABLE temporary AS SELECT * FROM ORIGINAL WHERE NAME='joe';
UPDATE TEMP SET NAME='john' WHERE NAME='joe';
INSERT INTO ORIGINAL SELECT * FROM temporary WHERE NAME='john';
Was wondering if there was any shorter way to do it.

Import column from file with additional fixed fields

Can I somehow import a column or columns from a file, where I specify one or more fields held fixed for all rows?
For example:
CREATE TABLE users(userid int PRIMARY KEY, fname text, lname text);
COPY users (userid,fname) from 'users.txt';
but where lname is assumed to be 'SMITH' for all the rows in users.txt?
My actual setting is more complex, where the field I want to supply for all rows is part of the PRIMARY KEY.
Possibly something of this nature:
COPY users (userid,fname,'smith' as lname) from 'users.txt';
Since I can't find a native solution to this in Cassandra, my solution was to perform a preparation step with Perl so the file contained all the relevant columns prior to calling COPY. This works fine, although I would prefer an answer that avoided this intermediate step.
e.g. adding a column with 'Smith' for every row to users.txt and calling:
COPY users (userid,fname,lname) from 'users.txt';

How can I copy an IDENTITY field?

I’d like to update some parameters for a table, such as the dist and sort key. In order to do so, I’ve renamed the old version of the table, and recreated the table with the new parameters (these can not be changed once a table has been created).
I need to preserve the id field from the old table, which is an IDENTITY field. If I try the following query however, I get an error:
insert into edw.my_table_new select * from edw.my_table_old;
ERROR: cannot set an identity column to a value [SQL State=0A000]
How can I keep the same id from the old table?
You can't INSERT data setting the IDENTITY columns, but you can load data from S3 using COPY command.
First you will need to create a dump of source table with UNLOAD.
Then simply use COPY with EXPLICIT_IDS parameter as described in Loading default column values:
If an IDENTITY column is included in the column list, the EXPLICIT_IDS
option must also be specified in the COPY command, or the COPY command
will fail. Similarly, if an IDENTITY column is omitted from the column
list, and the EXPLICIT_IDS option is specified, the COPY operation
will fail.
You can explicitly specify the columns, and ignore the identity column:
insert into existing_table (col1, col2) select col1, col2 from another_table;
Use ALTER TABLE APPEND twice, first time with IGNOREEXTRA and the second time with FILLTARGET.
If the target table contains columns that don't exist in the source
table, include FILLTARGET. The command fills the extra columns in the
source table with either the default column value or IDENTITY value,
if one was defined, or NULL.
It moves the columns from one table to another, extremely quickly; took me 4s for 1GB table in dc1.large node.
Appends rows to a target table by moving data from an existing source
table.
...
ALTER TABLE APPEND is usually much faster than a similar CREATE TABLE
AS or INSERT INTO operation because data is moved, not duplicated.
Faster and simpler than UNLOAD + COPY with EXPLICIT_IDS.