update content of column from another schema.table.column postgresql - postgresql

I have 2 schemas in the same database (postgresql).
schema1
schema2
Each schema has table users and mail column.
how can i copy the content from mail column in schema1.users to mail column to schema2.users for all rows
i tried:
update schema1.users
set mail=(select mail from schema2.users);
but didn't work.

You can do an UPDATE joining the tables, assuming both of your tables have matching ID's, it'll look like this:
UPDATE schema1.users a
SET mail=b.mail
FROM schema2.users b
WHERE a.id=b.id
What I'm doing is Joining the tables and updating mail on schema1.users for every matching id.
EDIT: I just read that you actually wanted to update the schema2.users mails. The query will be this one:
UPDATE schema2.users a
SET mail=b.mail
FROM schema1.users b
WHERE a.id=b.id

you can join the two tables. I joined on user but i don't know what the tablelayout looks like.
update schema2.users
set mail=s1.mail
from schema1.mail as s1
where users.user = s1.user

Not an elegant solution but this is what I am doing from the command line in a script to copy a specific table from one schema to a specific table in a different schema
set search_path to schema1;
\copy (select field1, nextfield, morefields from table_in_schema1 ) TO export.sql delimiter ',' ;
set search_path to schema2;
\copy table_name_in_schema2(matchingfield1, field_that_matches_next, another_field_forPosition2, ) FROM 'export.sql' DELIMITER ',' ;
I have several tables that I need to export to a single table in the new schema so I have this scripted so I have a loop that reads a file containing the table names to import into schema two and then send them to the database by using the command below - which will load around 30 tables to the table in the second schema.
psql -d alltg -c "\i bulk_import.sql"

Related

Is there a way to treat a csv like a table to match keys and import data to appropriate rows in postgres?

We have multiple data sources we're trying to merge into a DB table that are not ordered and may not even have matching records. We have a column that is common to both that we'd like to match up and merge the records. I'm looking to find a command that we can write that will do something like:
if column1.table = column1.csvfile then update table set column2.table = column2.csvfile WHERE column1.table = column1.csvfile
Scanning through each row of the CSV.
COPY assumes that your data is in order.
file_fdw is made precisely for this requirement.
Define a foreign table on the CSV file, then you can query it like a regular table.
An easy way to do this is to create a temporary table (let's call it table2) with a structure that matches your CSV file, and COPY the file to there. Then you can run a simple update:
UPDATE table1
SET column2 = table2.column2
FROM table2
WHERE table1.column1 = table2.column1;
And then drop table2 when you're done.

create (or copy) table schema using postgres_fdw or dblink

I have many tables in different databases and want to bring them to a database.
It seems like I have to create foreign table in the database (where I want to merge them all) with schemas of all the tables.
I am sure, there is a way to automate this (by the way, I am going to use psql command) but I do not know where to start.
what I have found so far is I can use
select * from information_schema.columns
where table_schema = 'public' and table_name = 'mytable'
I added more detail explanation.
I wanted to copy tables from another database
the tables have same column names and data type
using postgres_fdw, I needed to set up a field name and data type for each tables (the table names are also same)
then, I want to union the tables have same name all to have one single table.
for that, I am going to add prefix on table
for instance, mytable in db1, mytable in db2, mytable in db3 as in
db1_mytable, db2_mytable, db3_mytable in my local database.
Thanks to Albe's comment, I managed it and now I need to figure out doing 4th step using psql command.

Workaround in Redshift for "ADD COLUMN IF NOT EXISTS"

I'm trying to execute an S3 copy operation via Spark-Redshift and I'm looking to modify the Redshift table structure before running the copy command in order to add any missing columns (they should be all VARCHAR).
What I'm able to do is send an SQL query before running the copy, so ideally I would have liked to ALTER TABLE ADD COLUMN IF NOT EXISTS column_name VARCHAR(256). Unfortunately, Redshift does not offer support for ADD COLUMN IF NOT EXISTS, so I'm currently looking for a workaround.
I've tried to query the pg_table_def table to check for the existence of the column, and that works, but I'm not sure how to chain that with an ALTER TABLE statement. Here's the current state of my query, I'm open to any suggestions for accomplishing the above.
select
case when count(*) < 1 then ALTER TABLE tbl { ADD COLUMN 'test_col' VARCHAR(256) }
else 'ok'
end
from pg_table_def where schemaname = 'schema' and tablename = 'tbl' and pg_table_def.column = 'test_col'
Also, I've seen this question: Redshift: add column if not exists, however the accepted answer isn't mentioning how to actually achieve this.

Update Big Query Table Schema

I have a table already in BQ that is populated with data. I want to rename the headings (update the schema) of the table. I'm using command line tool
Presuming it's something along the lines of this??
bq update --schema:Col1:STRING,Col2:STRING....... data_set.Table_Name
But I'm getting
FATAL Flags parsing error: Unknown command line flag 'schema:Col1:STRING,Col2:STRING.....'
What am I missing?
As Mosha says, renaming columns is not supported via API, but you could run a query that scans the whole table and overwrites it.
bq query --nouse_legacy_sql \
--destination_table p:d.table \
--replace \
'SELECT * EXCEPT(col1,col2), col1 AS newcol1, col2 AS newcol2 FROM `p.d.table`'
Warning: This overwrites the table. But that's what you wanted anyways.
Now BigQuery supports renaming the column's via sql query
ALTER TABLE [IF EXISTS] table_name
RENAME COLUMN [IF EXISTS] column_to_column[, ...]
column_to_column :=
old_column_name TO new_column_name
https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#alter_table_rename_column_statement
The correct syntax on command line would be
bq update --schema col1:STRING,col2,STRING dataset.table
However, renaming fields is not supported schema change - you will get error message saying
Provided Schema does not match table
You can only add new fields or relax existing fields (i.e. from REQUIRED to NULLABLE).

Import and overwrite duplicate rows

I'm importing some rows to my postgres database like so:
psql -U postgres import_test < 1432798324_data
Where my import_test is my database and 1432798324_data file is just plain text formatted like:
COPY cars FROM stdin;
<row data>
<row data>
...
\.
COPY drivers FROM stdin;
<row data>
<row data>
...
\.
(I got the format for this plain text file from the answer here).
This method works fine when I'm importing into a blank database. However, if the database isn't blank and during the import any duplicate rows are found I get an error:
ERROR: duplicate key value violates unique constraint "car_pkey"
Is there any way I could modify my import command to force an overwrite if duplicates are found? In other words, if I'm importing a row and there's already a row with that id, I want my new row to overwrite it.
You can import into a temporary table. Then you can delete rows that were already there before you copy over the new data:
create temporary table import_drivers as select * from drivers limit 0;
copy import_drivers from stdin;
begin transaction;
delete from drivers
where id in
(
select id
from import_drivers
);
insert into drivers
select *
from import_drivers;
commit transaction;
One way to deal with this where you are constantly doing a bulk import (lets say daily) is to use table partitioning.
You would just add a time field to your cars and drivers table. The time field is the time when you do the import. Your primary key will have to change for both tables as a two tuple of your existing primary key and the time field.
Once you are done you then just drop older tables (if you are using a daily scheme then you would drop the previous day) or alternatively use max(time_field) in your queries.