Postgres Alter Column Datatype & Update Values - postgresql

I am new to writing postgres queries, I am stuck at a problem where the price is a string having $ prefix, I want to change the data type of column price to float and update the values by removing the $ prefix. Can someone help me do that?
bootcamp=# SELECT * FROM car;
id | make | model | price
----+---------+---------------------+-----------
1 | Ford | Explorer Sport Trac | $92075.96
2 | GMC | Safari | $81521.80
3 | Mercury | Grand Marquis | $64391.84
(3 rows)
bootcamp=# \d car
Table "public.car"
Column | Type | Collation | Nullable | Default
--------+-----------------------+-----------+----------+---------------------------------
id | bigint | | not null | nextval('car_id_seq'::regclass)
make | character varying(50) | | not null |
model | character varying(50) | | not null |
price | character varying(50) | | not null |
Thanks

You can cleanup the string while altering the table:
alter table car
alter column price type numeric using substr(price, 2)::numeric;

First you have to disable safe update mode to update without WHERE clause:
SET SQL_SAFE_UPDATES=0;
Then remove '$' from all rows:
UPDATE car SET price = replace(price, '$', '');
Then change the column type:
ALTER TABLE car ALTER COLUMN price TYPE your_desired_type;
If you want to enable safe update mode again:
SET SQL_SAFE_UPDATES=1;

Related

Bulk update datatype of a column in all relevant tables

An example of some tables with the column I want to change.
+--------------------------------------+------------------+------+
| ?column? | column_name | data_type |
|--------------------------------------+------------------+------|
| x.articles | article_id | bigint |
| x.supplier_articles | article_id | bigint |
| x.purchase_order_details | article_id | bigint |
| y.scheme_articles | article_id | integer |
....
There are some 50 tables that have the column.
I want to change the article_id column from a numeric data type to a textual data type. It is found across several tables. Is there anyway to update them all at once ? Information schema is readonly so I cannot do an update on it. Other than writing inidividual alter statements for all the tables, is there a better way to do it ?

Postgres: jsonb to jsonb[]?

Let's say I have this table:
ams=# \d player
Table "public.player"
Column | Type | Collation | Nullable | Default
-------------+--------------------------+-----------+----------+-------------------
id | integer | | not null |
created | timestamp with time zone | | not null | CURRENT_TIMESTAMP
player_info | jsonb | | not null |
And then I have this:
ams=# \d report
Table "public.report"
Column | Type | Collation | Nullable | Default
---------+--------------------------+-----------+----------+---------
id | integer | | not null |
created | timestamp with time zone | | not null |
data | jsonb[] | | not null |
How can I take the player_info from all the rows in the player table and insert that into a single row in the report table (into the data jsonb[] field)? My attempts with jsonb_agg() return a jsonb, and I can't for the life of me figure out how to go from jsonb to jsonb[]. Any pointers would be very much appreciated! Thanks in advance.
If you plainly want to copy the values, just treat it like any other data type, and use ARRAY_AGG.
SELECT ARRAY_AGG(player_info)
FROM player
WHERE id IN (...)
should return something of type json[].
Since jsonb[] is an array at the type level in PostgreSQL vs. a json array, use array_agg() instead of jsonb_agg().
insert into report
select 1 as id, now() as created, array_agg(player_info)
from player
;

Is it safe to use inheritance for meta-data migration?

Is it safe to create a template schema for metadata so that other schemas which contain data can inherit from it?
Advantage: Migrations will be seamless for multi tenant scenarios.
Disadvantages: ?
Example:
hello=# CREATE SCHEMA template;
hello=# CREATE TABLE template.cities (
name text,
population real,
altitude int;
hello=# CREATE SCHEMA us;
hello=# CREATE TABLE us.cities () INHERITS (template.cities);
hello=# \d us.cities;
Table "us.cities"
Column | Type | Collation | Nullable | Default
------------+--------------+-----------+----------+---------
name | text | | |
population | real | | |
altitude | integer | | |
Inherits: template.cities
hello=# CREATE SCHEMA eu;
hello=# CREATE TABLE eu.cities () INHERITS (template.cities);
hello=# \d eu.cities;
Table "eu.cities"
Column | Type | Collation | Nullable | Default
------------+--------------+-----------+----------+---------
name | text | | |
population | real | | |
altitude | integer | | |
Inherits: template.cities
hello=# ALTER TABLE cities ADD COLUMN state varchar(30);
hello=# \d us.cities;
Table "us.cities"
Column | Type | Collation | Nullable | Default
------------+-----------------------+-----------+----------+---------
name | text | | |
population | real | | |
altitude | integer | | |
state | character varying(30) | | |
Inherits: template.cities
I think inheritance is a good approach to this problem.
I can think of two down sides:
It is possible to create additional columns to the inheritance children. If you control DDL, you can probably prevent that.
You still have to create and modify indexes on all inheritance children individually.
If you are using PostgreSQL v11 or later, you could prevent both problems by using partitioning. The individual tables would then be partitions of the “template” table. This way, you can create indexes centrally by creating a partitioned index on the template table. The disadvantage (that may make this solution impossible) is that you need a partitioning key column in the table.

GIST index creation too slow on PostgreSQL

I have a database in PostgreSQL with the following structure:
Column | Type | Collation | Nullable | Default
-------------+-----------------------+-----------+----------+------------------------------------------------
vessel_hash | integer | | not null | nextval('samplecol_vessel_hash_seq'::regclass)
status | character varying(50) | | |
station | character varying(50) | | |
speed | character varying(10) | | |
longitude | numeric(12,8) | | |
latitude | numeric(12,8) | | |
course | character varying(50) | | |
heading | character varying(50) | | |
timestamp | character varying(50) | | |
the_geom | geometry | | |
Check constraints:
"enforce_dims_the_geom" CHECK (st_ndims(the_geom) = 2)
"enforce_geotype_geom" CHECK (geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL)
"enforce_srid_the_geom" CHECK (st_srid(the_geom) = 4326)
The database contains ~146.000.000 records and the size of table that contains the data is:
public | samplecol | table | postgres | 31 GB |
I try to create a GIST index on the geometry field the_geom with this command:
create index samplecol_the_geom_gist on samplecol using gist (the_geom);
but takes too long. It runs 2 hours already.
Based on this question Slow indexing of 300GB Postgis table
Ask Question, before index creation I execute in psql console:
ALTER SYSTEM SET maintenance_work_mem = '1GB';
ALTER SYSTEM
SELCT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
But index creation takes too long. Does anyone know why? An how to fix this?
I am afraid you'll have to sit it out.
Apart from high maintenance_work_mem, there is not really a tuning option.
Increasing max_wal_size will help somewhat, since you will get fewer checkpoints.
If you can't live with an ACCESS EXCLUSIVE lock for that long, try CREATE INDEX CONCURRENTLY, which will be even slower, but won't block concurrent database activity.

Postgres varchar column giving error "invalid input syntax for integer"

I'm trying to use a INSERT INTO ... SELECT statement to copy columns from one table into another table but I am getting an error message:
gis=> INSERT INTO places (SELECT 0 AS osm_id, 0 AS code, 'country' AS fclass, pop_est::numeric(10,0) AS population, name, geom FROM countries);
ERROR: invalid input syntax for integer: "country"
LINE 1: ...NSERT INTO places (SELECT 0 AS osm_id, 0 AS code, 'country' ...
The SELECT statement by itself is giving a result like I expect:
gis=> SELECT 0 AS osm_id, 0 AS code, 'country' AS fclass, pop_est::numeric(10,0) AS population, name, geom FROM countries LIMIT 1;
osm_id | code | fclass | population | name | geom
--------+------+---------+------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0 | 0 | country | 103065 | Aruba | 0106000000010000000103000000010000000A000000333333338B7951C0C8CCCCCC6CE7284033333333537951C03033333393D82840CCCCCCCC4C7C51C06066666686E0284000000000448051C00000000040002940333333333B8451C0C8CCCCCC0C18294099999999418351C030333333B3312940333333333F8251C0C8CCCCCC6C3A294000000000487E51C000000000A0222940333333335B7A51C00000000000F62840333333338B7951C0C8CCCCCC6CE72840
(1 row)
But somehow it looks like it's getting confused thinking that the fclass column should be an integer when, in fact, it is actually a character varying(20)
gis=> \d+ places
Unlogged table "public.places"
Column | Type | Modifiers | Storage | Stats target | Description
------------+------------------------+------------------------------------------------------+----------+--------------+-------------
gid | integer | not null default nextval('places_gid_seq'::regclass) | plain | |
osm_id | bigint | | plain | |
code | smallint | | plain | |
fclass | character varying(20) | | extended | |
population | numeric(10,0) | | main | |
name | character varying(100) | | extended | |
geom | geometry | | main | |
Indexes:
"places_pkey" PRIMARY KEY, btree (gid)
"places_geom" gist (geom)
I've tried casting all of the columns to their exact types they need to be for the destination table but that doesn't seem to have any effect.
All of the other instances of this error message I can find online appear to be people trying to use empty strings as an integer which isn't relevant here because I'm selecting a constant string as fclass.
You need to specify the column names you are inserting into:
INSERT INTO places (osm_id, code, fclass, population, name, geom) SELECT ...
Without specifying them individually, it is assumed that all columns are to be inserted into - including gid, which you want to have auto-populate. So, 'country' is actually being inserted into code by your current INSERT statement.