Postgres: truncate values of jsonb field - postgresql

I need to move data from PostgreSQL's jsonb to Redshift's SUPER types.
However, some of the values from the jsonb are higher than the 65535 limit on VARCHAR values that exists on Redshift (and SUPER types also for that matter).
Is there a way to select from jsonb truncating values? E.g. applying a map function (to truncate up to a number of characters) to each value on jsonb?
Bonus: Can this be done recursively?

Related

Alter a Column from INTEGER To BIGINT

In my database I have several fields with INTEGER Type. I need to change some of them to BIGINT.
So my question is, can I just use the following command?
ALTER TABLE MyTable ALTER COLUMN MyIntegerColumn TYPE BIGINT;
Are the contained data be converted the correct way? After the convert is this column a "real" BIGINT column?
I know this is not possible if there are constraints on this column (Trigger, ForeingKey,...). But if there are no constraints is it possible to do it this way?
Or is it better to convert it by a Help-Column:
MyIntegerColumn -> MyIntegerColumnBac -> MyBigIntColumn
When you execute
ALTER TABLE MyTable ALTER COLUMN MyIntegerColumn TYPE BIGINT;
Firebird will not convert existing data from INTEGER to BIGINT, instead it will create a new format version for the table.
When inserting new rows or updating existing rows, the value will be stored as a BIGINT, but when reading Firebird will convert 'old' rows on the fly from INTEGER to BIGINT. This happens transparently for you as the user. This is to prevent needing to rewrite all existing rows, which could be costly (IO, garbage collection of old versions of rows, etc).
So please, do use ALTER TABLE .. ALTER COLUMN, do not do MyIntegerColumn -> MyIntegerColumnBac -> MyBigIntColumn. There are some exceptions to this rule, eg (potentially) lossy character set transformations are better done that way to prevent transliterations errors on select if a character does not exist in the new character set, or changing a (var)char column to be shorter (which can't be done with alter column).
To be a little more specific: when a row is written in the database it contains a format version (aka version count) of that row. The format version points to a description of a row (datatypes, etc) how Firebird should read that row. An alter table will create a new format version, and that format will be applied when writing new rows or updating existing rows. When reading an old row, Firebird will apply necessary transformation to present that row as the new format (for example adding new columns with their default values, transforming a data type of a column).
These format versions are also a reason why the number of alter tables are restricted: if you apply more than 255 alter tables on a single table you must backup and restore the database (the format version is a single byte) before further changes are allowed to that table.

POSTGRESQL:autoincrement for varchar type field

I'm switching from MongoDB to PostgreSQL and was wondering how I can implement the same concept as used in MongoDB for uniquely identifying each raws by MongoId.
After migration, the already existing unique fields in our database is saved as character type. I am looking for minimum source code changes.
So if any way exist in postgresql for generating auto increment unique Id for each inserting into table.
The closest thing to MongoDB's ObjectId in PostgreSQL is the uuid type. Note that ObjectId has only 12 bytes, while UUIDs have 128 bits (16 bytes).
You can convert your existsing IDs by appending (or prepending) f.ex. '00000000' to them.
alter table some_table
alter id_column
type uuid
using (id_column || '00000000')::uuid;
Although it would be the best if you can do this while migrating the schema + data. If you can't do it during the migration, you need to update you IDs (while they are still varchars: this way the referenced columns will propagate the change), drop foreign keys, do the alter type and then re-apply foreign keys.
You can generate various UUIDs (for default values of the column) with the uuid-ossp module.
create extension "uuid-ossp";
alter table some_table
alter id_column
set default uuid_generate_v4();
Use a sequence as a default for the column:
create sequence some_id_sequence
start with 100000
owned by some_table.id_column;
The start with should be bigger then your current maximum number.
Then use that sequence as a default for your column:
alter table some_table
alter id_column set default nextval('some_id_sequence')::text;
The better solution would be to change the column to an integer column. Storing numbers in a text (or varchar) column is a really bad idea.

pandas read_sql convers column names to lower case - is there a workaroud?

related: pandas read_sql drops dot in column names
I use pandas.read_sql to create a data frame from an sql query from a postgres database.
some column aliases\names use mixed case, and I want it to propagate to the data frame.
however, pandas (or the underlining engine - SQLAlchemy as much as I know) return only lower case field names.
is there a workaround?
(besides using a lookup table and fix the values afterwards)
Postgres normalizes unquoted column names to lower case. If you have such a table:
create table foo ("Id" integer, "PointInTime" timestamp);
PostgreSQL will obey the case, but you will have to specify table names quoted as such:
select "Id", "PointInTime" from foo;
A better solution is to add column aliases, eg:
select name as "Name", value as "Value" from parameters;
And Postgres will return properly cased column names. If the problem is SQLAlchemy or pandas, then this will not suffice.

Change data type varchar to timestamp along with null values in PostgreSQL

Change data type varchar to timestamp along with null values in PostgreSQL
I have a column with empty rows and few with timestamp rows. How to convert that into timestamp data type in PostgreSQL?
You need a USING clause to turn the empty strings into null.
ALTER TABLE ...
ALTER COLUMN mycol
TYPE timestamp
USING (...conversion expression...)
Without seeing the input data it's hard to say exactly what that expression must be, but it probably involves nullif or case expressions and the to_timestamp function and/or a CAST to timestamp.

PostgreSQL indexes for hstore boolean attributes

I have an hstore column called extras and I have defined there many attributes some of them are boolean and I would like to index some of them for example extras->'delivered' in this case which would be the best way to index some of these attributes.
If you answer could you tell me if your technique applies for decimal or other types.
Thanx.
Indexing individual keys in a hstore field
The current hstore version doesn't have typed values. All values are text. So you can't directly define a "boolean" index on an hstore value. You can, however, cast the value to boolean and index the cast expression.
CREATE INDEX sometable_extras_delivered_bool
ON sometable ( ((extras->'delivered')::boolean) );
Only queries that use the expression (extras->'delivered')::boolean) will benefit from the index. If the index expression uses a cast, the query expressions must too.
This b-tree index on a hstore field will be less efficient to create and maintain than a b-tree index of a boolean col directly in the table. It'll be much the same to query.
Indexing all keys in a hstore field
If you want a general purpose index that indexes all hstore keys, you can only index them as text. There's no support for value typing in hstore in PostgreSQL 9.3.
See indexes on hstore.
This is useful when you don't know in advance which keys you need to index.
(Users on later, pre-release at time of writing versions of PostgreSQL with the json-compatible hstore version 2 will find that their hstore supports typed values).
Reconsider your data model
Frankly, if you're creating indexes on fields in a hstore that you treat as boolean, then consider re-thinking your data model. You are quite likely better off having this boolean as a normal field of the table that contains the hstore.
You can store typed values in json, but you don't get the GIN / GiST index support that's available for hstore. This will improve in 9.4 or 9.5, with hstore 2 adding support for typed, nested, indexable hstores and a new json representation being built on top of that.
Partial indexes
For booleans you may also want to consider partial index expressions where the boolean is a predicate on another index, instead of the actual indexed column. E.g:
CREATE INDEX sometable_ids_delivered ON sometable(id) WHERE (delivered);
or, for the hstore field:
CREATE INDEX sometable_ids_delivered ON sometable(id) WHERE ((extras->'delivered')::boolean);
Exactly what's best depends on your queries.