I am trying get the PII data from postgreSQL table. But I can't display the raw data.
How to Pseudonymize the data while fetching(select) it from postgreSQL database?
You can always create (pseudo)anonymized views for tables, and select from those. As for anonymization techniques, that depends on data, you can use regex replaces and md5 very easily in postgres.
Related
I have two databases - Cloudant and IBM Db2. I have a table in each of these databases that hold static data that is only read from and never updated. These were created a long time ago and I'm not sure if they are used today so I wish to do a clean-up.
I want to determine if these tables or rows from these tables, are still being read from.
Is there a way to record the read timestamp (or at least know if it is simply accessed like a dirty bit) on a row of the table when it is read from?
OR
Record the read timestamp of the entire table (if any record from it is accessed)?
There is SYSCAT.TABLES.LASTUSED system catalog column in Db2 for DML statements on whole table.
There is no way to track each table row read access.
I tried searching for it but couldn't find out
What is the best way to copy data from Redshift to Postgresql Database ?
using Talend job/any other tool/code ,etc
anyhow i want to transfer data from Redshift to PostgreSQL database
also,you can use any third party database tool if it has similar kind of functionality.
Also,as far as I know,we can do so using AWS Data Migration Service,but not sure our source db and destination db matches that criteria or not
Can anyone please suggest something better ?
The way I do it is with a Postgres Foreign Data Wrapper and dblink,
This way, the redshift table is available directly within Postgres.
Follow the instructions here to set it up https://aws.amazon.com/blogs/big-data/join-amazon-redshift-and-amazon-rds-postgresql-with-dblink/
The important part of that link is this code:
CREATE EXTENSION postgres_fdw;
CREATE EXTENSION dblink;
CREATE SERVER foreign_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host '<amazon_redshift _ip>', port '<port>', dbname '<database_name>', sslmode 'require');
CREATE USER MAPPING FOR <rds_postgresql_username>
SERVER foreign_server
OPTIONS (user '<amazon_redshift_username>', password '<password>');
For my use case I then set up a postgres materialised view with indexes based upon that.
create materialized view if not exists your_new_view as
SELECT some,
columns,
etc
FROM dblink('foreign_server'::text, '
<the redshift sql>
'::text) t1(some bigint, columns bigint, etc character varying(50));
create unique index if not exists index1
on your_new_view (some);
create index if not exists index2
on your_new_view (columns);
Then on a regular basis I run (on postgres)
REFRESH MATERIALIZED VIEW your_new_view;
or
REFRESH MATERIALIZED VIEW CONCURRENTLY your_new_view;
In the past, I managed to transfer data from one PostgreSQL database to another by doing a pg_dump and piping the output as an SQL command to the second instance.
Amazon Redshift is based on PostgreSQL, so this method should work, too.
You can control whether pg_dump should include the DDL to create tables, or whether it should just load the data (--data-only).
See: PostgreSQL: Documentation: 8.0: pg_dump
I followed the manual on: https://docs.timescale.com/v1.0/using-timescaledb/backup
When I dump it into a binary file everything work out as expected (can restore it easily).
However, when I dump it into plain text SQL, insertions to hyper tables will be created. Is that possible to create INSERTION to the table itself?
Say I have an 'Auto' table with columns of id,brand,speed
and with only one row: 1,Opel,170
dumping into SQL will result like this:
INSERT INTO _timescaledb_catalog.hypertable VALUES ...
INSERT INTO _timescaledb_internal._hyper_382_8930_chunk VALUES (1, 'Opel',170);
What I need is this (and let TS do the work in the background):
INSERT INTO Auto VALUES (1,'Opel',170);
Is that possible somehow? (I know I can exclude tables from pg_dump but that wouldn't create the needed insertion)
Beatrice. Unfortunately, pg_dump will dump commands that mirror the underlying implementation of Timescale. For example, _hyper_382_8930_chunk is a chunk underlying the auto hypertable that you have.
Might I ask why you don't want pg_dump to behave this way? The SQL file that Postgres creates on a dump is intended to be used by pg_restore. So as long as you dump and restore and see correct state, there is no problem with dump/restore.
Perhaps you are asking a different question?
Is there a way to calculate a checksum of UUIDs in PostgreSQL?
I have a table in PostgreSQL, and in another PostgreSQL database I have a similar table, with (perhaps) the same data. The key is a UUID.
What I want to do is calculate a checksum of all the UUIDs in each table, so I can compare the checksums for both tables. I could read all those keys to my client program and perform the calculations there, but I would prefer to do it on the server. Ideally with a single, simple query. Is there a way to do this?
SELECT md5(string_agg(my_uuid_column::text, '' ORDER BY my_uuid_column)) FROM my_table
I have to transfer data from old database to new database where table name and column name is different.
Can it be done with DOS command or any other solution?
One is POSTGRESQL and old is MYSQL.
My concern is table name and column names are different, column number is same.
Thank you
I do not know the postgresql part but for sql server you can use sqlcmd.exe to export data as text format with or w/o column names
Please check
http://msdn.microsoft.com/en-us/library/ms162773.aspx