Data masking using PostgreSQL at Amazon RDS, how to?

Data masking using PostgreSQL at Amazon RDS, how to? - postgresql

I was looking into this
https://postgres.ai/docs/database-lab/masking
and was searching how to mask (dynamically) my data using Amazon RDS PostgreSQL, but couldn't find any real solution.
Is there any one out there that solved that without recurring to server/backend masking?

If you want to retrieve readonly masked data I'd say to use views from the PostgreSQL. One way to achieve that would be:
Create a new schema, for example: "masked"
Create a new view inside the masked schema, with the same name as the table you want to mask your data dynamically. You'll set your masked rules on this said view.
Create a new user role which will retrieve the masked data and change its search_path to look first for the masked schema SET search_path TO '$user', masked, public
This way, if you make a query to select from the said table, it'll first look for the masked schema and if it has a view with the same name as the table you are doing your query, it'll retrieve the masked data from the rules applied to the view.

Related

PostgreSQL 9.5 Update virtual column in view

I am using PostgreSQL for GIS purposes with PostGIS and QGIS, but I think I might find more informations here than on gis.stackexchange.com since my question is not directly GIS related.
I am using views to diplay datas at will for the specific needs of my users, like that they have access to the datas as they are in the DB, but with just what they need. I added some rules to my views to make them "updatable" in QGIS and make them directly "workable" by the users.
Display in QGIS is based on attributes, but since there is the possibility that different persons will access the same data at the same moment, and they might want to display and hide some of these datas according to their needs (map printing for once). So I am looking for a way to give possibility to have a specific display for each view, and I thought about simply adding a "virtual" column in the view definition, with for example a boolean like such:
CREATE VIEW view1 AS
SELECT oid, column1, True as display from table1;
But I would like my users to be able to change the value of this column, to simply make appear or disappear the objects from the canvas (with a ruled-base styling considering this parameter). And obviously it doesn't work direclty since the update would be in conflict with the definition of the view.
Does anyone have any idea on how to achieve that? Maybe a materialized view (but I quite like the dynamics of the regular view)?
Thanks.

If the unique ID field is read from the table (i.e. it is not dynamically created via the row_number() trick commonly used with spatial views in QGIS), you could create a visibility manager table and use it in the view. You could have one table by view or one for all of them. Something similar to:
create table visibility_manager (oid bigint, visibility_status boolean, viewname text);
CREATE VIEW view1 AS
SELECT table1.oid, column1, coalesce(visibility_status,true) as display
from table1
left outer join visibility_manager
on (table1.oid = visibility_manager.oid and visibility_manager.viewname = 'view1');
see it in action:
http://rextester.com/OZPN1777

How to update PostgreSQL full text search field when relational data changes

I have the following strategy for the full text search in my web app which uses PostgreSQL for relational data storage. For example I will take Invoices table.
In the tables I have one additional field ALTER TABLE invoices ADD COLUMN tsv tsvector on which the full text search query is done like this ... WHERE tsv ## to_tsquery('query:*') ...
On every full text search table I have set an update trigger that updates tsv field on every change of the record. Update sets and concatenates the data from different fields to tsv field, sets the right weights, etc...
The data that gets set into tsv field can also be relational data from other tables. From example in table invoices I have client_id field but since I want to search invoices by the client name as well I also include clients.client_name data in the invoices.tsv field
My question is what is the best strategy to keep the relational data in tsv selectors in sync. In above scenario -> if client name changes I would need to update this in tsv field for every invoice...
Should I set cron job setup up that would do this every night? It could be also done with triggers, but since my database schema is very large I am scared it might get out of control if I have triggers all over the place.

If you add the clients name into the tsv field you will end up with more complexity. You might want to look into Materialized views as mentioned in this article. The trade-off might be speed in showing results and the need to refresh the view periodically. As of Postgres 9.4 you can now refresh a view concurrently.
Another thing you could do is create an update trigger in the Client table and when there's an update it will update the data in the Invoices table as well.

Changing a DB View dynamically according the current user-group

we are currently digging into Amazon Redshift and testing different functionalities.
One of our basic requirements is that we will define different user groups which in turn will be granted access to different views.
One way to go about this would be to implement one view seperately for each user-group. However, since we have a lot of user-groups that share almost the exact same need for information, I'm looking for a way to implement this more dynamically in Redshift.
For instance, let's say I have a user group called users_london and another one called users_berlin. Both will have access to a view called v_employee_master_data which contains the columns employee_name, employee_job_title and employee_city.
Both groups share the same scope of information with one exception - the column employee_city.
In essence, the view should be pre-filtered for a certain value in the column employee_city according to the currently logged-in user-group.
In SQL - something like this:
For the usergroup users_london:
SELECT * FROM v_employee_master_data WHERE employee_city = 'London';
For the usergroup users_berlin:
SELECT * FROM v_employee_master_data WHERE employee_city = 'Berlin';
Now to make the connection back to Amazon Redshift. Does the underlying DB runtime provide an out-of-the-box functionality to somehow catch the currently logged user-group as a form of global variable and alter the SQL-statement according to the value of that variable?

It is possible to do:
get current user
select current_user
find what group it belongs to
select groname from pg_group where current_user_id = any(grolist);
Extract city and capitalize it:
select initcap(substring(groname from 'users_(.*)')) from pg_group where current_user_id = any(grolist);
Now you have your city based on the "user". So just inject it in the view
... WHERE employee_city = initcap(substring(groname from 'users_(.*)') ...

Is there any risks involved with changing a table's schema

I want to know if there is any risks involved when changing a table's schema, for example from dbo. to xyz. or visa versa.
Would like to hear your views on this.

First which crossed my mind is when you have views, stored procedures etc which contains query like
SELECT *
FROM [sch].[tbl]
Once you move table to new schema you will get 'Invalid object name 'oldSchema.tableName'.

postgresql request over several schema

I have a database, with every users having a schema.
Is there a way to query a table in every schema?
Something like: select id, name from *.simulation doesn't work...
Thank you for your help !

No, you will need to write a function - either a server side function or a client side function in whatever language you're using - that executes the query once for each schema.
You could also create a VIEW that does UNION ALL between all the schemas, but that's going to be a lot of work to maintain if your schemas are dynamically added and removed.

Yes you can, use SET search_path TO ... to point to all schema's. If you don't know all the names of the schemas, wrap it in a function that first selects all schemas and then set the entire search_path.
http://www.postgresql.org/docs/current/interactive/sql-set.html

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Data masking using PostgreSQL at Amazon RDS, how to? - postgresql

I was looking into this https://postgres.ai/docs/database-lab/masking and was searching how to mask (dynamically) my data using Amazon RDS PostgreSQL, but couldn't find any real solution. Is there any one out there that solved that without recurring to server/backend masking?

Related

PostgreSQL 9.5 Update virtual column in view

How to update PostgreSQL full text search field when relational data changes

Changing a DB View dynamically according the current user-group

Is there any risks involved with changing a table's schema

postgresql request over several schema

Categories

Resources