How to use security policy packages on Databricks

How to use security policy packages on Databricks - pyspark

I am trying to create data security policies on user tables on Databricks. However i have implemented this task on SQL server with below SQL Queries
CREATE FUNCTION [test].[mailfunction](#useremail AS nvarchar(100))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN SELECT 1 AS mailfunction_result WHERE #useremail = SUSER_SNAME()
GO
create SECURITY POLICY [mailfunctionSecurityPolicy]
ADD FILTER PREDICATE [test].[mailfunction]([useremail]) ON
test.users WITH (STATE = OFF);
And i am trying this to implement on Databrick and created the function but i am not able to create SECURITY POLICY on Databricks.
I need to create the function or work around for Create function in databricks and need to archive role base access control on my table as we achieved in SQL side.
Also please suggest some reference code for implement Role based access and Row and Column level security and data masking implementation databricks.

Right now there is no exact the same functionality but it's coming in the near future - you can watch latest Databricks quarterly roadmap webinar to get more details about upcoming functionality for RBAC & ABAC.
But right now you can dynamic views over the tables to implement row-level access control and data masking. For this you can use current_user and is_member functions to perform checks. Like this (example from docs):
CREATE VIEW sales_redacted AS
SELECT user_id,
CASE WHEN
is_member('auditors') THEN email
ELSE 'REDACTED'
END AS email,
country, product, total
FROM sales_raw
And you can use user/group names from the data itself, it's not necessary to use hard-coded group names in the is_member call. You can see example in the following answer.

Related

How to design security policies for a following system including counters in postgres/supabase if postgres functions are used?

I am unsure how to design security policies for a following system including counters in postgres/supabase. My database includes two tables:
Users:
uuid|name|follower_counter
------------------------------
xyz |tobi| 1
Following-Relationship
follower| following
---------------------------
uuid_1 | uuid_2
Once a user follows a different user, I would like to use a postgres function/transaction to
Insert a new following-follower relationship
Update the followed users' counter
BEGIN
create follower_relationship(follower_id, following_id);
update increment_counter_of_followed_person(following_id);
END;
The constraint should be that the users table (e.g. the name column) can only be altered by the user owning the row. However, the follower_counter should open to changes from users who start following that user.
What is the best security policy design here? Should I add column security or should exclude the counters to a different table?
Do I have to pass parameters to the "block transaction" to ensure that the update and insert functions are called with the needed rights? With which rights should I call the block function?

It might be better to take a different approach to solve this problem. Instead of having a column dedicated to counting the followers, I would recommend actually counting the number of followers when you query the users. Since you already have Following-Relationship table, we just need to count the rows within the table where following or follower is the querying user.
When you have a counter, it might be hard to keep the counter accurate. You have to make sure the number gets decremented when someone unfollows. What if someone blocks a user? What if a user was deleted? There could be a lot of situations that could throw off the counter.
If you count the number of followings/followers on the fly, you don't need to worry about those situations at all.
Now obvious concern with this approach that you might have is performance, but you should not worry too much about it. Postgres is a powerful database that has been battle tested for decades, and with a proper index in place, it can easily perform these query on the fly.
The easiest way of doing this in Supabase would be to create a view like this the following. Once you create a view, you can query it from your Supabase client just like a typical table!
create or replace view profiles as
select
id,
name,
(select count(*) from following_relationship where followed_user_id = id) as follower_count,
(select count(*) from following_relationship where following_user_id = id) as following_count
from users;

Using multiple schemas in dbt Cloud based on postgresql instance

I am using dbt Cloud to complete the tutorial to learn more about the tool. I'm using Postgresql because I don't have any access to the paid tools that are heavily supported.
I have orders and customers in the jaffle_shop schema and payment in the stripe schema. I have the following staging table for payment:
with payment as (
select
orderid as order_id,
amount
from dbt.stripe.payment
)
select * from payment
When I try to do the simple pull to test:
with payments as (
select * from {{ref('stg_payments')}}
)
select * from payments
I get an error. I try the compile it keeps insisting on going back to the jaffle_shop default schema, even though I've been more specific about using stripe as above (compiled output below with wrong schema):
with payments as (
select * from "dbt"."jaffle_shop"."stg_payments"
)
select * from payments
limit 500
/* limit added automatically by dbt cloud */
Is there something I should do differently to make it go to the correct schema? Or is this a limit of dbt Cloud and Postgresql? Thank you.

You need to add a config to the stg_payments model to tell dbt to build that model in another schema. See the Docs for Custom Schemas
So in stg_payments.sql:
{{ config(schema='stripe') }}
...
Then, when you use {{ ref('stg_payments') }}, dbt will compile that to use the stripe custom schema. (However! note that dbt will prepend the target name to stripe to make the custom schema name, so dbt can create models in multiple dev and prod environments).
You can override this behavior (as noted in the docs), but I wouldn't recommend it, since you will lose the ability to develop in multiple environments.
Lastly, if you have a source table that is loaded into a specific schema (and not a model that is created/managed by dbt), you should reference that in your dbt models using the source() macro, instead of ref() -- the source config will allow you to specify the exact schema where the source data lives

Does DB2 database have feature to add Sensitive Data Indicator to an Object?

SQL Server gives a feature to add Sensitive Indicator for Columns/Objects to identify what kind of data is store in that Column.
CREATE TABLE STUDENT (SNAME VARCHAR(1000))
ADD SENSITIVITY CLASSIFICATION TO
dbo.STUDENT.SNAME
WITH ( LABEL='Highly Confidential', INFORMATION_TYPE='Financial', RANK=CRITICAL )
Then we can fetch this Information with the following query.
SELECT *FROM sys.sensitivity_classifications
Does DB2 have any feature similar to this?
SQLServer Documentation : SQLServer_Documention_For_Sensitive_Data_Indicator

Db2 has the security feature of Label-Based Access Control (LBAC). You can define and later assign security labels and policies to data. Moreover, you then define access control rules based on those labels.
INSERT INTO student VALUES ('Henrik', SECLABEL_BY_NAME('Highly Confidential', 'Financial') )

IBM DB2 Timetravel logging based on some criteria

I have been searching for the condition, where, lets say when we enable time travel to a certain table in DB2 , but don't want to capture all the updates done, but only the updates that's done by some specific user.
Wanted to know if this is at all possible with the DB2 time travel and how we can achieve it .

It's not possible with DB2 temporal tables.

Alter the temporal table add a user column maintained by system.
db2 for Iseries column shown
EMP_CHANGE_USER VARCHAR(18) GENERATED ALWAYS AS (USER)
The new column will go automatically to the history table of the temporal table. You can report on the history table and have emp_change user.
Note: IRL Don't single out users. You can give management a report that lists out all users and management can filter it down to individuals. Programmers do not single out users for reporting and logging.

SELECT WHERE CASE statement

I am working on a project in which the user's access to records is restricted based on the user's User Group. I have created a global variable $usr_sec_group, and I want to add to the WHERE clause in the SELECT statement for several applications a CASE statement that applies a different filter based on the value of $usr_sec_group. I am a relative "newbie" with regards to mySQL, and my attempts at writing such a statement haven't worked. Here is the basic logic:
SELECT
field1,
field2,
etc
FROM
Organizations
CASE $user_sec_group
WHEN 1 THEN 'filter_statement_1'
WHEN 2 THEN 'filter_statement_2'
WHEN 3 THEN 'filter_statement_3'
ELSE 'filter_statement_else'
END CASE
ORDER By
field1
The 'filter_statements' could be any valid filter, such as
'oName => 'a' AND oName < 'g'
I am assuming that the problem is a relatively simple matter of syntax, but so far I haven't been able to write a CASE statement that works.
I will be grateful for some guidance!
Best regards,
Eric

Your attempted solution will not work: it's not just a question of syntax, you would have to use dynamic sql. Even if you used dynamic sql, it is not a good way to manage access permissions.
A better way is to create specific views at various levels of access and then grant appropriate access to specific users and revoke access for others:
GRANT SELECT ON MyDatabase.viewABC
TO 'someuser'#'somehost';
See
The Grant/Revoke Command
An introduction to MySQL permissions
How to grant multiple users privileges; MySQL

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to use security policy packages on Databricks - pyspark

Related

How to design security policies for a following system including counters in postgres/supabase if postgres functions are used?

Using multiple schemas in dbt Cloud based on postgresql instance

Does DB2 database have feature to add Sensitive Data Indicator to an Object?

IBM DB2 Timetravel logging based on some criteria

SELECT WHERE CASE statement

Categories

Resources