Postgres unique constraint on groups of rows - postgresql

I'm using postgresql 10.12
I have labeled entities. Some are standard, some are not. Standard entities are shared among all users, whereas not standard entities are user owned. So let's say I have a table Entity with a text column Label, and a column user_id which is null for standard entities.
CREATE TABLE Entity
(
id uuid NOT NULL PRIMARY KEY,
user_id integer,
label text NOT NULL,
)
Here is my constraint : two not standard entities belonging to different users can have the same label. Standard entities labels are unique, and entities of a given users have unique labels. The hard part is: a label must be unique within a group of standard entities + a given user's entities.
I'm using sqlAlchemy, here is the constraints I've made so far:
__table_args__ = (
UniqueConstraint("label", "user_id", name="_entity_label_user_uc"),
db.Index(
"_entity_standard_label_uc",
label,
user_id.is_(None),
unique=True,
postgresql_where=(user_id.is_(None)),
),
)
My problem with this constraint is that I do not guarantee that a user entity won't have a standard entity label.
Example:
+----+---------+------------+
| id | user_id | label |
+----+---------+------------+
| 1 | null | std_ent |
| 2 | 42 | user_ent_1 |
| 3 | 42 | user_ent_2 |
| 4 | 43 | user_ent_1 |
+----+---------+------------+
This is a valid table. I want to make sure that it is not possible anymore to create an entity with label std_ent, that user 42 cannot create another entity with label user_ent_1 or user_ent_2 and that user 43 cannot create another entity with label user_ent_1.
With my current constraints, it is still possible for users 42 and 43 to create an entity with label std_ent, which is what I want to fix.
Any idea?

If your unique constraint(s) are doing their job of preventing users from entering duplicate labels for their own "user entities" then you can prevent them from entering the label of a "standard entity" by adding a trigger.
You create a function …
CREATE OR REPLACE FUNCTION public.std_label_check()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
begin
if exists(
select * from entity
where label = new.label and user_id is null) then
raise exception '"%" is already a standard entity', new.label;
end if;
return new;
end;
$function$
;
… and then attach it as a trigger to the table
CREATE TRIGGER entity_std_label_check
BEFORE INSERT
ON public.entity FOR EACH ROW
EXECUTE PROCEDURE std_label_check()

Related

How do I update a value within a table that has a constraint?

Example Table: id_rel
id | other_id
-----------
1 | 123
-----------
2 | 456
-----------
3 | 123
There is a constraint on columns id, and other_id. The table is a relation table. I'd like to update all '123' values to '456' which already exist in the table. I've tried something as simple as:
UPDATE id_rel
SET other_id = 456
WHERE other_id = 123;
When I try the above I get a message like the following error:
ERROR: duplicate key value violates unique constraint "id_rel" Detail: Key (id, other_id)=(1, 456) already exists.
How can I change these values without having to remove the restraints and basically rebuild the table?
The key "456" as an unique constraint and this constraint allready exist for a another record.
You have to merge the two record or delete the one who occupy the constraint value

Postgresql query results to depend on few rows of same table

I'm working on some application, and we're using postgres as our DB. I don't a lot of experience with SQL at all, and now i encountered a problem, that i can't find answer to.
So here's a problem:
We have privacy settings stored in separate table, and accessibility of each row of data depends on few rows of this privacy table.
Basically structure of privacy table is:
entityId | entityType | privacyId | privacyType | allow | deletedAt
-------------------------------------------------------------------
5 | user | 6 | user | f | //example entry
5 | user | 1 | user_all | t |
In two words, this settings mean, that user id5 allows to have access to his data to everybody except user id6.
So i get available data by query like:
SELECT <some_relevant_fields> FROM <table>
JOIN <join>
WHERE
(privacy."privacyId"=6 AND privacy."privacyType"='user' AND privacy.allow=true)
OR (
(privacy."privacyType"='user_all' AND privacy."deletedAt" IS NOT NULL)
AND
(privacy."privacyType"='user' AND privacy."privacyId"=6 AND privacy.allow!=false)
);
I know that this query is incorrect in this form, but i want you to get idea of what i try to achieve.
So it must check for field with its type/id and allow=true, OR check that user_all is not deleted(deletedAt field is null) and there is no field restricting access with allow=false to this user.
But it seems like postgres is chaining all expressions, so it overrides privacy."privacyType"='user_all' with 'user' at the end of expression, and returns no results, or returns data even if user "blocked", because 'user_all' exist.
Is there a way to write WHERE clause to return result if AND expression is true for 2 different rows, for example in code above: (privacy."privacyType"='user_all' AND privacy."deletedAt" IS NOT NULL) is true for one row AND (privacy."privacyType"='user' AND privacy."privacyId"=6 AND privacy.allow!=false) is true for other, or maybe check for absence of row with this values.
Is this what you want?
select <some_fields> from <table> where
privacyType='user_all' AND deletedAt IS NOT NULL
union
select <some_fields> from <table> where
privacyType='user' AND privacyId=6 AND allow<>'f';
You left join the table with itself and found what element doesnt have a match using the where.
SELECT p1.*
FROM privacy p1
LEFT JOIN privacy p2
ON p1."entityId" = p2."entityId"
AND p1."privacyType" = 'user_all'
AND p1."deletedAt" IS NULL
AND p2."privacyType"='user' AND
AND p2."privacyId"= 6
AND p2.allow!=false
WHERE
p2.privacyId IS NOT NULL

How to fill a nullable integer column and convert it into a serial primary key in Postgresql?

My table contains an integer column (gid) which is nullable:
gid | value
-------------
0 |  a
| b
1 | c
2 | d
| e
Now I would like to change the gid column into a SERIAL primary key column. That means filling up the empty slots with new integers. The existing integers must remain in place. So the result should look like:
gid | value
-------------
0 |  a
3 | b
1 | c
2 | d
4 | e
I just can't figure out the right SQL command for doing the transformation. Code sample would be appreciated...
A serial is "just" a column that takes it default value from a sequence.
Assuming your table is named n1000 then the following will do what you want.
The first thing you need to do is to create that sequence:
create sequence n1000_gid_seq;
Then you need to make that the "default" for the column:
alter table n1000 alter column gid set default nextval('n1000_gid_seq');
To truly create a "serial" you also need to tell the sequence that it is associated with the column:
alter sequence n1000_gid_seq owned by n1000.gid;
Then you need to advance the sequence so that the next value doesn't collide with the existing values:
select setval('n1000_gid_seq', (select max(gid) from n1000), true);
And finally you need to update the missing values in the table:
update n1000
set gid = nextval('n1000_gid_seq')
where gid is null;
Once this is done, you can define the column as the PK:
alter table n1000
add constraint pk_n1000
primary key (gid);
And of course if you have turned off autocommit you need to commit all this.

Why SELECT with WHERE clause returns 0 rows on Cassandra's table? (should return 2 rows)

I created a minimal example of users TABLE on Cassandra 2.0.9 database. I can use SELECT to select all its rows, but I do not understand why adding my WHERE clause (on indexed collumn) returns 0 rows.
(I also do not get why 'COINTAINS' statement causes an error here, as presented below, but let's assume this is not my primary concern. )
DROP TABLE IF EXISTS users;
CREATE TABLE users
(
KEY varchar PRIMARY KEY,
password varchar,
gender varchar,
session_token varchar,
state varchar,
birth_year bigint
);
INSERT INTO users (KEY, gender, password) VALUES ('jessie', 'f', 'avlrenfls');
INSERT INTO users (KEY, gender, password) VALUES ('kate', 'f', '897q7rggg');
INSERT INTO users (KEY, gender, password) VALUES ('mike', 'm', 'mike123');
CREATE INDEX ON users (gender);
DESCRIBE TABLE users;
Output:
CREATE TABLE users (
key text,
birth_year bigint,
gender text,
password text,
session_token text,
state text,
PRIMARY KEY ((key))
) WITH
bloom_filter_fp_chance=0.010000 AND
caching='KEYS_ONLY' AND
comment='' AND
dclocal_read_repair_chance=0.100000 AND
gc_grace_seconds=864000 AND
index_interval=128 AND
read_repair_chance=0.000000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
default_time_to_live=0 AND
speculative_retry='99.0PERCENTILE' AND
memtable_flush_period_in_ms=0 AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'LZ4Compressor'};
CREATE INDEX users_gender_idx ON users (gender);
This SELECT works OK
SELECT * FROM users;
key | birth_year | gender | password | session_token | state
--------+------------+--------+-----------+---------------+-------
kate | null | f | 897q7rggg | null | null
jessie | null | f | avlrenfls | null | null
mike | null | m | mike123 | null | null
And this does not:
SELECT * FROM users WHERE gender = 'f';
(0 rows)
This also fails:
SELECT * FROM users WHERE gender CONTAINS 'f';
Bad Request: line 1:33 no viable alternative at input 'CONTAINS'
It sounds like your index may have become corrupt. Try rebuilding it. Run this from a command prompt:
nodetool rebuild_index yourKeyspaceName users users_gender_idx
However, the larger issue here is that secondary indexes are known to perform poorly. Some have even identified their use as an anti-pattern. DataStax has a document designed to guide you in appropriate use of secondary indexes. And this is definitely not one of them.
creating an index on an extremely low-cardinality column, such as a boolean column, does not make sense. Each value in the index becomes a single row in the index, resulting in a huge row for all the false values, for example. Indexing a multitude of indexed columns having foo = true and foo = false is not useful.
While gender may not be a boolean column, it has the same cardinality. A secondary index on this column is a terrible idea.
If querying by gender is something you really need to do, then you may need to find a different way to model or partition your data. For instance, PRIMARY KEY (state, gender, key) will allow you to query gender by state.
SELECT * FROM users WHERE state='WI' and gender='f';
That would return all female users from the state of Wisconsin. Of course, that would mean you would also have to query all states individually. But the bottom line, is that Cassandra does not handle queries for low cardinality keys/indexes well, so you have to be creative in how you solve these types of problems.

Create view with fields from another table as column headers

I've got two tables that I'd like to combine into a view. The first table contains the structure:
Template Table
componentID | title
======================
1000 | blue
1001 | red
1002 | orange
The second table contains the actual data that will be stored, and the columns reference the ID of the first table:
Data Table
id | field1000 | field1001 | field1002
======================================
1 | navy | ruby | vermilion
2 | midnight | crimson | amber
What I'd like to get as a result in a view:
Combined Table/View?
id | blue | red | orange
=================================
1 | navy | ruby | vermilion
2 | midnight | crimson | amber
Is this possible? I've been trying to get it to work with pivot tables, but I'm getting hung up on how to use the titles as the columns for the data.
Ok, I went a bit overboard with this one but this will do what you want. This procedure will combine all fields with the proper data table columns, and does not need to know nor care how many columns there are in the data tables.
It does not use cursors, but due to the possibility of many template tables, it does use Dynamic SQL to generate the Select statement for the final return.
Only caveat is it's not a View, it's a stored procedure, because it allows to pass the variable for the data table you want to ultimately select from.
The assumptions:
The template table is static
There is one template table for all data tables
All fields in any data tables must be unique *
All data tables have a PK/identity field with the word 'id' in it that must be ignored
All fields in the data tables have a corresponding title in the template table
All fields in the data table are prefixed with the word 'field' and all of the reference ID's in the template table correspond to those field names with 'field' removed, based on your example
*- It can of course be improved by modifying the template table schema to also have a field for the data table that the field title belongs to, for example, which would remove this assumption #3.
The process:
First we need a mapping of the field names, reference IDs, and column titles. We do this with a table variable and get our info from syscolumns. Then, we update our temp table to get the titles from the TemplateTable table.
Then, we need to build a dynamic Select list from the DataTable (which is a parameter in the SP and therefore requires some dynamic SQL to execute). My preferred method of doing this is by having a bit column in my source table that I can update, something like 'IsCompleted', and then using a regular While loop to get through each row. Inside the While loop, all we do is grab the current "TitleReference" from our temporary table variable, and append to the select list the real field name from syscolumns (from first step above).
Finally, we execute the dynamic SQL statement which has a Select, and when this is inside a stored procedure that is executed, the result is returned as the result of the stored procedure.
The Full Working Code
Create Procedure usp_CombineTables
(
#DataTableName varchar(50)
)
As
-- Test
-- Exec usp_CombineTables 'DataTable'
-- Set up our variables
Declare #DataTableIdFieldName varchar(50), -- The ID field of the data table, dynamic
#IsCompleted bit, -- Used by While loop to know when to exit
#CurrentTitleReference int, -- Used in While loop as the ID from TemplateTable that relates to the real data field name and the desired title
#CurrentDataFieldName varchar(50), -- Used in While loop for the current actual field name in the data table
#CurrentTitle varchar(50), -- Used in While loop for the desired field name in the resulting table of the stored proc
#DynamicSelectQuery varchar(2000) -- Stores the SQL query that is dynamically built and executed for the final result; can increase value if needed
-- Use table variable to correlate the datatable columns, titles, and references
Declare #TitleReferences Table (
TitleReference int,
DataTableColumnName varchar(50),
Title varchar(50),
Completed bit default 0
)
-- Get the info from syscolumns about our datatable; assumes that all of the field names are prefixed with the word 'field' which needs to be removed
Insert Into #TitleReferences (
TitleReference,
DataTableColumnName
)
Select
Replace(name, 'field', '') As TitleReference,
name As DataTableColumnName
From syscolumns
Where id = OBJECT_ID(#DataTableName)
And name Not Like '%id%' -- assumes DataTable will always have a PK with 'id' in it, need to ignore/remove
-- Get the titles -- assumes only one template table for all data tables; all data fields accross tables must be unique
Update #TitleReferences
Set Title = t.Title From TemplateTable As t
Where TitleReference = t.ComponentID
-- Get the ID field of the data table
Set #DataTableIdFieldName = (
Select name From syscolumns
Where id = OBJECT_ID(#DataTableName)
And name Like '%id%')
-- Build a dynamic SQL query to select from the datatable with the right column names
Set #DynamicSelectQuery = 'Select ' + #DataTableIdFieldName + ', ' -- start with the ID
Set #IsCompleted = 0
While (#IsCompleted = 0)
Begin
-- Retrieve the field name and title from the current row based on title reference
Set #CurrentTitleReference = (Select Top 1 TitleReference From #TitleReferences Where Completed = 0)
Set #CurrentDataFieldName = (Select DataTableColumnName From #TitleReferences Where TitleReference = #CurrentTitleReference)
Set #CurrentTitle = (Select Title From #TitleReferences Where TitleReference = #CurrentTitleReference)
-- Append the next select field to the dynamic query
Set #DynamicSelectQuery = #DynamicSelectQuery +
#CurrentDataFieldName + ' As ' + QuoteName(#CurrentTitle)
-- Set up to move past current record in next iteration
Update #TitleReferences Set Completed = 1 Where TitleReference = #CurrentTitleReference
-- Exit loop or add comma for next field
If (Select Count(Completed) From #TitleReferences Where Completed = 0) = 0
Begin
Set #IsCompleted = 1
End
Else
Begin
-- Add comma to select field for next column
Set #DynamicSelectQuery = #DynamicSelectQuery + ','
End
End
-- Now the column list is built, just add the table and exec
Set #DynamicSelectQuery = #DynamicSelectQuery +
' From ' + #DataTableName
Exec(#DynamicSelectQuery)
The Result
Hope this helps, it was fun writing it!
something on these lines
DECLARE #f0 VARCHAR(50)=(SELECT title FROM template WHERE componentID=1000)
DECLARE #f1 VARCHAR(50)=(SELECT title FROM template WHERE componentID=1001)
DECLARE #f2 VARCHAR(50)=(SELECT title FROM template WHERE componentID=1002)
#sql='SELECT field1000 AS ' + quotename(#f0) + ' field1001 AS ' + quotename(#f1) + ' field1002 AS ' + quotename(#f2) + ' FROM data'
exec sp_executesql #sql