Postgres JSONB unique constraint - postgresql

I have a table as following table.
create table person {
firstname varchar,
lastname varchar,
person_info jsonb,
..
}
I already have unique constraints on firstname + lastname. I recently identify there is always something different in person_info jsonb. I want to uniquely identify by person_info jsonb.
Should I add person_info as part of unique constraints firstname + lastname + person_info ? Is there any performance impact with such implementation ? I heard JSONB is not good for index when number of data increases.
I am thinking to use store person_info hashvalue in different field and combine this new hashvalue field as part of unique index.
I would appreciate if I get some help from expert on this.

This seems like a wrong idea.
A primary key should be immutable and uniquely identify a table row.
Names are not good for that, because
different people can have the same name
names can change
This is probably why you are tempted to add additional information to truly identify each individual row.
Unless you have some immutable attribute that uniquely identifies each person (such as the social security nubmer), you should generate an artificial primary key for the table:
ALTER TABLE person
ADD id bigint
GENERATED ALWAYS AS IDENTITY
PRIMARY KEY;
Indexing a jsonb is possible, but you will get problems with long values since index entries are limited in size, and you will get an error if you exceed the limit.
I recommend that any attribute that you might want to index is not stored in a jsonb, but as a regular table column.

JSONB indexing IMHO refers to the ability to index fields inside the binary JSON rather than the whole block. Be aware also that key ordering is not kept! So if you can obtain two different hashes for two json with the exact same data but different ordering. Instead, if you can find which json fields gives you uniqueness, than you can use directly those for indexing.
Try also to look at this page

Related

Is it bad for columns in composite keys to have mismatching types?

Problem:
I'd like to make a composite primary key from columns id and user_id for a postgres database table. Column user_id is a foreign key with an integer type, whereas id is a string. Will this cause a conflict because the types are different?
Edit: Also, are there combinations of types that would cause problems?
Context:
I obviously should match the type of the User.id field for its foreign key. And, the id for my table will be derived from a uuid to prevent data leaks. So I would prefer not to change the types of either field I want in this table.
Research:
I am using sqlalchemy. Their documentation mentions how to create a composite primary key, but it doesn't discuss dealing with different types for each column.
No, this won't be a problem.
Your question seems to indicate that you think, the values of the indexed columns are somehow concatenated and then stored in the index as a single value. This is not the case. Each column value is stored independently but together. Similar to the way the column values are stored in the actual table.

Is there efficient difference between varchar and int as PK

Could somebody tell is it good idea use varchar as PK. I mean is it less efficient or equal to int/uuid?
In example: car VIN I want to use it as PK but I'm not sure as good it will be indexed or work as FK or maybe there is some pitfalls.
It depends on which kind of data you are going to store.
In some cases (I would say in most cases) it is better to use integer-based primary keys:
for instance, bigint needs only 8 bytes, varchar can require more space. For this reason, a varchar comparison is often more costly than a bigint comparison.
while joining tables it would be more efficient to join them using integer-based values rather that strings
an integer-based key as a unique key is more appropriate for table relations. For instance, if you are going to store this primary key in another tables as a separate column. Again, varchar will require more space in other table too (see p.1).
This post on stackexchange compares non-integer types of primary keys on a particular example.

Composite key with user-supplied string column, foreign keys

Let's say I have the following table
TABLE subgroups (
group_id t_group_id NOT NULL REFERENCES groups(group_id),
subgroup_name t_subgroup_name NOT NULL,
more attributes ...
)
subgroup_name is UNIQUE to a group(group_id).
A group can have many subgroups.
The subgroup_names are user-supplied. (I would like to avoid using a subgroup_id column. subgroup_name has meaning in the model and is more than just a label, I am providing a list of predetermined names but allow a user to add his owns for flexibility).
This table has 2 levels of referencing child tables containing subgroup attributes (with many-to-one relations);
I would like to have a PRIMARY KEY on (group_id, upper(trim(subgroup_name)));
From what I know, postgres doesn't allow to use PRIMARY KEY/UNIQUE on a function.
IIRC, the relational model also requires columns to be used as stored.
CREATE UNIQUE INDEX ON subgroups (group_id, upper(trim(subgroup_name))); doesn't solve my problem
as other tables in my model will have FOREIGN KEYs pointing to those two columns.
I see two options.
Option A)
Store a cleaned up subgroup name in subgroup_name
Add an extra column called subgroup_name_raw that would contained the uncleaned string
Option B)
Create both a UNIQUE INDEX and PRIMARY KEY on my key pair. (seems like a huge waste)
Any insights?
Note: I'm using Postgres 9.2
Actually you can do a UNIQUE constraint on the output of a function. You can't do it in the table definition though. What you need to do is create a unique index after. So something like:
CREATE UNIQUE INDEX subgroups_ukey2 ON subgroups(group_id, upper(trim(subgroup_name)));
PostgreSQL has a number of absolutely amazing indexing capabilities, and the ability to create unique (and partial unique) indexes on function output is quite underrated.

Setting constraint for two unique fields in PostgreSQL

I'm new to postgres. I wonder, what is a PostgreSQL way to set a constraint for a couple of unique values (so that each pair would be unique). Should I create an INDEX for bar and baz fields?
CREATE UNIQUE INDEX foo ON table_name(bar, baz);
If not, what is a right way to do that? Thanks in advance.
If each field needs to be unique unto itself, then create unique indexes on each field. If they need to be unique in combination only, then create a single unique index across both fields.
Don't forget to set each field NOT NULL if it should be. NULLs are never unique, so something like this can happen:
create table test (a int, b int);
create unique index test_a_b_unq on test (a,b);
insert into test values (NULL,1);
insert into test values (NULL,1);
and get no error. Because the two NULLs are not unique.
You can do what you are already thinking of: create a unique constraint on both fields. This way, a unique index will be created behind the scenes, and you will get the behavior you need. Plus, that information can be picked up by information_schema to do some metadata inferring if necessary on the fact that both need to be unique. I would recommend this option. You can also use triggers for this, but a unique constraint is way better for this specific requirement.

How does 'UQ' function if another field is 'PK' in MySQL workbench?

I'm creating a schema for a database using MySQL's workbench. One of my tables contains fields for a personId, as well as a national id number if they have one (which they may not).
The personId field is the one used as a unique identifier throughout the schema, so I've ticked the "PK" and "NN" options for it. Now I'd like to be able to ensure that the system won't allow a new insert with a different personId if it has the same national id as an entity that already exists. However, national ids are not primary keys and may in fact be null.
I've been looking at the 'UQ' option, but I can't find clear documentation on what it actually does. I'm worried it'll create the numbers automatically when I actually want them to be inserted by a user or left null. Does anyone know?
UQ tags a field as a unique key. This enforces uniqueness in a given field, except for NULLs. This is exactly what I need for my national id field.
From http://dev.mysql.com/doc/refman/5.5/en/create-table.html :
A UNIQUE index creates a constraint such that all values in the index must be distinct. An error occurs if you try to add a new row with a key value that matches an existing row. For all engines, a UNIQUE index permits multiple NULL values for columns that can contain NULL.