Table-wide constraint in Postgres - postgresql

I'm rather new at Postgres. Is there any way that I can write a constraint for a table that checks ALL characters fields and tests to make sure that there are no leading or trailing characters IF there is any value in the field?
This way I don't have to itemize each and every character field when I write the constraint.
Thanks!

No, you cannot write such a constraint insofar as I am aware.
What you could do is to create a DOMAIN that has the check function and then make all of your table columns of that domain type. Assuming that the characters you refer to are spaces:
CREATE DOMAIN varchar_no_spaces AS varchar
CHECK ( left(VALUE, 1) <> ' ' AND right(VALUE, 1) <> ' ') );
There are many variations on this CHECK expression, including regular expression and using different or multiple characters. See the string functions for more options.
Then:
CREATE TABLE mytable (
f1 varchar_no_spaces,
...
);
Effectively you relay the constraint check to the level of the domain.

Related

Alphanumeric sorting without any pattern on the strings [duplicate]

I've got a Postgres ORDER BY issue with the following table:
em_code name
EM001 AAA
EM999 BBB
EM1000 CCC
To insert a new record to the table,
I select the last record with SELECT * FROM employees ORDER BY em_code DESC
Strip alphabets from em_code usiging reg exp and store in ec_alpha
Cast the remating part to integer ec_num
Increment by one ec_num++
Pad with sufficient zeors and prefix ec_alpha again
When em_code reaches EM1000, the above algorithm fails.
First step will return EM999 instead EM1000 and it will again generate EM1000 as new em_code, breaking the unique key constraint.
Any idea how to select EM1000?
Since Postgres 9.6, it is possible to specify a collation which will sort columns with numbers naturally.
https://www.postgresql.org/docs/10/collation.html
-- First create a collation with numeric sorting
CREATE COLLATION numeric (provider = icu, locale = 'en#colNumeric=yes');
-- Alter table to use the collation
ALTER TABLE "employees" ALTER COLUMN "em_code" type TEXT COLLATE numeric;
Now just query as you would otherwise.
SELECT * FROM employees ORDER BY em_code
On my data, I get results in this order (note that it also sorts foreign numerals):
Value
0
0001
001
1
06
6
13
۱۳
14
One approach you can take is to create a naturalsort function for this. Here's an example, written by Postgres legend RhodiumToad.
create or replace function naturalsort(text)
returns bytea language sql immutable strict as $f$
select string_agg(convert_to(coalesce(r[2], length(length(r[1])::text) || length(r[1])::text || r[1]), 'SQL_ASCII'),'\x00')
from regexp_matches($1, '0*([0-9]+)|([^0-9]+)', 'g') r;
$f$;
Source: http://www.rhodiumtoad.org.uk/junk/naturalsort.sql
To use it simply call the function in your order by:
SELECT * FROM employees ORDER BY naturalsort(em_code) DESC
The reason is that the string sorts alphabetically (instead of numerically like you would want it) and 1 sorts before 9.
You could solve it like this:
SELECT * FROM employees
ORDER BY substring(em_code, 3)::int DESC;
It would be more efficient to drop the redundant 'EM' from your em_code - if you can - and save an integer number to begin with.
Answer to question in comment
To strip any and all non-digits from a string:
SELECT regexp_replace(em_code, E'\\D','','g')
FROM employees;
\D is the regular expression class-shorthand for "non-digits".
'g' as 4th parameter is the "globally" switch to apply the replacement to every occurrence in the string, not just the first.
After replacing every non-digit with the empty string, only digits remain.
This always comes up in questions and in my own development and I finally tired of tricky ways of doing this. I finally broke down and implemented it as a PostgreSQL extension:
https://github.com/Bjond/pg_natural_sort_order
It's free to use, MIT license.
Basically it just normalizes the numerics (zero pre-pending numerics) within strings such that you can create an index column for full-speed sorting au naturel. The readme explains.
The advantage is you can have a trigger do the work and not your application code. It will be calculated at machine-speed on the PostgreSQL server and migrations adding columns become simple and fast.
you can use just this line
"ORDER BY length(substring(em_code FROM '[0-9]+')), em_code"
I wrote about this in detail in this related question:
Humanized or natural number sorting of mixed word-and-number strings
(I'm posting this answer as a useful cross-reference only, so it's community wiki).
I came up with something slightly different.
The basic idea is to create an array of tuples (integer, string) and then order by these. The magic number 2147483647 is int32_max, used so that strings are sorted after numbers.
ORDER BY ARRAY(
SELECT ROW(
CAST(COALESCE(NULLIF(match[1], ''), '2147483647') AS INTEGER),
match[2]
)
FROM REGEXP_MATCHES(col_to_sort_by, '(\d*)|(\D*)', 'g')
AS match
)
I thought about another way of doing this that uses less db storage than padding and saves time than calculating on the fly.
https://stackoverflow.com/a/47522040/935122
I've also put it on GitHub
https://github.com/ccsalway/dbNaturalSort
The following solution is a combination of various ideas presented in another question, as well as some ideas from the classic solution:
create function natsort(s text) returns text immutable language sql as $$
select string_agg(r[1] || E'\x01' || lpad(r[2], 20, '0'), '')
from regexp_matches(s, '(\D*)(\d*)', 'g') r;
$$;
The design goals of this function were simplicity and pure string operations (no custom types and no arrays), so it can easily be used as a drop-in solution, and is trivial to be indexed over.
Note: If you expect numbers with more than 20 digits, you'll have to replace the hard-coded maximum length 20 in the function with a suitable larger length. Note that this will directly affect the length of the resulting strings, so don't make that value larger than needed.

Postgres CHAR check constraint not evaluating correctly

My goal is to have a column which accepts only values of the type letters and underscores in Postgres.
For some reason though I'm not managing to get Postgres to enforce the regex correctly.
Things I tried:
Produces column which won't accept strings with digits
action_type CHAR(100) check (action_type ~ '^(?:[^[:digit:]]*)$')
Produces column which won't accept anything
action_type CHAR(100) check (action_type ~ '^(?:[^[:digit:] ]*)$')
Produces column which won't accept anything
action_type CHAR(100) check (action_type ~ '^([[:alpha:]_]*)$'),
I have tried using multiple variations of the above as well as using SIMILAR TO instead of '~'. From my experience the column either accepts everything or nothing, depends on the given constraint.
I'm running this on the timescaledb docker image locally which is running PostgreSQL 12.5.
Any help would be greatly appreciated as I am at my wits end.
Try this:
CREATE TABLE your_table(
action_type text
);
ALTER TABLE your_table ADD CONSTRAINT check_letters
CHECK (action_type ~ '^[a-z\_]*$' )
This should only allow a-z characters and underscore _ for action_type column.
Also note that this is case sensitive, if you need case insensitive match, use ~* instead of ~
And also, this allows empty string, if you don't want that, use '^[a-z\_]+$' instead

How to correctly setup CHECK constraint with multiple LIKE statements?

I am trying to enforce a simple rule for inserting values of bank_account column:
- bank account can consist of only digits
- bank account can have one hyphen '-' or zero hyphens
- bank account can have one slash '/' or zero slashes
I have this check constraint:
alter table someTable
add constraint chk_bank check
(
(bank_account not like '%-%-%')
and
(bank_account not like '%/%/%')
and
(bank_account not like '%[^0123456789\-/]%')
)
And I have these bank_account numbers (they are fictional):
12-4414424434/0987
987654321/9999
NULL
41-101010101010/0011
500501502503/7410
NULL
60-6000070000/1234
7987-42516/7845
NULL
12-12121212/2121
When enabling the constraint I get this error:
The ALTER TABLE statement conflicted with the CHECK constraint "chk_bank".
The conflict occurred in database "x", table "someTable", column 'bank_account'.
I tried some select queries but I can't find the wrong numbers.
Is my check constraint written wrong? If so, how should I change it to match my requirements?
Does check constraint ignore NULL values or are these a problem?
According to the documentation, there is no escape character by default. You must use the escape clause to signify that the backslash is the escape character:
...and bank_account not like %[^0123456789\-/]%' escape '\'...
The easy way to check the logic is to select the conditions individually, e.g.:
select bank_account,
case when bank_account not like '%-%-%' then 1 else 0 end as CheckHyphens,
case when bank_account not like '%/%/%' then 1 else 0 end as CheckSlashes,
case when bank_account not like '%[^0123456789\-/]%' then 1 else 0 end as CheckIllegalCharacters,
case when bank_account not like '%[^0123456789\-/]%' escape '\' then 1 else 0 end as CheckIllegalCharactersWithEscape
from YourTable;
It becomes clear that your last condition is failing. Adding an escape clause corrects the pattern.

T-SQL Insert null if ' ' input is empty

My web-application allows for token replacements and therefore my SQL INSERT query looks something like this:
INSERT INTO mytable (Col1, Col2, Col3)
VALUES ('[Col1Value]', '[Col2Value]', '[Col3Value]')
The web app is replacing whats inside the brackets [Col1Value] with the input entered into the form field.
Problem is when an input field is left empty this query is still inserting ' ' just an empty space so the field is not considered null
I'm trying to use SQL's default value/binding section so that all columns that are null have a default value of -- but having my query insert a blank space ' ' is making it so SQL does not trigger the default value action and it still shows blank rather than my desired default value of --
Any ideas as to how I can solve this and make sure ' ' is inserted as a null rather than a space or empty space so it will trigger SQL replacing null with my default value of --
There is no easy going...
How are you inserting the values? If you create these statements literally you are stumbling on the dangerous fields of SQL injection... Use parameters!
One approach might be an insert through a Stored Procedure, another approach is an Instead Of TRIGGER and the third uses the fact, that the string-length does not calculate trailing blanks:
SELECT LEN('') --returns 0
,LEN(' ') --returns 0 too
You can use this in an expression like this:
CASE WHEN LEN(#YourInputValue)=0 THEN NULL ELSE #YourInputValue END

TSQL Prefixing String Literal on Insert - Any Value to This, or Redundant?

I just inherited a project that has code similar to the following (rather simple) example:
DECLARE #Demo TABLE
(
Quantity INT,
Symbol NVARCHAR(10)
)
INSERT INTO #Demo (Quantity, Symbol)
SELECT 127, N'IBM'
My interest is with the N before the string literal.
I understand that the prefix N is to specify encoding (in this case, Unicode). But since the select is just for inserting into a field that is clearly already Unicode, wouldn't this value be automatically upcast?
I've run the code without the N and it appears to work, but am I missing something that the previous programmer intended? Or was the N an oversight on his/her part?
I expect behavior similar to when I pass an int to a decimal field (auto-upcast). Can I get rid of those Ns?
Your test is not really valid, try something like a Chinese character instead, I remember if you don't prefix it it will not insert the correct character
example, first one shows a question mark while the bottom one shows a square
select '作'
select N'作'
A better example, even here the output is not the same
declare #v nvarchar(50), #v2 nvarchar(50)
select #v = '作', #v2 = N'作'
select #v,#v2
Since what you look like is a stock table why are you using unicode, are there even symbols that are unicode..I have never seen any and this includes ISIN, CUSIPS and SEDOLS
Yes, SQL Server will automatically convert (widen, cast down) varchar to nvarchar, so you can remove the N in this case. Of course, if you're specifying a string literal where the characters aren't actually present in the database's default collation, then you need it.
It's like you can suffix a number with "L" in C et al to indicate it's a long literal instead of an int. Writing N'IBM' is either being precise or a slave to habit, depending on your point of view.
One trap for the unwary: nvarchar doesn't get automatically converted to varchar, and this can be an issue if your application is all Unicode and your database isn't. For example, we had this with the jTDS JDBC driver, which bound all parameter values as nvarchar, resulting in statements effectively like this:
select * from purchase where purchase_reference = N'AB1234'
(where purchase_reference was a varchar column)
Since the automatic conversions are only one way, that became:
select * from purchase where CONVERT(NVARCHAR, purchase_reference) = N'AB1234'
and therefore the index of purchase_reference wasn't used.
By contrast, the reverse is fine: if purchase_reference was an nvarchar, and an application passed in a varchar parameter, then the rewritten query:
select * from purchase where purchase_reference = CONVERT(NVARCHAR, 'AB1234')
would be fine. In the end we had to disable binding parameters as Unicode, hence causing a raft of i18n problems that were considered less serious.