I've got a Postgres ORDER BY issue with the following table:
em_code name
EM001 AAA
EM999 BBB
EM1000 CCC
To insert a new record to the table,
I select the last record with SELECT * FROM employees ORDER BY em_code DESC
Strip alphabets from em_code usiging reg exp and store in ec_alpha
Cast the remating part to integer ec_num
Increment by one ec_num++
Pad with sufficient zeors and prefix ec_alpha again
When em_code reaches EM1000, the above algorithm fails.
First step will return EM999 instead EM1000 and it will again generate EM1000 as new em_code, breaking the unique key constraint.
Any idea how to select EM1000?
Since Postgres 9.6, it is possible to specify a collation which will sort columns with numbers naturally.
https://www.postgresql.org/docs/10/collation.html
-- First create a collation with numeric sorting
CREATE COLLATION numeric (provider = icu, locale = 'en#colNumeric=yes');
-- Alter table to use the collation
ALTER TABLE "employees" ALTER COLUMN "em_code" type TEXT COLLATE numeric;
Now just query as you would otherwise.
SELECT * FROM employees ORDER BY em_code
On my data, I get results in this order (note that it also sorts foreign numerals):
Value
0
0001
001
1
06
6
13
۱۳
14
One approach you can take is to create a naturalsort function for this. Here's an example, written by Postgres legend RhodiumToad.
create or replace function naturalsort(text)
returns bytea language sql immutable strict as $f$
select string_agg(convert_to(coalesce(r[2], length(length(r[1])::text) || length(r[1])::text || r[1]), 'SQL_ASCII'),'\x00')
from regexp_matches($1, '0*([0-9]+)|([^0-9]+)', 'g') r;
$f$;
Source: http://www.rhodiumtoad.org.uk/junk/naturalsort.sql
To use it simply call the function in your order by:
SELECT * FROM employees ORDER BY naturalsort(em_code) DESC
The reason is that the string sorts alphabetically (instead of numerically like you would want it) and 1 sorts before 9.
You could solve it like this:
SELECT * FROM employees
ORDER BY substring(em_code, 3)::int DESC;
It would be more efficient to drop the redundant 'EM' from your em_code - if you can - and save an integer number to begin with.
Answer to question in comment
To strip any and all non-digits from a string:
SELECT regexp_replace(em_code, E'\\D','','g')
FROM employees;
\D is the regular expression class-shorthand for "non-digits".
'g' as 4th parameter is the "globally" switch to apply the replacement to every occurrence in the string, not just the first.
After replacing every non-digit with the empty string, only digits remain.
This always comes up in questions and in my own development and I finally tired of tricky ways of doing this. I finally broke down and implemented it as a PostgreSQL extension:
https://github.com/Bjond/pg_natural_sort_order
It's free to use, MIT license.
Basically it just normalizes the numerics (zero pre-pending numerics) within strings such that you can create an index column for full-speed sorting au naturel. The readme explains.
The advantage is you can have a trigger do the work and not your application code. It will be calculated at machine-speed on the PostgreSQL server and migrations adding columns become simple and fast.
you can use just this line
"ORDER BY length(substring(em_code FROM '[0-9]+')), em_code"
I wrote about this in detail in this related question:
Humanized or natural number sorting of mixed word-and-number strings
(I'm posting this answer as a useful cross-reference only, so it's community wiki).
I came up with something slightly different.
The basic idea is to create an array of tuples (integer, string) and then order by these. The magic number 2147483647 is int32_max, used so that strings are sorted after numbers.
ORDER BY ARRAY(
SELECT ROW(
CAST(COALESCE(NULLIF(match[1], ''), '2147483647') AS INTEGER),
match[2]
)
FROM REGEXP_MATCHES(col_to_sort_by, '(\d*)|(\D*)', 'g')
AS match
)
I thought about another way of doing this that uses less db storage than padding and saves time than calculating on the fly.
https://stackoverflow.com/a/47522040/935122
I've also put it on GitHub
https://github.com/ccsalway/dbNaturalSort
The following solution is a combination of various ideas presented in another question, as well as some ideas from the classic solution:
create function natsort(s text) returns text immutable language sql as $$
select string_agg(r[1] || E'\x01' || lpad(r[2], 20, '0'), '')
from regexp_matches(s, '(\D*)(\d*)', 'g') r;
$$;
The design goals of this function were simplicity and pure string operations (no custom types and no arrays), so it can easily be used as a drop-in solution, and is trivial to be indexed over.
Note: If you expect numbers with more than 20 digits, you'll have to replace the hard-coded maximum length 20 in the function with a suitable larger length. Note that this will directly affect the length of the resulting strings, so don't make that value larger than needed.
I have to produce a dynamically generated T-SQL script that inserts records into various tables. I've done a bunch of searching and testing but can't seem to find the path I'm looking for.
I know that the following is valid SQL:
INSERT INTO [MyTable] ( [Col1], [Col2], [Col3] )
SELECT N'Val1', N'Val2', N'Val3';
But, is it at all possible to write something akin to this:
INSERT INTO [MyTable]
SELECT [Col1] = N'Val1', [Col2] = N'Val2', [Col3] = N'Val3';
By having the columns in the select statement, I'm able to do it all at once vs writing 2 separate lines. Obviously my idea doesn't work, I'm trying to figure out whether something similar is possible or I need to stick with the first one.
Much appreciated.
Best practice for insert statements is to specify the columns list in the insert clause, and for very good reasons:
It's far more readable. You know exactly what value goes into what column.
You don't have to provide values to nullable \ default valued columns.
You're not bound to the order of the columns in the table.
In case a column is added to the table, your insert statement might not break (It will if the newly added column is not nullable and doesn't have a default value).
In some cases, SQL Server demands you specify the columns list explicitly, like when identity_insert is set to on.
And in any case, the column names or aliases in the select clause of the insert...select statement does not have any effect as to what target columns the value column should go to. values are directed to target based only on their location in the statement.
I would like to know how I can insert regular expression in a table column in a PostgreSQl table.
For example I have column called "rule" in a table where I need to store the expression ^[0-9]+$. I tried:
insert into rule_master(rule)
values('^[0-9]+$') where rule_id='7'
But I am getting error syntax near where is wrong. I tried this with and with out single quotes. Please suggest me a solution.
It appears you want to UPDATE an existing record. In that case you should do:
UPDATE rule_master
SET rule = '^[0-9]+$'
WHERE rule_id = '7';
But if this is indeed a new record and you want to INSERT that regex with the value of "rule_id" then do:
INSERT INTO rule_master(rule_id, rule)
VALUES ('7', '^[0-9]+$');
I have data in DB2 then i want to insert that data to SQL.
The DB2 data that i had is like :
select char('AAA ') as test from Table_1
But then, when i select in SQL after doing insert, the data become like this.
select test from Table_1
result :
test
------
AAA
why Space character read into box character. How do I fix this so that the space character is read into.
Or is there a setting I need to change? or do I have to use a parameter?
I used AS400 and datastage.
Thank you.
Datastage appends pad characters so you know that there are spaces there. The pad character is 0x00 (NUL) by default and that's what you're seeing.
Research the APT_STRING_PADCHAR environment variable; you can set it to something else if you want.
The 0x00 characters are not actually in your database. The short answer is, you can safely ignore it.
When you said:
select char('AAA ') as test from Table_1
You were not actually showing any data from the table. Instead you were showing an expression casting a constant AAA as a character value, and giving that result column the name test which coincidentally seems to be the name of a column in the table, although that coincidence doesn't matter here.
Then your 2nd statement does show the contents of the database column.
select test from Table_1
Find out what the hexadecimal value actually is.
I am looking to extract Sybase datatype for all the columns in a table. When I try to achieve this using $sth->{TYPE}, I get a numeric version of the datatype (i.e. instead of sybase datatype varchar, I get 0).
From the DBD::Sybase documentation, I noticed that SYBTYPE attribute of syb_describe function might be able to produce what I am looking for. But it seems that my understanding is not proper. SYBTYPE also prints datatype in numeric form only.
Is there any way to fetch the textual representation of actual Sybase datatype (instead of the number)?
It sounds like you wish to reverse engineer the create table definition. Here is an SQL script you can use for Sybase or SQL Server tables.
select c.name,
"type(size)"=case
when t.name in ("char", "varchar") then
t.name + "(" + rtrim(convert(char(3), c.length)) + ")"
else t.name
end,
"null"=case
when convert(bit, (c.status & 8)) = 0 then "NOT NULL"
else "NULL"
end
from syscolumns c, systypes t
where c.id = object_id("my_table_name")
and c.usertype *= t.usertype
order by c.colid
go
Note: This could still be edited with a nawk script to create a real SQL schema file.
The nawk script would strip the header, add "create table my_table_name", add commas, strip the footer and add a "go".
Good SQL, good night!
I found a workaround (Note: This does not answer the question though):
What I did was simply joined the sysobjects, systypes and syscolumns system tables.