Postgresql: Concatenating regexp_split_to_table into a single string - postgresql

In Postgresql, I need to take an ASCII string like CNR39XL and turn it into a number like 21317392311 (ASCII values - 65, but digits left as is). This is to take some logic in our software and push it down into the databse level. I’m partway there with this:
ourdb=> SELECT CASE
ourdb-> WHEN ASCII(c) < 65 THEN c::integer
ourdb-> ELSE ASCII(c)-65
ourdb-> END
ourdb-> FROM regexp_split_to_table('CNR39XL','') s(c);
case
------
2
13
17
3
9
23
11
(7 rows)
However, I can't figure out how to take that final set of numbers and convert it into the string. What am I looking for?

Use string_agg
SELECT CAST(
string_agg(
CAST (CASE ... END AS text),
''
)
AS numeric)
FROM ...

Related

Padding Fields With White Space

I have the following piece of code in my SELECT statement -
SELECT convert(varchar (24),ra.Reference)
If a result is - R0_2, so 4 characters, how do you go about padding the trailing space (to the right) with the remaining 20 characters to make up 24?
Similar in that if I have a figure of say 18.00 what I want is to add a # to the front, which I know I can achieve with a CONCAT.
However this field I want to be 16 characters and any leading space to be filled with white space, so this example would look like -
'xxxxxxxxxx#18.00' (where x is a blank space)
Thank you for any advice.
One trick you can use is to just concatenate to the string an amount of padding which is guaranteed to fill the missing spaces. For the case of a string 24 characters long, in your first example, we can concatenate 24 spaces to the end of that string. Then, take the first 24 characters from the left, and the resulting string should be right padded by spaces. Similar logic applies to the other case.
First query:
SELECT LEFT(CONVERT(varchar(24), ra.Reference) + ' ', 24)
FROM yourTable
Second query:
SELECT RIGHT(' ' + '#' + CONVERT(varchar(16), ra.TotalValue), 16)
FROM yourTable
You could also use REPLICATE to accurately pad based on the length of text for each cell to ensure it's always 24 characters:
DECLARE #Test1 VARCHAR(24) = 'Test',
#Test2 VARCHAR(24) = 'Longer String'
SELECT CONCAT(#Test1, REPLICATE(N' ', 24 - LEN(#Test1))),
CONCAT(#Test2, REPLICATE(N' ', 24 - LEN(#Test2)))
And for the #....
DECLARE #Number DECIMAL(4,2) = 18.00
SELECT CONCAT(REPLICATE(' ', 15 - LEN(CONVERT(VARCHAR(16), #Number))), '#',#Number)
I used 15 here despite it being 16 characters to account for the addition of the #

How to replace captured group with evaluated expression (adding an integer value to capture group)

I need to convert some strings with this format:
B12F34
to something like that:
Building 12 - Floor 34
but I have to add a value, say 10, to the second capture group so the new string would be as:
Building 12 - Floor 44
I can use this postgres sentence to get almost everything done, but I don't know how to add the value to the second capture group.
SELECT regexp_replace('B12F34', 'B(\d+)F(\d+)', 'Building \1 - Floor \2', 'g');
I have been searching for a way to add a value to \2 but all I have found is that I can use 'E'-modifier, and then \1 and \2 need to be \\1 and \\2:
SELECT regexp_replace('B12F34', 'B(\d+)F(\d+)', E'Building \\1 - Floor \\2', 'g')
I need some sentence like this one:
SELECT regexp_replace('B12F34', 'B(\d+)F(\d+)', E'Building \\1 - Floor \\2+10', 'g')
to get ........ Floor 44 instead of ........ Floor 34+10
You can not do this in regexp alone because regexp does not support math on captured groups even if they are all numeric characters. So you have to get the group that represents the floor number, do the math and splice it back in:
SELECT regexp_replace('B12F34', 'B(\d+)F(\d+)', 'Building \1 - Floor ') ||
((regexp_matches('B12F34', '[0-9]+$'))[1]::int + 10)::text;
Not very efficient because of the two regexp calls. Another option is to just get the two numbers in a sub-query and assemble the string in the main query:
SELECT format('Building %L - Floor %L', m.num[1], (m.num[2])::int + 10)
FROM (
SELECT regexp_matches('B12F34', '[0-9]+', 'g') AS num) m;

Removing decimal places

I have a set of decimal values, but I need to remove the values after the decimal if they are zero.
17.00
23.50
100.00
512.79
become
17
23.50
100
512.79
Currently, I convert to a string and replace out the trailing .00 - Is there a better method?
REPLACE(CAST(amount as varchar(15)), '.00', '')
This sounds like it is purely a data presentation problem. As such you should let the receiving application or reporting software take care of the formatting.
You could try converting the .00s to datatype int. That would truncate the decimals. However, as all the values appear in one column they will have to have the same type. Making everything an int would ruin your rows with actual decimal places.
As a SQL solution to a presentation problem, I think what you have is OK.
I would advice you to compare the raw decimal value with itself floored. Example code:
declare
#Val1 decimal(9,2) = 17.00,
#Val2 decimal(9,2) = 23.50;
select
case when FLOOR ( #Val1 ) = #Val1
then cast( cast(#Val1 as int) as varchar)
else cast(#Val1 as varchar) end,
case when FLOOR ( #Val2 ) = #Val2
then cast( cast(#Val2 as int) as varchar)
else cast(#Val2 as varchar) end
-------------
17 | 23.50

Extracting values from non-standard markup strings in PostgreSQL

Unfortunately, I have a table like the following:
DROP TABLE IF EXISTS my_list;
CREATE TABLE my_list (index int PRIMARY KEY, mystring text, status text);
INSERT INTO my_list
(index, mystring, status) VALUES
(12, '', 'D'),
(14, '[id] 5', 'A'),
(15, '[id] 12[num] 03952145815', 'C'),
(16, '[id] 314[num] 03952145815[name] Sweet', 'E'),
(19, '[id] 01211[num] 03952145815[name] Home[oth] Alabama', 'B');
Is there any trick to get out number of [id] as integer from the mystring text shown above? As though I ran the following query:
SELECT index, extract_id_function(mystring), status FROM my_list;
and got results like:
12 0 D
14 5 A
15 12 C
16 314 E
19 1211 B
Preferably with only simple string functions and if not regular expression will be fine.
If I understand correctly, you have a rather unconventional markup format where [id] is followed by a space, then a series of digits that represents a numeric identifier. There is no closing tag, the next non-numeric field ends the ID.
If so, you're going to be able to do this with non-regexp string ops, but only quite badly. What you'd really need is the SQL equivalent of strtol, which consumes input up to the first non-digit and just returns that. A cast to integer will not do that, it'll report an error if it sees non-numeric garbage after the number. (As it happens I just wrote a C extension that exposes strtol for decoding hex values, but I'm guessing you don't want to use C extensions if you don't even want regex...)
It can be done with string ops if you make the simplifying assumption that an [id] nnnn tag always ends with either end of string or another tag, so it's always [ at the end of the number. We also assume that you're only interested in the first [id] if multiple appear in a string. That way you can write something like the following horrible monstrosity:
select
"index",
case
when next_tag_idx > 0 then substring(cut_id from 0 for next_tag_idx)
else cut_id
end AS "my_id",
"status"
from (
select
position('[' in cut_id) AS next_tag_idx,
*
from (
select
case
when id_offset = 0 then null
else substring(mystring from id_offset + 4)
end AS cut_id,
*
from (
select
position('[id] ' in mystring) AS id_offset,
*
from my_list
) x
) y
) z;
(If anybody ever actually uses that query for anything, kittens will fall from the sky and splat upon the pavement, wailing in horror all the way down).
Or you can be sensible and just use a regular expression for this kind of string processing, in which case your query (assuming you only want the first [id]) is:
regress=> SELECT
"index",
coalesce((SELECT (regexp_matches(mystring, '\[id\]\s?(\d+)'))[1])::integer, 0) AS my_id,
status
FROM my_list;
index | my_id | status
-------+----------------+--------
12 | 0 | D
14 | 5 | A
15 | 12 | C
16 | 314 | E
19 | 01211 | B
(5 rows)
Update: If you're having issues with unicode handling in regex, upgrade to Pg 9.2. See https://stackoverflow.com/a/14293924/398670

how does one extract octets from an inet value in postgres sql?

I would like to convert an inet formatted IPv4 address into the integer components.
For example, turn '101.255.30.40' into oct1=101, oct2=255, oct3=30, and oct4=40.
There are regex expressions that should do this if I cast the inet as a varchar, but that seems inelegant. Is there a 1-line function for returning the nth octet of an inet?
select inet_to_octet('101.255.30.40', 4) as temp; -- returns temp=40?
I finally got an excellent answer from a co-worker...
For some flavors of sql, use "split_part" along with host(inet) to get the text field.
select split_part(host('101.255.30.40'::inet), '.', 1);
select split_part(host('101.255.30.40'::inet), '.', 2);
select split_part(host('101.255.30.40'::inet), '.', 3);
select split_part(host('101.255.30.40'::inet), '.', 4);
results in
101
255
30
40
If you want to get trickier and handle IPv6, use a mask to speed up the operation along with case statements to get the IP version:
select
(case
when family('101.255.30.40'::inet) = 4 then split_part(host(broadcast(set_masklen('101.255.30.40'::inet, 32))), '.', 4)::varchar
when family('101.255.30.40'::inet) = 6 then split_part(host(broadcast(set_masklen('101.255.30.40'::inet, 64))), ':', 4)::varchar
else null end)::varchar as octet4;
select
(case
when family('2604:8f00:4:80b0:3925:c69c:458:3f7b'::inet) = 4 then split_part(host(broadcast(set_masklen('2604:8f00:4:80b0:3925:c69c:458:3f7b'::inet, 32))), '.', 4)::varchar
when family('2604:8f00:4:80b0:3925:c69c:458:3f7b'::inet) = 6 then split_part(host(broadcast(set_masklen('2604:8f00:4:80b0:3925:c69c:458:3f7b'::inet, 64))), ':', 4)::varchar
else null end)::varchar as octet4;
results in
40
80b0
you can then add in a hex-to-int conversion into the case statement if you want to cast the IPv6 as a number instead.
Unless you want to try to contribute the function to the inet datatype, you'll need to rely on a string based version. Maybe put something like this (but with some error checking) into an SQL function for easy access?:
CREATE OR REPLACE FUNCTION extract_octet(inet, integer) RETURNS integer AS $$
SELECT ((regexp_split_to_array(host($1), E'\\.'))[$2])::int;
$$ LANGUAGE SQL;
select extract_octet(inet '192.26.22.2', 2)
Output: 26
Here are a couple of one-liners for the separate octets of an IPv4 address:
SELECT substring(host('1.2.3.4'::inet) FROM '^([0-9]+)\.[0-9]+\.[0-9]+\.[0-9]+$');
will return 1
SELECT substring(host('1.2.3.4'::inet) FROM '^[0-9]+\.([0-9]+)\.[0-9]+\.[0-9]+$');
will return 2
SELECT substring(host('1.2.3.4'::inet) FROM '^[0-9]+\.[0-9]+\.([0-9]+)\.[0-9]+$');
will return 3
SELECT substring(host('1.2.3.4'::inet) FROM '^[0-9]+\.[0-9]+\.[0-9]+\.([0-9]+)$');
will return 4