Are there different flavors of null in a PostgreSQL plpgsql variable of type jsonb? - postgresql

Following up on a question I asked here...
#Bergi provided a helpful fiddle which produces the following table...
+------------------+--------+------+--------+
| val | -> | ->> | ->> :: |
+------------------+--------+------+--------+
| {"prop": "null"} | "null" | null | null |
| {"prop": null} | null * | | ** |
| {"prop": 42} | 42 | 42 | 42 |
| {"prop": []} | [] | [] | [] |
| {"prop": {}} | {} | {} | {} |
| {} | | | |
+------------------+--------+------+--------+
The fiddle link is more clear, but some returns are grey nulls and some are black. And there is clearly meaning in the difference. Anyway, note in particular the * and ** cells.
Note now two snippets from my OP code...
SELECT (_main_jsonb->'i_am_null') INTO _sub_jsonb; -- *
SELECT (_main_jsonb->>'i_am_null')::jsonb INTO _sub_jsonb; -- **
So first question is -- what precisely is being stored in _sub_jsonb in the * case? I understand that _sub_jsonb is being initialized by a different sequences of casts, but I do not understand what is actually being stored in _sub_jsonb in the first case. Clearly something different... a null that's not a null, somehow.
My second question is a re-ask of best practices, because I think I gleaned from a passing comment from Bergi (" Commenting in the first line demonstrates why casting string fields back to jsonb is a bad idea. ") that I should probably stick with this (below) approach to discovering nulls, as best practice:
DO
$$
DECLARE
_main_jsonb jsonb = '{"i_am_null": null}';
_sub_jsonb jsonb;
BEGIN
SELECT (_main_jsonb->'i_am_null') INTO _sub_jsonb;
IF _sub_jsonb = 'null'::jsonb THEN
RAISE INFO 'Best practice to identify a jsonb null. I think.';
END IF;
END;
$$
Did I read your guidance rightly, Bergi?
Thx again, Bergi and Schwern!

Related

Select rows where column matches any IP address in inet[] array

I'm trying to display rows that have at least one value in the inet[] type column.
I really don't know any better, so it seems it would be easiest to use something like this, but it returns results with {} which I guess is null according to the inet[] type, but not from the perspective of the is not null query?
peering_manager=# select asn,name,potential_internet_exchange_peering_sessions from peering_autonomoussystem where potential_internet_exchange_peering_sessions is not null order by potential_internet_exchange_peering_sessions limit 1;
asn | name | potential_internet_exchange_peering_sessions
------+---------------------------------+----------------------------------------------
6128 | Cablevision Systems Corporation | {}
(1 row)
peering_manager=#
So trying to dig a little more into it, I think maybe if I can try to match the existence of any valid IP address in the inet[] column, that would work, however I'm getting an error and I don't understand what it's referring to or how to resolve it to achieve the desired results:
peering_manager=# select asn,name,potential_internet_exchange_peering_sessions from peering_autonomoussystem where potential_internet_exchange_peering_sessions << inet '0.0.0.0/0';
ERROR: operator does not exist: inet[] << inet
LINE 1: ...here potential_internet_exchange_peering_sessions << inet '0...
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
peering_manager=#
Maybe it's saying that the << operator is invalid for the inet[] type or that the << operator is an invalid operation when trying to query an inet type from a value stored as an inet[] type? Or something else?
In any event, I'm kind of lost. Maybe there's a better way to do this?
Here's the table, and a sample of the data set I'm working with.
peering_manager=# \d peering_autonomoussystem;
Table "public.peering_autonomoussystem"
Column | Type | Modifiers
----------------------------------------------+--------------------------+-----------------------------------------------------------------------
id | integer | not null default nextval('peering_autonomoussystem_id_seq'::regclass)
asn | bigint | not null
name | character varying(128) | not null
comment | text | not null
ipv6_max_prefixes | integer | not null
ipv4_max_prefixes | integer | not null
updated | timestamp with time zone |
irr_as_set | character varying(255) |
ipv4_max_prefixes_peeringdb_sync | boolean | not null
ipv6_max_prefixes_peeringdb_sync | boolean | not null
irr_as_set_peeringdb_sync | boolean | not null
created | timestamp with time zone |
potential_internet_exchange_peering_sessions | inet[] | not null
contact_email | character varying(254) | not null
contact_name | character varying(50) | not null
contact_phone | character varying(20) | not null
Indexes:
"peering_autonomoussystem_pkey" PRIMARY KEY, btree (id)
"peering_autonomoussystem_asn_ec0373c4_uniq" UNIQUE CONSTRAINT, btree (asn)
Check constraints:
"peering_autonomoussystem_ipv4_max_prefixes_check" CHECK (ipv4_max_prefixes >= 0)
"peering_autonomoussystem_ipv6_max_prefixes_check" CHECK (ipv6_max_prefixes >= 0)
Referenced by:
TABLE "peering_directpeeringsession" CONSTRAINT "peering_directpeerin_autonomous_system_id_691dbc97_fk_peering_a" FOREIGN KEY (autonomous_system_id) REFERENCES peering_autonomoussystem(id) DEFERRABLE INITIALLY DEFERRED
TABLE "peering_internetexchangepeeringsession" CONSTRAINT "peering_peeringsessi_autonomous_system_id_9ffc404f_fk_peering_a" FOREIGN KEY (autonomous_system_id) REFERENCES peering_autonomoussystem(id) DEFERRABLE INITIALLY DEFERRED
peering_manager=#
peering_manager=# select asn,name,potential_internet_exchange_peering_sessions from peering_autonomoussystem limit 7;
asn | name | potential_internet_exchange_peering_sessions
-------+---------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
37662 | WIOCC | {2001:504:1::a503:7662:1,198.32.160.70}
38001 | NewMedia Express Pte Ltd | {2001:504:16::9471,206.81.81.204}
46562 | Total Server Solutions | {2001:504:1::a504:6562:1,198.32.160.12,2001:504:16::b5e2,206.81.81.81,2001:504:1a::35:21,206.108.35.21,2001:504:2d::18:80,198.179.18.80,2001:504:36::b5e2:0:1,206.82.104.156}
55081 | 24Shells Inc | {2001:504:1::a505:5081:1,198.32.160.135}
62887 | Whitesky Communications | {2001:504:16::f5a7,206.81.81.209}
2603 | NORDUnet | {2001:504:1::a500:2603:1,198.32.160.21}
6128 | Cablevision Systems Corporation | {}
(7 rows)
You can use array_length(). On empty arrays or nulls this returns NULL.
...
WHERE array_length(potential_internet_exchange_peering_sessions, 1) IS NOT NULL
...
better to compare array length with integer number
...
WHERE array_length(potential_internet_exchange_peering_sessions, 1) > 0
...

Querying partial value from a field - SQL SERVER 2008

I need to return only a portion of the value in a given field.
Example:
A given field returns something like 'AB-1X3.4567' but the desired value is only the '1X3.4567'portion. So for this example I need to remove anything that precedes the pattern of
[0-9,A-Z][0-9,A-Z][0-9,A-Z][.][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z].
What query could I write to do this?
using stuff() and patindex():
create table t (val varchar(32))
insert into t values
('AB-1X3.4567') -- given example
,('1X3.4567AB-1X3.4567') --extra junk on the end
,('1X3.4567') -- goldy locks
,('X3.4567') -- too short
,('AB-1X#.4567') -- # is not [0-9A-Z]
select
val
, str = stuff(val,1,patindex('%[0-9A-Z][0-9A-Z][0-9A-Z][.][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z]%',val)-1,'')
from t
rextester demo: http://rextester.com/ITUJ68634
returns:
+---------------------+---------------------+
| val | str |
+---------------------+---------------------+
| AB-1X3.4567 | 1X3.4567 |
| 1X3.4567AB-1X3.4567 | 1X3.4567AB-1X3.4567 |
| 1X3.4567 | 1X3.4567 |
| X3.4567 | NULL |
| AB-1X#.4567 | NULL |
+---------------------+---------------------+
Your pattern alludes to anything which is XXX.XXXX where X = any single digit or letter. In that case we can use RIGHT() and LEN()
DECLARE #value VARCHAR(4000)='AB-1X3.4567'
SELECT RIGHT(#value,LEN(#value) - 3)

Column-wise autocomplete

I have a table in a PostgreSQL database with four columns that contain increasingly more detailed information (think state->city->street->number), along with a column where everything is concatenated according to some simple formatting rules. Example:
| kommun | trakt | block | enhet | beteckning |
| Mora | Gislövs Läge | 9 | 16 | Mora Gislövs Läge 9:16 |
| Mora | Gisslaved | * | 8 | Mora Gisslaved 8 |
| Mora | Gisslaved | * | 9 | Mora Gisslaved 9 |
| Lilla Edet | Sanda | GA | 1 | Lilla Edet Sanda GA:1 |
A web service uses this table to implement a word-wise autocomplete, where the user gets input suggestions as they drill down. An input of mora gis will result in
["Mora Gislövs", "Mora Gisslaved"]
Currently, this is done by splitting the concatenated column by word in this query:
select distinct trim(substring(beteckning from '(^(\S+\s?){NUMPARTS})')) as bet
from beteckning_ac
where upper(beteckning) like upper('mora gis%')
order by bet
Where NUMPARTS is the number of words in the input - 2 in this case.
Now I want the autocomplete to be done column-wise rather than word-wise, so mora gis would now result in this instead:
["Mora Gislövs Läge", "Mora Gisslaved"]
Since the first two columns can contain an arbitrary number of words, I can no longer use the input to determine how many columns to include in the response. Is there a way to do this, or have I maybe gone about this autocomplete business all wrong?
CREATE OR REPLACE FUNCTION get_auto(text)
--$1 is here your input
RETURNS setof text
LANGUAGE plpgsql
AS $function$
declare
NUMPARTS int := array_length(regexp_split_to_array($1,' '), 1);
begin
return query
select
case
when (NUMPARTS = 1) then kommun
when (NUMPARTS = 2) then kommun||' '||trakt
when (NUMPARTS = 3) then kommun||' '||trakt||' '||block
when (NUMPARTS = 4) then kommun||' '||trakt||' '||block||' '||enhet
--alter if you want to
end
from
auto_complete --your tablename here
where
beteckning like $1||'%';
end;
$function$;

Update intermediate result

EDIT
As requested a little background of what I want to achieve. I have a table that I want to query but I don't want to change the table itself. Next the result of the SELECT query (what I called the 'intermediate table') needs to be cleaned a bit. For example certain cells of certain rows need to be swapped and some strings need to be trimmed. Of course this could all be done as postprocessing in, e.g., Python, but I was hoping to do all of this with one query statement.
Being new to Postgresql I want to update the intermediate table that results from a SELECT statement. So I basically want to edit the resulting table from a SELECT statement in one query. I'd like to prevent having to store the intermediate result.
I've tried the following 'with clause':
with result as (
select
a
from
b
)
update result as r
set
a = 'd'
...but that results in ERROR: relation "result" does not exist, while the following does work:
with result as (
select
a
from
b
)
select
*
from
result
As I said, I'm new to Postgresql so it is entirely possible that I'm using the wrong approach.
Depending on the complexity of the transformations you want to perform, you might be able to munge it into the SELECT, which would let you get away with a single query:
WITH foo AS (SELECT lower(name), freq, cumfreq, rank, vec FROM names WHERE name LIKE 'G%')
SELECT ... FROM foo WHERE ...
Or, for more or less unlimited manipulation options, you could create a temp table that will disappear at the end of the current transaction. That doesn't get the job done in a single query, but it does get it all done on the SQL server, which might still be worthwhile.
db=# BEGIN;
BEGIN
db=# CREATE TEMP TABLE foo ON COMMIT DROP AS SELECT * FROM names WHERE name LIKE 'G%';
SELECT 4677
db=# SELECT * FROM foo LIMIT 5;
name | freq | cumfreq | rank | vec
----------+-------+---------+------+-----------------------
GREEN | 0.183 | 11.403 | 35 | 'KRN':1 'green':1
GONZALEZ | 0.166 | 11.915 | 38 | 'KNSL':1 'gonzalez':1
GRAY | 0.106 | 15.921 | 69 | 'KR':1 'gray':1
GONZALES | 0.087 | 18.318 | 94 | 'KNSL':1 'gonzales':1
GRIFFIN | 0.084 | 18.659 | 98 | 'KRFN':1 'griffin':1
(5 rows)
db=# UPDATE foo SET name = lower(name);
UPDATE 4677
db=# SELECT * FROM foo LIMIT 5;
name | freq | cumfreq | rank | vec
--------+-------+---------+-------+---------------------
grube | 0.002 | 67.691 | 7333 | 'KRP':1 'grube':1
gasper | 0.001 | 69.999 | 9027 | 'KSPR':1 'gasper':1
gori | 0.000 | 81.360 | 28946 | 'KR':1 'gori':1
goeltz | 0.000 | 85.471 | 47269 | 'KLTS':1 'goeltz':1
gani | 0.000 | 86.202 | 51743 | 'KN':1 'gani':1
(5 rows)
db=# COMMIT;
COMMIT
db=# SELECT * FROM foo;
ERROR: relation "foo" does not exist

Syntax error in a simple UPDATE query

I have this query (PostgreSQL 9.1):
=> update tbp set super_answer = null where packet_id = 18;
ERROR: syntax error at or near "="
I don't get it. I'm really out of words here.
Table "public.tbp"
Column | Type | Modifiers
--------------+------------------------+-----------
id | bigint | not null
super_answer | bigint |
packet_id | bigint |
It turned out I've copied some white unicode character and Postgres didn't like it.
In a Python console:
>>> u'update "tbp" set "super_answer"=null where "packet_id" = 18'
u'update "tbp" set\xa0"super_answer"=null where "packet_id" = 18'
Life can be strange sometimes.