How to find an arbitrary key within a postgres jsonb object? - postgresql

There are several operators in postgres for getting elements at a certain path in jsonb.
But how could I retrieve all the values that have a key of 'foo', if I don't know where in the whole object structure they will appear?
I saw there is a regex matches function which would return me matching regexes, but the object keyed off foo could be arbitrarily complex, so tough to come up with a regex that would pull the whole object out neatly?
Thanks for your help

SELECT jsonb_column->'foo'
FROM table
[WHERE jsonb_column ? 'foo'] -- ignore values without key "foo"

Related

How to splay a column of type dictionary

I have a column in my table whose values that are dictionaries. The type in the meta of that column is " ".
I want to know how to splay this table. When I try to splay it, I get a type error. I am aware only vectors can be splayed, however, I have seen a table where a column holds dictionaries splayed before, so I know it's possible, but I am not sure how it is done.
Dictionaries are only supported in kdb version >3.6.
If you are running 3.6/4.0, double check you are enumerating the table for splay.
`:path/to/table set .Q.en[`:hdb;table]
If <3.6 json string is a good alternative although not recommended on large tables as .j.k is slow.

How to efficiently index fields with an identical (and long) prefix in PostgreSQL?

I’m working with identifiers in a rather unusual format: every single ID has the same prefix and the prefix consists of as many as 25 characters. The only thing that is unique is the last part of the ID string and it has a variable length of up to ten characters:
ID
----------------------------------
lorem:ipsum:dolor:sit:amet:12345
lorem:ipsum:dolor:sit:amet:abcd123
lorem:ipsum:dolor:sit:amet:efg1
I’m looking for advice on the best strategy around indexing and matching this kind of ID string in PostgreSQL.
One approach I have considered is basically cutting these long prefixes out and only storing the unique suffix in the table column.
Another option that comes to mind is only indexing the suffix:
CREATE INDEX ON books (substring(book_id FROM 26));
I don’t think this is the best idea though as you would need to remember to always strip out the prefix when querying the table. If you forgot to do it and had a WHERE book_id = '<full ID here>' filter, the index would basically be ignored by the planner.
Most times I always create an integer type ID for my tables if even I have one unique string type of field. Recommendation for you is a good idea, I must view all your queries in DB. If you are recently using substring(book_id FROM 26) after the where statement, this is the best way to create expression index (function-based index). Basically, you need to check table joining conditions, which fields are used in the joining processes, and which fields are used after WHERE statements in your queries. After then you can prepare the best plan for creating indexes. If on the process of table joining you are using last part unique characters on the ID field then this is the best way to extract unique last characters and store this in additional fields or create expression index using the function for extracting unique characters.

variable substitution querying postgres jsonb object

Hi I am querying a JSONb object where the search key depends on the value of another key/value pair. Consider the following example:
select
'{"a":"b","b":2}'::jsonb->'a',--b,
('{"a":"b","b":2}'::jsonb->'a')::text,--"b"
'{"a":"b","b":2}'::jsonb->('{"a":"b","b":2}'::jsonb->'a')::text--null, desired output is 2
Somehow I need to dereference the "b" to a json search path like 'b'
Any suggestions will be greatly appreciated
You should tu use ->> operator instead ->. There was unwanted double quoting:
select
'{"a":"b","b":2}'::jsonb->'a',--b,
('{"a":"b","b":2}'::jsonb->>'a')::text,--"b"
'{"a":"b","b":2}'::jsonb->('{"a":"b","b":2}'::jsonb->>'a')::text;

PostgreSql Queries treats Int as string datatypes

I store the following rows in my table ('DataScreen') under a JSONB column ('Results')
{"Id":11,"Product":"Google Chrome","Handle":3091,"Description":"Google Chrome"}
{"Id":111,"Product":"Microsoft Sql","Handle":3092,"Description":"Microsoft Sql"}
{"Id":22,"Product":"Microsoft OneNote","Handle":3093,"Description":"Microsoft OneNote"}
{"Id":222,"Product":"Microsoft OneDrive","Handle":3094,"Description":"Microsoft OneDrive"}
Here, In this JSON objects "Id" amd "Handle" are integer properties and other being string properties.
When I query my table like below
Select Results->>'Id' From DataScreen
order by Results->>'Id' ASC
I get the improper results because PostgreSql treats everything as a text column and hence does the ordering according to the text, and not as integer.
Hence it gives the result as
11,111,22,222
instead of
11,22,111,222.
I don't want to use explicit casting to retrieve like below
Select Results->>'Id' From DataScreen order by CAST(Results->>'Id' AS INT) ASC
because I will not be sure of the datatype of the column due to the fact that JSON structure will be dynamic and the keys and values may change next time. and Hence could happen the same with another JSON that has Integer and string keys.
I want something so that Integers in Json structure of JSONB column are treated as integers only and not as texts (string).
How do I write my query so that Id And Handle are retrieved as Integer Values and not as strings , without explicit casting?
I think your assumtions about the id field don't make sense. You said,
(a) Either id contains integers only or
(b) it contains strings and integers.
I'd say,
If (a) then numerical ordering is correct.
If (b) then lexical ordering is correct.
But if (a) for some time and then (b) then the correct order changes, too. And that doesn't make sense. Imagine:
For the current database you expect the order 11,22,111,222. Then you add a row
{"Id":"aa","Product":"Microsoft OneDrive","Handle":3095,"Description":"Microsoft OneDrive"}
and suddenly the correct order of the other rows changes to 11,111,22,222,aa. That sudden change is what bothers me.
So I would either expect a lexical ordering ab intio, or restrict my id field to integers and use explicit casting.
Every other option I can think of is just not practical. You could, for example, create a custom < and > implementation for your id field which results in 11,111,22,222,aa. ("Order all integers by numerical value and all strings by lexical order and put all integers before the strings").
But that is a lot of work (it involves a custom data type, a custom cast function and a custom operator function) and yields some counterintuitive results, e.g. 11,111,22,222,0a,1a,2a,aa (note the position of 0a and so on. They come after 222).
Hope, that helps ;)
If Id always integer you can cast it in select part and just use ORDER BY 1:
select (Results->>'Id')::int From DataScreen order by 1 ASC

How to alter Postgres table data based on its contents?

This is probably a super simple question, but I'm struggling to come up with the right keywords to find it on Google.
I have a Postgres table that has among its contents a column of type text named content_type. That stores what type of entry is stored in that row.
There are only about 5 different types, and I decided I want to change one of them to display as something else in my application (I had been directly displaying these).
It struck me that it's funny that my view is being dictated by my database model, and I decided I would convert the types being stored in my database as strings into integers, and enumerate the possible types in my application with constants that convert them into their display names. That way, if I ever got the urge to change any category names again, I could just change it with one alteration of a constant. I also have the hunch that storing integers might be somewhat more efficient than storing text in the database.
First, a quick threshold question of, is this a good idea? Any feedback or anything I missed?
Second, and my main question, what's the Postgres command I could enter to make an alteration like this? I'm thinking I could start by renaming the old content_type column to old_content_type and then creating a new integer column content_type. However, what command would look at a row's old_content_type and fill in the new content_type column based off of that?
If you're finding that you need to change the display values, then yes, it's probably a good idea not to store them in a database. Integers are also more efficient to store and search, but I really wouldn't worry about it unless you've got millions of rows.
You just need to run an update to populate your new column:
update table_name set content_type = (case when old_content_type = 'a' then 1
when old_content_type = 'b' then 2 else 3 end);
If you're on Postgres 8.4 then using an enum type instead of a plain integer might be a good idea.
Ideally you'd have these fields referring to a table containing the definitions of type. This should be via a foreign key constraint. This way you know that your database is clean and has no invalid values (i.e. referential integrity).
There are many ways to handle this:
Having a table for each field that can contain a number of values (i.e. like an enum) is the most obvious - but it breaks down when you have a table that requires many attributes.
You can use the Entity-attribute-value model, but beware that this is too easy to abuse and cause problems when things grow.
You can use, or refer to my implementation solution PET (Parameter Enumeration Tables). This is a half way house between between 1 & 2.