Removing keys/values from a JSONB object in Postgresql - postgresql

I am trying to adapt the Audit trigger to use JSONB instead of hstore. This function stores all inserts/updates/deletes in a separate table.
The trigger has 2 interesting fields : row_data (contains the OLD.* values) and changed_fields (contains only the modified fields/values).
I have trouble converting this part.
In the original function, we have the following code :
audit_row.row_data = hstore(OLD.*);
audit_row.changed_fields = (hstore(NEW.*) - audit_row.row_data);
In my implementation, row_data and changed_fields are of type JSONB
In the documentation, the "-" operator will match on keys only and I obviously need to match on both key AND value.
As an example, if the OLD value is this :
select jsonb_object('{field1,1,field2,a string,field3,TRUE}');
jsonb_object
---------------------------------------------------------
{"field1": "1", "field2": "a string", "field3": "TRUE"}
and only field2 was updated, I need to see this :
?column?
------------------------
{"field2": "a string"}
which I would get with this query :
select jsonb_object('{field1,1,field2,a string,field3,TRUE}')
#- '{field1}'
#- '{field3}';
Is there an elegant way to do this (like how it's done with hstore) or should I keep the hstore implementation and convert changed_fields to JSONB (with everything seen as text) ?
I could also loop on all fields in NEW and add them to changed_fields if a match could not be found but how to do this inside a function ?

Related

How to update a jsonb column with a replaced value in pgAdmin?

I have a PostgreSQL table called files which includes a jsonb table called formats. While some rows are [null], others have objects with this structure:
{
"thumbnail": {
"ext": ".jpg",
"url": "https://some-url.com/image01.jpg",
"name": "image01.jpg",
//...other properties
}
}
For every row I want to update the thumbnail.url and replace some-url with other-url.
I'm far from being an expert in PostgreSQL (or any other DB for that matter), and after some reading I tried to run the following query in pgAdmin:
UPDATE files
SET formats = jsonb_set(formats, '{thumbnail.url}', REPLACE('{thumbnail.url}', 'some-url', 'other-url'))
And I received this error: function jsonb_set(jsonb, unknown, text) does not exist
I tried to set format jsonb_set(formats::jsonb...), tried to target '{thumbnail}' instead of '{thumbnail.url}' - always the same error.
What am I doing wrong? Or is pgAdmin really doesn't support this function? How can I do such an update with pgAdmin query tool?
We can try to use ->> to get JSON content value of url and then replace your expect value from that.
Because your url field of your JSON might be string type we need to use " to content it before cast as JSONB
jsonb_set(target jsonb, path text[], new_value jsonb [, create_missing boolean])
UPDATE files
SET formats = jsonb_set(formats, '{thumbnail,url}', CONCAT('"',REPLACE(formats->'thumbnail'->>'url','some-url','other-url'),'"')::JSONB);
sqlfiddle
The second parameter of jsonb_set() must be an array with one array element for each "path" element. So the second parameter should be '{thumbnail,url}' or more obvious: array['thumbnail', 'url']
And the third parameter must be a jsonb value, but replace returns a text, so you need to use e.g. to_jsonb() to convert the result of the replace() to a jsonb value.
And as D-Shih pointed out, you need to extract the old value using ->>. But to get the URL you need to "navigate" to it: formats -> 'thumbnail ->> 'url'
I would also add a WHERE clause so that you only update rows that actually contain a URL.
UPDATE files
SET formats = jsonb_set(formats,
'{thumbnail,url}',
to_jsonb(replace(formats -> 'thumbnail' ->> 'url', 'some-url', 'other-url'))
)
where (formats -> 'thumbnail') ? 'url'

postgres SQL: getting rid of NA while migrating data from csv file

I am migrating data from a "csv" file into a newly created table named fortune500. the code is shown below
CREATE TABLE "fortune500"(
"id" SERIAL,
"rank" INTEGER,
"title" VARCHAR PRIMARY KEY,
"name" VARCHAR,
"ticker" CHAR(5),
"url" VARCHAR,
"hq" VARCHAR,
"sector" VARCHAR,
"industry" VARCHAR,
"employees" INTEGER,
"revenues" INTEGER,
"revenues_change" REAL,
"profits" NUMERIC,
"profits_change" REAL,
"assets" NUMERIC,
"equity" NUMERIC
);
Then I wanted to migrate data from a csv file using the below code:
COPY "fortune500"("rank", "title", "name", "ticker", "url", "hq", "sector", "industry", "employees",
"revenues", "revenues_change", "profits", "profits_change", "assets", "equity")
FROM 'C:\Users\Yasser A.RahmAN\Desktop\SQL for Business Analytics\fortune.csv'
DELIMITER ','
CSV HEADER;
But I got the below error message due to NA values in one of the columns.
ERROR: invalid input syntax for type real: "NA"
CONTEXT: COPY fortune500, line 12, column profits_change: "NA"
SQL state: 22P02
So how can I get rid of 'NA' values while migrating the data?
Consider using a staging table that would not have restrictive data types and then do your transformations and insert into the final table after the data had been loaded into staging. This is known as ELT (Extract - Load - Transform) approach. You could also use some external tools to implement an ETL process, and do the transformation in that tool, before it reaches your database.
In your case, an ELT approach would be to first create a table with all text types, load that table and then insert into your final table, casting the text values into appropriate types, either filtering out the values that cannot be casted or inserting NULLs, or maybe 0, where that cast can't be made - depending on your requirements. For example you'd filter out rows where profits_change = 'NA' (or better, WHERE NOT (profits_change ~ '^\d+\.?\d+$') to check for a numeric value, or you'd insert NULL or 0:
CASE
WHEN profits_change ~ '^\d+\.?\d+$'
THEN profits_change::real
ELSE NULL -- or 0, depending what you need
END
You'd perform this kind of validation for all fields.
Alternatively, if it's a one off thing - just edit your CSV before importing.

Unable to make generated column in postgresql for Json data

I'm trying out generated column with postgres-12. I need to create a table with generated column with JSON data. I'm going to receive "name" field as key there . However, while doing so - I got below error:
postgres=# create table json_tab2 (data jsonb ,
postgres(# "json_tab2.pname" text generated always as (data ->> "name" ) stored
postgres(# );
ERROR: column "name" does not exist
LINE 2: ...on_tab2.pname" text generated always as (data ->> "name" ) ...
After this: I tried to alter existing table- because that has value into json data for generated column - so it should be able to identify "name" now. This time I ran below:
postgres=# alter table json_tab add column Pname text generated always as (data ->> "name") stored
;
ERROR: column "name" does not exist
However, "name" has value here:
data
-------------------------------------------------
{"age": 31, "city": "New York", "name": "John"}
I'm unable to understand - what I'm doing wrong here
The righthand side of the ->> operator should be a value. In this case, since it's a string, you need to surround it with single quotes ('):
create table json_tab2 (
data jsonb,
pname text generated always as (data ->> 'name') stored
-- Here ---------------------------------^----^
);

How to modify or remove a specific JSON object from JSON array stored in jsonb column type in PostgreSQL using where clause?

In my Postgres database, I have one of the table columns having jsonb datatype. In that column, I am storing the JSON array. Now, I want to remove or modify a specific JSON object inside the array.
My JSON array looks like
[
{
"ModuleId": 1,
"ModuleName": "XYZ"
},
{
"ModuleId": 2,
"ModuleName": "ABC"
}
]
Now, I want to perform two operations:
How can I remove the JSON object from the above array having ModuleId as 1?
How can I modify the JSON object i.e. change the ModuleName as 'CBA' whose ModuleId is 1?
Is there a way through which I could perform queried directly on JSON array?
Note: Postgres version is 12.0
Both problems require unnesting and aggregating back the (modified) JSON elements. For both problems I would create a function to make that easier to use.
create function remove_element(p_value jsonb, p_to_remove jsonb)
returns jsonb
as
$$
select jsonb_agg(t.element order by t.idx)
from jsonb_array_elements(p_value) with ordinality as t(element, idx)
where not t.element #> p_to_remove;
$$
language sql
immutable;
The function can be used like this, e.g. in an UPDATE statement:
update the_table
set the_column = remove_element(the_column, '{"ModuleId": 1}')
where ...
For the second problem a similar function comes in handy.
create function change_value(p_value jsonb, p_what jsonb, p_new jsonb)
returns jsonb
as
$$
select jsonb_agg(
case
when t.element #> p_what then t.element||p_new
else t.element
end order by t.idx)
from jsonb_array_elements(p_value) with ordinality as t(element, idx);
$$
language sql
immutable;
The || operator will overwrite an existing key, so this effectively replaces the old name with the new name.
You can use it like this:
update the_table
set the_column = change_value(the_column, '{"ModuleId": 1}', '{"ModuleName": "CBA"}')
where ...;
I think passing the JSON values is a bit more flexible then hardcoding the keys which makes the use of the function very limited. The first function could also be used to remove array elements by comparing multiple keys.
If you don't want to create the functions, replace the function call with the select from the functions.
For your both cases, consider using a subquery including dynamic logic to determine the index of the element which contains the value for ModuleId key equal to 1.
For the First case, use #- operator :
WITH s AS
(
SELECT ('{'||idx-1||'}')::text[] AS path, jsdata
FROM tab
CROSS JOIN jsonb_array_elements(jsdata)
WITH ORDINALITY arr(j,idx)
WHERE j->>'ModuleId'='1'
)
UPDATE tab
SET jsdata = s.jsdata #- path
FROM s
and for the second case, use jsonb_set() function with path coming from thesubquery :
WITH s AS
(
SELECT ('{'||idx-1||',ModuleId}')::text[] AS path
FROM tab
CROSS JOIN jsonb_array_elements(jsdata)
WITH ORDINALITY arr(j,idx)
WHERE j->>'ModuleId'='1'
)
UPDATE tab
SET jsdata = jsonb_set(jsdata,s.path,'"CBA"',false)
FROM s
Demo

querying JSONB with array fields

If I have a jsonb column called value with fields such as:
{"id": "5e367554-bf4e-4057-8089-a3a43c9470c0",
"tags": ["principal", "reversal", "interest"],,, etc}
how would I find all the records containing given tags, e.g:
if given: ["reversal", "interest"]
it should find all records with either "reversal" or "interest" or both.
My experimentation got me to this abomination so far:
select value from account_balance_updated
where value #> '{}' :: jsonb and value->>'tags' LIKE '%"principal"%';
of course this is completely wrong and inefficient
Assuming you are using PG 9.4+, you can use the jsonb_array_elements() function:
SELECT DISTINCT abu.*
FROM account_balance_updated abu,
jsonb_array_elements(abu.value->'tags') t
WHERE t.value <# '["reversal", "interest"]'::jsonb;
As it turned out you can use cool jsonb operators described here:
https://www.postgresql.org/docs/9.5/static/functions-json.html
so original query doesn't have to change much:
select value from account_balance_updated
where value #> '{}' :: jsonb and value->'tags' ?| array['reversal', 'interest'];
in my case I also needed to escape the ? (??|) because I am using so called "prepared statement" where you pass query string and parameters to jdbc and question marks are like placeholders for params:
https://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html