With PostgREST, convert a column to and from an external encoding in the API - postgresql

We are using PostgREST to automatically generate a REST API for a Postgres database. Our primary keys have an external representation that's different from how we store them internally. For simplicity's sake lets pretend the ids are stored as integers but we represent them as hexadecimal strings outwardly.
It's simple enough to get PostgREST to convert to the external representation for read operations:
CREATE DOMAIN hexid AS bigint;
CREATE TABLE fruits (
fruit_id hexid PRIMARY KEY,
name text
);
CREATE OR REPLACE VIEW api_fruits AS
SELECT to_hex(fruit_id) as fruit_id, name FROM fruits;
INSERT INTO fruits(fruit_id, name) VALUES('51955', 'avocado');
PostgREST generates the expected representation when we GET api_fruits:
[
{
"fruit_id": "caf3",
"name": "avocado"
}
]
But that's about as far as we get with this solution. It's a one way transformation so we won't be able to POST/PATCH records this way. The way PostgREST works is to transform such requests into equivalent INSERT and UPDATE statements. But this view with its custom formatting is not updatable. This is what would happen if we tried:
ERROR: cannot insert into column "fruit_id" of view "api_fruits"
DETAIL: View columns that are not columns of their base relation are not updatable.
STATEMENT: WITH pgrst_source AS (WITH pgrst_payload AS (SELECT $1::json AS json_data), pgrst_body AS ( SELECT CASE WHEN json_typeof(json_data) = 'array' THEN json_data ELSE json_build_array(json_data) END AS val FROM pgrst_payload) INSERT INTO "api_x"."api_fruits"("fruit_id", "name") SELECT "fruit_id", "name" FROM json_populate_recordset (null::"api_x"."api_fruits", (SELECT val FROM pgrst_body)) _ RETURNING "api_x"."api_fruits".*) SELECT '' AS total_result_set, pg_catalog.count(_postgrest_t) AS page_total, CASE WHEN pg_catalog.count(_postgrest_t) = 1 THEN coalesce((
WITH data AS (SELECT row_to_json(_) AS row FROM pgrst_source AS _ LIMIT 1)
SELECT array_agg(json_data.key || '=eq.' || json_data.value)
FROM data CROSS JOIN json_each_text(data.row) AS json_data
WHERE json_data.key IN ('')
), array[]::text[]) ELSE array[]::text[] END AS header, '' AS body, nullif(current_setting('response.headers', true), '') AS response_headers, nullif(current_setting('response.status', true), '') AS response_status FROM (SELECT * FROM pgrst_source) _postgrest_t
We can't INSERT into "View columns that are not columns of their base relation".
The obvious workaround is to serve fruit_id as a straight column, just an integer. With some post and preprocessing at the nginx level we can hex encode it there (and hex decode incoming ids). I'm wondering if we can do better than that though. For large API operations, re-encoding the JSON will use a lot of memory and CPU time and it seems so unnecessary.
It would have been great to be able to use a custom CREATE CAST to take the incoming hexadecimal strings and turn them back into integers, something like this:
CREATE CAST (json AS hexid) WITH FUNCTION json_to_hexid AS ASSIGNMENT;
But alas custom casts are ignored on CREATE DOMAIN types. And we can't make a true custom column type because our cloud Postgres host (Google Cloud SQL) doesn't allow custom extensions.
It feels like some combination of INSTEAD OF triggers or rules could work. But when using query parameters to filter results using query parameters (e.g. select a fruit by id), I don't think there's an appropriate trigger to use. INSTEAD OF doesn't work for straight SELECT does it?
For example I've tested doing something like this to take care of INSERT and allow POST with PostgREST. It works:
CREATE OR REPLACE FUNCTION api_fruits_insert()
RETURNS trigger AS
$$
BEGIN
INSERT INTO fruits(fruit_id, name) VALUES (('x' || lpad(NEW.fruit_id, 16, '0'))::bit(64)::bigint::hexid, NEW.name);
RETURN NEW;
END
$$ LANGUAGE 'plpgsql';
CREATE TRIGGER api_fruits_insert
INSTEAD OF INSERT
ON api_fruits
FOR EACH ROW
EXECUTE PROCEDURE api_fruits_insert();
The trouble is in the WHERE clause. Let's PATCH api_fruits?fruit_id=in.(7b,caf3) with {"name": "pear"}. This works out of the box since the name column is updatable but look at the query:
WITH pgrst_source AS (WITH pgrst_payload AS (SELECT $1::json AS json_data), pgrst_body AS ( SELECT CASE WHEN json_typeof(json_data) = 'array' THEN json_data ELSE json_build_array(json_data) END AS val FROM pgrst_payload) UPDATE "api_x"."api_fruits" SET "name" = _."name" FROM (SELECT * FROM json_populate_recordset (null::"api_x"."api_fruits" , (SELECT val FROM pgrst_body) )) _ WHERE "api_x"."api_fruits"."fruit_id" = ANY ($2) RETURNING 1) SELECT '' AS total_result_set, pg_catalog.count(_postgrest_t) AS page_total, array[]::text[] AS header, '' AS body, nullif(current_setting('response.headers', true), '') AS response_headers, nullif(current_setting('response.status', true), '') AS response_status FROM (SELECT * FROM pgrst_source) _postgrest_t
DETAIL: parameters: $1 = '{
"name": "pear"
}', $2 = '{7b,caf3}'
So we have essentially UPDATE api_fruits SET name='berry' WHERE fruit_id IN ('7b', 'caf3');. Surprisingly this works but it's a full table scan so Postgres can evaluate to_hex(fruit_id) for each row looking for matches. The same happens if we try to GET a record by fruit_id. How would we rewrite the WHERE clauses?
It really feels like some combination of just the right Postgres and PostgREST features should be able to get us to a point where it's all happening in Postgres without nginx's help and without excessive complexity. Any ideas?

Related

Can postgreSQL OnConflict combine with JSON obejcts?

I wanted to perform a conditional insert in PostgreSQL. Something like:
INSERT INTO {TABLE_NAME} (user_id, data) values ('{user_id}', '{data}')
WHERE not exists(select 1 from files where user_id='{user_id}' and data->'userType'='Type1')
Unfortunately, insert and where does not cooperate in PostGreSQL. What could be a suitable syntax for my query? I was considering ON CONFLICT, but couldn't find the syntax for using it with JSON object. (Data in the example)
Is it possible?
Rewrite the VALUES part to a SELECT, then you can use a WHERE condition:
INSERT INTO { TABLE_NAME } ( user_id, data )
SELECT
user_id,
data
FROM
( VALUES ( '{user_id}', '{data}' ) ) sub ( user_id, data )
WHERE
NOT EXISTS (
SELECT 1
FROM files
WHERE user_id = '{user_id}'
AND data -> 'userType' = 'Type1'
);
But, there is NO guarantee that the WHERE condition works! Another transaction that has not been committed yet, is invisible to this query. This could lead to data quality issues.
You can use INSERT ... SELECT ... WHERE ....
INSERT INTO elbat
(user_id,
data)
SELECT 'abc',
'xyz'
WHERE NOT EXISTS (SELECT *
FROM files
WHERE user_id = 'abc'
AND data->>'userType' = 'Type1')
And it looks like you're creating the query in a host language. Don't use string concatenation or interpolation for getting the values in it. That's error prone and makes your application vulnerable to SQL injection attacks. Look up how to use parameterized queries in your host language. Very likely for the table name parameters cannot be used. You need some other method of either whitelisting the names or properly quoting them.

How to use dynamic regex to match value in Postgres

SUMMARY: I've two tables I want to derive info out of: family_values (family_name, item_regex) and product_ids (product_id) to be able to update the property family_name in a third.
Here the plan is to grab a json array from the small family_values table and use the column value item_regex to do a test match against the product_id for every row in product_ids.
MORE DETAILS: Importing static data from CSV to table of orders. But, in evaluating cost of goods and market value I'm needing to continuously determine family from a prefix regex (item_regex from family_values) match on the product_id.
On the client this looks like this:
const families = {
FOOBAR: 'Big Ogre',
FOOBA: 'Wood Elf',
FOO: 'Valkyrie'
};
// And to find family, and subsequently COGs and Market Value:
const findFamily = product_id => Object.keys(families).find(f => new RegExp('^' + f).test(product_id));
This is a huge hit for the client so I made a family_values table in PG to include a representative: family_name, item_regex, cogs, market_value.
Then, the product_ids has a list of only the products the app cares about (out of millions). This is actually used with an insert trigger 'on before' to ignore any CSV entries that aren't in the product_ids view. So, I guess after that the product_ids view could be taken out of the equation because the orders, after inserting readonly data, has its own matching product_id. It does NOT have family_name, so I still have the issue of determining that client-side.
PSUEDO CODE: update family column of orders with family_name from family_values regex match against orders.product_id
OR update the product_ids table with a new family column and use that with the existing on insert trigger (used to left pad zeros and normalize data right now). Now I'm thinking this may be just an update as suggested, but not real good with regex in PG. I'm a PG novice.
PROBLEM: But, I'm having a hangup in doing what I thought would be like a JS Array Find operation. The family_values have been sorted on the item_regex so that the most strict match would be on top, and therefor found first.
For example, with sorting we have:
family_values_array = [
{"family_name": "Big Ogre", "item_regex": "FOOBAR"},
{"family_name": "Wood Elf", "item_regex": "FOOBA"},
{"family_name": "Valkyrie", "item_regex": "FOO"}]
So, that the comparison of product_id of ^FOOBA would yield family "Wood Elf".
SOLUTION:
The solution I finally came about using was simply using concat to write out the front-anchored regex. It was so simple in the end. The key line I was missing is:
select * into family_value_row from iol.family_values
where lvl3_id = product_row.lvl3_id and product_row.product_id
like concat(item_regex, '%') limit 1;
Whole function:
create or replace function iol.populate_families () returns void as $$
declare
product_row record;
family_value_row record;
begin
for product_row in
select product_id, lvl3_id from iol.products
loop
-- family_name is what we want after finding the BEST match fr a product_id against item_regex
select * into family_value_row from iol.family_values
where lvl3_id = product_row.lvl3_id and product_row.product_id like concat(item_regex, '%') limit 1;
-- update family_name and value columns
update iol.products set
family_name = family_value_row.family_name,
cog_cents = family_value_row.cog_cents,
market_value_cents = family_value_row.market_value_cents
where product_id = product_row.product_id;
end loop;
end;
$$
LANGUAGE plpgsql;
Use concat as updated above:
select * into family_value_row from iol.family_values
where lvl3_id = product_row.lvl3_id and product_row.product_id
like concat(item_regex, '%') limit 1;

Extract all the values in jsonb into a row

I'm using postgresql 11, I have a jsonb which represent a row of that table, it's look like
{"userid":"test","rolename":"Root","loginerror":0,"email":"superadmin#ae.com",...,"thirdpartyauthenticationkey":{}}
is there any method that I could gather all the "values" of the jsonb into a string which is separated by ',' and without the keys?
The string I want to obtain with the jsonb above is like
(test, Root, 0, superadmin#ae.com, ..., {})
I need to keep the ORDER of those values as what their keys were in the jsonb. Could I do that with postgresql?
You can use the jsonb_populate_record function (assuming your json data does match the users table). This will force the text value to match the order of your users table:
Schema (PostgreSQL v13)
CREATE TABLE users (
userid text,
rolename text,
loginerror int,
email text,
thirdpartyauthenticationkey json
)
Query #1
WITH d(js) AS (
VALUES
('{"userid":"test", "rolename":"Root", "loginerror":0, "email":"superadmin#ae.com", "thirdpartyauthenticationkey":{}}'::jsonb),
('{"userid":"other", "rolename":"User", "loginerror":324, "email":"nope#ae.com", "thirdpartyauthenticationkey":{}}'::jsonb)
)
SELECT jsonb_populate_record(null::users, js),
jsonb_populate_record(null::users, js)::text AS record_as_text,
pg_typeof(jsonb_populate_record(null::users, js)::text)
FROM d
;
jsonb_populate_record
record_as_text
pg_typeof
(test,Root,0,superadmin#ae.com,{})
(test,Root,0,superadmin#ae.com,{})
text
(other,User,324,nope#ae.com,{})
(other,User,324,nope#ae.com,{})
text
Note that if you're building this string to insert it back into postgresql then you don't need to do that, since the result of jsonb_populate_record will match your table:
Query #2
WITH d(js) AS (
VALUES
('{"userid":"test", "rolename":"Root", "loginerror":0, "email":"superadmin#ae.com", "thirdpartyauthenticationkey":{}}'::jsonb),
('{"userid":"other", "rolename":"User", "loginerror":324, "email":"nope#ae.com", "thirdpartyauthenticationkey":{}}'::jsonb)
)
INSERT INTO users
SELECT (jsonb_populate_record(null::users, js)).*
FROM d;
There are no results to be displayed.
Query #3
SELECT * FROM users;
userid
rolename
loginerror
email
thirdpartyauthenticationkey
test
Root
0
superadmin#ae.com
[object Object]
other
User
324
nope#ae.com
[object Object]
View on DB Fiddle
You can use jsonb_each_text() to get a set of a text representation of the elements, string_agg() to aggregate them in a comma separated string and concat() to put that in parenthesis.
SELECT concat('(', string_agg(value, ', '), ')')
FROM jsonb_each_text('{"userid":"test","rolename":"Root","loginerror":0,"email":"superadmin#ae.com","thirdpartyauthenticationkey":{}}'::jsonb) jet (key,
value);
db<>fiddle
You didn't provide DDL and DML of a (the) table the JSON may reside in (if it does, that isn't clear from your question). The demonstration above therefore only uses the JSON you showed as a scalar. If you have indeed a table you need to CROSS JOIN LATERAL and GROUP BY some key.
Edit:
If you need to be sure the order is retained and you don't have that defined in a table's structure as #Marth's answer assumes, then you can of course extract every value manually in the order you need them.
SELECT concat('(',
concat_ws(', ',
j->>'userid',
j->>'rolename',
j->>'loginerror',
j->>'email',
j->>'thirdpartyauthenticationkey'),
')')
FROM (VALUES ('{"userid":"test","rolename":"Root","loginerror":0,"email":"superadmin#ae.com","thirdpartyauthenticationkey":{}}'::jsonb)) v (j);
db<>fiddle

How to modify or remove a specific JSON object from JSON array stored in jsonb column type in PostgreSQL using where clause?

In my Postgres database, I have one of the table columns having jsonb datatype. In that column, I am storing the JSON array. Now, I want to remove or modify a specific JSON object inside the array.
My JSON array looks like
[
{
"ModuleId": 1,
"ModuleName": "XYZ"
},
{
"ModuleId": 2,
"ModuleName": "ABC"
}
]
Now, I want to perform two operations:
How can I remove the JSON object from the above array having ModuleId as 1?
How can I modify the JSON object i.e. change the ModuleName as 'CBA' whose ModuleId is 1?
Is there a way through which I could perform queried directly on JSON array?
Note: Postgres version is 12.0
Both problems require unnesting and aggregating back the (modified) JSON elements. For both problems I would create a function to make that easier to use.
create function remove_element(p_value jsonb, p_to_remove jsonb)
returns jsonb
as
$$
select jsonb_agg(t.element order by t.idx)
from jsonb_array_elements(p_value) with ordinality as t(element, idx)
where not t.element #> p_to_remove;
$$
language sql
immutable;
The function can be used like this, e.g. in an UPDATE statement:
update the_table
set the_column = remove_element(the_column, '{"ModuleId": 1}')
where ...
For the second problem a similar function comes in handy.
create function change_value(p_value jsonb, p_what jsonb, p_new jsonb)
returns jsonb
as
$$
select jsonb_agg(
case
when t.element #> p_what then t.element||p_new
else t.element
end order by t.idx)
from jsonb_array_elements(p_value) with ordinality as t(element, idx);
$$
language sql
immutable;
The || operator will overwrite an existing key, so this effectively replaces the old name with the new name.
You can use it like this:
update the_table
set the_column = change_value(the_column, '{"ModuleId": 1}', '{"ModuleName": "CBA"}')
where ...;
I think passing the JSON values is a bit more flexible then hardcoding the keys which makes the use of the function very limited. The first function could also be used to remove array elements by comparing multiple keys.
If you don't want to create the functions, replace the function call with the select from the functions.
For your both cases, consider using a subquery including dynamic logic to determine the index of the element which contains the value for ModuleId key equal to 1.
For the First case, use #- operator :
WITH s AS
(
SELECT ('{'||idx-1||'}')::text[] AS path, jsdata
FROM tab
CROSS JOIN jsonb_array_elements(jsdata)
WITH ORDINALITY arr(j,idx)
WHERE j->>'ModuleId'='1'
)
UPDATE tab
SET jsdata = s.jsdata #- path
FROM s
and for the second case, use jsonb_set() function with path coming from thesubquery :
WITH s AS
(
SELECT ('{'||idx-1||',ModuleId}')::text[] AS path
FROM tab
CROSS JOIN jsonb_array_elements(jsdata)
WITH ORDINALITY arr(j,idx)
WHERE j->>'ModuleId'='1'
)
UPDATE tab
SET jsdata = jsonb_set(jsdata,s.path,'"CBA"',false)
FROM s
Demo

How to split array in json using json_query?

I've got a column in a table that's a json. It contains only values without keys like
Now I'm trying to split the data from the json and create new table using every index of each array as new entry like
I've already tried
SELECT JSON_QUERY(abc) as 'Type', Id as 'ValueId' from Table FOR JSON AUTO
Is there any way to handle splitting given that some arrays might be empty and look like
[]
?
A fairly simply approach would be to use outer apply with openjson.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
Id int,
Value nvarchar(20)
)
INSERT INTO #T VALUES
(1, '[10]'),
(2, '[20, 200]'),
(3, '[]'),
(4, '')
The query:
SELECT Id, JsonValues.Value
FROM #T As t
OUTER APPLY
OPENJSON( Value ) As JsonValues
WHERE ISJSON(t.Value) = 1
Results:
Id Value
1 10
2 20
2 200
3 NULL
Note the ISJSON condition in the where clause will prevent exceptions in case the Value column contains anything other than a valid json (an empty array [] is still considered valid for this purpose).
If you don't want to return a row where the json array is empty, use cross apply instead of outer apply.
Your own code calling for FOR JSON AUTO tries to create JSON out of tabular data. But what you really needs seems to be the opposite direction: You want to transform JSON to a result set, a derived table. This is done by OPENJSON.
Your JSON seems to be a very minimalistic array.
You can try something along this.
DECLARE #json NVARCHAR(MAX) =N'[1,2,3]';
SELECT * FROM OPENJSON(#json);
The result returns the zero-based ordinal position in key, the actual value in value and a (very limited) type-enum.
Hint: If you want to use this against a table's column you must use APPLY, something along
SELECT *
FROM YourTable t
OUTER APPLY OPENJSON(t.TheJsonColumn);