As context, I am creating a bucket of key values with empty documents to fulfill a want to quickly check IDs just through checking key existence in comparison to checking values. In the cluster, I have two buckets, source-bucket and new-bucket. The documents in source-bucket are in the form:
ID: {
ID: ...,
type: ...
}
You can move the contents of source to the new bucket using the query
INSERT INTO `new-bucket` (KEY k, VALUE v) SELECT meta(v).id AS k FROM `source-bucket` as v
Is there a way to copy over just the key? Something along the lines of this (although this example doesn't work):
INSERT INTO `new-bucket` (KEY k, VALUE v) values (SELECT meta().id FROM `source-bucket`, NULL)
I guess I'm not familiar enough with the n1ql syntax to under how to construct a query like this. Let me know if you have an answer to this. If this is a duplicate, feel free to point to the answer.
If you need empty object use {}.
CREATE PRIMARY INDEX ON `source-bucket`;
INSERT INTO `new-bucket` (KEY k, VALUE {})
SELECT meta(b).id AS k FROM `source-bucket` as b
NOTE: document value can be empty object or any data type. The following all are valid.
INSERT INTO default VALUES ("k01", {"a":1});
INSERT INTO default VALUES ("k02", {});
INSERT INTO default VALUES ("k03", 1);
INSERT INTO default VALUES ("k04", "aa");
INSERT INTO default VALUES ("k05", true);
INSERT INTO default VALUES ("k06", ["aa"]);
INSERT INTO default VALUES ("k07", NULL);
Related
I'm trying to convert each row in a jsonb column to a type that I've defined, and I can't quite seem to get there.
I have an app that scrapes articles from The Guardian Open Platform and dumps the responses (as jsonb) in an ingestion table, into a column called 'body'. Other columns are a sequential ID, and a timestamp extracted from the response payload that helps my app only scrape new data.
I'd like to move the response dump data into a properly-defined table, and as I know the schema of the response, I've defined a type (my_type).
I've been referring to the 9.16. JSON Functions and Operators in the Postgres docs. I can get a single record as my type:
select * from jsonb_populate_record(null::my_type, (select body from data_ingestion limit 1));
produces
id
type
sectionId
...
example_id
example_type
example_section_id
...
(abbreviated for concision)
If I remove the limit, I get an error, which makes sense: the subquery would be providing multiple rows to jsonb_populate_record which only expects one.
I can get it to do multiple rows, but the result isn't broken into columns:
select jsonb_populate_record(null::my_type, body) from reviews_ingestion limit 3;
produces:
jsonb_populate_record
(example_id_1,example_type_1,example_section_id_1,...)
(example_id_2,example_type_2,example_section_id_2,...)
(example_id_3,example_type_3,example_section_id_3,...)
This is a bit odd, I would have expected to see column names; this after all is the point of providing the type.
I'm aware I can do this by using Postgres JSON querying functionality, e.g.
select
body -> 'id' as id,
body -> 'type' as type,
body -> 'sectionId' as section_id,
...
from reviews_ingestion;
This works but it seems quite inelegant. Plus I lose datatypes.
I've also considered aggregating all rows in the body column into a JSON array, so as to be able to supply this to jsonb_populate_recordset but this seems a bit of a silly approach, and unlikely to be performant.
Is there a way to achieve what I want, using Postgres functions?
Maybe you need this - to break my_type record into columns:
select (jsonb_populate_record(null::my_type, body)).*
from reviews_ingestion
limit 3;
-- or whatever other query clauses here
i.e. select all from these my_type records. All column names and types are in place.
Here is an illustration. My custom type is delmet and CTO t remotely mimics data_ingestion.
create type delmet as (x integer, y text, z boolean);
with t(i, j, k) as
(
values
(1, '{"x":10, "y":"Nope", "z":true}'::jsonb, 'cats'),
(2, '{"x":11, "y":"Yep", "z":false}', 'dogs'),
(3, '{"x":12, "y":null, "z":true}', 'parrots')
)
select i, (jsonb_populate_record(null::delmet, j)).*, k
from t;
Result:
i
x
y
z
k
1
10
Nope
true
cats
2
11
Yep
false
dogs
3
12
true
parrots
I've got a column in a table that's a json. It contains only values without keys like
Now I'm trying to split the data from the json and create new table using every index of each array as new entry like
I've already tried
SELECT JSON_QUERY(abc) as 'Type', Id as 'ValueId' from Table FOR JSON AUTO
Is there any way to handle splitting given that some arrays might be empty and look like
[]
?
A fairly simply approach would be to use outer apply with openjson.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
Id int,
Value nvarchar(20)
)
INSERT INTO #T VALUES
(1, '[10]'),
(2, '[20, 200]'),
(3, '[]'),
(4, '')
The query:
SELECT Id, JsonValues.Value
FROM #T As t
OUTER APPLY
OPENJSON( Value ) As JsonValues
WHERE ISJSON(t.Value) = 1
Results:
Id Value
1 10
2 20
2 200
3 NULL
Note the ISJSON condition in the where clause will prevent exceptions in case the Value column contains anything other than a valid json (an empty array [] is still considered valid for this purpose).
If you don't want to return a row where the json array is empty, use cross apply instead of outer apply.
Your own code calling for FOR JSON AUTO tries to create JSON out of tabular data. But what you really needs seems to be the opposite direction: You want to transform JSON to a result set, a derived table. This is done by OPENJSON.
Your JSON seems to be a very minimalistic array.
You can try something along this.
DECLARE #json NVARCHAR(MAX) =N'[1,2,3]';
SELECT * FROM OPENJSON(#json);
The result returns the zero-based ordinal position in key, the actual value in value and a (very limited) type-enum.
Hint: If you want to use this against a table's column you must use APPLY, something along
SELECT *
FROM YourTable t
OUTER APPLY OPENJSON(t.TheJsonColumn);
I need to update a jsonb column which is called "verticals" and the array of values it holds are like HOM, BFB etc. There are no keys in the array.
Table: Product(verticals jsonb, code int)
sample value stored in "verticals" column is
[HOM,rst,NLF,WELSAK,HTL,TRV,EVCU,GRT]
I need to update the value 'HOM' to 'XXX' in the column "verticals" where code =1
My expected output is
[XXX,rst,NLF,WELSAK,HTL,TRV,EVCU,GRT]
Because you chose to store your data in a de-normalized way, updating it is more complicated then it has to be.
You need to first unnest the array (essentially normalizing the data), replace the values, then aggregate them back and update the column:
update product p
set verticals = t.verticals
from (
select jsonb_agg(case when x.v = 'HOM' then 'XXX' else x.v end order by idx) as verticals
from product p2, jsonb_array_elements_text(p2.verticals) with ordinality as x(v,idx)
where code = 1
) t
where p.code = t.code;
This assumes that product.code is a primary (or unique) key!
Online example: http://rextester.com/KZQ65481
If the order of the array elements is not important, this gets easier:
update product
set verticals = (verticals - 'HOM')||'["XXX"]'
where code = 1;
This removes the element 'HOM' from the array (regardless of the posisition) and then appends 'XXX' to the end of the array.
You should use jsonb_set(target jsonb, path text[], new_value jsonb[, create_missing boolean]) and array_position() OR array_replace(anyarray, anyelement, anyelement)
https://www.postgresql.org/docs/9.5/static/functions-json.html
https://www.postgresql.org/docs/10/static/functions-array.html
I have the following table:
CREATE TABLE scoped_data(
owner_id text,
scope text
key text,
data json,
PRIMARY KEY (owner_id, scope, key)
);
As part of each transaction we will potentially be inserting data for multiple scopes. Given this table has the potential to grow very quickly I would like not to store data if it is NULL or an empty JSON object.
An upsert felt like the idiomatic approach to this. The following is within the context of a PL/pgSQL function:
WITH upserts AS (
INSERT INTO scoped_data (owner_id, scope, key, data)
VALUES
(p_owner_id, 'broad', p_broad_key, p_broad_data),
(p_owner_id, 'narrow', p_narrow_key, p_narrow_data),
-- etc.
ON CONFLICT (owner_id, scope, key)
DO UPDATE SET data = scoped_data.data || COALESCE(EXCLUDED.data, '{}')
RETURNING scope, data
)
SELECT json_object_agg(u.scope, u.data)
FROM upserts u
INTO v_all_scoped_data;
I include the RETURNING as I would like the up-to-date version of each scope's data included in a variable for subsequent use, therefore I need the RETURNING to return something even if logically no data has been updated.
For example (all for key = 1 and scope = 'narrow'):
data = '{}' => v_scoped_data = {}, no data for key = 1 in scoped_data.
data = '{"some":"data"}' => v_scoped_data = { "narrow": { "some": "data" } }, data present in scoped_data.
data = '{}' => v_scoped_data = { "narrow": { "some": "data" }, data from 2. remains unaffected.
data = '{"more":"stuff"}' => v_scoped_data = { "narrow": { "some": "data", "more": "stuff" }. Updated data stored in table.
I initially added a trigger BEFORE INSERT ON scoped_data which did the following:
IF NULLIF(NEW.data, '{}') IS NULL THEN
RETURN NULL;
END IF;
RETURN NEW;
This worked fine for preventing the insertion of new records but the issue was that this trigger also prevented subsequent inserts to existing rows thereby no INSERT happened therefore there was no ON CONFLICT therefore nothing returned in the RETURNING.
A couple of approaches I've considered, both of which feel inelegant or like they should be unnecessary:
Add a CHECK constraint to scoped_data.data: CHECK(NULLIF(data, '{}') IS NOT NULL), allow the insert and catch the exception in the PL/pgSQL code.
DELETE in an AFTER INSERT trigger if the data field was NULL or empty.
Am I going about this in the right way? Am I trying to coerce this logic into an upsert when there is a better way? Might explicit INSERTs and UPDATEs be a more logical fit?
I am using Postgres 9.6.
I would go with the BEFORE trigger ON INSERT to prevent unnecessary inserts and updates.
To return the values even in the case that the operation is not performed, you can UNION ALL your query with a query on scoped_data that returns the original row, ORDER the results so that any new row is ordered first (introduce an artifical column to both queries) and use LIMIT 1 to get the correct result.
According to ZF documentation when using fetchAssoc() the first column in the result set must contain unique values, or else rows with duplicate values in the first column will overwrite previous data.
I don't want this, I want my array to be indexed 0,1,2,3... I don't need rows to be unique because I won't modify them and won't save them back to the DB.
According to ZF documentation fetchAll() (when using the default fetch mode, which is in fact FETCH_ASSOC) is equivalent to fetchAssoc(). BUT IT'S NOT.
I've used print_r()function to reveal the truth.
print_r($db->fetchAll('select col1, col2 from table'));
prints
Array
(
[0] => Array
(
[col1] => 1
[col2] => 2
)
)
So:
fetchAll() is what I wanted.
There's a bug in ZF documentation
From http://framework.zend.com/manual/1.11/en/zend.db.adapter.html
The fetchAssoc() method returns data in an array of associative arrays, regardless of what value you have set for the fetch mode, **using the first column as the array index**.
So if you put
$result = $db->fetchAssoc(
'SELECT some_column, other_column FROM table'
);
you'll have as result an array like this
$result['some_column']['other_column']