Convert text column to jsonb in Postgres - postgresql

I have a column products in a table test which is of type of text in the below format:
[{"is_bulk_product": false, "rate": 0, "subtotal": 7.17, "qty": 2, "tax": 0.90}]
It is an array with nested dictionary values. When I tried to alter the column using this:
alter table test alter COLUMN products type jsonb using products::jsonb;
I get below error:
ERROR: 22P02: invalid input syntax for type json
DETAIL: Character with value 0x09 must be escaped.
CONTEXT: JSON data, line 1: ...some_id": 2613, "qty": 2, "upc": "1234...
LOCATION: json_lex_string, json.c:789
Time: 57000.237 ms (00:57.000)
how can we make sure the json is valid before altering the column ?
Thank You

Your written JSON string is correct, so this SQL code execute without exception:
select '[{"is_bulk_product": false, "rate": 0, "subtotal": 7.17, "qty": 2, "tax": 0.90}]'::jsonb
Maybe the table has an incorrect JSON format in other records. You can firstly check this by selecting data, example:
select products::jsonb from test;
And you have incorrect syntax on your SQL code, you can cast products field to JSONB but not a test, test is your table name:
alter table test
alter COLUMN products type jsonb using products::jsonb;

Related

PostgreSQL update a jsonb column multiple times

Consider the following:
create table query(id integer, query_definition jsonb);
create table query_item(path text[], id integer);
insert into query (id, query_definition)
values
(100, '{"columns":[{"type":"integer","field":"id"},{"type":"str","field":"firstname"},{"type":"str","field":"lastname"}]}'::jsonb),
(101, '{"columns":[{"type":"integer","field":"id"},{"type":"str","field":"firstname"}]}'::jsonb);
insert into query_item(path, id) values
('{columns,0,type}'::text[], 100),
('{columns,1,type}'::text[], 100),
('{columns,2,type}'::text[], 100),
('{columns,0,type}'::text[], 101),
('{columns,1,type}'::text[], 101);
I have a query table which has a jsonb column named query_definition.
The jsonb value looks like the following:
{
"columns": [
{
"type": "integer",
"field": "id"
},
{
"type": "str",
"field": "firstname"
},
{
"type": "str",
"field": "lastname"
}
]
}
In order to replace all "type": "..." with "type": "string", I've built the query_item table which contains the following data:
path |id |
----------------+---+
{columns,0,type}|100|
{columns,1,type}|100|
{columns,2,type}|100|
{columns,0,type}|101|
{columns,1,type}|101|
path matches each path from the json root to the "type" entry, id is the corresponding query's id.
I made up the following sql statement to do what I want:
update query q
set query_definition = jsonb_set(q.query_definition, query_item.path, ('"string"')::jsonb, false)
from query_item
where q.id = query_item.id
But it partially works, as it takes the 1st matching id and skips the others (the 1st and 4th line of query_item table).
I know I could build a for statement, but it requires a plpgsql context and I'd rather avoid its use.
Is there a way to do it with a single update statement?
I've read in this topic it's possible to make it with strings, but I didn't find out how to adapt this mechanism with jsonb treatment.

Column value become NULL when creating Hive table from BSON file

I created a Hive(3.1.2) table from a BSON file dump from MongoDB (4.0).After creating the table, I select couples of entries from the table. However some of them value is null.
I tried to print the table row from BSON using python. It printed the values correct. Means the value not missing. Any clue about how to further trouble shoot?
SQL to create hive table.
CREATE EXTERNAL TABLE `tmp_test_status`(
`id` string COMMENT 'frame_id',
`createdAt` INT,
`updatedAt` string,
`task` string)
row format serde 'com.mongodb.hadoop.hive.BSONSerDe'
with serdeproperties('mongo.columns.mapping'='{"id":"_id"}')
stored as inputformat 'com.mongodb.hadoop.mapred.BSONFileInputFormat'
outputformat 'com.mongodb.hadoop.hive.output.HiveBSONFileOutputFormat'
LOCATION
'oss://data-warehouse/hive/warehouse/data.db/tmp_test_status';
===========================================
Data I printed by python bson lib.
{'_id': '00003a02-280d-4e59-8483-a0143e0a3359', 'createdAt': '1557999191951', 'updatedAt': '1557999191951', 'task': 'lane', '__v': 0}
===========================================
Data I selected from Hive table:
00003a02-280d-4e59-8483-a0143e0a3359 NULL NULL lane
093e72ae-206b-4112-ac28-5ba38f9485d0 NULL NULL lane
093ebe41-183c-47b4-ab25-93336875ae10 NULL NULL lane
093ec16b-ba1d-4ddc-90bc-9981342e8071 NULL NULL lane
I found the answer my self, the reason is that the BSON file attribute name distinguish lower and upper case, but Hive not. If the attribute name contain upper case in BSON file, then Hive will return NULL when query.Simply map the attribute name by table properties worked for me.
with serdeproperties('mongo.columns.mapping'='{"id":"_id", "createdAt": "createdAt", "updatedAt": "updatedAt", "reLabeled1" : "reLabeled1", "isValid": "isValid"}')

Update jsonb array value stored as text in Postgres

Postgres Version: 9.5.0
I have a database table where one of the columns is stored as text that represents a json value. The json value is an array of dictionaries e.x.:
[{"picture": "XXX", "image_hash": null, "name": "test", "video": null, "link": "http://www.google.com", "table_id": 356}, ..]
I am trying to update the value associated to the key table_id only for the 1st array element. Here is the query I ran:
update table1 set "json_column" = jsonb_set("json_column", "{0, table_id}", null, false) where id = 1;
I keep running into the error - ERROR: column "{0, table_id}" does not exist
Could someone please help me out understanding how this can be fixed?

how do I convert text to jsonB

What is the proper way to convert any text (or varchar) to jsonB type in Postgres (version 9.6) ?
For example, here I am using two methods and I am getting different results:
Method 1:
dev=# select '[{"field":15,"operator":0,"value":"1"},{"field":15,"operator":0,"value":"2"},55]'::jsonb;
jsonb
----------------------------------------------------------------------------------------------
[{"field": 15, "value": "1", "operator": 0}, {"field": 15, "value": "2", "operator": 0}, 55]
(1 row)
Method 2 , which doesn't produce the desired results, btw:
dev=# select to_jsonb('[{"field":15,"operator":0,"value":"1"},{"field":15,"operator":0,"value":"2"},55]'::text);
to_jsonb
----------------------------------------------------------------------------------------------------
"[{\"field\":15,\"operator\":0,\"value\":\"1\"},{\"field\":15,\"operator\":0,\"value\":\"2\"},55]"
(1 row)
dev=#
Here, it was converted to a string, not an array.
Why doesn't the second method creates an array ?
According to Postgres documentation:
to_jsonb(anyelemnt)
Returns the value as json or jsonb. Arrays and composites are
converted (recursively) to arrays and objects; otherwise, if there is
a cast from the type to json, the cast function will be used to
perform the conversion; otherwise, a scalar value is produced. For any
scalar type other than a number, a Boolean, or a null value, the text
representation will be used, in such a fashion that it is a valid json
or jsonb value.
IMHO you are providing a JSON formatted string, then you should use the first method.
to_json('Fred said "Hi."'::text) --> "Fred said \"Hi.\""
If you try to get an array of element using to_json(text) you'll get the next error:
select *
from jsonb_array_elements_text(to_jsonb('[{"field":15,"operator":0,"value":"1"},{"field":15,"operator":0,"value":"2"},55]'::text));
cannot extract elements from a scalar
But if you previously cast it to json:
select *
from jsonb_array_elements_text(to_jsonb('[{"field":15,"operator":0,"value":"1"},{"field":15,"operator":0,"value":"2"},55]'::json));
+--------------------------------------------+
| value |
+--------------------------------------------+
| {"field": 15, "value": "1", "operator": 0} |
+--------------------------------------------+
| {"field": 15, "value": "2", "operator": 0} |
+--------------------------------------------+
| 55 |
+--------------------------------------------+
If your text is just a json format text, you could just explicitly cast it to json/jsonb like this:
select '{"a":"b"}'::jsonb
Atomic type conversion and CSV-to-JSONb
A typical parse problem in open data applications is to parse line by line a CSV (or CSV-like) text into JSONB correct (atomic) datatypes. Datatypes can be defined in SQL jargon ('int', 'text', 'float', etc.) or JSON jargon ('string', 'number'):
CREATE FUNCTION csv_to_jsonb(
p_info text, -- the CSV line
coltypes_sql text[], -- the datatype list
rgx_sep text DEFAULT '\|' -- CSV separator, by regular expression
) RETURNS JSONb AS $f$
SELECT to_jsonb(a) FROM (
SELECT array_agg(CASE
WHEN tp IN ('int','integer','smallint','bigint') THEN to_jsonb(p::bigint)
WHEN tp IN ('number','numeric','float','double') THEN to_jsonb(p::numeric)
WHEN tp='boolean' THEN to_jsonb(p::boolean)
WHEN tp IN ('json','jsonb','object','array') THEN p::jsonb
ELSE to_jsonb(p)
END) a
FROM regexp_split_to_table(p_info,rgx_sep) WITH ORDINALITY t1(p,i)
INNER JOIN unnest(coltypes_sql) WITH ORDINALITY t2(tp,j)
ON i=j
) t
$f$ language SQL immutable;
-- Example:
SELECT csv_to_jsonb(
'123|foo bar|1.2|true|99999|{"x":123,"y":"foo"}',
array['int','text','float','boolean','bigint','object']
);
-- results [123, "foo bar", 1.2, true, 99999, {"x": 123, "y": "foo"}]
-- that is: number, string, number, true, number, object

Inserting NULL value to column of JSON type by multiple row insert query

rows = [{"id": 1, "json_value": [{"key": "value"}, {"key2": "value2"}]}, {"id": 2, "json_column": None}]
insert_query = table.insert().values(rows)
connection.execute(insert_query)
Doing this will have "null" (String) entered to the row where id=2. Rather than the NULL type.
Is there any way to properly do multiple row insert where value of some JSON columns is NULL?
The issue was a bug and has been fixed by the SQLAlchemy project maintainer.
Details here: https://groups.google.com/forum/#!topic/sqlalchemy/Bu4lJ18Gsa8