Consider the following:
create table query(id integer, query_definition jsonb);
create table query_item(path text[], id integer);
insert into query (id, query_definition)
values
(100, '{"columns":[{"type":"integer","field":"id"},{"type":"str","field":"firstname"},{"type":"str","field":"lastname"}]}'::jsonb),
(101, '{"columns":[{"type":"integer","field":"id"},{"type":"str","field":"firstname"}]}'::jsonb);
insert into query_item(path, id) values
('{columns,0,type}'::text[], 100),
('{columns,1,type}'::text[], 100),
('{columns,2,type}'::text[], 100),
('{columns,0,type}'::text[], 101),
('{columns,1,type}'::text[], 101);
I have a query table which has a jsonb column named query_definition.
The jsonb value looks like the following:
{
"columns": [
{
"type": "integer",
"field": "id"
},
{
"type": "str",
"field": "firstname"
},
{
"type": "str",
"field": "lastname"
}
]
}
In order to replace all "type": "..." with "type": "string", I've built the query_item table which contains the following data:
path |id |
----------------+---+
{columns,0,type}|100|
{columns,1,type}|100|
{columns,2,type}|100|
{columns,0,type}|101|
{columns,1,type}|101|
path matches each path from the json root to the "type" entry, id is the corresponding query's id.
I made up the following sql statement to do what I want:
update query q
set query_definition = jsonb_set(q.query_definition, query_item.path, ('"string"')::jsonb, false)
from query_item
where q.id = query_item.id
But it partially works, as it takes the 1st matching id and skips the others (the 1st and 4th line of query_item table).
I know I could build a for statement, but it requires a plpgsql context and I'd rather avoid its use.
Is there a way to do it with a single update statement?
I've read in this topic it's possible to make it with strings, but I didn't find out how to adapt this mechanism with jsonb treatment.
I created a Hive(3.1.2) table from a BSON file dump from MongoDB (4.0).After creating the table, I select couples of entries from the table. However some of them value is null.
I tried to print the table row from BSON using python. It printed the values correct. Means the value not missing. Any clue about how to further trouble shoot?
SQL to create hive table.
CREATE EXTERNAL TABLE `tmp_test_status`(
`id` string COMMENT 'frame_id',
`createdAt` INT,
`updatedAt` string,
`task` string)
row format serde 'com.mongodb.hadoop.hive.BSONSerDe'
with serdeproperties('mongo.columns.mapping'='{"id":"_id"}')
stored as inputformat 'com.mongodb.hadoop.mapred.BSONFileInputFormat'
outputformat 'com.mongodb.hadoop.hive.output.HiveBSONFileOutputFormat'
LOCATION
'oss://data-warehouse/hive/warehouse/data.db/tmp_test_status';
===========================================
Data I printed by python bson lib.
{'_id': '00003a02-280d-4e59-8483-a0143e0a3359', 'createdAt': '1557999191951', 'updatedAt': '1557999191951', 'task': 'lane', '__v': 0}
===========================================
Data I selected from Hive table:
00003a02-280d-4e59-8483-a0143e0a3359 NULL NULL lane
093e72ae-206b-4112-ac28-5ba38f9485d0 NULL NULL lane
093ebe41-183c-47b4-ab25-93336875ae10 NULL NULL lane
093ec16b-ba1d-4ddc-90bc-9981342e8071 NULL NULL lane
I found the answer my self, the reason is that the BSON file attribute name distinguish lower and upper case, but Hive not. If the attribute name contain upper case in BSON file, then Hive will return NULL when query.Simply map the attribute name by table properties worked for me.
with serdeproperties('mongo.columns.mapping'='{"id":"_id", "createdAt": "createdAt", "updatedAt": "updatedAt", "reLabeled1" : "reLabeled1", "isValid": "isValid"}')
Postgres Version: 9.5.0
I have a database table where one of the columns is stored as text that represents a json value. The json value is an array of dictionaries e.x.:
[{"picture": "XXX", "image_hash": null, "name": "test", "video": null, "link": "http://www.google.com", "table_id": 356}, ..]
I am trying to update the value associated to the key table_id only for the 1st array element. Here is the query I ran:
update table1 set "json_column" = jsonb_set("json_column", "{0, table_id}", null, false) where id = 1;
I keep running into the error - ERROR: column "{0, table_id}" does not exist
Could someone please help me out understanding how this can be fixed?
What is the proper way to convert any text (or varchar) to jsonB type in Postgres (version 9.6) ?
For example, here I am using two methods and I am getting different results:
Method 1:
dev=# select '[{"field":15,"operator":0,"value":"1"},{"field":15,"operator":0,"value":"2"},55]'::jsonb;
jsonb
----------------------------------------------------------------------------------------------
[{"field": 15, "value": "1", "operator": 0}, {"field": 15, "value": "2", "operator": 0}, 55]
(1 row)
Method 2 , which doesn't produce the desired results, btw:
dev=# select to_jsonb('[{"field":15,"operator":0,"value":"1"},{"field":15,"operator":0,"value":"2"},55]'::text);
to_jsonb
----------------------------------------------------------------------------------------------------
"[{\"field\":15,\"operator\":0,\"value\":\"1\"},{\"field\":15,\"operator\":0,\"value\":\"2\"},55]"
(1 row)
dev=#
Here, it was converted to a string, not an array.
Why doesn't the second method creates an array ?
According to Postgres documentation:
to_jsonb(anyelemnt)
Returns the value as json or jsonb. Arrays and composites are
converted (recursively) to arrays and objects; otherwise, if there is
a cast from the type to json, the cast function will be used to
perform the conversion; otherwise, a scalar value is produced. For any
scalar type other than a number, a Boolean, or a null value, the text
representation will be used, in such a fashion that it is a valid json
or jsonb value.
IMHO you are providing a JSON formatted string, then you should use the first method.
to_json('Fred said "Hi."'::text) --> "Fred said \"Hi.\""
If you try to get an array of element using to_json(text) you'll get the next error:
select *
from jsonb_array_elements_text(to_jsonb('[{"field":15,"operator":0,"value":"1"},{"field":15,"operator":0,"value":"2"},55]'::text));
cannot extract elements from a scalar
But if you previously cast it to json:
select *
from jsonb_array_elements_text(to_jsonb('[{"field":15,"operator":0,"value":"1"},{"field":15,"operator":0,"value":"2"},55]'::json));
+--------------------------------------------+
| value |
+--------------------------------------------+
| {"field": 15, "value": "1", "operator": 0} |
+--------------------------------------------+
| {"field": 15, "value": "2", "operator": 0} |
+--------------------------------------------+
| 55 |
+--------------------------------------------+
If your text is just a json format text, you could just explicitly cast it to json/jsonb like this:
select '{"a":"b"}'::jsonb
Atomic type conversion and CSV-to-JSONb
A typical parse problem in open data applications is to parse line by line a CSV (or CSV-like) text into JSONB correct (atomic) datatypes. Datatypes can be defined in SQL jargon ('int', 'text', 'float', etc.) or JSON jargon ('string', 'number'):
CREATE FUNCTION csv_to_jsonb(
p_info text, -- the CSV line
coltypes_sql text[], -- the datatype list
rgx_sep text DEFAULT '\|' -- CSV separator, by regular expression
) RETURNS JSONb AS $f$
SELECT to_jsonb(a) FROM (
SELECT array_agg(CASE
WHEN tp IN ('int','integer','smallint','bigint') THEN to_jsonb(p::bigint)
WHEN tp IN ('number','numeric','float','double') THEN to_jsonb(p::numeric)
WHEN tp='boolean' THEN to_jsonb(p::boolean)
WHEN tp IN ('json','jsonb','object','array') THEN p::jsonb
ELSE to_jsonb(p)
END) a
FROM regexp_split_to_table(p_info,rgx_sep) WITH ORDINALITY t1(p,i)
INNER JOIN unnest(coltypes_sql) WITH ORDINALITY t2(tp,j)
ON i=j
) t
$f$ language SQL immutable;
-- Example:
SELECT csv_to_jsonb(
'123|foo bar|1.2|true|99999|{"x":123,"y":"foo"}',
array['int','text','float','boolean','bigint','object']
);
-- results [123, "foo bar", 1.2, true, 99999, {"x": 123, "y": "foo"}]
-- that is: number, string, number, true, number, object
rows = [{"id": 1, "json_value": [{"key": "value"}, {"key2": "value2"}]}, {"id": 2, "json_column": None}]
insert_query = table.insert().values(rows)
connection.execute(insert_query)
Doing this will have "null" (String) entered to the row where id=2. Rather than the NULL type.
Is there any way to properly do multiple row insert where value of some JSON columns is NULL?
The issue was a bug and has been fixed by the SQLAlchemy project maintainer.
Details here: https://groups.google.com/forum/#!topic/sqlalchemy/Bu4lJ18Gsa8