Postgres, where in on a json object - postgresql

I've been looking around and can't seem to find anything that is helping me understand how I can achieve the following. (Bear in mind I have simplified this to the problem I'm having and I am only storing simple JSON objects in this field)
Given I have a table "test" defined
CREATE TABLE test (
id int primary key
, features jsonb
)
And some test data
id
features
1
{"country": "Sweden"}
2
{"country": "Denmark"}
3
{"country": "Norway"}
I've been trying to filter on the JSONB column "features". I can do this easily with one value
SELECT *
FROM test
WHERE features #> '{"country": "Sweden"}'
But I've been having troubles working out how I could filter by multiple values succintly. I can do this
SELECT *
FROM test
WHERE features #> '{"country": "Sweden"}'
OR features #> '{"country": "Norway"}'
But I have been wondering if there would be an equivalent to WHERE IN ($1, $2, ...) for JSONB columns.
I suspect that I will likely need to stick with the WHERE... OR... but would like to know if there is another way to achieve this.

You can use jsonb->>'field_name' extract a field as text, then you use any operator compatible with text type
SELECT *
FROM test
WHERE features->>'country' = 'Sweden'
SELECT *
FROM test
WHERE features->>'country' in ('Sweden', 'Norway')
You an also directly work with jsonb as follow
jsonb->'field_name' extract field as jsonb, then you can use operator compatible with jsonb:
SELECT *
FROM test
WHERE features->'country' ?| array['Sweden', 'Norway']
See docs for more details

You can extract the country value, then you can use a regular IN condition:
select *
from test
where features ->> 'country' in ('Sweden', 'Norway')

Related

What is the equivalent of SQL subquery in MongoDB?

Is there a way to combine two finds in MongoDB similar to the SQL subqueries?
What would be the equivalent of something like:
SELECT * FROM TABLE1 WHERE name = (SELECT name FROM TABLE2 WHERE surname = 'Smith');
I need to get an uuid from one collection searching by email and then use it to filter in another. I would like if possible to make it with just one find instead of get the uuid, store it in a variable and then search second time using the variable...
Here are the two that I want combined somehow:
db.getCollection('person').find({email:'perry.goodwin#yahoo.com'}).sort({_id:-1});
db.getCollection('case').find({applicantUuid:'4a17e96c-caf9-4d78-a853-73e190005c63'});

Using a list as replacement for singular patterns in regexp_replace

I have a table that I need to delete random words/characters out of. To do this, I have been using a regexp_replace function with the addition of multiple patterns. An example is below:
select regexp_replace(combined,'\y(NAME|001|CONTAINERS:|MT|COUNT|PCE|KG|PACKAGE)\y','', 'g')
as description, id from export_final;
However, in the full list, there are around 70 different patterns that I replace out of the description. As you can imagine, the code if very cluttered: This leads me to my question. Is there a way to put these patterns into another table then use that table to check the descriptions?
Of course. Populate your desired 'other' table with what patterns you need. Then create a CTE that uses string_agg function to build the regex. Example:
create table exclude_list( pattern_word text);
insert into exclude_list(pattern_word)
values('NAME'),('001'),('CONTAINERS:'),('MT'),('COUNT'),('PCE'),('KG'),('PACKAGE');
with exclude as
( select '\y(' || string_agg(pattern_word,'|') || ')\y' regex from exclude_list )
-- CTE simulates actual table to provide test data
, export_final (id,combined) as (values (0,'This row 001 NAME Main PACKAGE has COUNT 3 units'),(1,'But single package can hold 6 KG'))
select regexp_replace(combined,regex,'', 'g')
as description, id
from export_final cross join exclude;

Casting rows to arrays in PostgreSQL

I need to query a table as in
SELECT *
FROM table_schema.table_name
only each row needs to be a TEXT[] with array values corresponding to column values casted to TEXT coming in the same order as in SELECT * so assuming the table has columns a, b and c I need the result to look like
SELECT ARRAY[a::TEXT, b::TEXT, c::TEXT]
FROM table_schema.table_name
only it shouldn't explicitly list columns by name. Ideally it should look like
SELECT as_text_array(a)
FROM table_schema.table_name AS a
The best I came up with looks ugly and relies on "hstore" extension
WITH columnz AS ( -- get ordered column name array
SELECT array_agg(attname::TEXT ORDER BY attnum) AS column_name_array
FROM pg_attribute
WHERE attrelid = 'table_schema.table_name'::regclass AND attnum > 0 AND NOT attisdropped
)
SELECT hstore(a)->(SELECT column_name_array FROM columnz)
FROM table_schema.table_name AS a
I am having a feeling there must be a simpler way to achieve that
UPDATE 1
Another query that achieves the same result but arguably as ugly and inefficient as the first one is inspired by the answer by #bspates. It may be even less efficient but doesn't rely on extensions
SELECT r.text_array
FROM table_schema.table_name AS a
INNER JOIN LATERAL ( -- parse ROW::TEXT presentation of a row
SELECT array_agg(COALESCE(replace(val[1], '""', '"'), NULLIF(val[2], ''))) AS text_array
FROM regexp_matches(a::text, -- parse double-quoted and simple values separated by commas
'(?<=\A\(|,) (?: "( (?:[^"]|"")* )" | ([^,"]*) ) (?=,|\)\Z)', 'xg') AS t(val)
) AS r ON TRUE
It is still far from ideal
UPDATE 2
I tested all 3 options existing at the moment
Using JSON. It doesn't rely on any extensions, it is short to write, easy to understand and the speed is ok.
Using hstore. This alternative is the fastest (>10 times faster than JSON approach on a 100K dataset) but requires an extension. hstore in general is very handy extension to have through.
Using regex to parse TEXT presentation of a ROW. This option is really slow.
A somewhat ugly hack is to convert the row to a JSON value, then unnest the values and aggregate it back to an array:
select array(select (json_each_text(to_json(t))).value) as row_value
from some_table t
Which is to some extent the same as your hstore hack.
If the order of the columns is important, then using json and with ordinality can be used to keep that:
select array(select val
from json_each_text(to_json(t)) with ordinality as t(k,val,idx)
order by idx)
from the_table t
The easiest (read hacky-est) way I can think of is convert to a string first then parse that string into an array. Like so:
SELECT string_to_array(table_name::text, ',') FROM table_name
BUT depending on the size and type of the data in the table, this could perform very badly.

How to combine DISTINCT and ORDER BY in array_agg of jsonb values in postgresSQL

Note: I am using the latest version of Postgres (9.4)
I am trying to write a query which does a simple join of 2 tables, and groups by the primary key of the first table, and does an array_agg of several fields in the 2nd table which I want returned as an object. The array needs to be sorted by a combination of 2 fields in the json objects, and also uniquified.
So far, I have come up with the following:
SELECT
zoo.id,
ARRAY_AGG(
DISTINCT ROW_TO_JSON((
SELECT x
FROM (
SELECT animals.type, animals.name
) x
))::JSONB
-- ORDER BY animals.type, animals.name
)
FROM zoo
JOIN animals ON animals.zooId = zoo.id
GROUP BY zoo.id;
This results in one row for each zoo, with a an aggregate array of jsonb objects, one for each animal, uniquely.
However, I can't seem to figure out how to also sort this by the parameters in the commented out part of the code.
If I take out the distinct, I can ORDER BY original fields, which works great, but then I have duplicates.
If you use row_to_json() you will lose the column names unless you put in a row that is typed. If you "manually" build the jsonb object with json_build_object() using explicit names then you get them back:
SELECT zoo.id, array_agg(za.jb) AS animals
FROM zoo
JOIN (
SELECT DISTINCT ON (zooId, "type", "name")
zooId, json_build_object('animal_type', "type", 'animal_name', "name")::jsonb AS jb
FROM animals
ORDER BY zooId, jb->>'animal_type', jb->>'animal_name'
-- ORDER BY zooId, "type", "name" is far more efficient
) AS za ON za.zooId = zoo.id
GROUP BY zoo.id;
You can ORDER BY the elements of a jsonb object, as shown above, but (as far as I know) you cannot use DISTINCT on a jsonb object. In your case this would be rather inefficient anyway (first building all the jsonb objects, then throwing out duplicates) and at the aggregate level it is plain impossible with standard SQL. You can achieve the same result, however, by applying the DISTINCT clause before building the jsonb object.
Also, avoid use of SQL key words like "type" and standard data types like "name" as column names. Both are non-reserved keywords so you can use them in their proper contexts, but practically speaking your commands could get really confusing. You could, for instance, have a schema, with a table, a column in that table, and a data type each called "type" and then you could get this:
SELECT type::type FROM type.type WHERE type = something;
While PostgreSQL will graciously accept this, it is plain confusing at best and prone to error in all sorts of more complex situations. You can get a long way by double-quoting any key words, but they are best just avoided as identifiers.

Dynamic number of fields in table

I have a problem with TSQL. I have a number of tables, each table contain different number of fielsds with different names.
I need dynamically take all this tables, read all records and manage each record into string list, where each value separated by commas. And do smth. with this string.
I think that I need to use CURSORS, but I can't FETCH em without knowing A concrete amount of fields with names and types. Maybe I can create a table variable with dynamic number of fields?
Thanks a lot!
Makarov Artem.
I would repurpose one of the many T-SQL scripts written to generate INSERT statements. They do exactly what you require. Namely
Reverse engineer a given table to determine columns names and types
Generate a delimited string of values
The most complete example I've found is here
But just a simple Google search for "INSERT STATEMENT GENERATOR" will yield several examples that you can repurpose to fit your needs.
Best of luck!
SELECT
ORDINAL_POSITION
,COLUMN_NAME
,DATA_TYPE
,CHARACTER_MAXIMUM_LENGTH
,IS_NULLABLE
,COLUMN_DEFAULT
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_NAME = 'MYTABLE'
ORDER BY
ORDINAL_POSITION ASC;
from http://weblogs.sqlteam.com/joew/archive/2008/04/27/60574.aspx
Perhaps you can do something with this.
select T2.X.query('for $i in *
return concat(data($i), ",")'
).value('.', 'nvarchar(max)') as C
from (
select *
from YourTable
for xml path('Row'),elements xsinil, type
) as T1(X)
cross apply T1.X.nodes('/Row') T2(X)
It will give you one row for each row in YourTable with each value in YourTable separated by a comma in the column C.
This builds an XML for the entire table and then parses that XML. Might get you into trouble if you have tables with a lot of rows.
BTW: I saw from a comment that you can "use only pure SQL". I really don't think this qualifies as "pure SQL" :).