Elixir Ecto - PostgreSQL jsonb Functions - postgresql

I am in the process of converting a Ruby on Rails API over to Elixir and Phoenix. In my Postgres database, I have a table with a jsonb column type. One of the keys in the json is an array of colors. For example:
{"id": 12312312, "colors": ["Red", "Blue", "White"]}
What I am trying to do from Ecto is query my table for all records that contain the colors Red or Blue. Essentially, recreate this query:
select * from mytable where data->'colors' ?| array['Red', 'Blue']
I'm having some difficulties constructing this query with Ecto. Here is what I have:
Note: "value" will be a pipe delimited list of colors
def with_colors(query, value) do
colors = value
|> String.split("|")
|> Enum.map(fn(x) -> "'#{x}'" end)
|> Enum.join(", ")
# colors should look like "'Red', 'Blue'"
from c in query,
where: fragment("data->'colors' \\?| array[?]", ^colors))
end
This is currently not working as expected. I am having issues with the replacement question mark, as it seems to wrap additional quotes around my field. What is the proper way to do this use fragment? Or maybe there is a better way?
I'm going to run into this problem again because I'm also going to have to recreate this query:
select * from mytable where data->'colors' #> '["Red", "Blue"]'

I have found a solution to my problem.
def with_colors(query, value) do
colors = value
|> String.split("|")
from c in query,
where: fragment("data->'colors' \\?| ?", ^colors))
end

Related

Ecto - casting an array of strings to integers in a fragment

In an Ecto query I'm running against Postgres (13.6), I need to interpolate a list of ids into a fragment - this is generally not something I have a problem with, but in this case, the list of ids is being received as a list of strings that need to be cast to integers (or, more specifically, BIGINT). The query that I think that I need is as follows, which the troublesome bit being ANY(ARRAY?::BIGINT[]):
ModelA
|> where(
[ma],
fragment(
"EXISTS (SELECT * FROM model_b mb WHERE mb.a_id = ? AND mb.c_id = ANY(ARRAY?::BIGINT[]))",
a.id,
^c_ids
)
)
where c_ids would be a list like ["1449441579", "2345556834"]
However, when I run this, I get the error
(Postgrex.Error) ERROR 42703 (undefined_column) column "array$4" does not exist
referring to the generated SQL
ANY(ARRAY$4::BIGINT[])
Of course, I could convert the array of c_ids to integers beforehand in my app code, but I'd like to see if I can get it to cast in the query itself.
Writing the fragment in straight SQL works out just fine:
SELECT * FROM model_b mb WHERE mb.a_id = 1 AND mb.c_id = ANY(ARRAY['1449441579', '2345556834']::BIGINT[]);
What is the idiomatic way to get this kind of array casting to work in an Ecto fragment? Many thanks.
Just to codify my comment, I would do the integer conversion before the query. You can use the dynamic macro to support IN queries:
import Ecto.Query
alias YourApp.Repo
alias YourApp.SomeSchema, as: ModelA
strs = ["1", "2", "3"]
ids = Enum.map(strs, fn x -> String.to_integer(x) end)
conditions = dynamic([tbl], tbl.id in ^ids)
Repo.all(
from ma in ModelA,
where: ^conditions
)
|> IO.inspect()
I did find a post that led me to a potential solution here - https://elixirforum.com/t/interpolating-lists-into-ecto-query-fragment-1/16690/7 - however, I'm not convinced that this is optimal for me.
By using Enum.join and converting my initial array of strings into a string and then allowing Postgres to cast that string to an array of bigints, it worked out:
ModelA
|> where(
[ma],
fragment(
"EXISTS (SELECT * FROM model_b mb WHERE mb.a_id = ? AND mb.c_id = ANY(ARRAY?::BIGINT[]))",
a.id,
^Enum.join(c_ids, ",")
)
)
So I'm posting it here for reference and because it does "work", but I feel sort of silly doing this, because I specifically was trying to avoid processing the list before passing it to PG... so yeah I'd still really like to hear about the "right" way to do this...

In Elixir with Postgres, how can I have the database return the enum values which are NOT in use?

I have an EctoEnum.Postgres:
# #see: https://en.wikipedia.org/wiki/ISO_4217
defmodule PricingEngine.Pricing.CurrencyEnum do
#options [
:AED,
:AFN,
# snip...
:ZWL
]
use EctoEnum.Postgres,
type: :currency,
enums: #options
def values, do: #options
end
This enum has been included in our Postgres database
We also have a structure:
defmodule PricingEngine.Pricing.Currency do
use Ecto.Schema
import Ecto.Changeset
schema "currencies" do
field(:currency, PricingEngine.Pricing.CurrencyEnum)
timestamps()
end
#doc false
def changeset(currency, attrs) do
currency
|> cast(attrs, [:currency])
|> validate_required([:currency])
|> unique_constraint(:currency)
end
end
We can currently successfully use the following functions to figure out which currencies are active/used:
def active_currency_isos do
Repo.all(select(Currency, [record], record.currency))
end
defdelegate all_currency_isos,
to: CurrencyEnum,
as: :values
def inactive_currency_iso do
Pricing.all_currency_isos() -- Pricing.active_currency_isos()
end
This works, but I'm led to believe this could be more efficient if we just asked the database for this information.
Any idea(s) how to do this?
If you want to get a list of all the used enums you should just do a distinct on the currency field. This uses the Postgres DISTINCT ON operator:
from(c in Currency,
distinct: c.currency,
select: c.currency
)
This will query the table, unique by the currency column, and return only the currency column values. You should get an array of all of the enums that exist in the table.
There are some efficiency concerns with doing it this way which could be mitigated by materialized views, lookup tables, in-memory cache etc. However, if your data set isn't extremely large, you should be able to use this for a while.
Edit:
Per the response, I will show how to get the unused enums.
There are 2 ways to do this.
Pure SQL
This query will get all of the used ones and do a difference from the entire set of available enums. The operator we use to do this is EXCEPT and you can get a list of all available enums with enum_range. I will use unnest to turn the array of enumerated types into individual rows:
SELECT unnest(enum_range(NULL::currency)) AS unused_enums
EXCEPT (
SELECT DISTINCT ON (c.name) c.name
FROM currencies c
)
You can execute this raw SQL in Ecto by doing this:
Ecto.Adapters.SQL.query!(MyApp.Repo, "SELECT unnest(...", [])
From this you'll get a Postgresx.Result that you'll have to get the values out of:
result
|> Map.get(:rows, [])
|> List.flatten()
|> Enum.map(&String.to_existing_atom/1)
I'm not really sure of a way to code this query up in pure Ecto, but let me know if you figure it out.
In Code
You can do the first query that I posted before with distinct then do a difference in the code.
query = from(c in Currency,
distinct: c.currency,
select: c.currency
)
CurrencyEnum.__enums__() -- Repo.all(query)
Either way is probably negligible in terms of performance so it's up to you.

Deep search within jsonb field PostgreSQL

A sample of my data looks something like this:
{"city": "NY",
"skills": [
{"soft_skills": "Analysis"},
{"soft_skills": "Procrastination"},
{"soft_skills": "Presentation"}
],
"areas_of_training": [
{"areas of training": "Visio"},
{"areas of training": "Office"},
{"areas of training": "Risk Assesment"}
]}
I would like to run a query to find users with soft_skills Analysis and maybe run another one to find users whose area of training is Visio and Risk Assesment
My column type is jsonb. How can I implement a search query on these deeply nested objects? A query on level one for city works using SELECT * FROM mydata WHERE content::json->>'city'='NY';
How can I also run a match using the LIKE keyword or string matching for deeply nested values?
1)
SELECT * FROM mydata
WHERE content->'skills' #> '[{"soft_skills": "Analysis"}]';
2)
SELECT * FROM mydata
WHERE content->'areas_of_training' #> '[{"areas of training": "Visio"},{"areas of training": "Risk Assesment"}]';
About JSON(B) operators
PS: And be ready for extremely slow queries. I highly recommend to think about data normalization.
Update for LIKE
For your example data it could be:
SELECT * FROM mydata
WHERE EXISTS (
SELECT *
FROM jsonb_array_elements(content->'areas_of_training') as a
WHERE a->>'areas of training' ilike '%vi%');
But query highly depending on the actual JSON structure.
Use json_array_elements() to get values of nested elements, examples:
select d.*
from mydata d,
json_array_elements(content->'skills')
where value->>'soft_skills' ilike '%analysis%';
select d.*
from mydata d,
json_array_elements(content->'areas_of_training')
where value->>'areas of training' ~* 'visio|office';
It is possible that the query yields duplicate rows, so it is reasonable to use select distinct on (id), where id is a primary key.
Note that the function json_array_elements() is costly and you cannot use indexes in contrary to Abelisto's solution. However you have to use it if you want to have an access to values of nested json elements.

querying JSONB with array fields

If I have a jsonb column called value with fields such as:
{"id": "5e367554-bf4e-4057-8089-a3a43c9470c0",
"tags": ["principal", "reversal", "interest"],,, etc}
how would I find all the records containing given tags, e.g:
if given: ["reversal", "interest"]
it should find all records with either "reversal" or "interest" or both.
My experimentation got me to this abomination so far:
select value from account_balance_updated
where value #> '{}' :: jsonb and value->>'tags' LIKE '%"principal"%';
of course this is completely wrong and inefficient
Assuming you are using PG 9.4+, you can use the jsonb_array_elements() function:
SELECT DISTINCT abu.*
FROM account_balance_updated abu,
jsonb_array_elements(abu.value->'tags') t
WHERE t.value <# '["reversal", "interest"]'::jsonb;
As it turned out you can use cool jsonb operators described here:
https://www.postgresql.org/docs/9.5/static/functions-json.html
so original query doesn't have to change much:
select value from account_balance_updated
where value #> '{}' :: jsonb and value->'tags' ?| array['reversal', 'interest'];
in my case I also needed to escape the ? (??|) because I am using so called "prepared statement" where you pass query string and parameters to jdbc and question marks are like placeholders for params:
https://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html

Select from any of multiple values from a Postgres field

I've got a table that resembles the following:
WORD WEIGHT WORDTYPE
a 0.3 common
the 0.3 common
gray 1.2 colors
steeple 2 object
I need to pull the weights for several different words out of the database at once. I could do:
SELECT * FROM word_weight WHERE WORD = 'a' OR WORD = 'steeple' OR WORD='the';
but it feels ugly and the code to generate the query is obnoxious. I'm hoping that there's a way I can do something like (pseudocode):
SELECT * FROM word_weight WHERE WORD = 'a','the';
You are describing the functionality of the in clause.
select * from word_weight where word in ('a', 'steeple', 'the');
If you want to pass the whole list in a single parameter, use array datatype:
SELECT *
FROM word_weight
WHERE word = ANY('{a,steeple,the}'); -- or ANY('{a,steeple,the}'::TEXT[]) to make explicit array conversion
If you are not sure about the value and even not sure whether the field will be an empty string or even null then,
.where("column_1 ILIKE ANY(ARRAY['','%abc%','%xyz%']) OR column_1 IS NULL")
Above query will cover all possibility.