Sequelize how to use aggregate function on Postgres JSONB column - postgresql

I have created one table with JSONB column as "data"
And the sample value of that column is
[{field_id:1, value:10},{field_id:2, value:"some string"}]
Now there are multiple rows like this..
What i want ?
I want to use aggregate function on "data" column such that, i should
get
Sum of all value where field_id = 1;
Avg of value where field_id = 1;
I have searched alot on google but not able to find a proper solution.
sometimes it says "Field doesn't exist" and some times it says "from clause missing"
I tried referring like data.value & also data -> value lastly data ->> value
But nothing is working.
Please let me know the solution if any one knows,
Thanks in advance.

Your attributes should be something like this, so you instruct it to run the function on a specific value:
attributes: [
[sequelize.fn('sum', sequelize.literal("data->>'value'")), 'json_sum'],
[sequelize.fn('avg', sequelize.literal("data->>'value'")), 'json_avg']
]
Then in WHERE, you reference field_id in a similar way, using literal():
where: sequelize.literal("data->>'field_id' = 1")
Your example also included a string for the value of "value" which of course won't work. But if the basic Sequelize setup works on a good set of data, you can enhance the WHERE clause to test for numeric "value" data, there are good examples here: Postgres query to check a string is a number
Hopefully this gets you close. In my experience with Sequelize + Postgres, it helps to run the program in such a way that you see what queries it creates, like in a terminal where the output is streaming. On the way to a working statement, you'll either create objects which Sequelize doesn't like, or Sequelize will create bad queries which Postgres doesn't like. If the query looks close, take it into pgAdmin for further work, then try to reproduce your adjustments in Sequelize. Good luck!

Related

Redshift Spectrum table doesnt recognize array

I have ran a crawler on json S3 file for updating an existing external table.
Once finished I checked the SVL_S3LOG to see the structure of the external table and saw it was updated and I have new column with Array<int> type like expected.
When I have tried to execute select * on the external table I got this error: "Invalid operation: Nested tables do not support '*' in the SELECT clause.;"
So I have tried to detailed the select statement with all columns names:
select name, date, books.... (books is the Array<int> type)
from external_table_a1
and got this error:
Invalid operation: column "books" does not exist in external_table_a1;"
I have also checked under "AWS Glue" the table external_table_a1 and saw that column "books" is recognized and have the type Array<int>.
Can someone explain why my simple query is wrong?
What am I missing?
Querying JSON data is a bit of a hassle with Redshift: when parsing is enabled (eg using the appropriate SerDe configuration) the JSON is stored as a SUPER type. In your case that's the Array<int>.
The AWS documentation on Querying semistructured data seems pretty straightforward, mentioning that PartiQL uses "dotted notation and array subscript for path navigation when accessing nested data". This doesn't work for me, although I don't find any reasons in their SUPER Limitations Documentation.
Solution 1
What I have to do is set the flags set json_serialization_enable to true; and set json_serialization_parse_nested_strings to true; which will parse the SUPER type as JSON (ie back to JSON). I can then use JSON-functions to query the data. Unnesting data gets even crazier because you can only use the unnest syntax select item from table as t, t.items as item on SUPER types. I genuinely don't think that this is the supposed way to query and unnest SUPER objects but that's the only approach that worked for me.
They described that in some older "Amazon Redshift Developer Guide".
Solution 2
When you are writing your query or creating a query Redshift will try to fit the output into one of the basic column data types. If the result of your query does not match any of those types, Redshift will not process the query. Hence, in order to convert a SUPER to a compatible type you will have to unnest it (using the rather peculiar Redshift unnest syntax).
For me, this works in certain cases but I'm not always able to properly index arrays, not can I access the array index (using my_table.array_column as array_entry at array_index syntax).

Postgres with Knex - query where an array contains a particular value

Surprising as it is, I haven't been able to figure out how to return a table that has an array that contains a particular value using knex. I see in the Postgres docs it's possible.
in sql
SELECT ... FROM fdt WHERE c1 IN (1, 2, 3)
but I can't find how this would be done with knex. One suggestion online was using whereIn.
So I tried
knex.from('pin_table').select('*').whereIn('likes_recieved', id)
.whereIn('likes_sent', id)
(I also tried whereIn with the array and search parameter switched like .whereIn(id, 'likes_recieved')
but I keep getting back a syntax error:
"select * from "pin_table" where "10" in $1 and "10" in $2 - syntax error at or near "$1"
Can anyone tell me how to do this with knex or knex.raw?
Thanks
This depends a bit on whether you're using the ARRAY type, or if likes_sent is built from a subquery etc. If it's an array, you're probably going to need to use ANY:
knex.from('pin_table').whereRaw('? = ANY(likes_received)', id);
That's a best guess without seeing your schema. Usually I'd be tempted to use a separate table rather than the Postgres array:
knex('pin_table as p')
.join('likes as l', 'p.id', 'l.pin_id')
.where('l.user_id', id);
See https://stackoverflow.com/a/20220450/122643 for a discussion of potential pitfalls and advantages of Postgres arrays.

Why Doesn't Knex Use Table Column Names on the Resulting Row Objects?

I'm making a query with Knex (PostgreSQL):
knex('foos')
.select(columnsThatAreAMixOfFooAndBarColumns)
.leftJoin('bars', 'foos.bar_id', 'bars.id')
.where({ 'foos.id': id })
When I get back results, I don't get back [{foo_id: 1, bar_id: 2, .... Instead I get back [{ id: 1, ....
What's so bizzare is that Knex is building the right query: if I log it I'll see SELECT "foos"."id", "bars"."id", .... But for some strange reason I can't figure out, Knex decides to merge all the resulting columns together, so I lose bars.id entirely and foos.id gets renamed to id.
Can anyone explain what's going on, and how I can fix it? I've tried adding returning, using columns instead of select, and otherwise changing how I make the query, but it doesn't help, and again the query Knex builds isn't the problem: it's the results.
EDIT: I re-ran the same exact query using knex.raw(query) instead ... and got the exact same (id, not foos_id) result. So it's definitely not how the query is made that's the problem: I'm somehow doing something to make Knex build the results in a way that ignores the qualified column names.
Well, as far as I can tell (and it's difficult because the documentation is almost exclusively about query building, with almost nothing about the results) the answer is "this is just how Knex works". It seems like it would make more sense to have the library return the qualified names, but for whatever reason it doesn't.
Instead, if you want to JOIN tables in Knex, it seems that you have to alias (ie. foos.id AS foos_id) every column that shares a name :(

Using bookshelf.js to query JSON column in a PostgreSQL database

I have a column in the users table of my Postgres dB table that is named email_subscription_prefs, which stores some some information in JSON format. It has an array_length of 1.
Sample Data:
[{"marketing":true,"transactional":true,"collaboration":true,"legal":true,"account":true}]
Issue:
I am trying to use bookshelf.js ORM to query and search all records in this table based on the value of the marketing key, specifically when its valueis true.
Here is an edited snippet of my code showing what I'm trying to implement this query using bookshelf.js:
return new User()
qb.where(function() {
this.where('domicile', 'USA').orWhere('domicile', null)
})
qb.whereRaw('cast(email_subscription_prefs->>? as boolean) = ?', ['marketing', true])
qb.limit(100)
})
Can someone tell me what I'm doing wrong on qb.whereRaw statement where I'm trying to query the JSON column email_subscription_prefs?
The code returns nothing where there are several thousands records in the users table.
Thanks in advance.
You seem to have an array of objects in sample data instead of single json object.
[
{
"marketing":true,
"transactional":true,
"collaboration":true,
"legal":true,
"account":true
}
]
so looks like you are trying to refer email_subscription_prefs->>'marketing' which is not found from the array.
To fetch marketing attribute of the first item in the array you should do:
email_subscription_prefs->0->>'marketing'
If that is not the problem, then you would need to add some example data from your DB to be able to tell what is the problem. You current description doesn't describe the queried table well enough.

FsSql Not working when Parameterizing Columns

Using F# , FsSql and PostGres
So I'm using this function
let getSqlParameter value =
let uniqueKey = Guid.NewGuid().ToString("N")
let key = (sprintf "#%s" uniqueKey)
(key,Sql.Parameter.make(key,value))
to get me a parameter of anything I pass in dynamically
Which I then append to a query and I get something like this
select * from (select * from mytable) as innerQuery where #a29c575b69bb4629a9971dac2808b445 LIKE '%#9e3485fdf99249e5ad6adb6405f5f5ca%'
Then I take a collection of these and pass them off
Sql.asyncExecReader connectionManager query parameters
The problem that I'm having is that when I don't run this through my parameterization engine, it works fine. When I do, it doesn't work. It just returns empty sets.
The only thing I can think of is that the column names can't be parameterized. This is a problem because they're coming from the client. Is there a way to do this?
Okay so the answer here is that you can't parameterize column names as far as I can tell.
What I ended up doing was creating a whitelist of acceptable column names and then compare what was coming in to my whitelist. If it doesn't exist then I drop it.
By far a sub-optimal solution. I really wish there was a way to do this.