Surprising as it is, I haven't been able to figure out how to return a table that has an array that contains a particular value using knex. I see in the Postgres docs it's possible.
in sql
SELECT ... FROM fdt WHERE c1 IN (1, 2, 3)
but I can't find how this would be done with knex. One suggestion online was using whereIn.
So I tried
knex.from('pin_table').select('*').whereIn('likes_recieved', id)
.whereIn('likes_sent', id)
(I also tried whereIn with the array and search parameter switched like .whereIn(id, 'likes_recieved')
but I keep getting back a syntax error:
"select * from "pin_table" where "10" in $1 and "10" in $2 - syntax error at or near "$1"
Can anyone tell me how to do this with knex or knex.raw?
Thanks
This depends a bit on whether you're using the ARRAY type, or if likes_sent is built from a subquery etc. If it's an array, you're probably going to need to use ANY:
knex.from('pin_table').whereRaw('? = ANY(likes_received)', id);
That's a best guess without seeing your schema. Usually I'd be tempted to use a separate table rather than the Postgres array:
knex('pin_table as p')
.join('likes as l', 'p.id', 'l.pin_id')
.where('l.user_id', id);
See https://stackoverflow.com/a/20220450/122643 for a discussion of potential pitfalls and advantages of Postgres arrays.
Related
I have ran a crawler on json S3 file for updating an existing external table.
Once finished I checked the SVL_S3LOG to see the structure of the external table and saw it was updated and I have new column with Array<int> type like expected.
When I have tried to execute select * on the external table I got this error: "Invalid operation: Nested tables do not support '*' in the SELECT clause.;"
So I have tried to detailed the select statement with all columns names:
select name, date, books.... (books is the Array<int> type)
from external_table_a1
and got this error:
Invalid operation: column "books" does not exist in external_table_a1;"
I have also checked under "AWS Glue" the table external_table_a1 and saw that column "books" is recognized and have the type Array<int>.
Can someone explain why my simple query is wrong?
What am I missing?
Querying JSON data is a bit of a hassle with Redshift: when parsing is enabled (eg using the appropriate SerDe configuration) the JSON is stored as a SUPER type. In your case that's the Array<int>.
The AWS documentation on Querying semistructured data seems pretty straightforward, mentioning that PartiQL uses "dotted notation and array subscript for path navigation when accessing nested data". This doesn't work for me, although I don't find any reasons in their SUPER Limitations Documentation.
Solution 1
What I have to do is set the flags set json_serialization_enable to true; and set json_serialization_parse_nested_strings to true; which will parse the SUPER type as JSON (ie back to JSON). I can then use JSON-functions to query the data. Unnesting data gets even crazier because you can only use the unnest syntax select item from table as t, t.items as item on SUPER types. I genuinely don't think that this is the supposed way to query and unnest SUPER objects but that's the only approach that worked for me.
They described that in some older "Amazon Redshift Developer Guide".
Solution 2
When you are writing your query or creating a query Redshift will try to fit the output into one of the basic column data types. If the result of your query does not match any of those types, Redshift will not process the query. Hence, in order to convert a SUPER to a compatible type you will have to unnest it (using the rather peculiar Redshift unnest syntax).
For me, this works in certain cases but I'm not always able to properly index arrays, not can I access the array index (using my_table.array_column as array_entry at array_index syntax).
I have created one table with JSONB column as "data"
And the sample value of that column is
[{field_id:1, value:10},{field_id:2, value:"some string"}]
Now there are multiple rows like this..
What i want ?
I want to use aggregate function on "data" column such that, i should
get
Sum of all value where field_id = 1;
Avg of value where field_id = 1;
I have searched alot on google but not able to find a proper solution.
sometimes it says "Field doesn't exist" and some times it says "from clause missing"
I tried referring like data.value & also data -> value lastly data ->> value
But nothing is working.
Please let me know the solution if any one knows,
Thanks in advance.
Your attributes should be something like this, so you instruct it to run the function on a specific value:
attributes: [
[sequelize.fn('sum', sequelize.literal("data->>'value'")), 'json_sum'],
[sequelize.fn('avg', sequelize.literal("data->>'value'")), 'json_avg']
]
Then in WHERE, you reference field_id in a similar way, using literal():
where: sequelize.literal("data->>'field_id' = 1")
Your example also included a string for the value of "value" which of course won't work. But if the basic Sequelize setup works on a good set of data, you can enhance the WHERE clause to test for numeric "value" data, there are good examples here: Postgres query to check a string is a number
Hopefully this gets you close. In my experience with Sequelize + Postgres, it helps to run the program in such a way that you see what queries it creates, like in a terminal where the output is streaming. On the way to a working statement, you'll either create objects which Sequelize doesn't like, or Sequelize will create bad queries which Postgres doesn't like. If the query looks close, take it into pgAdmin for further work, then try to reproduce your adjustments in Sequelize. Good luck!
Oracle has the ability to do bulk inserts by passing arrays as bind variables. The database then does a separate row insert for each member of the array:
http://www.oracle.com/technetwork/issue-archive/2009/09-sep/o59odpnet-085168.html
Thus if I have an array:
string[] arr = { 1, 2, 3}
And I pass this as a bind to my SQL:
insert into my_table(my_col) values (:arr)
I end up with 3 rows in the table.
Is there a way to do this in PostgreSQL w/o modifying the SQL? (i.e. I don't want to use the copy command, an explicit multirow insert, etc)
Nearest that you can use is :
insert into my_table(my_col) SELECT unnest(:arr)
PgJDBC supports COPY, and that's about your best option. I know it's not what you want, and it's frustrating that you have to use a different row representation, but it's about the best you'll get.
That said, you will find that if you prepare a statement then addBatch and executeBatch, you'll get pretty solid performance. Sufficiently so that it's not usually worth caring about using COPY. See Statement.executeBatch. You can create "array bind" on top of that with a trivial function that's a few lines long. It's not as good as server-side array binding, but it'll do pretty well.
No, you cannot do that in PostgreSQL.
You'll either have to use a multi-row INSERT or a COPY statement.
I'm not sure which language you're targeting, but in Java, for example, this is possible using Connection.createArrayOf().
Related question / answer:
error setting java String[] to postgres prepared statement
I need to execute some query that returns (SELECTs) a PostgreSQL array, but the documentation of SOCI's PostgreSQL backend doesn't mention anything about arrays.
If I just try to put it into a soci::rowset, it thinks that it is a string column and returns a string like "{1, 2, 3}" which I would hate to parse. Is there any way to get SOCI to handle that data type automatically, either with soci::into or via a soci::rowset? Or do I have to resort to JOINing with the array in order to get separate rows in the resultset?
I'm using SOCI 3.2 and PostgreSQL 9.3.
I have a Postgres table with more than 8 million rows. Given the following two ways of doing the same query via DBD::Pg, I get wildly different results.
$q .= '%';
## query 1
my $sql = qq{
SELECT a, b, c
FROM t
WHERE Lower( a ) LIKE '$q'
};
my $sth1 = $dbh->prepare($sql);
$sth1->execute();
## query 2
my $sth2 = $dbh->prepare(qq{
SELECT a, b, c
FROM t
WHERE Lower( a ) LIKE ?
});
$sth2->execute($q);
query 2 is at least an order of magnitude slower than query 1... seems like it is not using the indexes, while query 1 is using the index.
Would love hear why.
With LIKE expressions, b-tree indexes can only be used if the search pattern is left-anchored, i.e. terminated with %. More details in the manual.
Thanks to #evil otto for the link. This link to the current version.
Your first query provides this essential information at prepare time, so the query planner can use a matching index.
Your second query does not provide any information about the pattern at prepare time, so the query planner cannot use any indexes.
I suspect that in the first case the query compiler/optimizer detects that the clause is a constant, and can build an optimal query plan. In the second it has to compile a more generic query because the bound variable can be anything at run-time.
Are you running both test cases from same file using same $dbh object?
I think reason of increasing speed in second case is that you using prepared statement which is already parsed(but maybe I wrong:)).
Ahh, I see - I will drop out after this comment since I don't know Perl. But I would trust that the editor is correct in highlighting the $q as a constant. I'm guessing that you need to concatenate the value into the string, rather than just directly referencing the variable. So, my guess is that if + is used for string concatenation in perl, then use something like:
my $sql = qq{
SELECT a, b, c
FROM t
WHERE Lower( a ) LIKE '
} + $q + qq{'};
(Note: unless the language is tightly integrated with the database, such as Oracle/PLSQL, you usually have to create a completely valid SQL string before submitting to the database, instead of expecting the compiler to 'interpolate'/'Substitute' the value of the variable.)
I would again suggest that you get the COUNT() of the statements, to make sure that you are comparing apple to apples.
I don't know Postgres at all, but I think in Line 7 (WHERE Lower( a ) LIKE '$q'
), $q is actually a constant. It looks like your editor thinks so too, since it is highlighted in red. You probably still need to use the ? for the variable.
To test, do a COUNT(*), and make sure they match - I could be way offbase.