Parameterizing a query with a varying amount of WHERE clause conditionals - postgresql

Let's imagine we have a table containing columns for 'color' and for 'size'. I have a list with color-size combinations (e.g. [(red, small), (blue, medium)]), of which the length is unknown. The table should be filtered based on this list, so that the result contains only the rows where the combinations apply.
A query based on the example would look like this:
SELECT * FROM items WHERE ((color = 'red' AND size = 'small') OR (color = 'blue' and size = 'medium'));
Parameterizing this query wouldn't work of course, since the amount of combinations varies.
Is there a way to achieve this using the parameterized queries like the ones that are use in node-postgres? The only solution I can think of is using string interpolation, which doesn't appear to be a safe.

It looks like good scenario for IN operator
select * from items where
(color, size) in (('red','small'), ('blue','medium'))
and it can be parametrized using arrays
select * from items where
(color, size) in (
select unnest (array['red','blue']), unnest(array['small','medium']))
First array is for colors, second for sizes. Unnest in one select create pairs.Arrays should have the same number of elements.
And arrays can be passed as parameters to query.

Related

How to search from multiple values in single field (Bus Stops in GTFS)?

In the GTFS data for Denver, the BUS_STOPS table stores multiple comma-separated values in the ROUTES column:
28, 19, 44, 10, 32
I'm selecting BUS_ROUTES that are within a distance of a school.
But to select the corresponding stops along those routes means trying to find a bus stop (ex. that serves bus 44) from the list I described above - I'm not sure how to do this.
The comment below helped me figure out selecting 1 route by its value would look like this:
select * from BUS_STOPS where ROUTES like '% 44,%';
...which returns records that contain 44 in one of the listed values.
So how would I replace the static value of 44 to be the value of the ROUTES field in the BUS_ROUTES table?
The BUS_ROUTES table looks like this:
...and the BUS_STOPS table looks like this:
I'm using PostgreSQL to query the GTFS data.
Convert the string to an array, then use an array comparison:
select *
from BUS_STOPS
where '44' = any(string_to_array(routes,',')
This can also be used as a join condition:
select *
from BUS_STOPS s
join bus_routes r on string_to_array(s.routes,',') #> string_to_array(r.routes,',')
The #> is the "contains" operator and tests if the left array (bus_stops.routes) contains all elements from right array (bus_routes.routes). Another option would be to the use overlaps operator && - it's not clear to me what exactly you want

it is possible to concatenate one result set onto another in a single query?

I have a table of Verticals which have names, except one of them is called 'Other'. My task is to return a list of all Verticals, sorted in alpha order, except with 'Other' at the end. I have done it with two queries, like this:
String sqlMost = "SELECT * from core.verticals WHERE name != 'Other' order by name";
String sqlOther = "SELECT * from core.verticals WHERE name = 'Other'";
and then appended the second result in my code. Is there a way to do this in a single query, without modifying the table? I tried using UNION
(select * from core.verticals where name != 'Other' order by name)
UNION (select * from core.verticals where name = 'Other');
but the result was not ordered at all. I don't think the second query is going to hurt my execution time all that much, but I'm kind of curious if nothing else.
UNION ALL is the usual way to request a simple concatenation; without ALL an implicit DISTINCT is applied to the combined results, which often causes a sort. However, UNION ALL isn't required to preserve the order of the individual sub-results as a simple concatenation would; you'd need to ORDER the overall UNION ALL expression to lock down the order.
Another option would be to compute an integer order-override column like CASE WHEN name = 'Other' THEN 2 ELSE 1 END, and ORDER BY that column followed by name, avoiding the UNION entirely.

Conditional OR in the SQL Server Join – Multi-Value Parameters

I have an SSRS report with 4 parameters, two of which are multi-value parameters (#material and #color using VARCHAR(MAX) datatype in SQL Server 2008 R2). I am using a split function to return the value as a comma separated:
SELECT *
FROM MyView
WHERE height > 200
AND width > 100
AND (
material IN (SELECT Item FROM [dbo].[MySplitFunction] (#material, ',')) OR
color IN (SELECT Item FROM [dbo].[MySplitFunction] (#color, ','))
)
(The code above would return 50 records)
The problem with this approach is that these two multi-value parameters have around of 1,500 different colors and materials and degrade the performance. Sometimes, it takes more than 40 minutes to return the results (row count in the view around 600,000).
I tried a different approach where I used a temp table and used it in the JOIN instead of the WHERE clause:
SELECT Item
INTO #TempTable
FROM [dbo].[MySplitFunction] (#material, ',')
SELECT *
FROM MyView
INNER JOIN ON MyView.Item = #TempTable.Item
WHERE height > 200
AND width > 100
AND material IN (SELECT Item FROM [dbo].[MySplitFunction] (#material, ','))
(The code above would return 7 records only, but the performance is much better)
My question is how can I return the same number of records (50 rows) using the second approach by adding the other #color parameter and allowing the OR condition? So in the SSRS report, the user can multi select these two parameters and the query will return #material = values OR #color = Values.
I am open to any other approach as long as it speeds up the query and allows the OR condition for the two multi-value parameters (#material, #color).
Thanks!
Something like the following might do the trick. I'm not sure I have the syntax precisely right, and it wants further testing and analysis that I can't do without the proper structures and data...
SELECT
from MyVeiew
where height > 200
and width > 100
and (exists (select Item
from dbo.MySplitFunction(#material, ',')
where Item = material)
or exists (select Item
from dbo.MySplitFunction(#color, ',')
where Item = color)
)
This performs two correlated subqueries on nested function calls. Exists checks are generally faster than in lookups in these situations. The syntax bit that worries me is the "and (exists" bit -- that's the parenthesis for the OR clause, and combined with exists it looks a bit wonky.
I think it should do what you want, but testing is definitely called for.
I mistrust that or clause. To get rid of it, try this and see what happens:
SELECT * -- Better with specific columns
from MyView
where height > 200
and width > 100
and exists (select Item
from dbo.MySplitFunction(#material, ',')
where Item = material)
UNION select *
from MyView
where height > 200
and width > 100
and exists (select Item
from dbo.MySplitFunction(#color, ',')
where Item = color)
This runs and combines two queries, removing all duplicates -- pretty much the same as the OR clause would.
Next thing to check would be reviewing table sizes and checking indexes. You're filtering results on (only!) columns height, width, material, and color; if the table is huge, appropriate index would help here.

Select from any of multiple values from a Postgres field

I've got a table that resembles the following:
WORD WEIGHT WORDTYPE
a 0.3 common
the 0.3 common
gray 1.2 colors
steeple 2 object
I need to pull the weights for several different words out of the database at once. I could do:
SELECT * FROM word_weight WHERE WORD = 'a' OR WORD = 'steeple' OR WORD='the';
but it feels ugly and the code to generate the query is obnoxious. I'm hoping that there's a way I can do something like (pseudocode):
SELECT * FROM word_weight WHERE WORD = 'a','the';
You are describing the functionality of the in clause.
select * from word_weight where word in ('a', 'steeple', 'the');
If you want to pass the whole list in a single parameter, use array datatype:
SELECT *
FROM word_weight
WHERE word = ANY('{a,steeple,the}'); -- or ANY('{a,steeple,the}'::TEXT[]) to make explicit array conversion
If you are not sure about the value and even not sure whether the field will be an empty string or even null then,
.where("column_1 ILIKE ANY(ARRAY['','%abc%','%xyz%']) OR column_1 IS NULL")
Above query will cover all possibility.

Complex SphinxQL Query

I'm trying to write a SphinxQL query that would replicate the following MySQL in a Sphinx RT index:
SELECT id FROM table WHERE colA LIKE 'valA' AND (colB = valB OR colC = valC OR ... colX = valX ... OR colY LIKE 'valY' .. OR colZ LIKE 'valZ')
As you can see I'm trying to get all the rows where one string column matches a certain value, AND matches any one of a list of values, which mixes and matches string and integer columns / values)
This is what I've gotten so far in SphinxQL:
SELECT id, (intColA = intValA OR intColB = intValB ...) as intCheck FROM rt_index WHERE MATCH('#requiredMatch = requiredValue');
The problem I'm running into is in matching all of the potential optional string values. The best possible query (if multiple MATCH statements were allowed and they were allowed as expressions) would be something like
SELECT id, (intColA = intValA OR MATCH('#checkColA valA|valB') OR ...) as optionalMatches FROM rt_index WHERE optionalMatches = 1 AND MATCH('#requireCol requiredVal')
I can see a potential way to do this with CRC32 string conversions and MVA attributes but these aren't supported with RT Indexes and I REALLY would prefer not switch from them.
One way would be to simply convert all your columns to normal fields. Then you can put all this logic inside the MATCH(..). Ie not using attributes.
Yes you can only have one MATCH per query.
Otherwise, yes you could use the CRC trick to make string attributes into integer ones, so can use for filtering.
Not sure why you would need MVA, but they are now supported in RT indexes in 2.0.2