Invalid length parameter passed to the LEFT or SUBSTRING function UNION ALL - tsql

I have two queries that use the SUBSTRING function within a CASE statement like so:
CASE
WHEN Answer.ChoiceTitle = 'Neither Likely or Unlikely'
THEN 'Neither Likely nor Unlikely'
WHEN Answer.ChoiceTitle LIKE '[1-5]%'
THEN SUBSTRING(Answer.ChoiceTitle, 3, LEN(Answer.ChoiceTitle) - 2)
ELSE Answer.ChoiceTitle
END AS Recommendation
Both queries run perfectly fine when run separately but when I try to combine both result sets with a UNION ALL I get the error message:
Invalid length parameter passed to the LEFT or SUBSTRING function
Whilst trying to figure out why this error is occurring I added the below to each statement and the UNION ALL now works perfectly fine.
MIN(LEN(Answer.ChoiceTitle)) OVER() AS MinLength
Why would I be getting this error?
Execution Plans
Planned Execution Plan with UNION ALL - https://www.brentozar.com/pastetheplan/?id=rksFnuLS-
Actual Execution Plan of first statement - https://www.brentozar.com/pastetheplan/?id=r1Z-pO8HW
Actual Execution Plan of second statement - https://www.brentozar.com/pastetheplan/?id=rkCTh_IBb

This is most likely causing your error: LEN(Answer.ChoiceTitle) - 2
When that evaluates to less than 0, it will throw an error.
Try this instead:
CASE
WHEN Answer.ChoiceTitle = 'Neither Likely or Unlikely'
THEN 'Neither Likely nor Unlikely'
WHEN Answer.ChoiceTitle LIKE '[1-5]%' and LEN(Answer.ChoiceTitle) > 2
THEN SUBSTRING(Answer.ChoiceTitle, 3, LEN(Answer.ChoiceTitle) - 2)
ELSE Answer.ChoiceTitle
END AS Recommendation
Since you are just getting rid of the first two characters, you could use stuff() instead like so:
CASE
WHEN Answer.ChoiceTitle = 'Neither Likely or Unlikely'
THEN 'Neither Likely nor Unlikely'
WHEN Answer.ChoiceTitle LIKE '[1-5]%'
THEN stuff(Answer.ChoiceTitle,1,2,'')
ELSE Answer.ChoiceTitle
END AS Recommendation
This will give you an empty string if the length is less than 3, otherwise it will remove the first two characters of Answer.ChoiceTitle.
As to why the combined query with union all throws an error when the others run alone do not:
I'm seeing this difference in the execution plans:
Hash Match > (Question & Survey nested loop) & (Compute Scalar > Answer) {Bottom right of execution plan without error}
vs
Hash Match > (Bitmap > Parallelism > Question) & (Compute Scalar > Answer) {Bottom right of execution plan with error}
the nested loop version may be filtering rows that cause the error prior to the hash match, thus avoiding the error.
It is possible that using option (maxdop 1) to prevent parallelism would avoid the error as well (confirmed) on the query that is currently throwing it
this just comes down to when the scalar function is being evaluated for the rows in the answer table, pre or post filtering out the rows you want to run the expression on.
The cost is higher in the union all version, and it exceeds the cost threshold for parallelism, which is why you wouldn't see the same error when run alone where it does not go parallel (specifically parallel in the same way) with the lower cost when run alone.
So basically the parallel plan is running the substring() sooner than your other plans, before the rows that throw errors are filtered out.

Related

Powershell Index math

I have a couple scenarios where a series of arbitrary string values have a specific priority sequence. For example:
forgo > log > assert
exception > failure > error > warning > alert
And, I need to evaluate an arbitrary number of scenarios that resolve to one of those values and maintain a running status. So, using the simpler example, if every evaluation is assert then the final running status would be assert. But if a single evaluation is log and the rest are assert, the the final running status is log. And a single forgo means the final status is forgo, no matter what the mix of individual status results was. I want to provide a human readable running status and individual status, do the math to determine what the new running index is, then return the human readable status.
So, I have this, and it works.
$statusIndex = #('assert', 'log', 'forgo')
$runningStatus = 'forgo'
$individualStatus = 'log'
$runningStatusIndex = $statusIndex.indexof($runningStatus)
$individualStatusIndex = $statusIndex.indexof($individualStatus)
if ($individualStatusIndex -gt $runningStatusIndex) {
$runningStatusIndex = $individualStatusIndex
}
$runningStatus = $statusIndex[$runningStatusIndex]
But, this feels like something that happens often enough that there may be a more "native" way to do it. Some built in PowerShell functionality that handles the same thing more elegantly and in less code.
Is my intuition correct, and there is a native way? Or, perhaps a more elegant approach than what I have here?
Put your terms in an array, in precedence order:
$statusIndex = #('forgo', 'log', 'assert')
Aggregate all your status values and remove any duplicates:
$statusValues = 'assert', 'assert', 'log', 'assert'
$statusValueSet = $statusValues |Sort -Unique
Now use the .Where() extension method to select only the first matching term from the precedence list:
$overallStatus = $statusIndex.Where({$_ -in $statusValueSet}, 'First')
Value of $overallStatus should now be 'log' as expected.
3-4 lines of code instead of 9-10 :)

Gremlin: Capturing both (or all) the elements in a union() with as() - to then select() them later

I can't work out how to capture/select the elements of a union in Gremlin. In this example, I'm trying to use as() to capture 'a' or 'b' and be able to tell (easily) which of the two it was that hit in select() step.
Attempt 1:
g.V().has('property', 'value').union(
out().has('propertyA', 'valueA').as_('a'),
out().has('propertyB', 'valueB').as_('b')
).select('a','b')
This gives no results, because we're trying to select both values, and they never both capture at the same time in the same result.
Attempt 2:
g.V().has('property', 'value').union(
out().has('propertyA', 'valueA'),
out().has('propertyB', 'valueB')
).as_('a_or_b').select('a_or_b')
This solves the no results problem, but doesn't let me work out which element of the union - was it 'a' or was it 'b'? - captured (without some post-processing).
Ideally, I want a result like {a: [v100], b: []} - if 'a' captured.
Note: this is a toy example. In the end it will need generalising and the 'a' and 'b' union elements might be arbitrarily complex.
Do you have to use union() for some reason? Seems like you just need to project() you results:
g.V().has('property','value').
filter(out().or(has('propertyA`,'valueA'),has('propertyB'),'valueB'))).
project('a','b').
by(out().has('propertyA','valueA').fold()).
by(out().has('propertyB','valueB').fold())
That should give you the result you desired, but I'm not sure if I have the full context of what you're doing.

How to insert similar value into multiple locations of a psycopg2 query statement using dict? [duplicate]

I have a Python script that runs a pgSQL file through SQLAlchemy's connection.execute function. Here's the block of code in Python:
results = pg_conn.execute(sql_cmd, beg_date = datetime.date(2015,4,1), end_date = datetime.date(2015,4,30))
And here's one of the areas where the variable gets inputted in my SQL:
WHERE
( dv.date >= %(beg_date)s AND
dv.date <= %(end_date)s)
When I run this, I get a cryptic python error:
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) argument formats can't be mixed
…followed by a huge dump of the offending SQL query. I've run this exact code with the same variable convention before. Why isn't it working this time?
I encountered a similar issue as Nikhil. I have a query with LIKE clauses which worked until I modified it to include a bind variable, at which point I received the following error:
DatabaseError: Execution failed on sql '...': argument formats can't be mixed
The solution is not to give up on the LIKE clause. That would be pretty crazy if psycopg2 simply didn't permit LIKE clauses. Rather, we can escape the literal % with %%. For example, the following query:
SELECT *
FROM people
WHERE start_date > %(beg_date)s
AND name LIKE 'John%';
would need to be modified to:
SELECT *
FROM people
WHERE start_date > %(beg_date)s
AND name LIKE 'John%%';
More details in the pscopg2 docs: http://initd.org/psycopg/docs/usage.html#passing-parameters-to-sql-queries
As it turned out, I had used a SQL LIKE operator in the new SQL query, and the % operand was messing with Python's escaping capability. For instance:
dv.device LIKE 'iPhone%' or
dv.device LIKE '%Phone'
Another answer offered a way to un-escape and re-escape, which I felt would add unnecessary complexity to otherwise simple code. Instead, I used pgSQL's ability to handle regex to modify the SQL query itself. This changed the above portion of the query to:
dv.device ~ E'iPhone.*' or
dv.device ~ E'.*Phone$'
So for others: you may need to change your LIKE operators to regex '~' to get it to work. Just remember that it'll be WAY slower for large queries. (More info here.)
For me it's turn out I have % in sql comment
/* Any future change in the testing size will not require
a change here... even if we do a 100% test
*/
This works fine:
/* Any future change in the testing size will not require
a change here... even if we do a 100pct test
*/

Using a Position Function in a Case Function - PostgreSQL

I'm new to SQL and Postgres, so hopefully this isn't too hard to figure out for all of you.
I'm trying to use a Position function within a CASE statement, but I keep getting the error
"ERROR: Syntax error at or near ""Project"". LINE 2: CASE WHEN position('(' IN "Project") >0 THEN".
I've used this position function before and it worked fine, so I'm confused what the problem is here. I've also tried the table name, such as "xyztable.Project" and "Project" - both without quotation marks.
Here is the entire statement:
SELECT "Project",
CASE WHEN postion('(' IN "Project") >0 THEN
substring("Project",position('(' IN "Project")+1,position(')' IN "Project")-2)
CASE WHEN postion('('IN "Project") IS NULL THEN
"Project"
END
FROM "2015Budget";
As I haven't gotten to past the second line of this statement, if anyone sees anything that would prevent this statement from running correctly, please feel free to point it out.
New Statement:
SELECT "Project",
CASE
WHEN position('(' IN "Project") >0 THEN
substring("Project",position('(' IN "Project")+1,position(')' IN "Project")-2)
WHEN position('('IN "Project") IS NULL THEN
"Project"
END
FROM "2015Budget";
Thank you for your help!!
The error is due to a simple typo - postion instead of position.
You would generally get a much more comprehensible error message in situations like this (e.g. "function postion(text,text) does not exist"). However, the use of function-specific keywords as argument separators (as mandated by the SQL standard) makes this case much more difficult for the parser to cope with.
After fixing this, you'll run into another error. Note that the general form of a multi-branch CASE expression is:
CASE
WHEN <condition1> THEN <value1>
WHEN <condition2> THEN <value2>
...
END

Erlang mnesia equivalent of "select * from Tb"

I'm a total erlang noob and I just want to see what's in a particular table I have. I want to just "select *" from a particular table to start with. The examples I'm seeing, such as the official documentation, all have column restrictions which I don't really want. I don't really know how to form the MatchHead or Guard to match anything (aka "*").
A very simple primer on how to just get everything out of a table would be very appreciated!
For example, you can use qlc:
F = fun() ->
Q = qlc:q([R || R <- mnesia:table(foo)]),
qlc:e(Q)
end,
mnesia:transaction(F).
The simplest way to do it is probably mnesia:dirty_match_object:
mnesia:dirty_match_object(foo, #foo{_ = '_'}).
That is, match everything in the table foo that is a foo record, regardless of the values of the fields (every field is '_', i.e. wildcard). Note that since it uses record construction syntax, it will only work in a module where you have included the record definition, or in the shell after evaluating rr(my_module) to make the record definition available.
(I expected mnesia:dirty_match_object(foo, '_') to work, but that fails with a bad_type error.)
To do it with select, call it like this:
mnesia:dirty_select(foo, [{'_', [], ['$_']}]).
Here, MatchHead is _, i.e. match anything. The guards are [], an empty list, i.e. no extra limitations. The result spec is ['$_'], i.e. return the entire record. For more information about match specs, see the match specifications chapter of the ERTS user guide.
If an expression is too deep and gets printed with ... in the shell, you can ask the shell to print the entire thing by evaluating rp(EXPRESSION). EXPRESSION can either be the function call once again, or v(-1) for the value returned by the previous expression, or v(42) for the value returned by the expression preceded by the shell prompt 42>.