Not equal operator in pyspark giving wrong results - pyspark

I am looking for this particular record in an array and it find the rows as below:
xarr.filter(xarr["orderid"] == 27952740).count()
This gives 67,272 rows which is the right answer.
Then I do this. Assign all Non-zero values to another array:
xarr2 = xarr.filter(xarr["orderid"] != 0)
Now in the resulting array xarr2, I am trying to locate the record as follows:
xarr2.filter(xarr2["orderid"] == 27952740).count()
This one returns zero records. Why is this behavior? Any idea?

The data type of orderid is String. Changing != 0to != '0' gave the correct results.

Related

GEN_NO_GENERATABLE_NOTIF

I want to choose index from list, so the element[index] complies my condition.
MyList[index].num==0
I tried the code bellow:
gen DescIdx2Choose keeping {
it < MyList.size();
MyList[it].num==0;//I tried a few way of read_only
};
How can I do it without using all_indices?
Thanks
Since you generating DescIdx2Choose then MyList will be input to the problem.
Therefore,
If seeking for the first index (if exists) then using random generation isn't required. Use the procedural code "first_index" as user3467290 suggested which is much more efficient:
var fidx := MyList.first_index(.num == 0);
if ( fidx != UNDEF ) {
DescIdx2Choose = fidx;
} else {
// error handling
};
If there are multiple indices and it is required to choose a random one, the most efficient way would be using "all_indices" as Thorsten suggested:
gen DescIdx2Choose keeping {
it in read_only( MyList.all_indices(.num == 0) );
};
The reason is the random generator doesn't need to read all possible values of "MyList.num" only a shorter list of valid indices.
This should do it, but MyList must fulfill the condition otherwise you get a contradiction. Pseudo-method first_index() is not bi-directional which is what we need here.
gen DescIdx2Choose keeping {
MyList.first_index(.num == 0) == it;
};
Maybe i missed something in the question, but If you always want the index of the first element that its num == 0, then why use constraints? can assign DescIdx2Choose == MyList.first_index(.num == 0).
To ensure that there is at least one such element, can constrain MyList.has(.num == 0).
Do you have additional constraints on DescIdx2Choose ?

Calculate value depending on multiple conditions

I have Grp_ids feild which i need to assign "Pass" or "Fail" based on the results of measure_ids
For example : If one of the measure_ids in the group is failed, i want the whole group to be failed.
Thank you.
Define Grp_Result as MIN([Measure_Results] = "Pass")
That will be True if Measure_Results = "Pass" for every data row for your selected dimensions (Grp_Ids and Measure_is in your example). If you can have null values in your Measure_Results column and if you want to treat those nulls as a Fail, then would use instead MIN(IFNULL([Measure_Results] = "Pass"), False)
This works because Tableau treats True as greater than False. So MAX() holds if is satisfied for EVERY data row, and MAX() holds if is satisfied for ANY data row.
You can simplify a bit further by using boolean values in place of strings to begin with to represent whether a record passed or failed.
This should do the trick -
IIF({ FIXED [Grp_ids]: SUM(IIF([Measure_Results] = 'Pass',0,1)) }>0,'Fail','Pass')
Sum the failures by GRP_ID. If the sum is greater than 0 (GRP_ID contains a Fail) then Fail the entire GRP_ID.

Comparing a value from a variable to values from the given column to check if value exists or not in MongoDB

Suppose I have vehicle plate numbers in a given table "plates" with name of the column as "plate".
I'm taking a plate number as input and keeping it in a variable num1 and then check whether the plate number is present in the column or not. I tried using
db.plates.find( { plate: { $eq: num1 }
but it is not returning any output.
Can anyone suggest a way to achieve the desired result?

Strange object value instead of float by using mapReduce in mongodb with Doctrine

I use mongo query for calculating sum price for every item.
My query looks like so
$queryBuilder = new Query\Builder($this, $documentName);
$queryBuilder->field('created')->gte($startDate);
$queryBuilder->field('is_test_value')->notEqual(true);
..........
$queryBuilder->map('function() {emit(this.item, this.price)}');
$queryBuilder->reduce('function(item, valuesPrices) {
return {sum: Array.sum(valuesPrices)}
}');
And this works, no problem. But I found that in some cases (approximately 20 cases from 200 results) I have strange result in field sum - instead of sum value I see construction like
[objectObject]444444444444444
4 - is price for item.
I tried to replace reduce block to block like this:
var sum = 0;
for (var i = 0; i < valuesPrices.length; i++) {
sum += parseFloat(valuesPrices[i]);
}
return {sum: sum}
In that case I see NAN value.
I suspected that some data in field price was inserted incorrectly (not as float, but as string, object etc). I tried execute my query from mongo cli and I see that all price values are integer.
It's not "strange" at all. You "broke the rules" and now you are paying for it.
"MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key."
The primary rule of mapReduce (as cited ) is that you must return exactly the same structure from the "reducer" as you do from the "mapper". This is because the "reducer" can actually run several times for the same "key". This is how mapReduce processes large lists.
You fix this by just returning a singular value, just like you did in the emit:
return Array.sum(values);
And then there will not be a problem. Adding an object key to that makes the data inconsistent, and thus you get an error when the "reduced" result gets fed back into the "reducer" again.

Summing Only Visible Rows in SSRS

I'm trying to sum only the visible rows for a report and I know the format is:
=Sum( iif( <use the condition of the Visibility.Hidden expression>, 0, Fields!A.Value))
In my report, I've set row visbility to:
=IIF(CInt(Fields!EM_ET.Value)=1 Or CInt(Fields!EM_ET.Value)= 2,True,False)
Not exactly sure what I'm missing, but when I use this as an expression:
=Sum(IIF(CInt(Fields!EM_ET.Value)=1 Or CInt(Fields!EM_ET.Value)= 2,True,False),0,Fields!EM_ET.Value)
I get this following error: Value expression for textrun'FTD1.Paragraph[0].TextRuns[0]' has a scope argument that is not valid for aggregate function.
You are giving the True/False as output to the SUM() from your expression.You need to change your expression as,
=Sum(IIF(CInt(Fields!EM_ET.Value) = 1 Or CInt(Fields!EM_ET.Value)= 2,0,Fields!EM_ET.Value))
Thanks coder of code.
I am getting some error message if i am using '0', so i replaced with nothing it's working. I thought it would be helpful.
=SUM(IIF(ISNOTHING(Fields!value1.Value) OR ISNOTHING(Fields!value2.Value),NOTHING,Fields!value1.Value))
I was having similar issues. I know this question has a marked answer, however it isn't entirely correct.
The following contains the code from the "marked correct" answer:
=Sum(IIF(CInt(Fields!EM_ET.Value) = 1 Or CInt(Fields!EM_ET.Value) = 2,0,Fields!EM_ET.Value))
The following snipit of code contains the code from above, with a slight tweak:
=Sum(IIF(CInt(Fields!EM_ET.Value) = 1 Or CInt(Fields!EM_ET.Value) = 2,NOTHING,Fields!EM_ET.Value))
By changing the "0" to "NOTHING" this will resolve any lingering issues with your formula. The formula I was having issues with was returning "#Error" in the cell that was supposed to return the sum of the visible cells in the column within the row group.
My formula originally looked like this:
=IIF(Sum(Fields!VCBitType.Value) <> 0 OR (Parameters!Details.Value = "Detail"), Sum(IIF((Fields!VCBitType.Value = 0) and (Parameters!Details.Value = "Dangerous"),0,Fields!VC.Value)), "")
I altered my formula after seeing Hari's comment above:
=IIF(SUM(Fields!VCBitType.Value) <> 0 OR (Parameters!Details.Value = "Detail"), SUM(IIF((Fields!VCBitType.Value = 0) and (Parameters!Details.Value = "Dangerous"),NOTHING, Fields!VC.Value)), "")
I wanted to set the cells value to "" (empty) rather than hide the cell entirely or set that cell's value to "0" in my report (due to the fact that not displaying the cell entirely made the report look really funky). Because I did this, using a "0" in my formula conflicted with the value of the cell which was "" (empty), causing the formula to break. The only change I made was to alter "0" to "Nothing", this resolved the issue I was having with the cell's formula.