unexpected Error in reduce - pyspark

While finding max value with reduce in pyspark i am getting the below unexpected result.
agg.reduce(lambda a,b : a if a > b else b )
and my sample data is
(u'2013-10-17', 80325.0)
(u'2014-01-01', 68521.0)
(u'2013-11-10', 83691.0)
(u'2013-11-14', 149289.0)
(u'2013-11-18', 94756.0)
(u'2014-01-30', 126171.0)
and result is
(u'2014-07-24', 97088.0)
It should gave result more than 94756
Thanks
sPradeep

You should compare the second value in tuple, like this:
agg.reduce(lambda a,b : a if a[1] > b[1] else b )

Just use max with key:
rdd.max(key=lambda x: x[1])

Related

SSP Algorithm minimal subset of length k

Suppose S is a set with t elements modulo n. There are indeed, 2^t subsets of any length. Illustrate a PARI/GP program which finds the smallest subset U (in terms of length) of distinct elements such that the sum of all elements in U is 0 modulo n. It is easy to write a program which searches via brute force, but brute force is infeasible as t and n get larger, so would appreciate help writing a program which doesn't use brute force to solve this instance of the subset sum problem.
Dynamic Approach:
def isSubsetSum(st, n, sm) :
# The value of subset[i][j] will be
# true if there is a subset of
# set[0..j-1] with sum equal to i
subset=[[True] * (sm+1)] * (n+1)
# If sum is 0, then answer is true
for i in range(0, n+1) :
subset[i][0] = True
# If sum is not 0 and set is empty,
# then answer is false
for i in range(1, sm + 1) :
subset[0][i] = False
# Fill the subset table in botton
# up manner
for i in range(1, n+1) :
for j in range(1, sm+1) :
if(j < st[i-1]) :
subset[i][j] = subset[i-1][j]
if (j >= st[i-1]) :
subset[i][j] = subset[i-1][j] or subset[i - 1][j-st[i-1]]
"""uncomment this code to print table
for i in range(0,n+1) :
for j in range(0,sm+1) :
print(subset[i][j],end="")
print(" ")"""
return subset[n][sm];
I got this code from here I don't know weather it seems to work.
function getSummingItems(a,t){
return a.reduce((h,n) => Object.keys(h)
.reduceRight((m,k) => +k+n <= t ? (m[+k+n] = m[+k+n] ? m[+k+n].concat(m[k].map(sa => sa.concat(n)))
: m[k].map(sa => sa.concat(n)),m)
: m, h), {0:[[]]})[t];
}
var arr = Array(20).fill().map((_,i) => i+1), // [1,2,..,20]
tgt = 42,
res = [];
console.time("test");
res = getSummingItems(arr,tgt);
console.timeEnd("test");
console.log("found",res.length,"subsequences summing to",tgt);
console.log(JSON.stringify(res));

q/KDB - nprev function to get all the previous n elements

I am struggling to write a nprev function in KDB; xprev function returns the nth element but I need all the prev n elements relative to the current element.
q)t:([] i:1+til 26; s:.Q.a)
q)update xp:xprev[3;]s,p:prev s from t
Any help is greatly appreciated.
You can achieve the desired result by applying prev repeatedly and flipping the result
q)n:3
q)select flip 1_prev\[n;s] from t
s
-----
" "
"a "
"ba "
"cba"
"dcb"
"edc"
..
If n is much smaller than the rows count, this will be faster than some of the more straightforward solutions.
The xprev function basically looks like this :
xprev1:{y til[count y]-x} //readable xprev
We can tweak it to get all n elements
nprev:{y til[count y]-\:1+til x}
using nprev in the query
q)update np: nprev[3;s] , xp1:xprev1[3;s] , xp: xprev[3;s], p:prev[s] from t
i s np xp1 xp p
-------------------
1 a " "
2 b "a " a
3 c "ba " b
4 d "cba" a a c
5 e "dcb" b b d
6 f "edc" c c e
k equivalent of nprev
k)nprev:{$[0h>#y;'`rank;y(!#y)-\:1+!x]}
and similarly nnext would look like
k)nnext:{$[0h>#y;'`rank;y(!#y)+\:1+!x]}

Input argument "b" is undefined

i am new in matlab and search everything. I am writing a the function. i could not able to understand why this error is comning :"Input argument "b" is undefined." . shall i intialise b =0 ? whereas it is the parameter coming from input console. my code:
function f = evenorodd( b )
%UNTITLED2 Summary of this function goes here
%zohaib
% Detailed explanation goes here
%f = b;%2;
f = [0 0];
f = rem(b,2);
if f == 0
disp(b+ 'is even')
else
disp(b+ 'is odd')
end
console:
??? Input argument "b" is undefined.
Error in ==> evenorodd at 6
f = rem(b,2);
From what I see, this is what you are trying to do:
function f = evenorodd( b )
f = rem(b,2);
if f == 0
fprintf('%i is even\n', b)
else
fprintf('%i is odd\n', b)
end
=======================
>> evenorodd(2);
2 is even
No need to initialize f as [0,0].
In MATLAB, you cant concatenate a number and string with + operator. Use fprintf.
The above function evenorodd takes one argument (integer) and returns 0 or 1.

Difference in function behavior when called standalone or inside a query in q

I have found a strange issue in q, a possible bug I suppose.
I have defined a simple function that returns a float, given a date as input:
give_dummy:{[the_date]
/// give_dummy[2013.05.10] // <- if u wanna test
:$[ the_date > 2013.01.01 ; 0.001 ; 0.002] ;
}
It works without problems if called stand-alone:
q)give_dummy[2013.05.10]
0.001
Nevertheless, if I try to call it in a query I get an error:
q)select give_dummy[date] from tab where sym = sec, i >= first_i , i < 4000
'type
If I simplify the function to just return the input date (identity function), it works in the query.
If I simplify the function to just return a float, without comparing the dates, it works in the query.
The problem arises when I USE the input date to compare it in the if-statement:
$[ the_date > 2013.01.01 ; 0.001 ; 0.002]
The same happens if I re-define the function taking a float as input, instead than a date, and then I try to give the price as input in the query:
give_dummy:{[the_price]
/// give_dummy[12] // <- if u wanna test
:$[ the_price > 20 ; 0.001 ; 0.002] ;
}
q) give_dummy[12]
0.002
q)select give_dummy[price] from tab where sym = sec, i >= first_i , i < 4000
'type
Do you have any idea of why this happens?
I tried everything.
Thanks
Marco
You need to either:
select give_dummy each date from tab where ...
Or:
give_dummy:{[the_date] :?[ the_date > 2013.01.01 ; 0.001 ; 0.002]; }
select give_dummy[date] from tab where ...
? is the vector conditional. See here for more details: http://code.kx.com/q4m3/10_Execution_Control/

Matlab function calling basic

I'm new to Matlab and now learning the basic grammar.
I've written the file GetBin.m:
function res = GetBin(num_bin, bin, val)
if val >= bin(num_bin - 1)
res = num_bin;
else
for i = (num_bin - 1) : 1
if val < bin(i)
res = i;
end
end
end
and I call it with:
num_bin = 5;
bin = [48.4,96.8,145.2,193.6]; % bin stands for the intermediate borders, so there are 5 bins
fea_val = GetBin(num_bin,bin,fea(1,1)) % fea is a pre-defined 280x4096 matrix
It returns error:
Error in GetBin (line 2)
if val >= bin(num_bin - 1)
Output argument "res" (and maybe others) not assigned during call to
"/Users/mac/Documents/MATLAB/GetBin.m>GetBin".
Could anybody tell me what's wrong here? Thanks.
You need to ensure that every possible path through your code assigns a value to res.
In your case, it looks like that's not the case, because you have a loop:
for i = (num_bins-1) : 1
...
end
That loop will never iterate (so it will never assign a value to res). You need to explicitly specify that it's a decrementing loop:
for i = (num_bins-1) : -1 : 1
...
end
For more info, see the documentation on the colon operator.
for i = (num_bin - 1) : -1 : 1