Pyspark and spark not working in apache hue - pyspark

I want to know the cause.
os : ubuntu 20.4
heu version : 4.10.0
livy version : 0.8.0
spark version : 3.3.0
hadoop version : 3.3.4
hive version : 3.1.3
After installing livy to use pyspark, I checked the operation of pyspark using curl.
And in apache hue -> pyspark, print(1+1) works fine.
However, the code below and all pyspark commands do not work
import random
NUM_SAMPLES = 100000
def sample§:
x, y = random.random(), random.random()
return 1 if xx + yy < 1 else 0
count = sc.parallelize(range(0, NUM_SAMPLES)).map(sample).reduce(lambda a, b: a + b)
print(count)
When the above sample program is executed, the following error message is displayed.
[07/Oct/2022 15:24:36 +0900] decorators ERROR Error running fetch_result_data
Traceback (most recent call last):
File “/home/hue/hue-4.10.0/desktop/libs/notebook/src/notebook/decorators.py”, line 119, in wrapper
return f(*args, **kwargs)
File “/home/hue/hue-4.10.0/desktop/libs/notebook/src/notebook/api.py”, line 329, in fetch_result_data
response = _fetch_result_data(request, notebook, snippet, operation_id, rows=rows, start_over=start_over)
File “/home/hue/hue-4.10.0/desktop/libs/notebook/src/notebook/api.py”, line 339, in _fetch_result_data
‘result’: get_api(request, snippet).fetch_result(notebook, snippet, rows, start_over)
File “/home/hue/hue-4.10.0/desktop/libs/notebook/src/notebook/connectors/spark_shell.py”, line 235, in fetch_result
raise QueryError(msg)
notebook.connectors.base.QueryError: Traceback (most recent call last):
File “/tmp/3309927620969108702”, line 223, in execute
code = compile(mod, ‘’, ‘exec’)
TypeError: required field “type_ignores” missing from Module
And when I do curl like below, I get the same error.
curl localhost:8998/sessions/11/statements -X POST -H 'Content-Type: application/json' -d '{"code":"import random\n\nNUM_SAMPLES = 100000\n\ndef sample(p):\n x, y = random.random(), random.random()\n return 1 if x*x + y*y < 1 else 0\n\ncount = sc.parallelize(range(0, NUM_SAMPLES)).map(sample).reduce(lambda a, b: a + b)\nprint(count)"}'
livy#bigdata:~$ curl localhost:8998/sessions/11/statements/1
{"id":1,"code":"import random\n\nNUM_SAMPLES = 100000\n\ndef sample(p):\n x, y = random.random(), random.random()\n return 1 if x*x + y*y < 1 else 0\n\ncount = sc.parallelize(range(0, NUM_SAMPLES)).map(sample).reduce(lambda a, b: a + b)\nprint(count)","state":"available","output":{"status":"error","execution_count":1,"ename":"TypeError","evalue":"required field \"type_ignores\" missing from Module","traceback":["Traceback (most recent call last):\n"," File \"/tmp/7630604275391098405\", line 223, in execute\n code = compile(mod, '<stdin>', 'exec')\n","TypeError: required field \"type_ignores\" missing from Module\n"]},"progress":1.0,"started":1665132825039,"completed":1665132825041}

Related

TypeError: __init__() got an unexpected keyword argument 'n_features_to_select' : - feature selection using forward selection

I am trying to do feature selection using feature forward method.
Tried previously answered questions but didn't get any proper solution. My code is as follows:
def forward_selection_rf(data, target, number_of_features=14):
# adapt number of features to select: if requested number
# is greater than features availabe, go for 75% of the
# features instead
if number_of_features > len(data.columns):
print("SFS: Wanted " + str(number_of_features) + " from " + str(len(data.columns)) + " featurs. Sanifying to 75%")
number_of_features = 0.75
# Sequential Forward Selection(sfs)
sfs1 = sfs(RandomForestClassifier(
n_estimators=70,
criterion='gini',
max_depth=15,
min_samples_split=2,
min_samples_leaf=1,
min_weight_fraction_leaf=0.0,
max_features='auto',
max_leaf_nodes=None,
min_impurity_decrease=0.0,
min_impurity_split=None,
bootstrap=True,
oob_score=False,
n_jobs=-1,
random_state=0,
verbose=0,
warm_start=False,
class_weight='balanced'
),
n_features_to_select=14,
direction='forward',
scoring = 'roc_auc',
cv = 5,
n_jobs = 3)
sfs1.fit(data, target)
return sfs1
compiler gives runtime error as follows:
forward_selection_rf(X, y, number_of_features=14)
Traceback (most recent call last):
File "C:\Users\drash\AppData\Local\Temp\ipykernel_37980\1091017691.py", line 1, in <module>
forward_selection_rf(X, y, number_of_features=14)
File "C:\Users\drash\OneDrive\Desktop\Howto Health\untitled3.py", line 102, in forward_selection_rf
TypeError: __init__() got an unexpected keyword argument 'n_features_to_select'

Incorrect results when using cufftCallbackLoadR for R2C transformation using cupy

Since Callback feature isn't documented beyond [this sample] [(https://docs.cupy.dev/en/latest/user_guide/fft.html#fft-callbacks), I might just be holding it wrong. I tried to modify that example to do a real-to-complex transform as attached below.
When run , it gives this output:
Traceback (most recent call last):
File "/workspace/test/CB-R2C-2D.py", line 28, in <module>
assert cp.allclose(b, c)
AssertionError
C2R is working properly but not R2C. Is it supported? It's not clearly documented if all possible fft combinations are supported.
Here is the modified R2C code:
#!/usr/bin/env python3
import cupy as cp
# a load callback that overwrites the input array to 1
code = r'''
__device__ cufftReal CB_ConvertInputR(
void *dataIn,
size_t offset,
void *callerInfo,
void *sharedPtr)
{
cufftReal x;
x = 1.;
return x;
}
__device__ cufftCallbackLoadR d_loadCallbackPtr = CB_ConvertInputR;
'''
a = cp.random.random((64, 128, 128)).astype(cp.float64)
# this fftn call uses callback
with cp.fft.config.set_cufft_callbacks(cb_load=code):
b = cp.fft.rfftn(a, axes=(1,2))
# this does not use
c = cp.fft.rfftn(cp.ones(shape=a.shape, dtype=cp.float64), axes=(1,2))
# result agrees
assert cp.allclose(b, c)

_vode.error: failed in processing argument list for call-back f

I try to solve a series of ODEs by scipy integrate.ode module, what does this error message mean?
create_cb_arglist: Failed to build argument list (siz) with enough arguments (tot-opt) required by user-supplied function (siz,tot,opt=6,7,0).
Traceback (most recent call last):
File "D:/DeepSpillModel/api.py", line 49, in <module>
model.solver(start_time, end_time)
File "D:\DeepSpillModel\far_field.py", line 17, in solver
far_model.simulate(self.parcels, self.initial_location, start_time,
File "D:\DeepSpillModel\single_parcel_model.py", line 15, in simulate
self.t, self.y = calculate_underwater(self.profile, parcel, t0, y0, diff_factor, self.p, delta_t_sub)
File "D:\DeepSpillModel\single_parcel_model.py", line 44, in calculate_underwater
r.integrate(t[-1] + delta_t, step=True)
File "D:\Miniconda3\envs\gnome\lib\site-packages\scipy\integrate\_ode.py", line 433, in integrate
self._y, self.t = mth(self.f, self.jac or (lambda: None),
File "D:\Miniconda3\envs\gnome\lib\site-packages\scipy\integrate\_ode.py", line 1024, in step
r = self.run(*args)
File "D:\Miniconda3\envs\gnome\lib\site-packages\scipy\integrate\_ode.py", line 1009, in run
y1, t, istate = self.runner(*args)
_vode.error: failed in processing argument list for call-back f.
Process finished with exit code 1

Sympy .coeff_all() returned list is not readable by scipy

I have question about the data type of the result returned by Sympy Poly.all_coeffs(). I have started to use Sympy just recently.
My Sympy transfer function is following:
Then I run this code:
n,d = fraction(Gs)
num = Poly(n,s)
den = Poly(d,s)
num_c = num.all_coeffs()
den_c = den.all_coeffs()
I get:
Then I run this code:
from scipy import signal
#nu = [5000000.0]
#de = [4.99, 509000.0]
nu = num_c
de = den_c
sys = signal.lti(nu, de)
w,mag,phase = signal.bode(sys)
plt.plot(w/(2*np.pi), mag)
and the result is:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-131-fb960684259c> in <module>
4 nu = num_c
5 de = den_c
----> 6 sys = signal.lti(nu, de)
But if I use those commented line 'nu' and 'de' straight python lists instead, the program works. So what is wrong here?
Why did you just show a bit the error? Why not the full message, maybe even the full traceback!
In [60]: sys = signal.lti(num_c, den_c)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-60-21f71ecd8884> in <module>
----> 1 sys = signal.lti(num_c, den_c)
/usr/local/lib/python3.6/dist-packages/scipy/signal/ltisys.py in __init__(self, *system, **kwargs)
590 self._den = None
591
--> 592 self.num, self.den = normalize(*system)
593
594 def __repr__(self):
/usr/local/lib/python3.6/dist-packages/scipy/signal/filter_design.py in normalize(b, a)
1609 leading_zeros = 0
1610 for col in num.T:
-> 1611 if np.allclose(col, 0, atol=1e-14):
1612 leading_zeros += 1
1613 else:
<__array_function__ internals> in allclose(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/numpy/core/numeric.py in allclose(a, b, rtol, atol, equal_nan)
2169
2170 """
-> 2171 res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
2172 return bool(res)
2173
<__array_function__ internals> in isclose(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/numpy/core/numeric.py in isclose(a, b, rtol, atol, equal_nan)
2267 y = array(y, dtype=dt, copy=False, subok=True)
2268
-> 2269 xfin = isfinite(x)
2270 yfin = isfinite(y)
2271 if all(xfin) and all(yfin):
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Now look at the elements of the num_c list (same for den_c):
In [55]: num_c[0]
Out[55]: 500000.000000000
In [56]: type(_)
Out[56]: sympy.core.numbers.Float
The scipy code is doing numpy testing on the inputs. So it's first turned the lists into arrays:
In [61]: np.array(num_c)
Out[61]: array([500000.000000000], dtype=object)
This array contains sympy object(s). It can't cast that to numpy float with 'safe'. But an explicit astype uses unsafe as the default:
In [63]: np.array(num_c).astype(float)
Out[63]: array([500000.])
So lets convert both lists into valid numpy float arrays:
In [64]: sys = signal.lti(np.array(num_c).astype(float), np.array(den_c).astype(float))
In [65]: sys
Out[65]:
TransferFunctionContinuous(
array([100200.4008016]),
array([1.00000000e+00, 1.02004008e+05]),
dt: None
)
Conversion in a list comprehension also works:
sys = signal.lti([float(i) for i in num_c],[float(i) for i in den_c])
You likely need to conver sympy objects to floats / lists of floats.

Jython 2.5 isdigit

I am trying to add an isdigit() to the program so that I can verify what the user enters is valid. This is what I have so far. But when I run it an enter a character, say "f". It crashes and gives me the error which will be posted below the code. Any ideas?
def mirrorHorizontal(source):
userMirrorPoint = requestString("Enter a mirror point from 0 to halfway through the pitcure.") #asks user for an input
while (int(userMirrorPoint) < 0 or int(userMirrorPoint) > (int(getHeight(source) - 1)//2)) or not(userMirrorPoint.isdigit()):
userMirrorPoint = requestString("Enter a mirror point from 0 to halfway through the pitcure.")
height = getHeight(source)
mirrorPoint = int(userMirrorPoint)
for x in range(0, getWidth(source)):
for y in range(0, mirrorPoint):
topPixel = getPixel(source, x, y)
bottomPixel = getPixel(source, x, height-y-1)
color = getColor(topPixel)
setColor(bottomPixel, color)
The error was: f
Inappropriate argument value (of correct type).
An error occurred attempting to pass an argument to a function.
Please check line 182 of /Volumes/FLASHDRIVE2/College/Spring 16'/Programs - CPS 201/PA5Sikorski.py
isdigit() itself behaves itself in the 2.7.0 jython version I have locally
>>> '1'.isdigit()
True
>>> ''.isdigit()
False
>>> 'A'.isdigit()
False
>>> 'A2'.isdigit()
False
>>> '2'.isdigit()
True
>>> '22321'.isdigit()
True
Try breaking your big expression up, as typecasting to integers will throw errors for non-numeric strings. This is true across Python versions.
>>> int('b')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'b'
>>> int('2')
2
You probably want to be careful about the order of the parts of that long expression (this or that or ...). Breaking it up would also make it more readable.