Cannot cast ListType[tuple(float64 x 2)] to list(tuple(float64 x 2)) in numba - numba

Hello I am trying to use typed List in numba v46.0
>>> from numba.typed import List
>>> from numba import types
>>> mylist = List.empty_list(item_type=types.Tuple((types.f8, types.f8)))
>>> mylist2 = List.empty_list(item_type=types.List(dtype=types.Tuple((types.f8, types.f8))))
>>> mylist2.append(mylist)
but I got the following error, I am wondering how to fix it?
Traceback (most recent call last): File "", line 1, in
File
"/usr/local/lib/python3.7/site-packages/numba/typed/typedlist.py",
line 223, in append
_append(self, item) File "/usr/local/lib/python3.7/site-packages/numba/dispatcher.py", line
401, in _compile_for_args
error_rewrite(e, 'typing') File "/usr/local/lib/python3.7/site-packages/numba/dispatcher.py", line
344, in error_rewrite
reraise(type(e), e, None) File "/usr/local/lib/python3.7/site-packages/numba/six.py", line 668, in
reraise
raise value.with_traceback(tb) numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend) Internal error at
. Failed in
nopython mode pipeline (step: nopython mode backend) Cannot cast
ListType[tuple(float64 x 2)] to list(tuple(float64 x 2)): %".24" =
load {i8*, i8*}, {i8*, i8*}* %"item"
File
"../../usr/local/lib/python3.7/site-packages/numba/listobject.py",
line 434:
def impl(l, item):
casteditem = _cast(item, itemty)

the following should work
mylist2 = List.empty_list(item_type=types.ListType(itemty=types.Tuple((types.f8, types.f8))))

Related

pycairo error: “no current point defined” when trying to use TeeSurface

I'm trying to use a cairo.TeeSurface (pycairo version '1.21.0') to redirect input to multiple surfaces, but I cannot make it work, here is my code:
>>> import cairo
>>> s1 = cairo.ImageSurface(cairo.FORMAT_ARGB32, 500,500)
>>> s2 = cairo.ImageSurface(cairo.FORMAT_ARGB32, 500,500)
>>> s3 = cairo.ImageSurface(cairo.FORMAT_ARGB32, 500,500)
>>> s = cairo.TeeSurface(s1)
>>> s.add(s2)
>>> s.add(s3)
>>> c = cairo.Context(s)
After above iniziatilation if I try to draw a line I get an error:
>>> c.rel_line_to(100,100)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
cairo.Error: no current point defined
I tried to set explicitly a point but I get the same error:
>>> c.move_to(10,10)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
cairo.Error: no current point defined
For sure I must have misunderstood something, any clue would be appreciated.

Show() brings error after applying pandas udf to dataframe

I am having problems to make this trial code work. The final line df.select(plus_one(col("x"))).show() doesn't work, I also tried to save in a variable ( vardf = df.select(plus_one(col("x"))) followed by vardf.show() and fails too.
import pyspark
import pandas as pd
from typing import Iterator
from pyspark.sql.functions import col, pandas_udf, struct
spark = pyspark.sql.SparkSession.builder.getOrCreate()
spark.sparkContext.setLogLevel("WARN")
pdf = pd.DataFrame([1, 2, 3], columns=["x"])
df = spark.createDataFrame(pdf)
df.show()
#pandas_udf("long")
def plus_one(batch_iter: Iterator[pd.Series]) -> Iterator[pd.Series]:
for s in batch_iter:
yield s + 1
df.select(plus_one(col("x"))).show()
Error message (parts of it):
File "C:\bigdatasetup\anaconda3\envs\pyspark-env\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec
exec(code, globals, locals)
File "c:\bigdatasetup\dataanalysiswithpythonandpyspark-trunk\code\ch09\untitled0.py", line 24, in
df.select(plus_one(col("x"))).show()
File "C:\bigdatasetup\anaconda3\envs\pyspark-env\lib\site-packages\pyspark\sql\dataframe.py", line 494, in show
print(self._jdf.showString(n, 20, vertical))
File "C:\bigdatasetup\anaconda3\envs\pyspark-env\lib\site-packages\py4j\java_gateway.py", line 1321, in call
return_value = get_return_value(
File "C:\bigdatasetup\anaconda3\envs\pyspark-env\lib\site-packages\pyspark\sql\utils.py", line 117, in deco
raise converted from None
PythonException:
An exception was thrown from the Python worker. Please see the stack trace below.
...
...
ERROR 2022-04-21 09:48:24,423 7608 org.apache.spark.scheduler.TaskSetManager [task-result-getter-0] Task 0 in stage 3.0 failed 1 times; aborting job

How to find expected value of np.array using scipy.stats?

I am trying to get the expected value of a NumPy array but I am running into a problem when I pass my array into the function here is an example of what is happening:
a = np.ones(10)
stats.rv_continuous.expect(args=a)
I get this error:
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
stats.rv_continuous.expect(args=a)
TypeError: expect() missing 1 required positional argument: 'self'
If I try stats.rv_continuous.expect(a) , I get this error:
'numpy.ndarray' object has no attribute '_argcheck'
Can someone tell me how to get scipy.stats to work with an array?
update:
following bob's comment I changed the code to:
st=stats.rv_continuous()
ev = st.expect(args=signal_array)
print(ev)
where signal_array is a numpy array. However I now get this error:
Traceback (most recent call last):
File "C:\Users\...\OneDrive\Área de Trabalho\TickingClock\Main.py", line 35, in <module>
ev = st.expect(args=signal_array)
File "C:\Users\...\AppData\Local\Programs\Python\Python39\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 2738, in expect
vals = integrate.quad(fun, lb, ub, **kwds)[0] / invfac
File "C:\Users\...\AppData\Local\Programs\Python\Python39\lib\site-packages\scipy\integrate\quadpack.py", line 351, in quad
retval = _quad(func, a, b, args, full_output, epsabs, epsrel, limit,
File "C:\Users\...\AppData\Local\Programs\Python\Python39\lib\site-packages\scipy\integrate\quadpack.py", line 465, in _quad
return _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel,limit)
File "C:\Users\...\AppData\Local\Programs\Python\Python39\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 2722, in fun
return x * self.pdf(x, *args, **lockwds)
File "C:\Users\...\AppData\Local\Programs\Python\Python39\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1866, in pdf
args, loc, scale = self._parse_args(*args, **kwds)
TypeError: _parse_args() got multiple values for argument 'loc'

RuntimeError using Networkx on Example code

Following the examples on https://networkx.github.io/documentation/stable/reference/drawing.html, I tried the following code:
import networkx as nx
G = nx.complete_graph(5)
A = nx.nx_agraph.to_agraph(G)
H = nx.nx_agraph.from_agraph(A)
I get a RuntimeError as follows:
H = nx.nx_agraph.from_agraph(A)
Traceback (most recent call last):
File "/home/nom/anaconda3/envs/wcats/lib/python3.7/site-packages/pygraphviz/agraph.py", line 1750, in iteritems
ah = gv.agnxtattr(self.handle, self.type, ah)
StopIteration: agnxtattr
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<ipython-input-10-19c378da806e>", line 1, in <module>
H = nx.nx_agraph.from_agraph(A)
File "/home/nom/anaconda3/envs/wcats/lib/python3.7/site-packages/networkx/drawing/nx_agraph.py", line 85, in from_agraph
N.graph.update(A.graph_attr)
File "/home/nom/anaconda3/envs/wcats/lib/python3.7/site-packages/pygraphviz/agraph.py", line 1740, in keys
return list(self.__iter__())
File "/home/nom/anaconda3/envs/wcats/lib/python3.7/site-packages/pygraphviz/agraph.py", line 1743, in __iter__
for (k, v) in self.iteritems():
RuntimeError: generator raised StopIteration
This error is so basic that I suspect there's a problem with the package itself. Any suggestions on how I can try to troubleshoot this one?

Pyspark 'tzinfo' error when using the Cassandra connector

I'm reading from Cassandra using
a = sc.cassandraTable("my_keyspace", "my_table").select("timestamp", "vaue")
and then want to convert it to a dataframe:
a.toDF()
and the schema is correctly infered:
DataFrame[timestamp: timestamp, value: double]
but then when materializing the dataframe I get the following error:
Py4JJavaError: An error occurred while calling o89372.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 285.0 failed 4 times, most recent failure: Lost task 0.3 in stage 285.0 (TID 5243, kepler8.cern.ch): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "/opt/spark-1.6.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/worker.py", line 111, in main
process()
File "/opt/spark-1.6.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/worker.py", line 106, in process
serializer.dump_stream(func(split_index, iterator), outfile)
File "/opt/spark-1.6.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream
vs = list(itertools.islice(iterator, batch))
File "/opt/spark-1.6.0-bin-hadoop2.6/python/pyspark/sql/types.py", line 541, in toInternal
return tuple(f.toInternal(v) for f, v in zip(self.fields, obj))
File "/opt/spark-1.6.0-bin-hadoop2.6/python/pyspark/sql/types.py", line 541, in <genexpr>
return tuple(f.toInternal(v) for f, v in zip(self.fields, obj))
File "/opt/spark-1.6.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/sql/types.py", line 435, in toInternal
return self.dataType.toInternal(obj)
File "/opt/spark-1.6.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/sql/types.py", line 190, in toInternal
seconds = (calendar.timegm(dt.utctimetuple()) if dt.tzinfo
AttributeError: 'str' object has no attribute 'tzinfo'
which sounds like a string as been given to pyspark.sql.types.TimestampType.
How could I debug this further?