JSON error while using Pandas output format - alpha-vantage

I am using alpha_vantage Timeseries API like below:
-----------------------------------------code------------------------------------
import pandas as pd
from alpha_vantage.timeseries import TimeSeries
from alpha_vantage.techindicators import TechIndicators
from matplotlib.pyplot import figure
import matplotlib.pyplot as plt
from pprint import pprint
#my key
key = 'mykey'
ts = TimeSeries(key, output_format='pandas')
def processMyBatch(batch, FD):
for i in batch:
df, meta_data = ts.get_quote_endpoint(i)
FD=FD.append(df)
return(FD)
main code...
for i in batches:
DF2=processMyBatch(i, DF)
DF=DF2
While the API worked for few symbols (see error log below), somewhere in between going through the list of symbols, I suddenly got the following JSONDecoder error ... but I am using output_format as pandas. Could you please throw some light on why this error occurred?
thank you
================error===============
/opt/scripts
starting now. fileName is: /mnt/NAS/Documents/../../../dailyquote2020-03-03.xlsx
completed the batch: ['AAPL', 'ABBV', 'AMZN', 'BAC', 'BNDX']
Waiting to honor API requirement: for 1 min
Waited: 65 sec
completed the batch: ['C', 'CNQ', 'CTSH', 'EEMV', 'FBGRX']
Waiting to honor API requirement: for 1 min
Waited: 65 sec
completed the batch: ['FDVV', 'FFNOX', 'FSMEX', 'FXAIX', 'GE']
Waiting to honor API requirement: for 1 min
Waited: 65 sec
Traceback (most recent call last):
File "getQuotes.py", line 55, in <module>
DF2=processMyBatch(i, DF)
File "getQuotes.py", line 29, in processMyBatch
df, meta_data = ts.get_quote_endpoint(i)
File "/home/username/.local/lib/python3.6/site-packages/alpha_vantage/alphavantage.py", line 174, in _format_wrapper
self, *args, **kwargs)
File "/home/username/.local/lib/python3.6/site-packages/alpha_vantage/alphavantage.py", line 159, in _call_wrapper
return self._handle_api_call(url), data_key, meta_data_key
File "/home/username/.local/lib/python3.6/site-packages/alpha_vantage/alphavantage.py", line 287, in _handle_api_call
json_response = response.json()
File "/home/username/.local/lib/python3.6/site-packages/requests/models.py", line 898, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 518, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Added on 3/4/2020
..
..
completed the batch: ['FDVV', 'FFNOX', 'FSMEX', 'FXAIX', 'GE']
Waiting to honor API requirement: for 1 min
Waited: 65 sec
completed the batch: ['GOOGL', 'IGEB', 'IJH', 'IJR', 'IMTB']
Waiting to honor API requirement: for 1 min
Waited: 65 sec
Traceback (most recent call last):
File "getQuotes.py", line 55, in <module>
DF2=processMyBatch(i, DF)
..
..

Well I was getting an error like that today and it turns out the Alpha Vantage site is down!

Related

Exporting trained instrument in DDSP-VST

I bought the pro colab+ and uploaded my own instrument to google drive and I initiated the training module and after 30 mins it is finished but gives me an error saying that the instrument is not found.
this is the error code :
Exporting model...
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/train_util.py", line 165, in get_latest_operative_config
restore_dir, prefix='operative_config-', suffix='.gin')
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/train_util.py", line 106, in get_latest_file
f'No files found matching the pattern '{search_pattern}'.')
FileNotFoundError: No files found matching the pattern '/content/gdrive/MyDrive/My/operative_config-*.gin'.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/ddsp_export", line 8, in
sys.exit(console_entry_point())
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/ddsp_export.py", line 364, in console_entry_point
app.run(main)
File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/ddsp_export.py", line 333, in main
export_impulse_response(model_path, save_dir, FLAGS.reverb_sample_rate)
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/ddsp_export.py", line 272, in export_impulse_response
ddsp.training.inference.parse_operative_config(model_path)
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/inference.py", line 41, in parse_operative_config
operative_config = train_util.get_latest_operative_config(ckpt_dir)
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/train_util.py", line 168, in get_latest_operative_config
os.path.dirname(restore_dir), prefix='operative_config-', suffix='.gin')
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/train_util.py", line 106, in get_latest_file
f'No files found matching the pattern '{search_pattern}'.')
FileNotFoundError: No files found matching the pattern '/content/gdrive/MyDrive/operative_config-*.gin'.
Export complete! Zipping /content/gdrive/MyDrive/My Instrument/ddsp-training-2022-07-18-0955/My_Instrument to /content/gdrive/MyDrive/My Instrument/ddsp-training-2022-07-18-0955/My_Instrument.zip
/bin/bash: line 0: cd: too many arguments
Zipping Complete! Downloading... My_Instrument.zip
You can also find your model at /content/gdrive/MyDrive/My Instrument/ddsp-training-2022-07-18-0955/My_Instrument
FileNotFoundError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/ipyfilechooser/filechooser.py in _on_select_click(self, _b)
315 if self._callback is not None:
316 try:
--> 317 self._callback(self)
318 except TypeError:
319 # Support previous behaviour of not passing self
3 frames
/usr/local/lib/python3.7/dist-packages/google/colab/files.py in download(filename)
187 raise OSError(msg)
188 else:
--> 189 raise FileNotFoundError(msg) # pylint: disable=undefined-variable
190
191 comm_manager = _IPython.get_ipython().kernel.comm_manager
FileNotFoundError: Cannot find file: /content/gdrive/MyDrive/My Instrument/ddsp-training-2022-07-18-0955/My_Instrument.zip

Show() brings error after applying pandas udf to dataframe

I am having problems to make this trial code work. The final line df.select(plus_one(col("x"))).show() doesn't work, I also tried to save in a variable ( vardf = df.select(plus_one(col("x"))) followed by vardf.show() and fails too.
import pyspark
import pandas as pd
from typing import Iterator
from pyspark.sql.functions import col, pandas_udf, struct
spark = pyspark.sql.SparkSession.builder.getOrCreate()
spark.sparkContext.setLogLevel("WARN")
pdf = pd.DataFrame([1, 2, 3], columns=["x"])
df = spark.createDataFrame(pdf)
df.show()
#pandas_udf("long")
def plus_one(batch_iter: Iterator[pd.Series]) -> Iterator[pd.Series]:
for s in batch_iter:
yield s + 1
df.select(plus_one(col("x"))).show()
Error message (parts of it):
File "C:\bigdatasetup\anaconda3\envs\pyspark-env\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec
exec(code, globals, locals)
File "c:\bigdatasetup\dataanalysiswithpythonandpyspark-trunk\code\ch09\untitled0.py", line 24, in
df.select(plus_one(col("x"))).show()
File "C:\bigdatasetup\anaconda3\envs\pyspark-env\lib\site-packages\pyspark\sql\dataframe.py", line 494, in show
print(self._jdf.showString(n, 20, vertical))
File "C:\bigdatasetup\anaconda3\envs\pyspark-env\lib\site-packages\py4j\java_gateway.py", line 1321, in call
return_value = get_return_value(
File "C:\bigdatasetup\anaconda3\envs\pyspark-env\lib\site-packages\pyspark\sql\utils.py", line 117, in deco
raise converted from None
PythonException:
An exception was thrown from the Python worker. Please see the stack trace below.
...
...
ERROR 2022-04-21 09:48:24,423 7608 org.apache.spark.scheduler.TaskSetManager [task-result-getter-0] Task 0 in stage 3.0 failed 1 times; aborting job

Data bricks:- Cannot display the predicted output by using ml flow registered model

I have created a model using diabetes dataset for prediction. I have trained, evaluated, logged and registered it as a new model in ML flow. Now I am trying to load the registered model and trying to predict on new data. All though I was able to predict the results. I am not able to display it. When I try to display using command .show() or display() it is throwing an error. What is the cause of the error? and How do I display the results?
Note: I have programmed using pure pyspark and all the ML flow operation was done on Data bricks
Code:-
model_details = mlflow.tracking.MlflowClient().get_latest_versions('model1',stages=['staging'])[0]
model = mlflow.pyfunc.spark_udf(spark,model_details.source)
input_df = sdf.drop('progression')
columns = list(map(lambda c: f"{c}", input_df.columns))
df = input_df.withColumn("progression", model(*columns))
df.show(truncate=False)
Error :-
PythonException: An exception was thrown from a UDF: 'Exception: Java gateway process exited before sending its port number'. Full traceback below:
PythonException Traceback (most recent call last)
<command-1343735193245452> in <module>
34 df = input_df.withColumn("progression", model(*columns))
35
---> 36 df.show(truncate=False)
/databricks/spark/python/pyspark/sql/dataframe.py in show(self, n, truncate, vertical)
441 print(self._jdf.showString(n, 20, vertical))
442 else:
--> 443 print(self._jdf.showString(n, int(truncate), vertical))
444
445 def __repr__(self):
/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in __call__(self, *args)
1303 answer = self.gateway_client.send_command(command)
1304 return_value = get_return_value(
-> 1305 answer, self.gateway_client, self.target_id, self.name)
1306
1307 for temp_arg in temp_args:
/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
131 # Hide where the exception came from that shows a non-Pythonic
132 # JVM exception message.
--> 133 raise_from(converted)
134 else:
135 raise
/databricks/spark/python/pyspark/sql/utils.py in raise_from(e)
PythonException: An exception was thrown from a UDF: 'Exception: Java gateway process exited before sending its port number'. Full traceback below:
Traceback (most recent call last):
File "/databricks/spark/python/pyspark/worker.py", line 654, in main
process()
File "/databricks/spark/python/pyspark/worker.py", line 646, in process
serializer.dump_stream(out_iter, outfile)
File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 281, in dump_stream
timely_flush_timeout_ms=self.timely_flush_timeout_ms)
File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 97, in dump_stream
for batch in iterator:
File "/databricks/spark/python/pyspark/sql/pandas/serializers.py", line 271, in init_stream_yield_batches
for series in iterator:
File "/databricks/spark/python/pyspark/worker.py", line 467, in mapper
result = tuple(f(*[a[o] for o in arg_offsets]) for (arg_offsets, f) in udfs)
File "/databricks/spark/python/pyspark/worker.py", line 467, in <genexpr>
result = tuple(f(*[a[o] for o in arg_offsets]) for (arg_offsets, f) in udfs)
File "/databricks/spark/python/pyspark/worker.py", line 111, in <lambda>
verify_result_type(f(*a)), len(a[0])), arrow_return_type)
File "/databricks/spark/python/pyspark/util.py", line 109, in wrapper
return f(*args, **kwargs)
File "/databricks/python/lib/python3.7/site-packages/mlflow/pyfunc/__init__.py", line 827, in predict
model = SparkModelCache.get_or_load(archive_path)
File "/databricks/python/lib/python3.7/site-packages/mlflow/pyfunc/spark_model_cache.py", line 64, in get_or_load
SparkModelCache._models[archive_path] = load_pyfunc(temp_dir)
File "/databricks/python/lib/python3.7/site-packages/mlflow/utils/annotations.py", line 43, in deprecated_func
return func(*args, **kwargs)
File "/databricks/python/lib/python3.7/site-packages/mlflow/pyfunc/__init__.py", line 693, in load_pyfunc
return load_model(model_uri, suppress_warnings)
File "/databricks/python/lib/python3.7/site-packages/mlflow/pyfunc/__init__.py", line 667, in load_model
model_impl = importlib.import_module(conf[MAIN])._load_pyfunc(data_path)
File "/databricks/python/lib/python3.7/site-packages/mlflow/spark.py", line 707, in _load_pyfunc
.master("local[1]")
File "/databricks/spark/python/pyspark/sql/session.py", line 189, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/databricks/spark/python/pyspark/context.py", line 384, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/databricks/spark/python/pyspark/context.py", line 134, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "/databricks/spark/python/pyspark/context.py", line 333, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "/databricks/spark/python/pyspark/java_gateway.py", line 105, in launch_gateway
raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number

using boto3 in a python3 virtual env in AWS Lambda

I am trying to use Python3.4 and boto3 to walk an S3 bucket and publish some file locations to an RDS instance. The part of this effort I am having trouble with is when using boto3. My lambda function looks like the following:
import subprocess
def lambda_handler(event, context):
args = ("venv/bin/python3.4", "run.py")
popen = subprocess.Popen(args, stdout=subprocess.PIPE)
popen.wait()
output = popen.stdout.read()
print(output)
and, in my run.py file I have some lines:
import boto3
s3c = boto3.client('s3')
which cause an exception. The run.py file is not relevant for this question however, so in order make this post more concise, I've found that the cause of this error is generated with executing the lambda function:
import subprocess
def lambda_handler(event, context):
args = ("python3.4", "-c", "import boto3; print(boto3.client('s3'))")
popen = subprocess.Popen(args, stdout=subprocess.PIPE)
popen.wait()
output = popen.stdout.read()
print(output)
My logstream reports the error:
Event Data
START RequestId: 2b65421a-664d-11e6-81db-974c7c09d283 Version: $LATEST
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/var/runtime/boto3/__init__.py", line 79, in client
return _get_default_session().client(*args, **kwargs)
File "/var/runtime/boto3/session.py", line 250, in client
aws_session_token=aws_session_token, config=config)
File "/var/runtime/botocore/session.py", line 818, in create_client
client_config=config, api_version=api_version)
File "/var/runtime/botocore/client.py", line 63, in create_client
cls = self._create_client_class(service_name, service_model)
File "/var/runtime/botocore/client.py", line 85, in _create_client_class
base_classes=bases)
File "/var/runtime/botocore/hooks.py", line 227, in emit
return self._emit(event_name, kwargs)
File "/var/runtime/botocore/hooks.py", line 210, in _emit
response = handler(**kwargs)
File "/var/runtime/boto3/utils.py", line 61, in _handler
module = import_module(module)
File "/var/runtime/boto3/utils.py", line 52, in import_module
__import__(name)
File "/var/runtime/boto3/s3/inject.py", line 13, in <module>
from boto3.s3.transfer import S3Transfer
File "/var/runtime/boto3/s3/transfer.py", line 135, in <module>
from concurrent import futures
File "/var/runtime/concurrent/futures/__init__.py", line 8, in <module>
from concurrent.futures._base import (FIRST_COMPLETED,
File "/var/runtime/concurrent/futures/_base.py", line 357
raise type(self._exception), self._exception, self._traceback
^
SyntaxError: invalid syntax
END RequestId: 2b65421a-664d-11e6-81db-974c7c09d283
REPORT RequestId: 2b65421a-664d-11e6-81db-974c7c09d283 Duration: 2673.45 ms Billed Duration: 2700 ms Memory Size: 1024 MB Max Memory Used: 61 MB
I need to use boto3 downstream of run.py. Any ideas on how to resolve this are much appreciated. Thanks!

Pandas HDF5 store unicode error on select query

I have unicode data as read from this file:
Mdt,Doccompra,OrgC,Cen,NumP,Criadopor,Dtcriacao,Fornecedor,P,Fun
400,8751215432,2581,,1,MIGRAÇÃO,01.10.2004,75852214,,TD
400,5464282154,9874,,1,MIGRAÇÃO,01.10.2004,78995411,,FO
I have two problems:
When I try to query this unicode data I get a UnicodeDecodeError:
Traceback (most recent call last):
File "<ipython-input-1-4423dceb2b1d>", line 1, in <module>
runfile('C:/Users/u5en/Documents/SAP/Programação/Problema HDF.py', wdir='C:/Users/u5en/Documents/SAP/Programação')
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 580, in runfile
execfile(filename, namespace)
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 48, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "C:/Users/u5en/Documents/SAP/Programação/Problema HDF.py", line 15, in <module>
store.select("EKKA", "columns=['Mdt', 'Fornecedor']")
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 665, in select
return it.get_result()
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 1359, in get_result
results = self.func(self.start, self.stop, where)
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 658, in func
columns=columns, **kwargs)
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 3968, in read
if not self.read_axes(where=where, **kwargs):
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 3201, in read_axes
a.convert(values, nan_rep=self.nan_rep, encoding=self.encoding)
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 2058, in convert
self.data, nan_rep=nan_rep, encoding=encoding)
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 4359, in _unconvert_string_array
data = f(data)
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 1700, in __call__
return self._vectorize_call(func=func, args=vargs)
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 1769, in _vectorize_call
outputs = ufunc(*inputs)
File "C:\Users\u5en\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 4358, in <lambda>
f = np.vectorize(lambda x: x.decode(encoding), otypes=[np.object])
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 7: unexpected end of data
How can I store and query my unicode data in hdf5?
I have many tables with column names I do not know beforehand and which are not proper pytable names (NaturalNameWarning). I would like the user to be able to query on this columns, so I wonder how could I query these when their name prevents me? I see this used to have no easy fix, so if that is still the case I will just remove the offending characters from the heading.
import csv
import pandas as pd
dados = pd.read_csv("EKKA - Cópia.csv")
print(dados)
store= pd.HDFStore('teste.h5' , encoding="utf-8")
store.append("EKKA", dados, format="table", data_columns=True)
store.select("EKKA", "columns=['Mdt', 'Fornecedor']")
store.close()
Would I be better off doing this in sqlite?
Environment:
Windows 7 64bit
Pandas 15.2
NumPy 1.9.2
So under Python 2.7 on Windows 7, pandas 0.15.2, everything worked as expected, no encoding necessary. However on Python 3.4, the following worked for me. Apparently some characters are not representable in 'utf-8'; 'latin1' encoding usually solves these issues. Note that I had to read the csv in the first place with this encoding.
>>> df = pd.read_csv('../../test.csv',encoding='latin1')
>>> df
Mdt Doccompra OrgC Cen NumP Criadopor Dtcriacao Fornecedor P Fun
0 400 8751215432 2581 NaN 1 MIGRAÇ\xc3O 01.10.2004 75852214 NaN TD
1 400 5464282154 9874 NaN 1 MIGRAÇ\xc3O 01.10.2004 78995411 NaN FO
Further, the encoding must be specified not when opening the store, but on the append/put calls
>>> df.to_hdf('test.h5','df',format='table',mode='w',data_columns=True,encoding='latin1')
>>> pd.read_hdf('test.h5','df')
Mdt Doccompra OrgC Cen NumP Criadopor Dtcriacao Fornecedor P Fun
0 400 8751215432 2581 NaN 1 MIGRAÇ\xc3O 01.10.2004 75852214 NaN TD
1 400 5464282154 9874 NaN 1 MIGRAÇ\xc3O 01.10.2004 78995411 NaN FO
Once it is written encoded, it is not necessary to specify the encoding when reading.