ERROR: blob.download_to_filename, return an empty file and raise error - google-cloud-storage

I'm trying to download the *.tflite model on Google Cloud Storage to my Rapberry Pi 3B+ using the code as follows:
export_metadata = response.metadata
export_directory = export_metadata.export_model_details.output_info.gcs_output_directory
model_dir_remote = export_directory + remote_model_filename # file path on Google Cloud Storage
model_dir = os.path.join("models", model_filename) # file path supposed to store locally
blob = bucket.blob(model_dir_remote)
blob.download_to_filename(model_dir)
However, this returns an empty file in my target directory locally, and meanwhile, raise an error:
# ERROR: google.resumable_media.common.InvalidResponse: ('Request failed with status code', 404,
# 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
# ERROR: google.api_core.exceptions.NotFound: 404
# GET https://storage.googleapis.com/download/storage/v1/b/ao-example/o/
# gs%3A%2F%2Fao-example%2Fmodel-export%2Ficn%2Fedgetpu_tflite-Test_Model12-2020-11-16T07%3A54%3A27.187Z%2Fedgetpu_model.tflite?alt=media:
# ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
I guaranteed the corresponding authority to the service account. What confuses me is that when I use gsutil command, it works:
gsutil cp gs://ao-example/model-export/icn/edgetpu_model.tflite models/
Is anyone encountering the same problem? Is there any error in my code? Your help will be greatly appreciated!
I used the following code:
from google.cloud import storage
from google.cloud import automl
from google.cloud.storage import Blob
client = storage.Client(project="semesterproject-294707")
bucket_name = 'ao-example'
bucket = client.get_bucket(bucket_name)
model_dir_remote = "gs://ao-example/model-export/icn/edgetpu_tflite-Test_Model13-2020-11-18T15:03:42.620Z/edgetpu_model.tflite"
blob = Blob(model_dir_remote, bucket)
with open("models/edgetpu_model13.tflite", "wb") as file_obj:
blob.download_to_file(file_obj)
This raises the same error, and return an empty file also... Still, I can use gsutil cp command to download the file...
(Edited on 06/12/2020)
The info of model generated:
export_model_details {
output_info {
gcs_output_directory: "gs://ao-example/model-export/icn/edgetpu_tflite-gc14-2020-12-06T14:43:18.772911Z/"
}
}
model_gcs_path: 'gs://ao-example/model-export/icn/edgetpu_tflite-gc14-2020-12-06T14:43:18.772911Z/edgetpu_model.tflite'
model_local_path: 'models/edgetpu_model.tflite'
It still encounters the error:
google.api_core.exceptions.NotFound: 404 GET https://storage.googleapis.com/download/storage/v1/b/ao-example/o/gs%3A%2F%2Fao-example%2Fmodel-export%2Ficn%2Fedgetpu_tflite-gc14-2020-12-06T14%3A43%3A18.772911Z%2Fedgetpu_model.tflite?alt=media: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
Still, when I use gsutil cp command, it works:
gsutil cp model_gcs_path model_local_path
Edited on 12/12/2020
Soni Sol's method works! Thanks!!

I believe you should use something like this:
from google.cloud.storage import Blob
client = storage.Client(project="my-project")
bucket = client.get_bucket("my-bucket")
blob = Blob("file_on_gcs", bucket)
with open("/tmp/file_on_premise", "wb") as file_obj:
blob.download_to_file(file_obj)
Blobs / Objects

When using the client libraries the gs:// is not needed, also the name of the bucket is being passed twice on This Question they had a similar issue and got corrected.
please try with the following code:
from google.cloud import storage
from google.cloud import automl
from google.cloud.storage import Blob
client = storage.Client(project="semesterproject-294707")
bucket_name = 'ao-example'
bucket = client.get_bucket(bucket_name)
model_dir_remote = "model-export/icn/edgetpu_tflite-Test_Model13-2020-11-18T15:03:42.620Z/edgetpu_model.tflite"
blob = Blob(model_dir_remote, bucket)
with open("models/edgetpu_model13.tflite", "wb") as file_obj:
blob.download_to_file(file_obj)

Related

"Dag Seems to be missing" error in a Cloud Composer Airflow Dynamic DAG

I have a dynamic Airflow DAG in Google Cloud Composer gets created, listed in the web-server and ran (backfill) without error.
However, there are issues:
When clicking on the DAG in web url, it says "DAG seems to be
missing"
Can't see Graph view/Tree view as showing the error above
Can't manually trigger the DAG as showing the error above
Trying to fix this for couple days...any hint will be helpful. Thank you!
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.contrib.operators.gcs_to_bq import GoogleCloudStorageToBigQueryOperator
from google.cloud import storage
from airflow.models import Variable
import json
args = {
'owner': 'xxx',
'start_date':'2020-11-5',
'provide_context': True
}
dag = DAG(
dag_id='dynamic',
default_args=args
)
def return_bucket_files(bucket_name='xxxxx', **kwargs):
client = storage.Client()
bucket = client.get_bucket(bucket_name)
blobs = bucket.list_blobs()
file_list = [blob.name for blob in blobs]
return file_list
def dynamic_gcs_to_gbq_etl(file, **kwargs):
mapping = json.loads(Variable.get("xxxxx"))
database = mapping[0][file]
table = mapping[1][file]
task=GoogleCloudStorageToBigQueryOperator(
task_id= f'gcs_load_{file}_to_gbq',
bucket='xxxxxxx',
source_objects=[f'{file}'],
destination_project_dataset_table=f'xxx.{database}.{table}',
write_disposition="WRITE_TRUNCATE",
autodetect=True,
skip_leading_rows=1,
source_format='CSV',
dag=dag)
return task
start_task = DummyOperator(
task_id='start',
dag=dag
)
end_task = DummyOperator(
task_id='end',
dag=dag)
push_bucket_files = PythonOperator(
task_id="return_bucket_files",
provide_context=True,
python_callable=return_bucket_files,
dag=dag)
for file in return_bucket_files():
gcs_load_task = dynamic_gcs_to_gbq_etl(file)
start_task >> push_bucket_files >> gcs_load_task >> end_task
This issue means that the Web Server is failing to fill in the DAG bag on its side - this problem is most likely not with your DAG specifically.
My suggestion would be right now to try and restart the web server (via the installation of some dummy package).
Similar issues reported in this post as well here.

IBM Cloud-Watson NLC - TypeError: __init__() got an unexpected keyword argument 'iam_apikey'

I am currently trying to deploy an application from a repo. (https://github.com/IBM/nlc-icd10-classifier#run-locally) But it gives me this error:
Traceback (most recent call last):
File "app.py", line 34, in <module>
iam_apikey=nlc_iam_apikey
TypeError: __init__() got an unexpected keyword argument 'iam_apikey'
I am on Python 3.6.8
app.py:
load_dotenv(os.path.join(os.path.dirname(__file__), ".env"))
nlc_username = os.environ.get("NATURAL_LANGUAGE_CLASSIFIER_USERNAME")
nlc_password = os.environ.get("NATURAL_LANGUAGE_CLASSIFIER_PASSWORD")
nlc_iam_apikey = os.environ.get("NATURAL_LANGUAGE_CLASSIFIER_IAM_APIKEY")
classifier_id = os.environ.get("CLASSIFIER_ID")
# Use provided credentials from environment or pull from IBM Cloud VCAP
if nlc_iam_apikey != "placeholder":
NLC_SERVICE = NaturalLanguageClassifierV1(
iam_apikey=nlc_iam_apikey
)
elif nlc_username != "placeholder":
NLC_SERVICE = NaturalLanguageClassifierV1(
username=nlc_username,
password=nlc_password
.env:
CLASSIFIER_ID=<add_NLC_classifier_id>
#NATURAL_LANGUAGE_CLASSIFIER_USERNAME=<add_NLC_username>
#NATURAL_LANGUAGE_CLASSIFIER_PASSWORD=<add_NLC_password>
NATURAL_LANGUAGE_CLASSIFIER_IAM_APIKEY="placeholderapikeyforstackoverflolw"
It seems that you ran into an issue with the Watson SDK. Recently, with V4, they introduced a breaking change which I found in their release notes. There is a new, more abstract authentication mechanism that caters to different authentication types. You would need to slightly change the code for how NLC is initialized.
This is from the migration instructions:
For example, to pass a IAM apikey:
Before
from ibm_watson import MyService
service = MyService(
iam_apikey='{apikey}',
url='{url}'
)
After(V4.0)
from ibm_watson import MyService
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
authenticator = IAMAuthenticator('{apikey}')
service = MyService(
authenticator=authenticator
)
service.set_service_url('{url}')

Using input function with remote files in snakemake

I want to use a function to read inputs file paths from a dataframe and send them to my snakemake rule. I also have a helper function to select the remote from which to pull the files.
from snakemake.remote.GS import RemoteProvider as GSRemoteProvider
from snakemake.remote.SFTP import RemoteProvider as SFTPRemoteProvider
from os.path import join
import pandas as pd
configfile: "config.yaml"
units = pd.read_csv(config["units"]).set_index(["library", "unit"], drop=False)
TMP= join('data', 'tmp')
def access_remote(local_path):
""" Connnects to remote as defined in config file"""
provider = config['provider']
if provider == 'GS':
GS = GSRemoteProvider()
remote_path = GS.remote(join("gs://" + config['bucket'], local_path))
elif provider == 'SFTP':
SFTP = SFTPRemoteProvider(
username=config['user'],
private_key=config['ssh_key']
)
remote_path = SFTP.remote(
config['host'] + ":22" + join(base_path, local_path)
)
else:
remote_path = local_path
return remote_path
def get_fastqs(wc):
"""
Get fastq files (units) of a particular library - sample
combination from the unit sheet.
"""
fqs = units.loc[
(units.library == wc.library) &
(units.libtype == wc.libtype),
"fq1"
]
return {
"r1": list(map(access_remote, fqs.fq1.values)),
}
# Combine all fastq files from the same sample / library type combination
rule combine_units:
input: unpack(get_fastqs)
output:
r1 = join(TMP, "reads", "{library}_{libtype}.end1.fq.gz")
threads: 12
run:
shell("cat {i1} > {o1}".format(i1=input['r1'], o1=output['r1']))
My config file contains the bucket name and provider, which are passed to the function. This works as expected when running simply snakemake.
However, I would like to use the kubernetes integration, which requires passing the provider and bucket name in the command line. But when I run:
snakemake -n --kubernetes --default-remote-provider GS --default-remote-prefix bucket-name
I get this error:
ERROR :: MissingInputException in line 19 of Snakefile:
Missing input files for rule combine_units:
bucket-name/['bucket-name/lib1-unit1.end1.fastq.gz', 'bucket-name/lib1-unit2.end1.fastq.gz', 'bucket-name/lib1-unit3.end1.fastq.gz']
The bucket is applied twice (once mapped correctly to each element, and once before the whole list (which gets converted to a string). Did I miss something ? Is there a good way to work around this ?

programmatically export grafana dashboard data

I have a visual in grafana. I can manually go to the menu click export and export the time series data in json. This works great. Is there a way I can script that in python?. Is there some api I can hit that will return the json of a visual?
I was googling around and it looks like I can use the api to create dashboards/visuals and administer them but not sure where how to use the api to export the data.
Here's a Python script to export then dashboard json, not the presented data. Tested on Python 2.7:
#!/usr/bin/env python
"""Grafana dashboard exporter"""
import json
import os
import requests
HOST = 'http://localhost:3000'
API_KEY = os.environ["grafana_api_key"]
DIR = 'exported-dashboards/'
def main():
headers = {'Authorization': 'Bearer %s' % (API_KEY,)}
response = requests.get('%s/api/search?query=&' % (HOST,), headers=headers)
response.raise_for_status()
dashboards = response.json()
if not os.path.exists(DIR):
os.makedirs(DIR)
for d in dashboards:
print ("Saving: " + d['title'])
response = requests.get('%s/api/dashboards/%s' % (HOST, d['uri']), headers=headers)
data = response.json()['dashboard']
dash = json.dumps(data, sort_keys=True, indent=4, separators=(',', ': '))
name = data['title'].replace(' ', '_').replace('/', '_').replace(':', '').replace('[', '').replace(']', '')
tmp = open(DIR + name + '.json', 'w')
tmp.write(dash)
tmp.write('\n')
tmp.close()
if __name__ == '__main__':
main()
Usage:
You should first create an API key in Grafana and then run:
grafana_api_key=my-key python export-dash.py
Credit: This is a simplified version of https://github.com/percona/grafana-dashboards/blob/master/misc/export-dash.py
http://docs.grafana.org/http_api/data_source/#data-source-proxy-calls.
Visit your browser console (network tab) and you will see how it works there.
You could also use this Go client https://github.com/netsage-project/grafana-dashboard-manager
Its purpose is not what you are looking for, but it is possible to reuse that code.

Problem with connect facebookads library for extract data from Facebook with Marketing API using Python

I want to get info about ad campaign. And I start from this code to get campaign name. and I get this error :
Traceback (most recent call last):
File "C:/Users/win7/PycharmProjects/API_Facebook/dd.py", line 2, in <module>
from facebookads.adobjects.adaccount import AdAccount
File "C:\Users\win7\AppData\Local\Programs\Python\Python37-32\lib\site-packages\facebookads\adobjects\adaccount.py", line 1582
def get_insights(self, fields=None, params=None, async=False, batch=None, pending=False):
^
SyntaxError: invalid syntax
^
What is may be reason? and if you want, can give code examples how can I get more info about campaign?
Click here to view image: code and error
If you're using Python 3.7, use async_, not only async.
import os, re
path = r"path facebookads"
python_files = []
for dirpath, dirnames, filenames in os.walk(path):
for filename in filenames:
if filename.endswith(".py"):
python_files.append(os.path.join(dirpath, filename))
for dirpath, dirnames, filenames in os.walk(path):
for filename in filenames:
if filename.endswith(".py"):
python_files.append(os.path.join(dirpath, filename))
for python_file in python_files:
with open(python_file, "r") as f:
text = f.read()
revised_text = re.sub("async", "async_", text)
with open(python_file, "w") as f:
f.write(revised_text)
They updated and renamed the library, now it's facebook_ads and async argument was renamed to is_async
Try updating facebookads:
$ pip install --upgrade facebookads
I'm using facebookads==2.11.4.
More info: https://pypi.org/project/facebookads/