Running blenderbot-3B model locally does not provide same result as on Inference API - chatbot

I tried the facebook/blenderbot-3B model using the Hosted Inference API and it works pretty well (https://huggingface.co/facebook/blenderbot-3B). Now I tried to use it locally with the Python script shown below. The created responses are much worse than from the inference API and do not make sense most of the time.
Is a different code used for the inference API or did I make a mistake?
from transformers import TFAutoModelForCausalLM, AutoTokenizer, BlenderbotTokenizer, TFBlenderbotForConditionalGeneration, TFT5ForConditionalGeneration, BlenderbotTokenizer, BlenderbotForConditionalGeneration
import tensorflow as tf
import torch
device = "cuda:0" if torch.cuda.is_available() else "cpu"
chat_bots = {
'BlenderBot': [BlenderbotTokenizer.from_pretrained("hyunwoongko/blenderbot-9B"), BlenderbotForConditionalGeneration.from_pretrained("hyunwoongko/blenderbot-9B").to(device)],
}
key = 'BlenderBot'
tokenizer, model = chat_bots[key]
for step in range(100):
new_user_input_ids = tokenizer.encode(input(">> User:") + tokenizer.eos_token, return_tensors='pt').to(device)
if step > 0:
bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1)
else:
bot_input_ids = new_user_input_ids
chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id).to(device)
print("Bot: ", tokenizer.batch_decode(chat_history_ids, skip_special_tokens=True)[0])

Related

Function in pytest file works only with hard coded values

I have the below test_dss.py file which is used for pytest:
import dataikuapi
import pytest
def setup_list():
client = dataikuapi.DSSClient("{DSS_URL}", "{APY_KEY}")
client._session.verify = False
project = client.get_project("{DSS_PROJECT}")
# Check that there is at least one scenario TEST_XXXXX & that all test scenarios pass
scenarios = project.list_scenarios()
scenarios_filter = [obj for obj in scenarios if obj["name"].startswith("TEST")]
return scenarios_filter
def test_check_scenario_exist():
assert len(setup_list()) > 0, "You need at least one test scenario (name starts with 'TEST_')"
#pytest.mark.parametrize("scenario", setup_list())
def test_scenario_run(scenario, params):
client = dataikuapi.DSSClient(params['host'], params['api'])
client._session.verify = False
project = client.get_project(params['project'])
scenario_id = scenario["id"]
print("Executing scenario ", scenario["name"])
scenario_result = project.get_scenario(scenario_id).run_and_wait()
assert scenario_result.get_details()["scenarioRun"]["result"]["outcome"] == "SUCCESS", "test " + scenario[
"name"] + " failed"
My issue is with setup_list function, which able to get only hard coded values for {DSS_URL}, {APY_KEY}, {PROJECT}. I'm not able to use PARAMS or other method like in test_scenario_run
any idea how I can pass the PARAMS also to this function?
The parameters in the mark.parametrize marker are read at load time, where the information about the config parameters is not yet available. Therefore you have to parametrize the test at runtime, where you have access to the configuration.
This can be done in pytest_generate_tests (which can live in your test module):
#pytest.hookimpl
def pytest_generate_tests(metafunc):
if "scenario" in metafunc.fixturenames:
host = metafunc.config.getoption('--host')
api = metafuc.config.getoption('--api')
project = metafuc.config.getoption('--project')
metafunc.parametrize("scenario", setup_list(host, api, project))
This implies that your setup_list function takes these parameters:
def setup_list(host, api, project):
client = dataikuapi.DSSClient(host, api)
client._session.verify = False
project = client.get_project(project)
...
And your test just looks like this (without the parametrize marker, as the parametrization is now done in pytest_generate_tests):
def test_scenario_run(scenario, params):
scenario_id = scenario["id"]
...
The parametrization is now done at run-time, so it behaves the same as if you had placed a parametrize marker in the test.
And the other test that tests setup_list now has also to use the params fixture to get the needed arguments:
def test_check_scenario_exist(params):
assert len(setup_list(params["host"], params["api"], params["project"])) > 0,
"You need at least ..."

PowerPoint REST API?

I'm looking for a REST API for generating PowerPoint slides... does anyone have any suggestions?
Realize this wouldn't be too difficult to implement, but we're trying to avoid building noncore functionality we could get from a third party.
Basically, we want to send a JSON blob and get back the generated slide.
Think this is a related question:
Is there an API to make a MS Office 365 Powerpoint presentation programmatically?
Thanks for your help!
You can try using Aspose.Slides Cloud for your purposes. This product provides REST-based APIs for many programming languages (C#, Java, PHP, Ruby, Python, Node.js, C++, Go, Perl, Swift), platforms and environments. With this product, you can use both Aspose file storages and third-party storages. Docker containers can also be used for working with Aspose.Slides Cloud.
This product does not support adding contents to presentations from JSON files but it provides many features for generating contents. It also provides features to add contents to presentations from HTML and PDF documents. The following Python sample code shows you how to add a WordArt object to a new presentation, for example.
import asposeslidescloud
from asposeslidescloud.apis.slides_api import SlidesApi
from asposeslidescloud.models.slide_export_format import SlideExportFormat
from asposeslidescloud.models.shape import Shape
from asposeslidescloud.models.fill_format import FillFormat
from asposeslidescloud.models.line_format import LineFormat
from asposeslidescloud.models.text_frame_format import TextFrameFormat
from asposeslidescloud.models.three_d_format import ThreeDFormat
from asposeslidescloud.models.shape_bevel import ShapeBevel
from asposeslidescloud.models.light_rig import LightRig
from asposeslidescloud.models.camera import Camera
slides_api = SlidesApi(None, "my_client_id", "my_client_secret")
file_name = "example.pptx"
slide_index = 1
dto = Shape()
dto.shape_type = "Rectangle"
dto.x = 100
dto.y = 100
dto.height = 100
dto.width = 200
dto.text = "Sample text"
dto.fill_format = FillFormat()
dto.fill_format.type = "NoFill"
dto.line_format = LineFormat()
dto.line_format.fill_format = FillFormat()
dto.line_format.fill_format.type = "NoFill"
text_frame_format = TextFrameFormat()
text_frame_format.transform = "ArchUpPour"
three_d_format = ThreeDFormat()
bevel_bottom = ShapeBevel()
bevel_bottom.bevel_type = "Circle"
bevel_bottom.height = 3.5
bevel_bottom.width = 3.5
three_d_format.bevel_bottom = bevel_bottom
bevel_top = ShapeBevel()
bevel_top.bevel_type = "Circle"
bevel_top.height = 4
bevel_top.width = 4
three_d_format.bevel_top = bevel_top
three_d_format.extrusion_color = "#FF008000"
three_d_format.extrusion_height = 6
three_d_format.contour_color = "#FF25353D"
three_d_format.contour_width = 1.5
three_d_format.depth = 3
three_d_format.material = "Plastic"
light_rig = LightRig()
light_rig.light_type = "Balanced"
light_rig.direction = "Top"
light_rig.x_rotation = 0
light_rig.y_rotation = 0
light_rig.z_rotation = 40
three_d_format.light_rig = light_rig
camera = Camera()
camera.camera_type = "PerspectiveContrastingRightFacing"
three_d_format.camera = camera
text_frame_format.three_d_format = three_d_format
dto.text_frame_format = text_frame_format
# Create the WordArt object and download the presentation.
slides_api.create_presentation(file_name)
slides_api.create_shape(file_name, slide_index, dto)
file_path = slides_api.download_file(file_name)
The WordArt object in the output presentation:
This is a paid product but you can make 150 free API calls per month for API learning and presentation processing.
I work as a Support Developer at Aspose.

Gcloud ai-platform, can't create model with own prediction-class

I try following AI Platform tutorial to upload a model and a prediction routine but one part fail and I don't understand why.
My prediction class is the same as in their tutorial:
%%writefile predictor.py
import os
import pickle
import numpy as np
from sklearn.datasets import load_iris
from sklearn.externals import joblib
class MyPredictor(object):
def __init__(self, model, preprocessor):
self._model = model
self._preprocessor = preprocessor
self._class_names = load_iris().target_names
def predict(self, instances, **kwargs):
inputs = np.asarray(instances)
preprocessed_inputs = self._preprocessor.preprocess(inputs)
if kwargs.get('probabilities'):
probabilities = self._model.predict_proba(preprocessed_inputs)
return probabilities.tolist()
else:
outputs = self._model.predict(preprocessed_inputs)
return [self._class_names[class_num] for class_num in outputs]
#classmethod
def from_path(cls, model_dir):
model_path = os.path.join(model_dir, 'model.joblib')
model = joblib.load(model_path)
preprocessor_path = os.path.join(model_dir, 'preprocessor.pkl')
with open(preprocessor_path, 'rb') as f:
preprocessor = pickle.load(f)
return cls(model, preprocessor)
the code I use to create my model in cloud is:
! gcloud beta ai-platform versions create {VERSION_NAME} \
--model {MODEL_NAME} \
--runtime-version 1.13 \
--python-version 3.5 \
--origin gs://{BUCKET_NAME}/custom_prediction_routine_tutorial/model/ \
--package-uris gs://{BUCKET_NAME}/custom_prediction_routine_tutorial/my_custom_code-0.1.tar.gz \
--prediction-class predictor.MyPredictor
But I end up with such an odd error:
ERROR: (gcloud.beta.ai-platform.versions.create) Bad model detected with error: "Failed to load model: Unexpected error when loading the model: 'ascii' codec can't decode byte 0xf9 in position 2: ordinal not in range(128) (Error code: 0)"
The thing is that when I run the same command without the:
--prediction-class predictor.MyPredictor
it work fine.
Does someone know the reason of this ? I think model.joblib might have an encoding problem but when I load it myself there is nothing wrong
I've find the solution,
In the tutorial they use pickle to save the preprocessor object created, and Joblib to save the model.
You need to use Joblib to save both and then send it to google storage.

Deploying Keras model to Google Cloud ML for serving predictions

I need to understand how to deploy models on Google Cloud ML. My first task is to deploy a very simple text classifier on the service. I do it in the following steps (could perhaps be shortened to fewer steps, if so, feel free to let me know):
Define the model using Keras and export to YAML
Load up YAML and export as a Tensorflow SavedModel
Upload model to Google Cloud Storage
Deploy model from storage to Google Cloud ML
Set the upload model version as default on the models website.
Run model with a sample input
I've finally made step 1-5 work, but now I get this strange error seen below when running the model. Can anyone help? Details on the steps is below. Hopefully, it can also help others that are stuck on one of the previous steps. My model works fine locally.
I've seen Deploying Keras Models via Google Cloud ML and Export a basic Tensorflow model to Google Cloud ML, but they seem to be stuck on other steps of the process.
Error
Prediction failed: Exception during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="In[0] is not a matrix
[[Node: MatMul = MatMul[T=DT_FLOAT, _output_shapes=[[-1,64]], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Mean, softmax_W/read)]]")
Step 1
# import necessary classes from Keras..
model_input = Input(shape=(maxlen,), dtype='int32')
embed = Embedding(input_dim=nb_tokens,
output_dim=256,
mask_zero=False,
input_length=maxlen,
name='embedding')
x = embed(model_input)
x = GlobalAveragePooling1D()(x)
outputs = [Dense(nb_classes, activation='softmax', name='softmax')(x)]
model = Model(input=[model_input], output=outputs, name="fasttext")
# export to YAML..
Step 2
from __future__ import print_function
import sys
import os
import tensorflow as tf
from tensorflow.contrib.session_bundle import exporter
import keras
from keras import backend as K
from keras.models import model_from_config, model_from_yaml
from optparse import OptionParser
EXPORT_VERSION = 1 # for us to keep track of different model versions (integer)
def export_model(model_def, model_weights, export_path):
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
K.set_learning_phase(0) # all new operations will be in test mode from now on
yaml_file = open(model_def, 'r')
yaml_string = yaml_file.read()
yaml_file.close()
model = model_from_yaml(yaml_string)
# force initialization
model.compile(loss='categorical_crossentropy',
optimizer='adam')
Wsave = model.get_weights()
model.set_weights(Wsave)
# weights are not loaded as I'm just testing, not really deploying
# model.load_weights(model_weights)
print(model.input)
print(model.output)
pred_node_names = output_node_names = 'Softmax:0'
num_output = 1
export_path_base = export_path
export_path = os.path.join(
tf.compat.as_bytes(export_path_base),
tf.compat.as_bytes('initial'))
builder = tf.saved_model.builder.SavedModelBuilder(export_path)
# Build the signature_def_map.
x = model.input
y = model.output
values, indices = tf.nn.top_k(y, 5)
table = tf.contrib.lookup.index_to_string_table_from_tensor(tf.constant([str(i) for i in xrange(5)]))
prediction_classes = table.lookup(tf.to_int64(indices))
classification_inputs = tf.saved_model.utils.build_tensor_info(model.input)
classification_outputs_classes = tf.saved_model.utils.build_tensor_info(prediction_classes)
classification_outputs_scores = tf.saved_model.utils.build_tensor_info(values)
classification_signature = (
tf.saved_model.signature_def_utils.build_signature_def(inputs={tf.saved_model.signature_constants.CLASSIFY_INPUTS: classification_inputs},
outputs={tf.saved_model.signature_constants.CLASSIFY_OUTPUT_CLASSES: classification_outputs_classes, tf.saved_model.signature_constants.CLASSIFY_OUTPUT_SCORES: classification_outputs_scores},
method_name=tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME))
tensor_info_x = tf.saved_model.utils.build_tensor_info(x)
tensor_info_y = tf.saved_model.utils.build_tensor_info(y)
prediction_signature = (tf.saved_model.signature_def_utils.build_signature_def(
inputs={'images': tensor_info_x},
outputs={'scores': tensor_info_y},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))
legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={'predict_images': prediction_signature,
tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: classification_signature,},
legacy_init_op=legacy_init_op)
builder.save()
print('Done exporting!')
raise SystemExit
if __name__ == '__main__':
usage = "usage: %prog [options] arg"
parser = OptionParser(usage)
(options, args) = parser.parse_args()
if len(args) < 3:
raise ValueError("Too few arguments!")
model_def = args[0]
model_weights = args[1]
export_path = args[2]
export_model(model_def, model_weights, export_path)
Step 3
gsutil cp -r fasttext_cloud/ gs://quiet-notch-xyz.appspot.com
Step 4
from __future__ import print_function
from oauth2client.client import GoogleCredentials
from googleapiclient import discovery
from googleapiclient import errors
import time
projectID = 'projects/{}'.format('quiet-notch-xyz')
modelName = 'fasttext'
modelID = '{}/models/{}'.format(projectID, modelName)
versionName = 'Initial'
versionDescription = 'Initial release.'
trainedModelLocation = 'gs://quiet-notch-xyz.appspot.com/fasttext/'
credentials = GoogleCredentials.get_application_default()
ml = discovery.build('ml', 'v1', credentials=credentials)
# Create a dictionary with the fields from the request body.
requestDict = {'name': modelName, 'description': 'Online predictions.'}
# Create a request to call projects.models.create.
request = ml.projects().models().create(parent=projectID, body=requestDict)
# Make the call.
try:
response = request.execute()
except errors.HttpError as err:
# Something went wrong, print out some information.
print('There was an error creating the model.' +
' Check the details:')
print(err._get_reason())
# Clear the response for next time.
response = None
raise
time.sleep(10)
requestDict = {'name': versionName,
'description': versionDescription,
'deploymentUri': trainedModelLocation}
# Create a request to call projects.models.versions.create
request = ml.projects().models().versions().create(parent=modelID,
body=requestDict)
# Make the call.
try:
print("Creating model setup..", end=' ')
response = request.execute()
# Get the operation name.
operationID = response['name']
print('Done.')
except errors.HttpError as err:
# Something went wrong, print out some information.
print('There was an error creating the version.' +
' Check the details:')
print(err._get_reason())
raise
done = False
request = ml.projects().operations().get(name=operationID)
print("Adding model from storage..", end=' ')
while (not done):
response = None
# Wait for 10000 milliseconds.
time.sleep(10)
# Make the next call.
try:
response = request.execute()
# Check for finish.
done = True # response.get('done', False)
except errors.HttpError as err:
# Something went wrong, print out some information.
print('There was an error getting the operation.' +
'Check the details:')
print(err._get_reason())
done = True
raise
print("Done.")
Step 5
Use website.
Step 6
def predict_json(instances, project='quiet-notch-xyz', model='fasttext', version=None):
"""Send json data to a deployed model for prediction.
Args:
project (str): project where the Cloud ML Engine Model is deployed.
model (str): model name.
instances ([Mapping[str: Any]]): Keys should be the names of Tensors
your deployed model expects as inputs. Values should be datatypes
convertible to Tensors, or (potentially nested) lists of datatypes
convertible to tensors.
version: str, version of the model to target.
Returns:
Mapping[str: any]: dictionary of prediction results defined by the
model.
"""
# Create the ML Engine service object.
# To authenticate set the environment variable
# GOOGLE_APPLICATION_CREDENTIALS=<path_to_service_account_file>
service = googleapiclient.discovery.build('ml', 'v1')
name = 'projects/{}/models/{}'.format(project, model)
if version is not None:
name += '/versions/{}'.format(version)
response = service.projects().predict(
name=name,
body={'instances': instances}
).execute()
if 'error' in response:
raise RuntimeError(response['error'])
return response['predictions']
Then run function with test input: predict_json({'inputs':[[18, 87, 13, 589, 0]]})
There is now a sample demonstrating the use of Keras on CloudML engine, including prediction. You can find the sample here:
https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census/keras
I would suggest comparing your code to that code.
Some additional suggestions that will still be relevant:
CloudML Engine currently only supports using a single signature (the default signature). Looking at your code, I think prediction_signature is more likely to lead to success, but you haven't made that the default signature. I suggest the following:
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: prediction_signature,},
legacy_init_op=legacy_init_op)
If you are deploying to the service, then you would invoke prediction like so:
predict_json({'images':[[18, 87, 13, 589, 0]]})
If you are testing locally using gcloud ml-engine local predict --json-instances the input data is slightly different (matches that of the batch prediction service). Each newline-separated line looks like this (showing a file with two lines):
{'images':[[18, 87, 13, 589, 0]]}
{'images':[[21, 85, 13, 100, 1]]}
I don't actually know enough about the shape of model.x to ensure the data being sent is correct for your model.
By way of explanation, it may be insightful to consider the difference between the Classification and Prediction methods in SavedModel. One difference is that, when using tensorflow_serving, which is based on gRPC, which is strongly typed, Classification provides a strongly-typed signature that most classifiers can use. Then you can reuse the same client on any classifier.
That's not overly useful when using JSON since JSON isn't strongly typed.
One other difference is that, when using tensorflow_serving, Prediction accepts column-based inputs (a map from feature name to every value for that feature in the whole batch) whereas Classification accepts row based inputs (each input instance/example is a row).
CloudML abstracts that away a bit and always requires row-based inputs (a list of instances). We even though we only officially support Prediction, but Classification should work as well.

how to build a REST hierarchy using spyne

I am trying to build a REST web service using spyne. So far I have been able to use ComplexModel to represent my resources. Something very basic, like this (borrowed from the examples):
class Meta(ComplexModel):
version = Unicode
description = Unicode
class ExampleService(ServiceBase):
#srpc(_returns=Meta)
def get_meta():
m = Meta()
m.version="2.0"
m.description="Meta complex class example"
return m
application = Application([ExampleService],
tns='sur.factory.webservices',
in_protocol=HttpRpc(validator='soft'),
out_protocol=JsonDocument()
)
if __name__ == '__main__':
wsgi_app = WsgiApplication(application)
server = make_server('0.0.0.0', 8000, wsgi_app)
server.serve_forever()
To run I use curl -v "http://example.com:8000/get_meta" and I get what I expect.
But what if I would like to access some hierarchy of resources like http://example.com:8000/resourceA/get_meta ??
Thanks for your time!
Two options: Static and dynamic. Here's the static one:
from spyne.util.wsgi_wrapper import WsgiMounter
app1 = Application([SomeService, ...
app2 = Application([SomeOtherService, ...
wsgi_app = WsgiMounter({
'resourceA': app1,
'resourceB': app2,
})
This works today. Note that you can stack WsgiMounter's.
As for the dynamic one, you should use HttpPattern(). I consider this still experimental as I don't like the implementation, but this works with 2.10.x, werkzeug, pyparsing<2 and WsgiApplication:
class ExampleService(ServiceBase):
#rpc(Unicode, _returns=Meta, _patterns=[HttpPattern("/<resource>/get_meta")])
def get_meta(ctx, resource):
m = Meta()
m.version = "2.0"
m.description="Meta complex class example with resource %s" % resource
return m
Don't forget to turn on validation and put some restrictions on the resource type to prevent DoS attacks and throwing TypeErrors and whatnot. I'd do:
ResourceType = Unicode(24, min_len=3, nullable=False,
pattern="[a-zA-Z0-9]+", type_name="ResourceType")
Note that you can also match http verbs with HttpPattern. e.g.
HttpPattern("/<resource>/get_meta", verb='GET')
or
HttpPattern("/<resource>/get_meta", verb='(PUT|PATCH)')
Don't use host matching, as of 2.10, it's broken.
Also, as this bit of Spyne is marked as experimental, its api can change any time.
I hope this helps