Batch processing with Nvidia's TensorRT

Batch processing with Nvidia's TensorRT - pycuda

I converted the trained model to onnx format, and then created the TensorRT engine file from onnx model. I used the below snnipet code for doing this?
import pycuda.driver as cuda
import pycuda.autoinit
import numpy as np
import tensorrt as trt
# logger to capture errors, warnings, and other information during the build and inference phases
TRT_LOGGER = trt.Logger()
def build_engine(onnx_file_path):
# initialize TensorRT engine and parse ONNX model
builder = trt.Builder(TRT_LOGGER)
network = builder.create_network()
parser = trt.OnnxParser(network, TRT_LOGGER)
# parse ONNX
with open(onnx_file_path, 'rb') as model:
print('Beginning ONNX file parsing')
parser.parse(model.read())
print('Completed parsing of ONNX file')
# allow TensorRT to use up to 1GB of GPU memory for tactic selection
builder.max_workspace_size = 1 << 30
# we have only one image in batch
builder.max_batch_size = 1
# use FP16 mode if possible
if builder.platform_has_fast_fp16:
builder.fp16_mode = True
# generate TensorRT engine optimized for the target platform
print('Building an engine...')
engine = builder.build_cuda_engine(network)
context = engine.create_execution_context()
print("Completed creating Engine")
return engine, context
# get sizes of input and output and allocate memory required for input data and for output data
for binding in engine:
if engine.binding_is_input(binding): # we expect only one input
input_shape = engine.get_binding_shape(binding)
input_size = trt.volume(input_shape) * engine.max_batch_size * np.dtype(np.float32).itemsize # in bytes
device_input = cuda.mem_alloc(input_size)
else: # and one output
output_shape = engine.get_binding_shape(binding)
# create page-locked memory buffers (i.e. won't be swapped to disk)
host_output = cuda.pagelocked_empty(trt.volume(output_shape) * engine.max_batch_size, dtype=np.float32)
device_output = cuda.mem_alloc(host_output.nbytes)
stream = cuda.Stream()
# preprocess input data
host_input = np.array(preprocess_image("turkish_coffee.jpg").numpy(), dtype=np.float32, order='C')
cuda.memcpy_htod_async(device_input, host_input, stream)
# run inference
context.execute_async(bindings=[int(device_input), int(device_output)], stream_handle=stream.handle)
cuda.memcpy_dtoh_async(host_output, device_output, stream)
stream.synchronize()
# postprocess results
output_data = torch.Tensor(host_output).reshape(engine.max_batch_size, output_shape[0])
postprocess(output_data)
The above codes is correctly work for one batch size of image, but I want to do for multi batch size, for this one thing that need to change :
builder.max_batch_size = 1
and What are other things I have to change to work correctly for batch size more than one? In my opinion, the one things that I have to change from sync to async, right?:
stream.synchronize()
How I can to solve the problem for batch size more than one?
My system:
torch:1.2.0
torchvision:0.4.0
albumentations:0.4.5
onnx:1.4.1
opencv-python:4.2.0.34
cuda:10.0
ubuntu:18.04
tensorrt: 5.x/6.x
Other solution is to use optimization profiler in TRT 7.x , But I want to know How I can to solve this problem with 5/6 versions, Is it possible?

Related

How can we fix this AttributeError: 'bytes' object has no attribute 'map'?

I'm trying to run this code on an AWS EMR instance to get features from images by Transfer learning model but I get this error code : AttributeError: 'bytes' object has no attribute 'map'
*
input = np.stack(content_series.map(preprocess))
preds = model.predict(input)
here is the whole code :
def preprocess(content):
img = Image.open(io.BytesIO(content)).resize([224, 224])
arr = img_to_array(img)
return preprocess_input(arr)
def featurize_series(model, content_series):
"""
Featurize a pd.Series of raw images using the input model.
:return: a pd.Series of image features
"""
input = np.stack(content_series.map(preprocess))
preds = model.predict(input)
output = [p.flatten() for p in preds]
return pd.Series(output)
#pandas_udf('array<float>', PandasUDFType.SCALAR_ITER)
def featurize_udf(content_series_iter):
'''
This method is a Scalar Iterator pandas UDF wrapping our featurization function.
The decorator specifies that this returns a Spark DataFrame column of type ArrayType(FloatType).
:param content_series_iter: This argument is an iterator over batches of data, where each batch
is a pandas Series of image data.
'''
# With Scalar Iterator pandas UDFs, we can load the model once and then re-use it
# for multiple data batches. This amortizes the overhead of loading big models.
model = model_fn()
for content_series in content_series_iter:
yield featurize_series(model, content_series)
do u have any suggestions ? tks
I tried to change instance config

The error is occurring because the content_series.map method is trying to apply the preprocess function to each element of content_series, which is assumed to be a pandas Series. However, the error message indicates that content_series is a 'bytes' object, which does not have a 'map' attribute.
It appears that the content_series_iter input to the featurize_udf function is an iterator over batches of image data, where each batch is a pandas Series of image data. To resolve the error, you'll need to modify the featurize_udf function to convert the 'bytes' objects in each batch to pandas Series before applying the preprocess function:
#pandas_udf('array<float>', PandasUDFType.SCALAR_ITER)
def featurize_udf(content_series_iter):
model = model_fn()
for content_series in content_series_iter:
content_series = pd.Series(content_series.tolist())
yield featurize_series(model, content_series)
This should resolve the AttributeError and allow you to run the code successfully on the AWS EMR instance.

Calculate descriptors with RDkit

I am trying to calculate all the descriptors (both 2D/3D) for a list of molecules with RDkit in python. When I run:
MolecularDescriptorCalculator.CalcDescriptors(mol, simplelist)
it returns:
AttributeError: 'Mol' object has no attribute 'simpleList'

To calculate all the rdkit descriptors, you can use the following code:
descriptor_names = list(rdMolDescriptors.Properties.GetAvailableProperties())
get_descriptors = rdMolDescriptors.Properties(descriptor_names)
Calculate descriptors using smile strings
def smi_to_descriptors(smile):
mol = Chem.MolFromSmiles(smile)
descriptors = []
if mol:
descriptors = np.array(get_descriptors.ComputeProperties(mol))
return descriptors
if the the smiles are in pandas dataframe
dataset['descriptors'] = dataset.SMILES.apply(smi_to_descriptors)

Looks like you are using the API slightly wrong, you need to initialize the MolecularDescriptorCalculator class first with the list of descriptors you require.
simplelist = ['TPSA'] # In the list add the names of the descriptors required
calculator = MolecularDescriptorCalculator(simplelist)
descriptors = calculator.CalcDescriptors(mol)
print(descriptors)
[Out]:
(21.259999999999998,)

training a RNN in Pytorch

I want to have an RNN model and teach it to learn generating "ihello" from "hihell". I am new in Pytorch and following the instruction in a video to write the code.
I have written two python files named train.py and model.py.
this is model.py:
#----------------- model for teach rnn hihell to ihello
#----------------- OUR MODEL ---------------------
import torch
import torch.nn as nn
from torch import autograd
class Model(nn.Module):
def __init__(self):
super(Model,self).__init__()
self.rnn=nn.RNN(input_size=input_size,hidden_size=hidden_size,batch_first=True)
def forward(self,x,hidden):
#Reshape input in (batch_size,sequence_length,input_size)
x=x.view(batch_size,sequence_length,input_size)
#Propagate input through RNN
#Input:(batch,seq+len,input_size)
out,hidden=self.rnn(x,hidden)
out=out.view(-1,num_classes)
return hidden,out
def init_hidden(self):
#Initialize hidden and cell states
#(num_layers*num_directions,batch,hidden_size)
return autograd.Variable(torch.zeros(num_layers,batch_size,hidden_size))
and this is train.py:
"""----------------------train for teach rnn to hihell to ihello--------------------------"""
#----------------- DATA PREPARATION ---------------------
#Import
import torch
import torch.nn as nn
from torch import autograd
from model import Model
import sys
idx2char=['h','i','e','l','o']
#Teach hihell->ihello
x_data=[0,1,0,2,3,3]#hihell
y_data=[1,0,2,3,3,4]#ihello
one_hot_lookup=[[1,0,0,0,0],#0
[0,1,0,0,0],#1
[0,0,1,0,0],#2
[0,0,0,1,0],#3
[0,0,0,0,1]]#4
x_one_hot=[one_hot_lookup[x] for x in x_data]
inputs=autograd.Variable(torch.Tensor(x_one_hot))
labels=autograd.Variable(torch.LongTensor(y_data))
""" ----------- Parameters Initialization------------"""
num_classes = 5
input_size = 5 # one hot size
hidden_size = 5 # output from LSTM to directly predict onr-hot
batch_size = 1 # one sequence
sequence_length = 1 # let's do one by one
num_layers = 1 # one layer RNN
"""----------------- LOSS AND TRAINING ---------------------"""
#Instantiate RNN model
model=Model()
#Set loss and optimizer function
#CrossEntropyLoss=LogSoftmax+NLLLOSS
criterion=torch.nn.CrossEntropyLoss()
optimizer=torch.optim.Adam(model.parameters(),lr=0.1)
"""----------------Train the model-------------------"""
for epoch in range(100):
optimizer.zero_grad()
loss=0
hidden=model.init_hidden()
sys.stdout.write("Predicted String:")
for input,label in zip(inputs,labels):
#print(input.size(),label.size())
hidden,output=model(input,hidden)
val,idx=output.max(1)
sys.stdout.write(idx2char[idx.data[0]])
loss+=criterion(output,label)
print(",epoch:%d,loss:%1.3f"%(epoch+1,loss.data[0]))
loss.backward()
optimizer.step()
when I run train.py, I receive this error:
self.rnn=nn.RNN(input_size=input_size,hidden_size=hidden_size,batch_first=True)
NameError: name 'input_size' is not defined
I don't know why I receive this error because I have input_size=5 in the above lines of my code. could anybody help me? thanks.

The scope of the variables defined in train.py (num_classes, input_size, ...) is the train.py itself. They are only visible in this file. The model.py is oblivious to these.
I suggest including these arguments in the constructor:
class Model(nn.Module):
def __init__(self, hidden_size, input_size):
# same
and then call the Model as:
model = Model(hidden_size, input_size)
Similarly, for other variables that you defined in train.py (and want to use them in model.py) you have to pass them as arguments to either their respective functions, or to the constructor and store them as attributes.

pyspark - how to cross validate several ML algorithms

i want to be able to choose the best fit algorithm with it's best params .
how can i do it in one go , without creating few pipelines for each algorithm , and without doing checks in the cross validation for params that are not relevant for specific algorithm ?
i.e i want to check how logistic regression perform against randomforest.
my code is :
lr = LogisticRegression().setFamily("multinomial")
# Chain indexer and tree in a Pipeline
pipeline = Pipeline(stages=[labelIndexer,labelIndexer2, assembler, lr , labelconverter])
paramGrid = ParamGridBuilder() \
.addGrid(lr.regParam, [0.1, 0.3, 0.01]) \
.addGrid(lr.elasticNetParam, [0.1, 0.8, 0.01]) \
.addGrid(lr.maxIter, [10, 20, 25]) \
.build()
crossval = CrossValidator(estimator=pipeline,
estimatorParamMaps=paramGrid,
evaluator=RegressionEvaluator(),
numFolds=2) # use 3+ folds in practice
# Train model. This also runs the indexer.
model = crossval.fit(trainingData)

I've written a quick and dirty workaround in Python/Pyspark. It is a bit primitive (it doesn't have a corresponding Scala class) and I think it lacks the save/load capabilities but it might be a starting point for your case. Eventually it might become a new functionality in Spark it would be nice to have.
The idea is to have a special pipeline stage that acts as a switch between different objects, and maintains a dictionary to refer to them with strings. The user can enable one or another by name. They can be either Estimators, Transformers or mix both - the user is responsible for keeping the coherence in the pipeline (doing things that make sense, at her own risk). The parameter with the name of the enabled stage can be included in the grid to be cross-validated.
from pyspark.ml.wrapper import JavaEstimator
from pyspark.ml.base import Estimator, Transformer, Param, Params, TypeConverters
class PipelineStageChooser(JavaEstimator):
selectedStage = Param(Params._dummy(), "selectedStage", "key of the selected stage in the dict",
typeConverter=TypeConverters.toString)
stagesDict = None
_paramMap = {}
def __init__(self, stagesDict, selectedStage):
super(PipelineStageChooser, self).__init__()
self.stagesDict = stagesDict
if selectedStage not in self.stagesDict.keys():
raise KeyError("selected stage {0} not found in stagesDict".format(selectedStage))
if isinstance(self.stagesDict[selectedStage], Transformer):
self.fittedSelectedStage = self.stagesDict[selectedStage]
for stage in stagesDict.values():
if not (isinstance(stage, Estimator) or isinstance(stage, Transformer)):
raise TypeError("Cannot recognize a pipeline stage of type %s." % type(stage))
self._set(selectedStage=selectedStage)
self._java_obj = None
def fit(self, dataset, params=None):
selectedStage_str = self.getOrDefault(self.selectedStage)
if isinstance(self.stagesDict[selectedStage_str], Estimator):
return self.stagesDict[selectedStage_str].fit(dataset, params = params)
elif isinstance(self.stagesDict[selectedStage_str], Transformer):
return self.stagesDict[selectedStage_str]
Use example:
count_vectorizer = CountVectorizer() # set params
hashing_tf = HashingTF() # set params
chooser = PipelineStageChooser(stagesDict={"count_vectorizer": count_vectorizer,
"hashing_tf": hashing_tf},
selectedStage="count_vectorizer")
pipeline = Pipeline(stages = [chooser])
# Test which among CountVectorizer or HashingTF works better to create features
# Could be used as well to decide between different ML algorithms
paramGrid = ParamGridBuilder() \
.addGrid(chooser.selectedStage, ["count_vectorizer", "hashing_tf"])\
.build()

PySpark: Error "Cannot pickle standard input" on function map

I'm trying to learn to use Pyspark.
I'm usin spark-2.2.0- with Python3
I'm in front of a problem now and I can't find where it came from.
My project is to adapt a algorithm wrote by data-scientist to be distributed. The code below it's what I have to use to extract the features from images and I have to adapt it to extract features whith pyspark.
import json
import sys
# Dependencies can be installed by running:
# pip install keras tensorflow h5py pillow
# Run script as:
# ./extract-features.py images/*.jpg
from keras.applications.vgg16 import VGG16
from keras.models import Model
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np
def main():
# Load model VGG16 as described in https://arxiv.org/abs/1409.1556
# This is going to take some time...
base_model = VGG16(weights='imagenet')
# Model will produce the output of the 'fc2'layer which is the penultimate neural network layer
# (see the paper above for mode details)
model = Model(input=base_model.input, output=base_model.get_layer('fc2').output)
# For each image, extract the representation
for image_path in sys.argv[1:]:
features = extract_features(model, image_path)
with open(image_path + ".json", "w") as out:
json.dump(features, out)
def extract_features(model, image_path):
img = image.load_img(image_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
features = model.predict(x)
return features.tolist()[0]
if __name__ == "__main__":
main()
I have written the begining of the Code:
rdd = sc.binaryFiles(PathImages)
base_model = VGG16(weights='imagenet')
model = Model(input=base_model.input, output=base_model.get_layer('fc2').output)
rdd2 = rdd.map(lambda x : (x[0], extract_features(model, x[0][5:])))
rdd2.collect()[0]
when I try to extract the feature. There is an error.
~/Code/spark-2.2.0-bin-hadoop2.7/python/pyspark/cloudpickle.py in
save_file(self, obj)
623 return self.save_reduce(getattr, (sys,'stderr'), obj=obj)
624 if obj is sys.stdin:
--> 625 raise pickle.PicklingError("Cannot pickle standard input")
626 if hasattr(obj, 'isatty') and obj.isatty():
627 raise pickle.PicklingError("Cannot pickle files that map to tty objects")
PicklingError: Cannot pickle standard input
I try multiple thing and here is my first result. I know that the error come from the line below in the method extract_features:
features = model.predict(x)
and when I try to run this line out of a map function or pyspark, this work fine.
I think the problem come from the object "model" and his serialisation whith pyspark.
Maybe I don't use a good way to distribute this with pyspark and if you have any clew to help me, I will take them.
Thanks in advance.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Batch processing with Nvidia's TensorRT - pycuda

Related

How can we fix this AttributeError: 'bytes' object has no attribute 'map'?

Calculate descriptors with RDkit

training a RNN in Pytorch

pyspark - how to cross validate several ML algorithms

PySpark: Error "Cannot pickle standard input" on function map

Categories

Resources