How to solve RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0

How to solve RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0 - neural-network

I had done a model training on Densenet161 and I saved my model
torch.save(model_ft.state_dict(),'/content/drive/My Drive/Stanford40/densenet161.pth')
and follow by this
model = models.densenet161(pretrained=False,num_classes=11)
model_ft.classifier=nn.Linear(2208,11)
model.load_state_dict(torch.load('/content/drive/My Drive/Stanford40/densenet161.pth'))
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model=model.to(device)
Then, when I want to proceed to the model evaluation
test_tuple=datasets.ImageFolder('/content/drive/My Drive/Stanford40/body/valid',transform=data_transforms['valid'])
test_dataloader=torch.utils.data.DataLoader(test_tuple,batch_size=1,shuffle=True)
class_names=test_tuple.classes
i=0
length=dataset_sizes['valid']
y_true=torch.zeros([length,1])
y_pred=torch.zeros([length,1])
for inputs ,labels in test_dataloader:
model_ft.eval()
inputs=inputs.to(device)
outputs=model_ft(inputs)
y_true[i][0]=labels
maxvlaues,indices = torch.max(outputs, 1)
y_pred[i][0]=indices
i=i+1
and I face the error as in the picture:
when I check whether my model was moved to the device with this code
next(model.parameters()).is_cuda
The result is True.
How can I modify the code to get away from this error?
My model training part can be found at How to solve TypeError: can’t convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first

You have moved model to the GPU, but not model_ft. The runtime error is at outputs = model_ft(inputs). Could it be a case of mixed-up variable names?

Related

Displaying Data as a String in Simulink RT Display Port

My issue involves using the RS-232 Simulink RT blocks.
A model is uploaded to the target PC (xPC) and it transmits and receives data from a variable frequency drive (VFD) that controls a motor. The issue arises on the receiving end when I take data and try to send that data to a display block in my model as a string. Code would be helpful here:
disp = uint8(zeros(1,24));
display = uint8(zeros(1,length(disp)));
cmd = 0;
status = stat_lb;
%% Start-Up
% Initialization Period
if (status == 0 || status == 1)
cmd = 0;
msg = uint8('Start up');
display = [msg uint8(zeros( 1, length(disp)- length(msg) ))];
end
...
%Multiple status cases with unique displays.
...
disp = display
So, here the cmd portion functions as expected. As noted above, I want to display the display string on a display block in my Simulink model. As you can see, though, it is of type uint8, so I need to convert it to type string; however, when I pass it through either the ascii2str Simulink block or just place it in the function call (e.g. display = ascii2str(display)) I get the following error message:
Executing the 'CheckData' command produced the following error: Invalid parameter/value pair arguments
My thought is that this has something to do with the fact that I am using MEX and this function (ascii2str) is not supported. Anyways, I am wondering if anyone knows why I receive this error and if there is anything I can do to resolve it.
Oh, and one last thing: I can get the display to work if I just remove the ascii2str; however, the only problem with this is that the display is in uint8 form and not really helpful. So, if there is any other way that I can decode the uint8 to a string I am all ears.
Thanks!

I have found that there is no support for this feature in Simulink RT. One option is to use external functions, but I found it better for my application to simply output a number and have a table in the simulation that explained what each number meant.

Output shape of mlmodel NNClassifier is Multiarray, VNClassificationObservation not working?

Need help in deploying coreML model generated from GCP to be built, and deployed on Xcode ?
The app on my iPhone opens up and I can take a picture, but the model gets stuck at 'classifying...'
This was initially due to the input image size (I changed it to 224*224) which I was able to fix using coremltools but looks like for the output I need to have a dictionary output when the .mlmodel that I have has a multiarray(float32) output. Also, GCP coreML provided two files, a label.txt file and .mlmodel.
So, I have two questions:
How do I leverage the label.text file during the classification/Xcode build process ?
My error happens at
{ guard let results = request.results as? [VNClassificationObservation] else {
fatalError("Model failed to load image")
}
Can I change my mlmodel output from multiarray to dictionary with labels to suit VNClassificationObservation OR VNCoreMLFeatureValueObservation can be used in someway with multiarray output ? I tried it but app on the iphone gets stuck.
Not sure how to use the label file in Xcode. Any help is much appreciated. I have spent a day researching online.

You will only get VNClassificationObservation when the model is a classifier. If you're getting an MLMultiArray as output, then your model is NOT a classifier according to Core ML.
It's possible to change your model into a classifier using coremltools. You need to write a Python script that:
loads the mlmodel
assigns the layers from model._spec.neuralNetwork to model._spec.neuralNetworkClassifier
adds two outputs, one for the winning class label & one for the dictionary with the probabilities for all class labels
fill in the class labels
save the mlmodel

Slowdown on tensorflow convolutional network with custom parameter update

I'm trying to implement a custom parameter update on a convolutional network, but every mini batch executed gets slower and slower.
I realize that there's no need to go through this trouble with a fixed learning rate, but I plan to update this later.
I call this in a loop where the feed_dict is the mini_batch.
sess.run(layered_optimizer(cost,.1,1),feed_dict = feed_dict)
where
def layered_optimizer(cost,base_rate, rate_multiplier):
gradients = tf.gradients(cost, [*weights, *biases])
print(gradients)
#update parameters based on gradients: var = var - gradient * base_rate * multiplier
for i in range(len(weights)-1):
weights[i].assign(tf.subtract(weights[i], tf.multiply(gradients[i], base_rate * rate_multiplier)))
biases[i].assign(tf.subtract(biases[i], tf.multiply(gradients[len(weights)+i], base_rate * rate_multiplier)))
return(cost)
I'm not sure if this is has to do with the problem, but after trying to run the code a second time I get the following errors and have to restart.
could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
error retrieving driver version: Unimplemented: kernel reported driver version not implemented on Windows
could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)

What happens is that every time this gets called
gradients = tf.gradients(cost, [*weights, *biases])
a new instance of tf.gradients gets created, taking up unnecessary memory.

Simulink-Simulation with parfor (Parallel Computing)

I asked today a question about Parallel Computing with Matlab-Simulink. Since my earlier question is a bit messy and there are a lot of things in the code which doesnt really belong to the problem.
My problem is
I want to simulate something in a parfor-Loop, while my Simulink-Simulation uses the "From Workspace" block to integrate the needed Data from the workspace into the simulation. For some reason it doesnt work.
My code looks as follows:
load DemoData
path = pwd;
apool = gcp('nocreate');
if isempty(apool)
apool = parpool('local');
end
parfor k = 1 : 2
load_system(strcat(path,'\DemoMDL'))
set_param('DemoMDL/Mask', 'DataInput', 'DemoData')
SimOut(k) = sim('DemoMDL')
end
delete(apool);
My simulation looks as follows
The DemoData-File is just a zeros(100,20)-Matrix. It's an example for Data.
Now if I simulate the Script following error message occures:
Errors
Error using DemoScript (line 9)
Error evaluating parameter 'DataInput' in 'DemoMDL/Mask'
Caused by:
Error using parallel_function>make_general_channel/channel_general (line 907)
Error evaluating parameter 'DataInput' in 'DemoMDL/Mask'
Error using parallel_function>make_general_channel/channel_general (line 907)
Undefined function or variable 'DemoData'.
Now do you have an idea why this happens??
The strange thing is, that if I try to acces the 'DemoData' inside the parfor-Loop it works. For excample with that code:
load DemoData
path = pwd;
apool = gcp('nocreate');
if isempty(apool)
apool = parpool('local');
end
parfor k = 1 : 2
load_system(strcat(path,'\DemoMDL'))
set_param('DemoMDL/Mask', 'DataInput', 'DemoData')
fprintf(num2str(DemoData))
end
delete(apool);
Thats my output without simulating and displaying the Data
'>>'DemoScript
00000000000000000 .....
Thanks a lot. That's the original question with a lot more (unnecessary) details:
EarlierQuestion

I suspect the issue is that when MATLAB is pre-processing the parfor loop to determine what variables need to be passed to the workers it does not know what DemoData is. In your first example it's just a string, so no data gets sent over. In your second example it explicitly knows about the variable and hence does pass it over.
You could try either using the Model Workspace, or perhaps just inserting the line
DemoData = DemoData;
in the parfor loop code.

Your error is because workers did not have access to DemoData in the client workspace.
When running parallel simulations with Simulink it would be easier to manage data from workspace if you move them to model workspace. Then each worker can access this data from its model workspace. You can load a MAT file or write MATLAB code to initialize data in model workspace. You can access model workspace using the Simulink model menu View->Model Explorer->Model Workspace.
Also see documentation at http://www.mathworks.com/help/simulink/ug/running-parallel-simulations.html that talks about "Resolving workspace access issues".

You can also move the line
load DemoData
to within the parfor loop. Doing this, you assure that the data will be available in each worker base workspace, wich is accessible to the model, instead of the client workspace.

Any reason why these instance could be misclassified?

I started off with two files training & testing.
Then using libsvm I scaled both those files to training.scale and testing.scale
Then using grid.py (part of libsvm) I ran training.scale and and recieved some cross validation values:
C = 512
gamme = 0.03125
validation 5 = 66.8421
Then running svm-train using the variable found from grid.py and training.scale I got a new fine called training.scale.model
I then ran svm-predict and I new file called testing.predict and got a validation % of 60.8333%
Finally comparing testing and testing.predict found that there were 47/120 misclassifications
[https://drive.google.com/folderview?id=0BxzgP5V6RPQHekRjZXdFYW9GX0U&usp=sharing][1]
[1]: link to code
The real question is there any reason why these misclassification occur?
PS. I apologise for the bad format of this question, been up for too long

I am guessing you are new to machine learning. The results you've got are perfectly right.
Reason why these mis-classifications occur? The features you've used don't separate your classes well. A 66% cross-validation score should have given you the hint. Even by plain hit or miss method you'll get 50% accuracy, and the feature-set you used could only improve this by another 16%. Try exploring new features.
I'm assuming your data set is clean.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to solve RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0 - neural-network

You have moved model to the GPU, but not model_ft. The runtime error is at outputs = model_ft(inputs). Could it be a case of mixed-up variable names?

Related

Displaying Data as a String in Simulink RT Display Port

Output shape of mlmodel NNClassifier is Multiarray, VNClassificationObservation not working?

Slowdown on tensorflow convolutional network with custom parameter update

Simulink-Simulation with parfor (Parallel Computing)

Any reason why these instance could be misclassified?

Categories

Resources