We are trying to use TensorFlow Face Mesh model within our iOS app. Model details: https://drive.google.com/file/d/1VFC_wIpw4O7xBOiTgUldl79d9LA-LsnA/view.
I followed TS official tutorial for setting up the model: https://firebase.google.com/docs/ml-kit/ios/use-custom-models and also printed the model Input-Output using the Python script in the tutorial and got this:
INPUT
[ 1 192 192 3]
<class 'numpy.float32'>
OUTPUT
[ 1 1 1 1404]
<class 'numpy.float32'>
At this point, I'm pretty lost trying to understand what those numbers mean, and how do I pass the input image and get the output face mesh points using the model Interpreter. Here's my Swift code so far:
let coreMLDelegate = CoreMLDelegate()
var interpreter: Interpreter
// Core ML delegate will only be created for devices with Neural Engine
if coreMLDelegate != nil {
interpreter = try Interpreter(modelPath: modelPath,
delegates: [coreMLDelegate!])
} else {
interpreter = try Interpreter(modelPath: modelPath)
}
Any help will be highly appreciated!
What those numbers mean completely depends on the model you're using. It's unrelated to both TensorFlow and Core ML.
The output is a 1x1x1x1404 tensor, which basically means you get a list of 1404 numbers. How to interpret those numbers depends on what the model was designed to do.
If you didn't design the model yourself, you'll have to find documentation for it.
Related
I have a simple model on OpenModelica. I generated a C-code (using export FMU functionality) of the model and I want to adjust the code so as to make a parametrised study of a parameter and print the output. In other words, I want that when I run the executable, to pass an argument to the executable which contains the value of a certain parameter, and print the value of some output of the model in a file.
Is that possible?
Yes that is possible.
You mentioned FMU export, so I'll stick to this and it is probably the easiest way.
The binaries inside a FMU are compiled libraries containing all internal runtimes and stuff to simulate your exported model.
These libraries are usually handled by an FMU importing tool, e.g. OMSimulator (part of OpenModelica) or FMPy.
I wouldn't recomment changing the generated C code directly. It is difficult and you don't need to do it.
In your FMU import tool you can usually change values of varaibles and parameters via the fmi2SetXXX functions before runing a simulation. Most tools wrap that interface call into some function to change values. Have a look at the documentation of your tool.
So let's look at an example. I'm using OpenModelica to export helloWorld and OMSimulator from Python, but check your tool and use what you prefere.
model helloWorld
parameter Real a = 1;
Real x(start=1,fixed=true);
equation
der(x) = a*x;
end helloWorld
Export as 2.0 Model Exchange FMU in OMEdit with File->Export->FMU or right-click on your model and Export->FMU.
You can find your export settings in Tools->Options->FMI.
Now I have helloWorld.fmu and can simulate it with OMSimulator. I want to change the value of parameter helloWorld.a to 2.0 and use setReal to do so.
>>> from OMSimulator import OMSimulator
>>> oms = OMSimulator()
>>> oms.newModel("helloWorld")
>>> oms.addSystem("helloWorld.root", oms.system_sc)
>>> oms.addSubModel("helloWorld.root.fmu", "helloWorld.fmu")
>>> oms.instantiate("helloWorld")
>>> oms.setReal("helloWorld.root.fmu.a", 2.0)
>>> oms.initialize("helloWorld")
info: maximum step size for 'helloWorld.root': 0.001000
info: Result file: helloWorld_res.mat (bufferSize=10)
>>> oms.simulate("helloWorld")
>>> oms.terminate("helloWorld")
info: Final Statistics for 'helloWorld.root':
NumSteps = 1001 NumRhsEvals = 1002 NumLinSolvSetups = 51
NumNonlinSolvIters = 1001 NumNonlinSolvConvFails = 0 NumErrTestFails = 0
>>> oms.delete("helloWorld")
Now add a loop to this to iterate over different values of your parameter and do whatever post-processing you want.
I am working on a project where I am using Ionic and TensorFlow for machine learning. I have converted my TensorFlow model to a tensorflowjs model. I have put the model.json file and shard files of the tensorflowjs model in the assets folder in Ionic. Basically, I have put my tensorflowjs model in the assets folder of ionic. I am wanting to use Capacitor to access the camera and allow users to take photos. Then, the photos will be passed to the tensorflowjs model in assets to get and display a prediction for that user.
Here is my typescript code:
import { Component, OnInit, ViewChild, ElementRef, Renderer2 } from '#angular/core';
import { Plugins, CameraResultType, CameraSource} from '#capacitor/core';
import { DomSanitizer, SafeResourceUrl} from '#angular/platform-browser';
import { Platform } from '#ionic/angular';
import * as tf from '#tensorflow/tfjs';
import { rendererTypeName } from '#angular/compiler';
import { Base64 } from '#ionic-native/base64/ngx';
import { defineCustomElements } from '#ionic/pwa-elements/loader';
const { Camera } = Plugins;
#Component({
selector: 'app-predict',
templateUrl: './predict.page.html',
styleUrls: ['./predict.page.scss'],
})
export class PredictPage{
linearModel : tf.Sequential;
prediction : any;
InputTaken : any;
ctx: CanvasRenderingContext2D;
pos = { x: 0, y: 0 };
canvasElement : any;
photo: SafeResourceUrl;
model: tf.LayersModel;
constructor(public el : ElementRef , public renderer : Renderer2 , public platform : Platform, private base64: Base64,
private sanitizer: DomSanitizer)
{}
async takePicture() {
const image = await Camera.getPhoto({
quality: 90,
allowEditing: true,
resultType: CameraResultType.DataUrl,
source: CameraSource.Camera});
const model = await tf.loadLayersModel('src/app/assets/model.json');
this.photo = this.sanitizer.bypassSecurityTrustResourceUrl(image.base64String);
defineCustomElements(window);
const pred = await tf.tidy(() => {
// Make and format the predications
const output = this.model.predict((this.photo)) as any;
// Save predictions on the component
this.prediction = Array.from(output.dataSync());
});
}
}
In this code, I have imported the necessary tools. Then, I have my constructor function and a takepicture() function. In the takepicture function, I have included functionality for the user to take pictures. However, I am having trouble with passing the pictures taken to the tensorflowjs model to get a prediction. I am passing the picture taken to the tensorflowjs model in this line of code:
const output = this.model.predict((this.photo)) as any;
However, I am getting an error stating that:
Argument of type 'SafeResourceUrl' is not assignable to parameter of type 'Tensor | Tensor[]'.\n Type 'SafeResourceUrl' is missing the following properties from type 'Tensor[]': length, pop, push, concat, and 26 more.
It would be appreciated if I could receive some guidance regarding this topic.
The model is expecting you to pass in a Tensor input, but you're passing it some other image format that isn't in the tfjs ecosystem. You should first convert this.photo to a Tensor, or perhaps easier, convert image.base64String to tensor.
Since you seem to be using node, try this code
// Move the base64 image into a buffer
const b = Buffer.from(image.base64String, 'base64')
// get the tensor
const image = tf.node.decodeImage(b)
// Compute the output
const output = this.model.predict(image) as any;
Other conversion solutions here: convert base64 image to tensor
I have generated a LSTM model for audio classification using keras with tf as the backend. Upon conversion to a .mlmodel using coremltools I am running into issues as you can see here. The dimensions are very different from what is expected.
I used this for my base in xcode in swift.
Particularly this snip is what I believe is giving me the trouble:
do {
let request = try SNClassifySoundRequest(mlModel: soundClassifier.model)
try analyzer.add(request, withObserver: resultsObserver)
} catch {
print("Unable to prepare request: \(error.localizedDescription)")
return
}
}
Running this model gives me the following error:
Invalid model, inputDescriptions.count = 5
Unable to prepare request: Invalid model, inputDescriptions.count = 5
Even though when I build the model I see what is expected in the spec:
description {
input {
name: "audioSamples"
shortDescription: "Audio from microphone"
type {
multiArrayType {
shape: 13
dataType: DOUBLE
}
}
}
I am trying to incorporate this post into my code but I am not sure how to format it to my needs. Any advice is greatly appreciated. I can see that MLMultiArray is the key to my question, but I am unsure of: how to put the proper data into it and how to push this into a SNClassifySoundRequest type.
keras == 2.3.1
coremltools == 3.3
When you use SNClassifySoundRequest, your model needs to have a certain structure. I don't know the exact details off the top of my head, but I think it needs to be a pipeline where the first model is a built-in model that converts the audio to spectrograms.
If you trained your model with Keras, it's most likely not compatible with the requirements of SNClassifySoundRequest.
The good news is that you don't need SNClassifySoundRequest to run your model. Simply call soundClassifier.prediction(...) on the model.
Note that you need to pass in the input but also the hidden states of the LSTM layers. Core ML will not automatically manage the LSTM state for you (unlike Keras).
I have installed caffe, uncommenting
WITH_PYTHON_LAYER=1
in 'Makefile.config'
When I use a python data layer in my net.prototxt, it says
Unknown layer type: Python
To cross check it in python interface,
I tried
import caffe
from caffe import layers as L
L.Python()
this seems to work ,no error then.
Where is the problem?
You can find out what layer types caffe has in python simply by examining caffe.layer_types_list(). For example, if you actually have a "Python" layer, then
list(caffe.layer_type_list()).index('Python')
Should actually return an index for its name in the layer types list.
As for L.Python() - this caffe.NetSpec() interface is used to programatically write a net prototxt, and at the writing stage layer types are not checked. You can actually write whatever layer you want:
L.YouDontThinkTheyNameALayerLikeThis()
Is totally cool. Even converting it to prototxt:
print "{}".format(L.YouDontThinkTheyNameALayerLikeThis().to_proto())
Actually results with this:
layer {
name: "YouDontThinkTheyNameALayerLikeThis1"
type: "YouDontThinkTheyNameALayerLikeThis"
top: "YouDontThinkTheyNameALayerLikeThis1"
}
You'll get an error message once you try to run this "net" using caffe...
I'm reading DICOM series using itk and converting them into VTK for visualization purposes. Even though I manage to visualize DICOM series in 3 different windows with 3 different orientations (XY, XZ and YZ) I can't even click on the windows. When I click or try to change the slice viewed, my code gives an access violation error. I'm using ImageViewer2 to visualize the slices. A file called itkVTKImageExportBase.cxx is opened when I try to find out what the error is. The lines referred to are:
void VTKImageExportBase::UpdateInformationCallbackFunction(void* userData)
{
static_cast<VTKImageExportBase*>
(userData)->UpdateInformationCallback();
}
My code is as follows:
typedef itk::VTKImageExport< ImageType > ExportFilterType;
ExportFilterType::Pointer itkExporter = ExportFilterType::New();
itkExporter->SetInput( reader->GetOutput() );
// Create the vtkImageImport and connect it to the itk::VTKImageExport instance.
vtkImageImport* vtkImporter = vtkImageImport::New();
ConnectPipelines(itkExporter, vtkImporter);
pViewerXY->SetInput(vtkImporter->GetOutput());
pViewerXY->SetSlice(3);
pViewerXY->SetSliceOrientationToXY();
pViewerXY->SetupInteractor(m_pVTKWindow_1);
pViewerXY->UpdateDisplayExtent();
m_pVTKWindow_1->AddObserver(vtkCommand::KeyPressEvent, m_pVTKWindow_1_CallbackCommand);
m_pVTKWindow_1->Update();
pViewerXZ->SetInput (vtkImporter->GetOutput());
pViewerXZ->SetSliceOrientationToXZ();
pViewerXZ->SetupInteractor(m_pVTKWindow_2);
pViewerXZ->UpdateDisplayExtent();
m_pVTKWindow_2->AddObserver(vtkCommand::KeyPressEvent, m_pVTKWindow_2_CallbackCommand);
m_pVTKWindow_2->Update();
pViewerYZ->SetInput (vtkImporter->GetOutput());
pViewerYZ->SetSliceOrientationToYZ();
pViewerYZ->SetupInteractor(m_pVTKWindow_3);
pViewerYZ->UpdateDisplayExtent();
m_pVTKWindow_3->AddObserver(vtkCommand::KeyPressEvent, m_pVTKWindow_3_CallbackCommand);
m_pVTKWindow_3->Update();
pViewerXX windows are imageviewer2 objects whereas m_pVTKWindow_X refers to wxVTK objects to use in wxWidgets GUI package.
Optional: My exporter and importer are as given below:
template <typename ITK_Exporter, typename VTK_Importer>
void ConnectPipelines(ITK_Exporter exporter, VTK_Importer* importer)
{
importer->SetUpdateInformationCallback(exporter->GetUpdateInformationCallback());
importer->SetPipelineModifiedCallback(exporter->GetPipelineModifiedCallback());
importer->SetWholeExtentCallback(exporter->GetWholeExtentCallback());
importer->SetSpacingCallback(exporter->GetSpacingCallback());
importer->SetOriginCallback(exporter->GetOriginCallback());
importer->SetScalarTypeCallback(exporter->GetScalarTypeCallback());
importer->SetNumberOfComponentsCallback(exporter->GetNumberOfComponentsCallback());
importer->SetPropagateUpdateExtentCallback(exporter->GetPropagateUpdateExtentCallback());
importer->SetUpdateDataCallback(exporter->GetUpdateDataCallback());
importer->SetDataExtentCallback(exporter->GetDataExtentCallback());
importer->SetBufferPointerCallback(exporter->GetBufferPointerCallback());
importer->SetCallbackUserData(exporter->GetCallbackUserData());
}
/**
* This function will connect the given vtkImageExport filter to
* the given itk::VTKImageImport filter.
*/
template <typename VTK_Exporter, typename ITK_Importer>
void ConnectPipelines(VTK_Exporter* exporter, ITK_Importer importer)
{
importer->SetUpdateInformationCallback(exporter->GetUpdateInformationCallback());
importer->SetPipelineModifiedCallback(exporter->GetPipelineModifiedCallback());
importer->SetWholeExtentCallback(exporter->GetWholeExtentCallback());
importer->SetSpacingCallback(exporter->GetSpacingCallback());
importer->SetOriginCallback(exporter->GetOriginCallback());
importer->SetScalarTypeCallback(exporter->GetScalarTypeCallback());
importer->SetNumberOfComponentsCallback(exporter->GetNumberOfComponentsCallback());
importer->SetPropagateUpdateExtentCallback(exporter->GetPropagateUpdateExtentCallback());
importer->SetUpdateDataCallback(exporter->GetUpdateDataCallback());
importer->SetDataExtentCallback(exporter->GetDataExtentCallback());
importer->SetBufferPointerCallback(exporter->GetBufferPointerCallback());
importer->SetCallbackUserData(exporter->GetCallbackUserData());
}
I don't have an exact answer to your question, but have you considered using one of the common medical image processing frameworks? There are a couple of them like MITK (mitk.org) or Slicer3D (slicer.org). They do a great job of linking together ITK, VTK and a sophisticated GUI Framework like QT (in the MITK case).
I have worked a long time in medical image processing and used MITK extensively. In my opinion, using a medical image processing framework really helps you focus on your real image processing problems, rather than trying to build processing/visualization pipelines for different types of visualizations.
If you look at InsightApplications, there are two methods:
The same as you tried, which is here, or
This one, which creates a pipeline object that can be connected on both sides. We're actually using this, and it works out quite well for us. You can copy this class into your code and use it.
There are some interesting usage examples in there too. Take a look at them and see if you can modify anything for your requirement.
Classes in the ITKVTkGlue module can be used to convert an ITK image to in a pipeline. See the tests for examples of how the classes are applied.