Is there a common format for neural networks - neural-network

Different teams use different libraries to train and run neural networks (caffe, torch, theano...). This makes sharing difficult: each library has its own format to store networks and you have to install a new library each time you want to test other teams' work.
I am looking for solutions to make this less tedious:
- Is there a preferred (shared?) format to store neural networks?
- Is there a service or library that can help handle different types of networks / or transform one type into another?
Thank you!

Is there a preferred (shared?) format to store neural networks?
Each library / framework has its own serialization, e.g. Caffe uses Protocol Buffers, Torch has a built-in serialization scheme and Theano objects can be serialized with pickle.
In some cases like OverFeat or darknet the weights and biases are stored on-disk in binary format via plain fwrite-s of the corresponding float(or double) contiguous arrays (see this answer for more details). Note that this does not cover the architecture of the network / model which has to be known or represented separately (like declared explicitly at load time).
Also: a library like libccv stores the structure and the weights in a SQLite database.
Is there a service or library that can help handle different types of networks / or transform one type into another?
I don't think there is a single (meta) library that claims to do so. But it exists distinct projects that provide convenient converters.
Some examples (non exhaustive):
Caffe -> Torch: https://github.com/szagoruyko/loadcaffe
Torch -> Caffe: https://github.com/facebook/fb-caffe-exts
Caffe -> TensorFlow: https://github.com/ethereon/caffe-tensorflow
--
UPDATE (2017-09): two noticeable initiatives are:
(1) the ONNX format (a.k.a. Open Neural Network Exchange):
[...] a standard for representing deep learning models that enables models to be transferred between frameworks
See these blog posts.
(2) the CoreML format introduced by Apple:
[...] a public file format (.mlmodel) for a broad set of ML methods [...] Models in this format can be directly integrated into apps through Xcode.

ONNX (Open Neural Network Exchange)
ONNX is an open-source AI ecosystem that provides a common format for neural networks.
It helps converting a deep learning model to another.
Generally, model conversion takes weeks/months without ONNX:
ONNX offers simpler and faster conversion process:
For all supported conversions, see here.
It makes deployment easier, models stored in a much preferable way:
In the above image, ONNX acts as a data ingestion layer, transforms each input model to the same format. Otherwise, all models would be like a bunch of puzzle pieces that do not fit each other.
How to Use ONNX - Keras Conversion Example
Let's say you have your Keras model and you want to transform it to ONNX:
model = load_model("keras_model.hdf5") # h5 is also OK!
onnx_model = keras2onnx.convert_keras(model, model.name)
onnx_model_file = 'output_model.onnx'
onnx.save_model(onnx_model, onnx_model_file)
Then load & run saved model :
onnx_model = onnx.load_model('output_model.onnx')
content = onnx_model.SerializeToString()
sess = onnxruntime.InferenceSession(content)
x = x if isinstance(x, list) else [x]
feed = dict([(input.name, x[n]) for n, input in enumerate(sess.get_inputs())])
# Do inference
pred_onnx = sess.run(None, feed)
This example uses keras2onnx to convert the Keras model and onnxruntime to do inference.
Note: There are also lots of pre-trained models in the ONNX format. Check this out!
References:
1. https://towardsdatascience.com/onnx-made-easy-957e60d16e94
2. https://blog.codecentric.de/en/2019/08/portability-deep-learning-frameworks-onnx/
3. http://on-demand.gputechconf.com/gtc/2018/presentation/s8818-onnx-interoperable-deep-learning-presented-by-facebook.pdf
4. https://devblogs.nvidia.com/tensorrt-3-faster-tensorflow-inference/

Related

what machine learning algorithm is used while creating a text classifier model using createML?

I'm creating a text classification model for sentiment analysis, I would like to know what machine learning algorithm is used by createML here?
You can set the algorithm used by the classifier or specify the language which you would like to classify in the parameters of the initializer. See the developer documentation in Xcode under CreateML > MLTextClassifier > MLTextClassifier.ModelParameters
init(validation: MLTextClassifier.ModelParameters.ValidationData, algorithm: MLTextClassifier.ModelAlgorithmType, language: NLLanguage?)
under the parameter algorithm set the algorithm you would like to use.
In developer documentation under the enum MLTextClassifier.ModelAlgorithmType you will find what is available. For example: case crf(revision: Int?) is a conditional random field model algorithm.

How to create a "Denoising Autoencoder" in Matlab?

I know Matlab has the function TrainAutoencoder(input, settings) to create and train an autoencoder. The result is capable of running the two functions of "Encode" and "Decode".
But this is only applicable to the case of normal autoencoders. What if you want to have a denoising autoencoder? I searched and found some sample codes, where they used the "Network" function to convert the autoencoder to a normal network and then Train(network, noisyInput, smoothOutput)like a denoising autoencoder.
But there are multiple missing parts:
How to use this new network object to "encode" new data points? it doesn't support the encode().
How to get the "latent" variables to the features, out of this "network'?
I appreciate if anyone could help me resolve this issue.
Thanks,
-Moein
At present (2019a), MATALAB does not permit users to add layers manually in autoencoder. If you want to build up your own, you will have start from the scratch by using layers provided by MATLAB;
In order to to use TrainNetwork(...) to train your model, you will have you find out a way to insert your data into an object called imDatastore. The difficulty for autoencoder's data is that there is NO label, which is required by imDatastore, hence you will have to find out a smart way to avoid it--essentially you are to deal with a so-called OCC (One Class Classification) problem.
https://www.mathworks.com/help/matlab/ref/matlab.io.datastore.imagedatastore.html
Use activations(...) to dump outputs from intermediate (hidden) layers
https://www.mathworks.com/help/deeplearning/ref/activations.html?searchHighlight=activations&s_tid=doc_srchtitle
I swang between using MATLAB and Python (Keras) for deep learning for a couple of weeks, eventually I chose the latter, albeit I am a long-term and loyal user to MATLAB and a rookie to Python. My two cents are that there are too many restrictions in the former regarding deep learning.
Good luck.:-)
If you 'simulation' means prediction/inference, simply use activations(...) to dump outputs from any intermediate (hidden) layers as I mentioned earlier so that you can check them.
Another way is that you construct an identical network but with the encoding part only, copy your trained parameters into it, and feed your simulated signals.

Object detection for a single object only

I have been working with object detection. But these methods consist of very deep neural networks and require lots of memory to store the trained models. E.g. I once tried to train a Mask R-CNN model, and the weights take 200 MB.
However, my focus is on detecting a single object only. So, I guess these methods are not suitable. Are there any object detection method that can do this job with a low memory requirement?
You can try SSD or faster RCNN they are easily available in Tensorflow object detection API
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
here you can get pre-trained models and config file
you can select your model by taking look on speed and mAP(accuracy) column as per your requirement.
Following mukul's answer, I specifically recommend you check out SSDLite-MobileNetV2.
It's a lite-weight model, which is still enough expressive for good results.
Especially when you're restricting yourself to a single class, as you can see in the example of FaceSSD-MobileNetV2 as in here (Note however this is vanilla SSD).
So you can simply Take the pre-trained model of SSDLite-MobileNetV2 with the corresponding config file, and modify it for a single class.
This means changing num_classes to 1, modifying the label_map.pbtxt, and of course - preparing the dataset with the single class you want.
If you want a more robust model, but which has no pre-trained mode, you can use an FPN version.
Checkout this config file, which is with MobileNetV1, and modify it for your needs (e.g. switching to MobileNetV2, switching to use_depthwise, etc).
On one hand, there's no detection pre-trained model, but on the other the detection head is shared over all (relevant) scales, so it's somewhat easier to train.
So simply fine-tune it from the corresponding classification checkpoint from here.

Spark-mllib retraining saved models

I am trying to make a classification with spark-mllib, especially using RandomForestModel.
I have taken a look on this example from spark (RandomForestClassificationExample.scala), but I need a somewhat expanded approach.
I need to be able to train a model, save the model for future usage, but also to be able to load it and train further. Like, extend the dataset and train again.
I completely understand the need to export and import a model for future usage.
Unfortunately, training "further" isn't possible with Spark nor does it make sense. Thus it's recommended to retrain the model with the data from use to train the first model + new data.
Your first training values/metrics don't have much sense anymore if you want to add more data (e.g features, intercept, coefficients, etc.)
I hope that this answers your question.
You may need to look for some reinforcement learning technique instead of Random Forest if you want to use the old model and retrain it with new data.
That I know, there's deeplearning4j that implements deep reinforcement learning algorithms on top of Spark (and Hadoop).
If you only need to save JavaRDD[Object], you can do (in Java)
model.saveAsObjectFile()
Values will be writter out using Java Serialization. Then, to read your data you do:
JavaRDD[Object] model = jsc.objectFile(pathOfYourModel)
Be careful, object files are not available in Python. But you could use saveAsPickleFile() to write your model and pickleFile() to read it.

Deep-learning for mapping large binary input

this question may come as being too broad, but I will try to make every sub-topic to be as specific as possible.
My setting:
Large binary input (2-4 KB per sample) (no images)
Large binary output of the same size
My target: Using Deep Learning to find a mapping function from my binary input to the binary output.
I have already generated a large training set (> 1'000'000 samples), and can easily generate more.
In my (admittedly limited) knowledge of Neural networks and deep learning, my plan was to build a network with 2000 or 4000 input nodes, the same number of output nodes and try different amounts of hidden layers.
Then train the network on my data set (waiting several weeks if necessary), and checking whether there is a correlation between in- and output.
Would it be better to input my binary data as single bits into the net, or as larger entities (like 16 bits at a time, etc)?
For bit-by-bit input:
I have tried "Neural Designer", but the software crashes when I try to load my data set (even on small ones with 6 rows), and I had to edit the project save files to set Input and Target properties. And then it crashes again.
I have tried OpenNN, but it tries to allocate a matrix of size (hidden_layers * input nodes) ^ 2, which, of course, fails (sorry, no 117GB of RAM available).
Is there a suitable open-source framework available for this kind of
binary mapping function regression? Do I have to implement my own?
Is Deep learning the right approach?
Has anyone experience with these kind of tasks?
Sadly, I could not find any papers on deep learning + binary mapping.
I will gladly add further information, if requested.
Thank you for providing guidance to a noob.
You have a dataset containing pairs of binary valued vectors, with a max length of 4,000 bits. You want to create a mapping function between the pairs. On the surface, that doesn't seem unreasonable - imagine a 64x64 image with binary pixels – this only contains 4,096 bits of data and is well within the reach of modern neural networks.
As your dealing with binary values, then a multi-layered Restricted Boltzmann Machine would seem like a good choice. How many layers you add to the network really depends on the level of abstraction in the data.
You don’t mention the source of the data, but I assume you expect there to be a decent correlation. Assuming the location of each bit is arbitrary and is independent of its near neighbours, I would rule out a convolutional neural network.
A good open source framework to experiment with is Torch - a scientific computing framework with wide support for machine learning algorithms. It has the added benefit of utilising your GPU to speed up processing thanks to its CUDA implementation. This would hopefully avoid you waiting several weeks for a result.
If you provide more background, then maybe we can home in on a solution…