Is there a web service that performs object recognition using an artificial neural network? - neural-network

I'd like to have a program call a web service that performs object recognition using a neural network. Is there a web service that can do this?

There is the "Google Prediction API" available as a web service - although there is little information on what models they use.

Google has a service called "Search By Image". It's not just a tineye clone, but rather will give a suggestion for the type of object in the image you upload or provide URL for. It's not clear whether or not they use a neural network although Google researchers have been known to work on machine learning of course (like the recent stacked autoencoder that classified cats, human faces, bodies etc).
http://www.google.com/insidesearch/features/images/searchbyimage.html
Their description from help:
How it works
Google uses computer vision techniques to match your image to other images in the Google Images index and additional image collections. From those matches, we try to generate an accurate "best guess" text description of your image, as well as find other images that have the same content as your search image. Your search results page can show results for that text description as well as related images."
Here's an example where it correctly classified this image as "1 cat"

Check out Strong Steam:
StrongSteam is an AppStore of artificial intelligence and
data mining APIs to let you pull interesting information
out of images, video and audio.

Related

Does the Mlflow platform (or any other you can suggst) fits as an experiment management tool for image processing CNNs on NVIDIA TAO workframe

we would like to make sure that the MLFLOW experiment management platform fits our needs and workflow.
We work with image processing CNNs like Yolo, UNET, and RetinaNet based on an NVIDIA TAO framework.
What we actually need is a tool that concentrates on one place (in a nice and representative way comfortable for comparison) at least the three following things for each experiment:
a- chosen by user typical meta parameters that were used to train a network (such as batches, subdivisions, max batches, etc)
b- a link to the dataset the network was trained on, located on our cloud storage (such as one-drive, google drive or google cloud) or a list of filenames or a link to a file storage cloud or online drive suggested by MLFLOW service if there is such a thing.
c- a result of running the trained network - the number of detected objects
Thus the question is:
Does the MLFLOW fit our needs?
If not ill be glad if anyone could suggest a relevant alternative.
Thank you
I use Comet.ml and it addresses all 3 of your points.
With Comet, parameter tracking is as easy as calling the experiment.log_parameter function. You can also use the diff tool to compare two experiments by their hyper parameters and even group experiments by hyper-paremeters!
Comet has the concept of artifacts. You can upload your dataset as an artifact and version it. You can also have remote artifacts!
Comet has a feature call the Image Panel. This allows users to visualize their model performance on the data across different experiment runs. For the object detection use case, use experiment.log_image to log your images in which you have you drawn your model predicted bounding box on! You will then see in the Image Panel different experiments and how they each one of them are drawing their predictions side by side

iOS Facial Recognition Continuous Learning

I was tasked to find the best way to create a facial recognition feature on an app with machine learning. This feature will be used to clock employees into the app. The feature will support...
multiple users per device.
continuous training (so when the mlmodel recognizes someone, it will send new images to the model on the back-end and train the model with the new recently taken images)
updates new classes (when a new user comes along and wants to use the app, the app will take pictures of them, send those images to the model training program on the back-end which will train the mlmodel to recognize the new user)
sends the newly updated models to other devices in the same store so they will recognize the employees as well
What I've tried.
I've tinkered with on-device training and Knn. But from what I understand on-device training will not work for this, because the on-device training models can only have up to 10 classes and knn isn't giving very accurate results...at all
Manual training and re-training with the createML. This is when I...
train a model with createML on my mac
download the model to the app with URLSession
add a new user with the app or take updated pictures of old users
send the images of the new user/updated old user to createML on my mac
create a whole new model with all the images I've ever taken of all the users
repeat steps 2-5 forever
This works just fine but is unbelievably expensive, time-consuming, and unfeasible for the number of eventual users the app will have, to do over and over again.
I'm still very new to machine learning and I feel like I'm going about this the wrong way. I would just like to see if anyone knows of a better/more efficient method of continuous learning so the model will remember what it's learned previously and I can just add new classes or images to it with createML... or if someone could point me in the right direction.
Take a look at Turi Create -- also from Apple: https://github.com/apple/turicreate
It can do everything Create ML does, but is in python and programmable, so you could automate the whole process on your backend. If you know how to do it in CreateML, you will find Turi Create easy to pick up.
To have an accurate result you should look into more powerful machine learning models. Here is an example of a really powerful face recognition model: https://github.com/davidsandberg/facenet.
Now, the next question becomes how would you integrate your app with this new model. This is really up to you but I would recommend you to checkout a few backend alternatives like leveraging AWS Services (EC2 compute servers , Sagemaker, API Gateway, etc) to run and coordinate the inferences. A couple of benefits to doing this is that your app would just be mainly front-end thus making it light and also scalable across different and older IOS platforms and devices. But more importantly, it gives you extra leg-space to do more sophisticated things in the future, where as using CoreML you will be mainly limited to on-device computational power and also the swift-based language.
However, leveraging cloud services would also have other cons attached like the learning curve (learning AWS Services) and potentially privacy issues.
This is just one of the ways, there are many other similar cloud providers like Google , IBM and Azure. Without knowing further your timeline, budget, technical expertise, I can only give you these options and the rest of the choice is yours to make

Can IBM Cloud Watson recognize the same person across multiple images?

I want to create a gallery service that clusters images based on different characteristics, chief among them being faces matched across multiple images.
I've been considering the IBM Cloud for this, but i can't find a definitive yes or no answer to whether Watson supports Face recognition (on top of detection) so the same person is identified across multiple images, like AWS Rekognition and Azure CognitiveServices Face API do.
The concrete scenario i want to implement is this: Given photos A.jpg and B.jpg Watson should be able to tell that A.jpg has a face corresponding to person X, and B.jpg has another face that looks similar to the one in A.jpg. Ideally, it should do this automatically and give me face id values for each detected face.
Has anyone tackled this with Watson before? Is it doable in a simple manner without much code or ML techniques on top of the vanilla Watson face detection?
I have used Watson to do basic face detection on the CLI. Are you wanting to recognize a particular individual after training on images of that individual? Could you clarify your question. Here is what I can answer, if you have a Watson API key you can run this for example on a terminal:
curl -X POST --form "images_file=#path/to/image.jpg" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/detect_faces?api_key={your_api_key}&version=2016-05-20"
That will recognize the individual in the photo and give other categorical information about the person such as age, sex, etc. And if they are famous it will give further category information.

recognize a picture on the photo

My server receives photos from the client. The server is web-server in my case, but I don't mention any specific technology because I can choose any free technology that will provide me the solution. The photos are snapshots from the video streamed from the web-camera.
On some photos the server receives there is a colored picture
(always the same picture and the server has it in advance) on a
whitish background (wall).
Other photos may include any subjects on
any backgrounds.
I can't control the light in the room where the
pictures are made (it could be darker or lighter on different
photos).
When there is the picture in the photo - whole the picture is included in the photo (not just part)
of it.
When there is the picture in the photo - it takes a very significant part of the photo (i.e. the picture will be made close to the wall).
The picture on the photo could be a bit inclined/declined - let's say not more that 10°.
On the server side I should be able to decide (with a certain level of significance) whether the picture is in the photo.
I am looking for a quick and dirty solution for now (it's just POC). The library and the technology should be free.
I thought of using neutral network. In this case I could even "to train" the network off-line and once I have it tuned on I could use it on client side with javascript (the calculations shouldn't take a lot of time) without passing the photos to the server (that would be perfect).
Is there any ready solution for this problem?
Thanks a lot!
I think answers to this question: Looking for an Image Comparison/Pattern Recognition Library would be a good start.
I would certainly not constrain myself to neural networks. You will need some kind of classifier but I think it would be good to start thinking about how to extract features of the images. It may turn out to be a simple problem: e.g. distinguishing between a homogeneous whitish/grayish image (a wall) and a much more heterogeneous image - that is you compute just one feature - heterogeneity - and decide based on that. In that case maybe you even wouldn't need any special image recognition library.

transcribe a phone recording

There is a certain organization that periodically provides information in the form of a recorded message on a "hotline". Is there any open source solution (or set of components that could be "wired" together) that would allow me to present this information in text form on a web page?
Since it's the really easy part, I'm going to assume you can fetch the audio from the "hotline", i.e. you have direct access to the actual audio samples.
The hard part is transcribing the audio. You can start by having a look at Wikipedia and follow the links from there. One solution you could use would be CMU Sphinx. Google and other related search tools such as Google Scholar are likely to become your close friends :)
While there are a number of voice recognition engines available, their accuracy is far from perfect.