Can IBM Cloud Watson recognize the same person across multiple images?

Can IBM Cloud Watson recognize the same person across multiple images? - ibm-cloud

I want to create a gallery service that clusters images based on different characteristics, chief among them being faces matched across multiple images.
I've been considering the IBM Cloud for this, but i can't find a definitive yes or no answer to whether Watson supports Face recognition (on top of detection) so the same person is identified across multiple images, like AWS Rekognition and Azure CognitiveServices Face API do.
The concrete scenario i want to implement is this: Given photos A.jpg and B.jpg Watson should be able to tell that A.jpg has a face corresponding to person X, and B.jpg has another face that looks similar to the one in A.jpg. Ideally, it should do this automatically and give me face id values for each detected face.
Has anyone tackled this with Watson before? Is it doable in a simple manner without much code or ML techniques on top of the vanilla Watson face detection?

I have used Watson to do basic face detection on the CLI. Are you wanting to recognize a particular individual after training on images of that individual? Could you clarify your question. Here is what I can answer, if you have a Watson API key you can run this for example on a terminal:
curl -X POST --form "images_file=#path/to/image.jpg" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/detect_faces?api_key={your_api_key}&version=2016-05-20"
That will recognize the individual in the photo and give other categorical information about the person such as age, sex, etc. And if they are famous it will give further category information.

Related

Does the Mlflow platform (or any other you can suggst) fits as an experiment management tool for image processing CNNs on NVIDIA TAO workframe

we would like to make sure that the MLFLOW experiment management platform fits our needs and workflow.
We work with image processing CNNs like Yolo, UNET, and RetinaNet based on an NVIDIA TAO framework.
What we actually need is a tool that concentrates on one place (in a nice and representative way comfortable for comparison) at least the three following things for each experiment:
a- chosen by user typical meta parameters that were used to train a network (such as batches, subdivisions, max batches, etc)
b- a link to the dataset the network was trained on, located on our cloud storage (such as one-drive, google drive or google cloud) or a list of filenames or a link to a file storage cloud or online drive suggested by MLFLOW service if there is such a thing.
c- a result of running the trained network - the number of detected objects
Thus the question is:
Does the MLFLOW fit our needs?
If not ill be glad if anyone could suggest a relevant alternative.
Thank you

I use Comet.ml and it addresses all 3 of your points.
With Comet, parameter tracking is as easy as calling the experiment.log_parameter function. You can also use the diff tool to compare two experiments by their hyper parameters and even group experiments by hyper-paremeters!
Comet has the concept of artifacts. You can upload your dataset as an artifact and version it. You can also have remote artifacts!
Comet has a feature call the Image Panel. This allows users to visualize their model performance on the data across different experiment runs. For the object detection use case, use experiment.log_image to log your images in which you have you drawn your model predicted bounding box on! You will then see in the Image Panel different experiments and how they each one of them are drawing their predictions side by side

How to detect more than one intent with IBM Watson Assistant?

Can the IBM Watson Conversation / Assistant service detect more than one intention in a single sentence?
Example of input:
play music and turn on the light
Intent 1 is #Turn_on
Intent 2 is #Play
==> the answer must be simultaneous for both intents: Music played and light turned on
If so, how can I do that?

Yes, Watson Assistant returns all detected intents with their associated confidence. See here for the API definition. In the response returned by Watson Assistant is n array of intents recognized in the user input, sorted in descending order of confidence.
The documents have an example on how to deal with multiple intents and their confidence. Be also aware of a setting alternate_intents to allow even more intents with lower confidence to be returned.

While #data_henrik is correct in how to get the other intents, it doesn't mean that the second question is related.
Take the following example graph, where we map the intents versus confidence that comes back:
Here you can clearly see that there are two intents in the persons question.
Now look at this one:
You can clearly see that there is only one intent.
So how do you solve this? There are a couple of ways.
You can check if the first and second intent fall within a certain percentage of each other. This is the easiest to detect, but tricker to code to select two different intents. It can get messy, and you will sometimes get false positives.
At the application layer you can do a K-Means on the intent result. K-Means will allow you to group intents by buckets, so you create two buckets (K=2), and if there is more than one in the first bucket, you have a compound question. I wrote about this and a sample on my site.
There is a new feature you can play with in Beta called "Disambiguation". This allows you to flag intent nodes with a question to ask to get it. Then if two questions are found it will say "Did you mean? ...." and the user can select.

IS this disambiguation feature available in non production environments, on Beta?

Image recognition services [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I'm now current making a mobile application. I want to find a service which I'll upload image of my defined objects : Airplane, computer, ... and when users use the app, they take a picture of the object I already defined in service, the service will tell him/her about the object information, such as: Akai's computer, Akai's laptop, ...
I wonder if there is any image recognition which provides database for inputting images as sample data with information to help me to achieve or not.
Thank you,

There is an important tradeoff here at play. There are two scenarios:
You have relatively few categories (objects for which the user can take an image) and multiple example images for each category. You have plenty of options from the realm of machine learning (neural network frameworks like caffe or Tensorflow). But if you want things to work with relatively small number of examples (you should still have at least tens per category), the easiest way is to use an external API like vize.it where you can set up the categories via a web interface and have the image recognizer hosted externally and accessed via a REST API.
You have many categories and just one or a few examples for each category. I'm personally not aware of any pre-made solutions to such a problem. My approach would be to use a pre-trained convolutional neural network to process the images, using the hidden representation near the top of such network (very much like what is used e.g. on the image side of automated image captioning - example code), and train a classifier that takes a pair of images processed this way and outputs a [0,1] scalar that represents how close the images are. I have experimented with that approach for comparing sentences and that works pretty well, but I expect you will need a big dataset.
Disclaimer: I'm a co-author of vize.it.

when users use the app, they take a picture of the object I already defined in service, the service will tell him/her about the object information, such as: Akai's computer, Akai's laptop, ...
Since your user is trying to identify an instance of an object, and retreive metadata about it, the Watson Visual Recognition service's similarity search feature may be a good fit. It is a beta service, which is free for the time being.
You can add photos with associated metadata (like strings "Akai's computer") into a collection, which indexes the images by their visual appearance. You can then query the collection with the "find_similar" method to retrieve the image ids and metadata from the most visually similar images.
Here is a demo: https://similarity-search-demo.mybluemix.net/ That page also includes a link to the API reference. Watson VR also includes custom classifier training, which you might find interesting.

Custom datasets in Watson Q&A service

Is there a way to create a custom dataset in Watson for use with services such as Question and Answer?
I tried the service using the 'healthcare' dataset and it was very limited. I could ask it any of the questions that were suggested by the IBM team (ex. What is HIV?) and get satisfactory results but straying from that list produced unreliable results. For example I asked it 'How can I lower my blood sugar' and none of the results even mentioned blood sugar. This makes me wonder how in-depth the healthcare dataset is and if there is a way we can add to it or create new datasets.

While the service is in BETA, there is no way to bring a custom corpus (dataset). This feature is being planned and should be available soon.

Is there a web service that performs object recognition using an artificial neural network?

I'd like to have a program call a web service that performs object recognition using a neural network. Is there a web service that can do this?

There is the "Google Prediction API" available as a web service - although there is little information on what models they use.

Google has a service called "Search By Image". It's not just a tineye clone, but rather will give a suggestion for the type of object in the image you upload or provide URL for. It's not clear whether or not they use a neural network although Google researchers have been known to work on machine learning of course (like the recent stacked autoencoder that classified cats, human faces, bodies etc).
http://www.google.com/insidesearch/features/images/searchbyimage.html
Their description from help:
How it works
Google uses computer vision techniques to match your image to other images in the Google Images index and additional image collections. From those matches, we try to generate an accurate "best guess" text description of your image, as well as find other images that have the same content as your search image. Your search results page can show results for that text description as well as related images."
Here's an example where it correctly classified this image as "1 cat"

Check out Strong Steam:
StrongSteam is an AppStore of artificial intelligence and
data mining APIs to let you pull interesting information
out of images, video and audio.