Custom datasets in Watson Q&A service - ibm-cloud

Is there a way to create a custom dataset in Watson for use with services such as Question and Answer?
I tried the service using the 'healthcare' dataset and it was very limited. I could ask it any of the questions that were suggested by the IBM team (ex. What is HIV?) and get satisfactory results but straying from that list produced unreliable results. For example I asked it 'How can I lower my blood sugar' and none of the results even mentioned blood sugar. This makes me wonder how in-depth the healthcare dataset is and if there is a way we can add to it or create new datasets.

While the service is in BETA, there is no way to bring a custom corpus (dataset). This feature is being planned and should be available soon.

Related

aws-presonalize: can I get recommendations on items not seen in training based on item features?

I consider using aws personalize, or any similar managed recommendation service.
My question is whether it is possible to get recommendations/rankings on items that were not seen in the training data, based on item features. I see that aws personalize does have item feature dataset, but when I read the documentation about ranking recipe it specifically says that items not in the training are added at the end of any ranking. of course - new items have no interaction data, so any recipe/algorithm that solely relies on interaction data is not relevant for my case.
My question is, whether and how can I utilize aws personalize to my use case, if at all possible, or whether you know of any recommender service that can handle it.
Yes. There are specific Amazon Personalize recipes designed to support cold starting items where a cold item is one without behavioral data in the interactions dataset but with item metadata in the items dataset.
The User-Personalization recipe supports cold starting items through a feature called exploration. You control how much exploration (i.e., recommending cold items) is done with the explorationWeight inference hyperparameter when creating a Personalize campaign or batch inference job. See this blog post for details.
Exploration also applies to domain recommenders for the Top picks for you VOD recommender and Recommended for you e-commerce recommender. You specify the explorationWeight when creating a recommender.
The Similar-Items recipe supports the related items use case and looks to balance recommending similar items based on behavioral data and thematic similarity between items. You currently cannot control the weighting with this recipe, though. See this blog post for details. The More like X VOD recommender provides similar functionality.

Creating a simple Q&A Chatbot with Amazon Lex with Predefined questions and answers

I am doing some research on potential options for building a chatbot. I am currently evaluating amazon Lex. The requirements for the bot are quite simple, a user can ask where to find something, the bot will tell them where in a document they will find the answer. All of these questions and answers have already been captured manually so we can easily have an excel sheet with question and answer.
Is there some way to input these pre-defined questions and responses into Lex? From my research I am having a hard time finding any info on something this basic. It won't really require any back and forth between the user and bot, (for ex. User: 'I need to order flowers' Bot: 'What kind of flowers?" etc.)
I have seen some info on incorporating Kendra, but I don't think the requirement is sophisticated enough to warrant using it
Ideally I would love to just hardcode it and say this is a question, and this is the response that should be given. Maybe this use case does not need something as powerful as Lex?
Lex can solve your problem at a fraction of the cost of Kendra.
Having said that, Kendra would be easier to work with when compared to Lex.
If you're got some Python capabilities I would recommend you take a look at the ExcelLexBot repo on GitHub. It is a Serverless Application that reads input from an Excel spreadsheet to build up a basic Lex bot for you.

Activiti and Drools ... is one enough?

I have been asked to start exploring a Activiti tool for some client demo.
The demo will also have JBoss Drools with which Activiti will be integrated.
I am new to both of these tools and business process world, so excuse me if the question is dumb.
The question is why do you need Drools? Isn't Activiti enough for the job?
Both of them have conditional elements so why do you need Activiti on top of drools?
This question doesn't quite fit the purpose of StackOverflow, so don't be surprised if you get a few flags. But I'll try to give a short answer.
Activity is a workflow engine, Drools is a business rules engine. They serve two different purposes.
Workflow engines are useful when you have a flow of actions of different actors that need to be controlled programmatically.
Rules engines are useful when you have business rules for executing some task automatically that you want to describe in a declarative way.
Both purposes are orthogonal to each other, meaning that the problem you have to solve may require none, just one, or both of them.
Imagine a workflow where a customer reports an incident, some experts have to work on it, and finally a bill gets produced, but no heavy algorithms are behind those tasks. That might be supported by a workflow engine without a rules engine.
Imagine a complex price model for a product, like cars having all sorts of special features that may be ordered. (Hifi speakers cost 400 €, except if the executive version of the car is ordered, where they only cost 200 € if ordered in combination with smartphone adapter...) Here a rules engine may be useful, although nobody talked about a workflow, so no workflow engine is needed.
Imagine the first example (incident workflow) together with a complex billing scheme. Here both tools may be used.
I wonder why these two types of tools are in some places described as perfectly fitting together. (Maybe this kind of claim motivated your question.) They serve two different purposes, and whether you need them both depends on the problem you have to solve.

MongoDB + Neo4J vs OrientDB vs ArangoDB [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I am currently on design phase of a MMO browser game, game will include tilemaps for some real time locations (so tile data for each cell) and a general world map. Game engine I prefer uses MongoDB for persistent data world.
I will also implement a shipping simulation (which I will explain more below) which is basically a Dijkstra module, I had decided to use a graph database hoping it will make things easier, found Neo4j as it is quite popular.
I was happy with MongoDB + Neo4J setup but then noticed OrientDB , which apparently acts like both MongoDB and Neo4J (best of both worlds?), they even have VS pages for MongoDB and Neo4J.
Point is, I heard some horror stories of MongoDB losing data (though not sure it still does) and I don't have such luxury. And for Neo4J, I am not big fan of 12K€ per year "startup friendly" cost although I'll probably not have a DB of millions of vertexes. OrientDB seems a viable option as there may be also be some opportunities of using one database solution.
In that case, a logical move might be jumping to OrientDB but it has a small community and tbh didn't find much reviews about it, MongoDB and Neo4J are popular tools widely used, I have concerns if OrientDB is an adventure.
My first question would be if you have any experience/opinion regarding these databases.
And second question would be which Graph Database is better for a shipping simulation. Used Database is expected to calculate cheapest route from any vertex to any vertex and traverse it (classic Dijkstra). But also have to change weights depending on situations like "country B has embargo on country A so any item originating from country A can't pass through B, there is flood at region XYZ so no land transport is possible" etc. Also that database is expected to cache results. I expect no more than 1000 vertexes but many edges.
Thanks in advance and apologies in advance if questions are a bit ambiguous
PS : I added ArangoDB at title but tbh, hadn't much chance to take a look.
Late edit as of 18-Apr-2016 : After evaluating responses to my questions and development strategies, I decided to use ArangoDB as their roadmap is more promising for me as they apparently not trying to add tons of hype features that are half baked.
Disclaimer: I am the author and owner of OrientDB.
As developer, in general, I don't like companies that hide costs and let you play with their technology for a while and as soon as you're tight with it, start asking for money. Actually once you invested months to develop your application that use a non standard language or API you're screwed up: pay or migrate the application with huge costs.
You know, OrientDB is FREE for any usage, even commercial. Furthermore OrientDB supports standards like SQL (with extensions) and the main Java API is the TinkerPop Blueprints, the "JDBC" standard for Graph Databases. Furthermore OrientDB supports also Gremlin.
The OrientDB project is growing every day with new contributors and users. The Community Group (Free channel to ask support) is the most active community in GraphDB market.
If you have doubts with the GraphDB to use, my suggestion is to get what is closer to your needs, but then use standards as more as you can. In this way an eventual switch would have a low impact.
It sounds as if your use case is exactly what ArangoDB is designed for: you seem to need different data models (documents and graphs) in the same application and might even want to mix them in a single query. This is where a multi-model database as ArangoDB shines.
If MongoDB has served you well so far, then you will immediately feel comfortable with ArangoDB, since it is very similar in look and feel. Additionally, you can model graphs by storing your vertices in one (or multiple) collections, and your edges in one or more so-called "edge-collections". This means that individual edges are simply documents in their own right and can hold arbitrary JSON data. The database then offers traversals, customizable with JavaScript to match any needs you might have.
For your variations of the queries, you could for example add attributes about these embargos to your vertices and program the queries/traversals to take these into account.
The ArangoDB database is licensed under the Apache 2 license, and community as well as professional support is readily available.
If you have any more specific questions do not hesitate to ask in the google group
https://groups.google.com/forum/#!forum/arangodb
or contact
hackers (at) arangodb.org
directly.
Neo4j's pricing is actually quite flexible, so don't be put away by the prices on the website.
You can also get started with the community edition or personal edition for a long time.
The Neo4j community is very active and helpful and quickly provide support and help for your questions. I think that's the biggest plus besides performance and convenience. I
n general using a graph model
Regarding your use-case:
Neo4j is used exactly for this route calculation scenario by one of the largest logistic companies in the world where it routes up to 4000 packages per second across the country.
And it is used in other game engines, like here at GameSys for game economy simulation and in another one for the routing (not in earth coordinates but in game-world-coordinates using Neo4j-Spatial).
I'm curious why you have only that few nodes? Are those like transport portals? I wonder where you store the details and the dynamics about the routes (like the criteria you mentioned) are they coming from the outside - in memory state of the game engine?
You should probably share some more details about your model and the concrete use-case.
And it might help to know that both Emil, one of the founders of Neo4j and I are old time players of multi user dungeons (MUDs), so it is definitely a use-case close to our heart :)

Business Rules Engine - Discrete Choice Modeling

Greetings,
I am currently in search of a framework that could be used in the development of a system that will find the best option based on a series of responses provided by a user, in a closed survey format.
Our company offers several service plans, and the idea behind this system is that the user can respond to questions (text format), and these answers can be mapped to the service plan that best meets the customer's needs. Each service plan has several attributes, and these attributes change over time, so we are looking for a flexible solution.
Would a Business Rule Engine be an adequate framework for this type of problem?
thank you!
You could do this with a rules engine.
However, what you are actually building is a survey. There are lots of survey software that allows for the definition of conditionaly logic and branching within the survey.
It would probably be cheaper and faster for you to use survey software.