Can a AIML chatbot support unlimited queries from user? - chatbot

I wanted to make a chatbot to answer questions from user with AIML but i was wondering if there is any limitation because i think i will need that the chatbot supports unlimited queries because we don't know how big this project will be

Humans have a limit to the number of clients they can handle at once. However, with chatbots, there is no such constraint and they can handle as many queries as required at once.
You can use this kind of asset from the unity asset store to implement and customise the chatbot quickly.
Do not worry about query limits. That can be easily scaled up.
You can use the free A.L.I.C.E. AIML as well. It includes a knowledge base of approximately 41,000 categories. Here’s an example of one of them:
<category>
<pattern>WHAT ARE YOU</pattern>
<template>
<think><set name=”topic”>Me</set></think>
I am the latest result in artificial intelligence,
which can reproduce the capabilities of the human brain
with greater speed and accuracy.
</template>
</category>

I run Mitsuku, which is a very popular AIML chatbot. The website version alone deals with over a million interactions every month with no issues at all. It has over 350,000 categories and can deal with thousands of concurrent users at the same time.
It's hosted at www.pandorabots.com

Related

Creating a database of many products

I am creating an inventory app currently for iPhone using Parse for companies to keep track of all of their tools, supplies, inventory. Now I'd like to allow for the user/company when adding a new item to their database for them to have the option to search from a pre-made database of items such as for a construction company when adding a simple Dewalt Drill Battery to their inventory would search the pre-made database for "Dewalt #DC9096 18V XRP 2.4A Battery" or an office would search for pencils by brand/serial number/name. I am looking for a simple way to make a database or even a table containing multiple brands products including their prices, product specifications, website for ordering more, company website, warranty phone number, etc... I have considered parsing all of the retail websites for information but don't know the legalities behind it and if the websites change then I'd need to update code. If there is ANY (easier/better) way to do this then assistance or direction would be great!
Thanks always
I would not go down the route of trying to parse websites, that will be a huge pain in the neck and impossible to maintain unless you have extensive resources (and as you mention it probably violates most site's terms of service anyway). Your best bet would be to hook into existing product databases via an API, such as Google's Search API for shopping, or maybe Amazon's API. Here's where you can start if you wanted to use Google:
https://developers.google.com/shopping-search/
Hopefully that gets you going in the right direction.
Edit: Here's a list of a lot more shopping APIs that could be good options:
http://www.programmableweb.com/apis/directory/1?apicat=Shopping
If you did find yourself needing to parse many different vendor websites (we'd call this "screen scraping") and you have the legal right to do so, you should use a tool like SelectorGadget to get your XPaths, it's much faster, easier and less error-prone than doing it by hand.
If you're doing more than a couple websites, though, you'll probably find that you'll have to update the scraping rules pretty often, it definitely won't be a set-and-forget operation.

Recommendation engine using google-prediction-api?

In google's prediction api page, it says we can use it for recommendation of webpages / products...
Can someone please show me how, for example:
I have 500,000 members purchased history
I have 2,000,000 products in 200 different categories
I have user-X just signup, asked him 15 'like' / 'dislike' product questions (user's taste)
Now, i want to suggest/recommend user-X with a list(e.g. 500) of products which he most likely willing to purchase
Thanks a lot
If you are not specifically tied to Google API fow whatever reason, explore using Mahout. This is a basic use case for the Mahout Recommendation mining.
https://cwiki.apache.org/MAHOUT/itembased-collaborative-filtering.html
The Google Prediction API, as currently implemented, is great for classifying data into a discrete set of categories, however, as noted in the documentation:
Avoid having a high ratio of categories to training data in categorical models.
Try to have at least a few dozen examples for each category, minimum.
For really good predictions, a few hundred examples per category is
recommended.
The Prediction API's classification doesn't work well when the ratio of categories to examples is high and in the example you sketched the relationship is one-to-one because you are trying to find the user whose liked product list is most similar to the user of interest (to find a set of promising products to recommend). In this model, each user is a unique category.

Geolocation APIs: SimpleGeo vs CityGrid vs PublicEarth vs Twitter vs Foursquare vs Loopt vs Fwix. How to retrieve venue/location information?

We need to display meta information (e.g, address, name) on our site for various venues like bars, restaurants, and theaters.
Ideally, users would type in the name of a venue, along with zip code, and we present the closest matches.
Which APIs have people used for similar geolocation purposes? What are the pros and cons of each?
Our basic research yielded a few options (listed in title and below). We're curious to hear how others have deployed these APIs and which ones are ultimately in use.
Fwix API: http://developers.fwix.com/
Zumigo
Does Facebook plan on offering a Places API eventually that could accomplish this?
Thanks!
Facebook Places is based on Factual. You can use Factual's API which is pretty good (and still free, I think?)
http://www.factual.com/topic/local
You can also use unauthenticated Foursquare as a straight places database. The data is of uneven quality since it's crowdsourced, but I find it generally good. It's free to a certain API limit, but I think the paid tier is negotiated.
https://developer.foursquare.com/
I briefly looked at Google Places but didn't like it because of all the restrictions on how you have to display results (Google wants their ad revenue).
It's been a long time since this question was asked but a quick update on answers for other people.
This post, right now at least, will not go into great detail about each service but merely lists them:
http://wiki.developer.factual.com/w/page/12298852/start
http://developer.yp.com
http://www.yelp.com/developers/documentation
https://developer.foursquare.com/
http://code.google.com/apis/maps/documentation/places/
http://developers.facebook.com/docs/reference/api/
https://simplegeo.com/docs/api-endpoints/simplegeo-context
http://www.citygridmedia.com/developer/
http://fwix.com/developer_tools
http://localeze.com/
They each have their pros and cons (i.e. Google Places only allows 20 results per query, Foursquare and Facebook Places have semi-unreliable results) which can be explained a bit more in detail, although not entirely, in the following link. http://www.quora.com/What-are-the-pros-and-cons-of-each-Places-API
For my own project I ended up deciding to go with Factual's API since there are no restrictions on what you do with the data (one of the only ToS' that I've read in its entirety). Factual has a pretty reliable API, which as a user of the API you may update, modify, or flag rows of the data. Facebook Places bases their data on Factual's, just another fact to shed some perspective.
Hope I can be of help to any future searchers.
This is not a complete answer, because I havn't compared the given geolocation API, but there is also the Google Places API, which solves a similiar problem like the other APIs.
One thing about SimpleGeo: The Location API of SimpleGeo supports mainly US (and Canada?) based locations. The last time I checked, my home country Germany doesn't has many known locations.
Comparison between places data APIs is tough to keep up to date, with the fast past of the space, and with acquisitions like SimpleGeo and HyperPublic changing the landscape quickly.
So I'll just throw in CityGrids perspective as of February 2012. CityGrid provides 18M US places, allowing up to 10M requests per month for developers (publishers) at no charge.
You can search using a wide range of "what" and "where" (Cities, Neighborhoods, Zip Codes, Metro Areas, Addresses, Intersections) searches including latlong. We have rich data for each place including images, videos, reviews, offers, etc.
CityGrid also has a developer revenue sharing program where we'll pay you to display some places as well as large mobile and web advertising network.
You can also query Places via the CityGrid API using Factual, Foursquare and other places providers places and venue IDs. We aggregate data from several places data providers through our system.
Website: http://developer.citygridmedia.com/

Resources for Scantron Cognition Enterprise?

I am using Scantron Cognition Enterprise at work to capture data from scanned forms. Building these forms is tedious at best, especially when it would be nice to have a library of pre-built objects to use. Unfortunately, documentation and on-line resources are scarce.
Does anyone have any pointers to find some resources for this tool?
Hey Jason, believe it or not, Scantron is STILL the standard, but this is not the Scantron you probably remember. Although OMR (bubble) forms are still used extensively in education, there are a lot more advanced technologies available to be added to them today.
Concerning Cognition, I looked through the available tags and these would fit:
"document-imaging" - Cognition is a document imaging product and can feed images and index values into most commercially available document storage applications
"OCR" - Optical Character Recognition, or reading machine print.
"ICR" - Intelligent Character Recognition - reading hand writing, usually in a constrained print format (one letter per box like a credt card application.
"datacollection" - the key purpose of Cognition is data collection.
However, there is not a tag for "OMR" - Optical Mark Recognition, or reading bubble choices, similar to the basic Scantron forms of the past. Also, I could not find one for "Key From Image", another purpose that Cognition is used for.
I am a Cognition user as well as someone who markets it and I know that there are a large number of users in North America. Many corporations that use Cognition use it for sensitive HR functions and so might not have their usage of it posted in a searchable format. Many other organizations use it for safety inspections, insurance data entry, and also for testing and surveys - basically anywhere you have a large number of paper forms and need all of the data quickly entered into a database. Many users are using Cognition for sensitive applications are so are not likely to share, but I can share a few I have, you could also contact your Scantron rep and they might have something they could share as well. I have some decent ICR fields built for name, e-mail, address, etc. The ICR fields are best when you build in your own dictionary or database look-ups. The OMR fields are the hard ones to build, but I have a few of these as well. The easiest way to share these is to send you the form that already has the field built into it. You can build your own lookups from txt, xls or db files.

Is there a well known classifier library?

I'm crawling data from internet,without classifying.
Is there such a library to recommend?
EDIT
I'm crawling jobs from other website,and I need to group them into different industries.
To sort unlabelled data into groups, you want clustering, not classification. The most complete machine learning library is the Java-based Weka. You'll probably want to start by extracting text from the web pages (remove script and style elements completely, strip other tags), and then running the text through the StringToWordVector filter before performing clustering.
My current employer developed a system to categorize web pages. There were not any useful libraries that we could find so we had to do our own. We do not license ours out.
I can give you some hints. Spam analyzers classify email into Junk or Not Junk. You can use the same tools such as Bayesian, CRM-114, etc to do your own classifications on any text, including web pages.
You will have to watch the results of these very carefully and give them a lot of human feedback. You can often find keyword sets that will score very well for you. Finding those keyword sets will take time and effort and it will change some over time.
You will have to write code to divide web pages into topic sections because most pages are not all one thing. There are ad frames, navigation and other things.