Where can I find a lot of data to feed it an artificial intelligence? - email

I want to code an artificial intelligence. To teach her the language I can use Wikipedia offline but for teaching her communication I need other sources. Do you know big data sources which fit to this task and are free available? For example chat protocolls, mails, content of forums or something similar?

for teaching her communication I need other sources.
Some ideas:
Search video transcriptions on youtube (you may need to edit them for quality)
Search in your country political debates transcriptions (they maybe available for free on internet)
Search for theater plays dialogs in public domain

Related

transcribe a phone recording

There is a certain organization that periodically provides information in the form of a recorded message on a "hotline". Is there any open source solution (or set of components that could be "wired" together) that would allow me to present this information in text form on a web page?
Since it's the really easy part, I'm going to assume you can fetch the audio from the "hotline", i.e. you have direct access to the actual audio samples.
The hard part is transcribing the audio. You can start by having a look at Wikipedia and follow the links from there. One solution you could use would be CMU Sphinx. Google and other related search tools such as Google Scholar are likely to become your close friends :)
While there are a number of voice recognition engines available, their accuracy is far from perfect.

DNN CMS training

Whats the best way to start to train an end user in a CMS like DOTNETNUKE?
The end user will want to add edit and delete there own content. They will need to install modules and understand how everything works?
Should i create a manual? is there a way to plan some training?
any ideas?
edit: the end users are VERY I.T illiterate, they struggled to even understand the rich text editor. I need to train them on how to use the form and list module and the HTML module for editting content. They want a document of some sort, this is really old school.
PD24, for what most customers do it usually only takes 5-10 minutes of training. I usually create a couple Jing Videos which is a free screen and audio recording tool. I go through and do voice over as I create a page, edit text, add photos, add modules and record it. Then I send them the links they can reference if they ever need a reminder.
Works great! (boooo to manuals, no one reads those and they take a lot of time to make!)
& DNNcreative is probably too detailed for your client, that's a good resource for DNN implementers.
We have a variety of videos in the video library on DotNetNuke.com you could point users to those for specific topics.
We (DotNetNuke Corp) also provide custom training solutions, we could develop a custom training program for your client that fits the scope of your project and delivery requirements. If you want more info feel free to email me at training#dnncorp.com.
Have a look into www.dnncreative.com, they have some awesome tutorials for developers and users.

Resources for Scantron Cognition Enterprise?

I am using Scantron Cognition Enterprise at work to capture data from scanned forms. Building these forms is tedious at best, especially when it would be nice to have a library of pre-built objects to use. Unfortunately, documentation and on-line resources are scarce.
Does anyone have any pointers to find some resources for this tool?
Hey Jason, believe it or not, Scantron is STILL the standard, but this is not the Scantron you probably remember. Although OMR (bubble) forms are still used extensively in education, there are a lot more advanced technologies available to be added to them today.
Concerning Cognition, I looked through the available tags and these would fit:
"document-imaging" - Cognition is a document imaging product and can feed images and index values into most commercially available document storage applications
"OCR" - Optical Character Recognition, or reading machine print.
"ICR" - Intelligent Character Recognition - reading hand writing, usually in a constrained print format (one letter per box like a credt card application.
"datacollection" - the key purpose of Cognition is data collection.
However, there is not a tag for "OMR" - Optical Mark Recognition, or reading bubble choices, similar to the basic Scantron forms of the past. Also, I could not find one for "Key From Image", another purpose that Cognition is used for.
I am a Cognition user as well as someone who markets it and I know that there are a large number of users in North America. Many corporations that use Cognition use it for sensitive HR functions and so might not have their usage of it posted in a searchable format. Many other organizations use it for safety inspections, insurance data entry, and also for testing and surveys - basically anywhere you have a large number of paper forms and need all of the data quickly entered into a database. Many users are using Cognition for sensitive applications are so are not likely to share, but I can share a few I have, you could also contact your Scantron rep and they might have something they could share as well. I have some decent ICR fields built for name, e-mail, address, etc. The ICR fields are best when you build in your own dictionary or database look-ups. The OMR fields are the hard ones to build, but I have a few of these as well. The easiest way to share these is to send you the form that already has the field built into it. You can build your own lookups from txt, xls or db files.

Is there an off the shelf CMS that can be used as a back end for smartphone travel guide apps?

I'm wondering if there's an off the shelf CMS available that is similar to something like Mobile Roadie - ie: it will allow you to create multiple versions of one application? I'm looking to develop some mobile travel guides for iPhone/Android/Blackberry etc, and rather than get a CMS built, I'd like to see if there's something out there is similar to Wordpress in that it will allow us to input text, images, Google Maps details, phone numbers, email addresses and potentially some audio/video content.
If anyone knows of anything, I'd love to hear about it. Also, if you have any ideas regarding pricing, that would be extremely helpful! Thanks in advance for your assistance.
The chances of you finding something "Off the shelf" diminish as your requirements get more specific. You want something for a limited and specific target audience (iPhone, Andriod, Blackberry) that can deliver many different types of very specific content (addresses, maps, text, images, video).
From my experience of building a CMS for one of the world's most famous travel guides, I can say your chances are slim indeed. The technical requirements of managing this type of information are huuuuuge!
But hell, I could be wrong! I hope you find something that solves your problem and you make the world a better place!
PS: Maybe you should simplify your requirements and build from there? Good luck. :)
I just dropped a reply on this question:
How to setup a CMS as a backend for iPhone app
You could look at this blog for a drupal showcase:
http://drupal.org/node/900630
and at this wordpress plugin:
http://wordpress.org/extend/plugins/json-api/
Personally I am trying with tikiwiki.org but I am not sure yet if it is right.
Cheers
We created a very flexible CMS called StorageRoom which we built specifically for mobile apps.
You could easily let users manage locations with maps and additional fields.

Anyone have a link to a technical discussion of anything akin to the Facebook news feed system?

I'm looking for a presentation, PDF, blog post, or whitepaper discussing the technical details of how to filter down and display massive amounts of information for individual users in an intelligent (possibly machine learning) kind of way. I've had coworkers hear presentations on the Facebook news feed but I can't find anything published anywhere that goes into the dirty details. Searches seem to just turn up the controversy of the system. Maybe I'm not searching for the right keywords...
#AlexCuse I'm trying to build something similar to Facebook's system. I have large amounts of data and I need to filter it down to something manageable to present to the user. I cannot use another website due to the scale of what I've got to work at. Also I just want a technical discussion of how to implement it, not examples of people who have an implementation.
Are you looking for something along the lines of distributed pub/sub with content based filtering? If so, you may want to look into Siena and some of the associated papers such as Design and Evaluation of a Wide-Area Event Notification Service