Open Source collaborative filtering frameworks - collaborative-filtering

I was wondering if there exists any open source frameworks that will help me include the following type of functionality to my website:
1) If I am viewing a particular product, I would like to see what other products may be interesting to me. This information may be deduced by calculating for example what other people in my region (or any other characteristic of my profile) bought in addition to the product that I am viewing. Kind of like what Amazon.com does.
2) Deduce relationships between people based on their profile, interaction with one another on the website (via commenting on one anotherĀ“s posts for example), use of the website in terms of areas most navigated, products bought in common etc.
I am not looking for a open source website with this functionality, but something like an object model into which I can feed information about users and their use of the site including rules about relationships and then at a later point ask it questions described in (1) and (2) above.
Any pointers to white papers / general information about best approaches to do this, or any related links will really help too.

(I am the developer of Taste, which is now part of Apache Mahout)
1) You're really asking for two things here:
a) Recommend items I might like
b) Favor items that are similar to the thing I am currently looking at.
Indeed, Mahout Taste is all about answering a). Everything it does supports systems like this. Take a look at the documentation to get started, and ask any questions to mahout-user#apache.org.
For 1b) in particular, Mahout has two answers:
If you are only interested in what items are similar to the current item, you would be interested in the ItemSimilarity abstraction in Mahout (org.apache.mahout.cf.taste.similarity.ItemSimilarity) and its implementations, like PearsonCorrelationSimilarity. Based on a set of user-item ratings, this could tell you an estimated similarity between any two items. You'd then just pick the most similar items. In fact, look at the TopItems class in Mahout which can just figure this for you quickly.
But also, you can combine a) and b) by computing recommendations, then applying a Rescorer implementation which then favors items that are similar to the currently-viewed item.
2) Yes likewise, you would be interesting the UserSimilarity abstraction, implementations, etc. This would deduce similarities based on item ratings. Mahout however does not help you deduce these ratings by, say, looking at user behavior. This is domain-specific and up to you.
Sound confusing -- read the docs and feel free to follow up on mahout-user#apache.org where I can tell you more.

I am researching the same topic, as I'm working on a project to help people decide how to vote on California's complicated ballot measures. Here are some open-source collaborative filtering engines that I've found:
Vogoo (PHP)
acts_as_recommendable (Ruby on Rails)
Mahout (formerly Taste) (Java)
There's also a good overview of these engines here.

There are also the Duine framework and OpenSlopeOne.
But in my opinion, Mahout is still the best.
You can find a survey about Open Source Recommender Systems here:
http://girlincomputerscience.blogspot.com.br/2012/11/open-source-recommendation-systems.html
Hope it helps!

You can find a List of Recommender Systems here

Related

What would be a good "CMS" for me to use?

I'm looking for some sort of CMS system to implement here in terms of "documentation" system.
Now, I'm not to sure about which system(s) would suit my needs best, so I thought I'd come here and type up my requirements so you could help me in narrowing down all the different options.
One important note to make is that I'm not looking at a system where I can store certain documents (word, pdf, whatever). Rather at a system where I can type the "documentation"-text in some sort of post (like a blog).
Requirements:
- Multilanguage support
- Tagging
- Decent search support (tags, groupings, categories)
- Version-control of posts/articles
- Possibility of exporting post(s) to a pdf file
- Support for multi-user (usergroup X can only see those posts, usergroup Y can see others, etc...)
I know, these are some strange requirements if they're all combined, and I reckon most of you would perhaps say that I'd have to develop something like this inhouse rather then finding a descent working product out there (open source if possible).
None the less, I thought I'd at least ask the opinion of y'all.
Regards,
Tim
You definitely need a Wiki. It doesn't matter if it's for developers or end-users, it meets all of your criteria.
Things like exporting to pdf might not come standard but it could be accomplished using a plugin. I've used a few wikis in the past, mediaWiki, OpenWiki, Twiki and currently Screwturn wiki. They all have their pros and cons, they all work on different systems (apache, iis, sqlbased, file based, etc..).
I would suggest you doing a little comparison investigation to decide which wiki you like best. Whichever you pick will meet your needs.
Here is a comparison chart, hope this helps
good luck
-D
If you want to export to PDF with hi-resolution images maybe following answer will be help:
CMS and store hi-resolution images in generated pdf

Requirements for a game

I'm writing an iPhone game and I am trying to write some requirements documents. I have never written requirements before so I got the book Software Requirements. I have not finished it yet, but I forsee some issues, as this book is targeted towards a business. My main question is I am the only person involved with this game and I feel the main purpose of the requirements document should be to nail out as many conceptual ideas of how the game works as I can before I am deep into design or construction. Does anyone have suggestions on how I should lay this out, should I still try to mimic the template provided in the book where it makes sense, or since I am both the sole developer and product owner, should I just stick to game concepts?
You're right that traditional SRS documents don't really fit games documentation all that well. Games instead have a general Game Design Document. It's usually created before any work on the game begins, and it's often edited as the development process goes to keep straight the intended end-result and specifics of the game.
While business software requirements documents are like contracts between a client and developer on what to produce, game design docs are more often specifications from the designer to the artists and programmers on what exactly they need to develop.
There is no specific layout to use. But you should consider who you're writing the document for. Is it for a class, for yourself, for peers after the project is done? The level of detail and the kind of things you include will be different depending on your audience. The format itself is very flexible, as long as it's coherent.
Brenda Brathwaite has a good blog entry on this subject which you might find helpful.
There is a semi-recent article from gamedev.net on the subject as well.
[Poor Jacob, you just read a book on the topic, and, collectively, the SO community writes another one for you, along with extra links, and probably with diverging views ;-) ]
Although I'm not familiar with the book you mention in the question, I think that the following suggestion may help you both take seriously, but also relax a bit, about the all too important question of requirements.
Being a "team of one", it is particularly important, and somewhat paradoxical, that you go through the effort of formalizing the requirements. However, rather than putting too much emphasis on the form, you may find an Agile approach to developement (and hence to requirements gathering) more appropriate. With regards to requirements, one of the main advantages of this approach, is flexibility, i.e. the understanding that while they should be formalized (with limited time/effort), requirements should be allowed to change (within limits) as part of an iterative process towards production of the target product.
In very broad terms, this generally go as follows:
write "user stories", these are individual "cards" (yes, physical cards, say 4 inches by 5 inches, are good, for you can then move then around, sort them etc.)
each story tells a particular feature of the application, here the game, from the end-user's perspective. You can/should start all cards with "As a user, I need the game to..." then follow with a particular feature, for example "... show my high score on the same page as the global high-scores are kept [because ... here optional reasons for why user may want this feature].
review each story and assign a rough estimation of the time involved in implementing it
review each story and assign a priority level (scale may vary, but something simple like "Must have from Version 1.0", "Should eventually be in there, for sure", "Would be nice to have" and "Maybe nice to have...")
organize releases, on the basis of what you can do within say 2 or 3 weeks, maximum. If a particular feature were to take too long, schedule it for a later release.
implement the features assigned to the current release
iterate through this release cycle, reviewing the requirements as you go, for the relative importance of features, and also the need of new features may become evident as with the insight provided by using the [incomplete/imperfect] intermediate releases.
Books like the one you describe are focused at a different audience, but there is value in the general concepts presented. Fully developed requirements documents are not as common as you might think. Don't let anyone think that you are a 'bad developer' for not having the most detailed requirements.
Requirements docs might be more important if you need to communicate the requirements with a co-developer.
If you are the sole developer I would strongly recommend that you spend your efforts on the design and implementation of the game, over requirements. If you have a good idea of what you build then let this flow as you build it.
Documentation can help you. The question is what is going to be most beneficial. Maybe design decisions are more critical than requirements for you but not for others. You'll maybe want to have a list of things that people have requested or ideas that you think of but cannot implement straight away. Sometimes a whiteboard can be handy for sketching out things, it's not just a tool for collaboration with other people.
Here's just a general approach...
Solidify the concept...write it in plain English first (ex: The game is a first person shooter. You kill zombies and hunt for treasure.)
Get a paper pad and pencil and draw out the general flow of the game and the main screens the users will encounter...main menu, options screen, help, etc. Make sure it makes sense.
Go to a site like mockingbird and create the detail wireframes for your screens...
Print these out and do some paper prototyping...i.e. put the printout in front of you and 'click' on a button...then bring up the appropriate screen...then click on another button, etc.
Once that makes sense, you can try to start coding your game.
Personally I believe you should use your own way to do this. The most commonly available one's will not match with your requirement. They might be suitable for a common commercial server application but not for a game. And since iPhone gaming is a new trend you may have to look in a different perspective.. You may not be able to fill a document with standard requirements and you may have different set of New type of requirements.
Just a suggestion... Sign up with Google Sites, and create a private site with documentation of the game, requirements, technical aspects, work log, etc... You can share it with select people, and it always keeps edit history.
I like it better than a Wiki because it is more structured, and just plain simple to use.

Where can I learn about recommendation systems?

I'd like to play around with building a recommendations system, and by that I mean an algorithm that looks at preferences and/or reviews posted by a user and then makes recommendations for them, similar to what netflix or amazon use.
What are some good resources for learning how to write something like this? Where should I start?
Check out the Wikipedia page on the Netflix Prize and its discussion forum. Also, the somewhat related 2009 GitHub Contest is a good source for full source code on a number of different recommendation engines. And obviously there's also the Wikipedia page on the topic itself, which has some decent links.
If you start writing your own, you'll want to use a corpus. I'd actually recommend using the Netflix Prize's data set. Just carve the data set into two pieces. Train on the first piece and score your algorithm on the second piece.
Addenda: A somewhat related and scary application of this sort of thing is predicting demographic information: a user's gender, age, household income, IQ, sexual orientation, etc. You could probably do most of these attributes with the Netflix Prize dataset with a fairly high degree of accuracy. Fortunately everyone in that dataset is just a number.
Take a look at pysuggest a Python library that implements a variety of recommendation algorithms for collaborative filtering (which is used by Amazon.com).

Largest possible group of friends in common?

I'm trying to come up with the largest possible group of friends that would theoretically get along with each other, i.e., each person in the group should know at least 50% of the other people in the group.
I'm trying to come up with an algorithm for this that doesn't take ridiculously long; Facebook's API/cross-server talk is pretty slow as is.
I was thinking I could start with the friend that has the most mutual friends with me first, and then add people to the group one by one. But who would I choose next?
Just interested in the theory, no code is necessary.
Edit: When I said "theory", what I really meant what's the next logical step in plain english :) I was hoping I could code this up in an afternoon, but I guess this is a bit more complicated than I anticipated, and I'm not sure I want to spend weeks delving into heavy graph theory. Nevertheless, maybe someone else will find this interesting.
MIT did some work on social graphing a while back. Although it used mobile phone data, the clustering algorithms and other systems should still apply, even though they are constructed using different inputs and criteria.
There is more MIT chatter about social graphing going on at the moment. Definitely the place to look for technical pointers on this kind of thing.
Whilst the problem of graph enumeration from a given node to it's edges is NP complete for most useful problems ... the application of the graph traversal and the wealth of information might help you make this more efficient:
For any node (profile) N, you could data-scrape using Google or something to find associated edges out. This means that you can harness a cache of the pages and Googles search technology to mitigate having to traverse the edges yourself.
Social profiles contain tons of meta-data. Developing a statistical analysis method for working out the likelyhood of A knowing B without a direct path might be useful. Afterall friends have a) similar locations and b) similar interests
Other data, seemingly irrelevant can provide a means for locating people likely to know eachother and then you can double check the edges. Things such as chatter on boards about a band or gig, or people mentioning "cat fight" when Kate smacked Mary in the mouth.
The data just needs looking at in the right way, in the same way MIT looked at geographical statistics to determine relationships through phones.
Good Luck
There is an Algorithm called SCAN-Algorithm with some precalculations the algorithm can cluster a network in a good speed.
You can find informations about the algorithm here: SCAN: A Structural Clustering Algorithm for Networks
This is more "broad", but see if it helps to get ideas.

What Makes a Great Functional Specification Great?

What qualities made it so great, and what made it stand out compared to the not-so-great specs that you've had to deal with? Or, if you've never worked with a good functional spec before, what sort of things would you expect in a great spec?
Sorry this is obviously subjective but I'm creating a functional spec (not my first) and it just occurred to me that I may get some good ideas from the bright folks on SO!
The Project Aardvark specs from Joel on Software are the best I've come across so far. Each screen is defined very well, with pictures. The main features of the software are described, as well as some technical details.
Sadly the specs I've received personally aren't that brilliant. Usually they are just a bulleted list of features they expect from each section of the system, and they expect you to work out all the details. Which is fine, I guess. However, I'm writing a game design document for an RPG game I'm working on as a personal project, and I think the specs I'm writing are very well written. I've divided the game into Sections such as
Characters
Weapons & Armor
Levels
Map
Physics
and so on, and described each section in terms of gameplay as well as some technical details. Its very easy to work through.
I also highly recommend reading the Painless Functional Specs Series from Joel on Software for anyone interested in writing better specs.
IMHO, a key quality should be that the functional spec specifies the "what" in great detail but not the "how". That way, the requestor (marketing?) gets the look & feel and feature set that they want, but the implementation is left to those who know it best -- the developers.
Obviously, the specification should be complete, consistent and comprehensible. IMO it should also be well-organized, in that it keeps all requirements for a specific part of the product together. I've more than once read specifications where requirements for some module were scattered throughout the whole document, e.g. the general description is in chapter 4, but additional requirements can found in clauses in chapters 2, 5, 7 and appendix B. To work with such a specification, I first have to create a cross-reference map of requirements to modules.
A good spec should state what the application is supposed to do, in a clear manner.
This seems obvious, but the stuff I usually get is often very vague. Apparently it's not very easy for people to express what they want on paper, IF they even know what they want.