Apache Mahout Advices? - recommendation-engine

Have you implemented Apache Mahout recommendation engine? any advices you can share? any other sites you know that use Mahout?
thanks!

You will get the best information about Mahout on the Mahout user mailing lists:
http://mahout.apache.org/mailinglists.html

The most useful brief, single page for me was the Mahout First-Timer FAQ. Based on what was written there, I decided to start with a non-distributed recommender instead of leaping into one of the Hadoop-based ones.
I also recommend the book Mahout in Action, which helped me a lot. Whether it's worth buying or not is discussed in this question, but it seems to get pretty positive reviews from most people (which I agree with).

Related

Getting started with Lift

I want to learn Lift. Unfortunately, all documentation which I tried either obsolete, unreadable, incorrect or combination of the above. I tried the following:
Simlply Lift. Some things from the book I tried lead to errors.
Exploring Lift. The structure of book is very bad. It's hard to read and try out code in the wild at the same time.
Lift in Action. The same as the previous but you need to pay for it.
P.S. I've seen similar questions. Most of them were asked a long time ago. Did the situation improve from the time of that writings?
P.P.S. Are there any other type safe scala web frameworks (Don't offer Play 2.0. It's not typesafe. I don't see any reason to create it in Scala).
It is unfortunately true that the state of Lift documentation is uneven at best and there are huge gaping holes.
However, the Lift community is just full of awesomely helpful people.
My recommendation is not to play around, but rather to try and implement something. If you get stuck, ask specific, direct questions about what you're trying to do, how you're doing it and why it isn't working.
So far, though I would wish for better documentation, I've been able to get every answer that I needed either through Google or on the Lift mailing list - though I expect I might ask more questions here in the future.
The Lift documentation is not its strong point. The philosophy is more "try and ask if you have any problem". Here are a few tips:
Assembla
One ressource that is really useful is http://www.assembla.com/wiki/show/liftweb, there are a lots of examples so you can progressively learn how it works.
Mailing List
Otherwise you can always use the mailing list if you have specific questions even if in my opinion it is really hard to explore it fast in order to solve a problem which was already encountered. http://groups.google.com/group/liftweb
Stack Overflow
Finally, a small community is present on Stackoverflow so feel free to ask in here. This is a good way of looking for answers and creating documentation in the same time.
Source code
Don't hesitate to explore the source code and the scaladoc if you have specific questions/doubts about the behavior of a function, they are often short and even sometimes commented! http://scala-tools.org/mvnsites/liftweb-2.4-M4/#package
Have a look at the Lift Cookbook: http://cookbook.liftweb.net/
"Simlply Lift. Some things from the book I tried lead to errors."
What exact type of errors did you have? Have you tried to follow it with "Simply Lift" examples that you can download from GitHub
https://github.com/dpp/simply_lift?
Only errors I had were related to my lack of experience with SBT, but that's another story.
I have started with Lift mostly from that source (Simply Lift + examples) and in combination with its great community and Google (ChrisJamesC has listed the main links really nice) it was quite okay for me.
I would suggest you to work out all examples given in the "Simply Lift" tutorial or at least work them out unless you feel comfortable enough to jump right "in media res" and try something by yourself. That was the best way of learning Lift for me.
Also, whenever you got stuck somewhere and can't find solution on the web, your questions would be welcome and answered on the Lift Google Group (https://groups.google.com/forum/?fromgroups=#!forum/liftweb). David Pollak is very often right there to answer your questions directly so I have only words of praise for this framework's community and Lift's
creator.
P.S. Lift's documentation could be better organized, some stuff could be better explained for sure, but IMHO it was just too small a price I had to pay to enjoy such beautiful framework. Learning curve is steeper than with Play, especially in the beginning, but after I "survived" the very first week it was almost impossible for me to give up of all of its advantages and original concepts (Lift's "Seven Things") and switch to another framework.

Is there a central site/page for "advanced Scala" topics?

Despite having read "Programming in Scala" several times, I still often finds important Scala constructs that were not explained in the book, like
#uncheckedVariance
#specialized
and other strange constructs like
new { ... } // No class name!
and so on.
I find this rather frustrating, considering that the book was written by the Scala "inventor" himself, and others.
I tried to read the language specification, but it's made for academics, rather than practicing programmers. It made my head spin.
Is there a website for "Everything "Programming in Scala" Didn't Tell You" ?
There was the daily-scala Blog, but it died over a year ago.
Currently, we're working on a central documentation site for scala-lang.org. We're hoping that this solves a lot of the documentation issues that new users face. More details on this effort can be found at http://heather.miller.am/blog/2011/07/improving-scala-documentation/, but in summary...
Believe it or not, there are a lot of documents that the Scala team has produced but which simply aren't in HTML or are otherwise difficult to find. Such as Martin's new Collections API, his document on Arrays, or Adriaan's on Type Constructor Inference.
One goal of such a site is to collect all of this documentation in one place, in a searchable, organized, and easy-to-navigate format.
Another goal is to collect excellent community documentation out there, and to put it in the same place as well. For that, we are actively looking for quality (article/overview-like) material with maintainers. Examples include the Scala Style Guide, and Daniel Spiewak's Scala for Java Refugees.
Yet another goal is to make it easy for contributors to participate- so the site is built from RST source, which will live in a documentation-only github repo at https://github.com/scala/scala-docs.
So, in short, something better is on it's way, and contributors are very welcome to participate.
EDIT: http://docs.scala-lang.org is now live.
Several documents considered to be rather detailed or even obscure are already available. This includes all "Scala Improvement Proposals" (the proposals produced when new language features are suggested, and which are usually very detailed, and written by the implementers themselves). Also available is the entire glossary from Programming in Scala, Scala cheatsheets, amongst many other documents. The bottom-line of the site is to be community-focused and contribution-friendly-- so, free, and totally open. Suggested topics to cover are also welcome.
Take a look at scalaz and typelevel librairies (shapeless, spire, etc.), they rely on many advanced features of Scala.
*scalaz was for a time part of typelevel, but it is no more the case.
Josh Sureth's book goes a little beyond the usual. It's not as far as I'd like but I'm not his core audience - still, there's a lot of good stuff in there.
http://www.manning.com/suereth
Scala IRC: irc://irc.freenode.net/scala
Scala forum: http://scala-forum.org/
Blogs: Just look at http://planetscala.com/
Programming Scala (Wampler, Payne): http://ofps.oreilly.com/titles/9780596155957/
Programming in Scala (Odersky, Venners, Spoon) - good but Scala 2.8: http://www.artima.com/pins1ed/
The new documentation page is online:
http://docs.scala-lang.org/
I've kept a library of advanced Scala resources, primarily talks and blog posts. It's updated pretty regularly as I find new, interesting content.
Happy to add new links to it if anyone has recommendations.
Try to read SBT Source: https://github.com/harrah/xsbt/wiki
Its a good exercise. Also check out the book 'scala in depth' : http://www.manning.com/suereth/ by
Joshua D. Suereth
I believe there are a lot of good answer here. But as a sharing of experience. I have been coding Scala for 2 year (not my full time job), and been progressively better at it. My project is 97% Scala, and I have been able to do most of it with:
Programming Scala
The scala-user list
Stackoverflow
This cover most of the need for the "user" side of Scala, meaning all you need to create working application. However if you want to write some more complex code, or create powerful typed libraries you definitely need more.
If you want to go beyond the basics and are prepared to delve deeply into type system, and libraries, then the alternatives I use:
Use the community, scala enthusiast are really nice. I have worked with folks form Specs, Scalaz and Lift.
IRC is really good and some of the core contributors to some of the big library frequently show up.
Jump to source code, but don't try to understand everything. Scala type system can be daunting, however you normally don't need to understand 100% of it to use it.
If you really need to get into the nitty gritty details, hit the language specs, development list, and get to know the key people.
However you can really be very effective in Scala without needing to understand every single bit of the language.

Good Scala introductory article/video to whet the appetite

What are some good online articles or videos you've seen that would be most likely to get a developer interested in Scala? I'm looking for an introduction that is brief & to the point that dives right into example code, and would leave a developer who does not know Scala wanting to learn more about it.
Try in this order:
Pragmatic Real-World Scala - This video shows off all kinds of things that would make a Java developer drool.
Programming In Scala - This is simply a great general-purpose programming book. In addition to being a gentle, clear introduction to the language, it's also a fantastic introduction to functional programming concepts and language design. Even if you hate Scala,
this book will make you a better programmer.
Scala For Java Refugees - Very well-written mostly gentle introduction to major Scala concepts.
Another tour of Scala - A Java-centric breakdown of fundamental Scala features.
i went to this talk, it was excellent. can't tell if it is still there due to our internet restrictions, if it's not i'll delete this post.
http://powerhost.powerstream.net/008/00102/100203Scala.wmv
I'd go straight to the horse's mouth, the Scala website itself: Code Examples.
http://www.escalatesoft.com/screencasts
Escalate software is in the process of creating a series of screencasts for Scala information sharing and training purposes. The first available screencasts are provided here for free and cover the new features of Scala 2.8. In the longer term we will create training materials in the form of these videos along with supporting material that will be for sale from this and other sources as well.
http://blog.jaoo.dk/2009/03/09/an-introduction-to-the-scala-programming-language-by-bill-venners/
Take a look at the following presentation by Jonas Bonér (a well known figure in the Scala community, responsible for the AKKA actors concurrency framework). I'm sure this will whet the appetite for Scala.
http://www.infoq.com/presentations/Scala-Jonas-Boner
german introduction, maybe useful for you: http://www.rheinjug.de/videos/gse.lectures.app/Player.html#Scala
I would recommend Chapter 1. Zero to Sixty: Introducing Scala of the Programming Scala book by Dean Wampler and Alex Payne. The rest of the book is also great. The book is freely available online.
EDIT
I recently bought and read the Atomic Scala book by Bruce Eckel and Dianne Marsh. This is the best book I have read so far for anyone wanting to learn Scala.

Please suggest direction for my small scala project

As a academic project of 6 months in college me and my 3 friends are going to implement "Distributed Caching" in scala language.
Being new to both of these concepts and this being our first project I would be really happy if you guys could provide some direction.
I am currently learning scala.
Please let me know which particular features of language to be learned for this particular project.
Any online resources for learning distributed caching.
thanks in advance
You could have a look at Terracotta and especially at its uses in implementing Distributed Caching. You could have a look at the source code of the open source edition of Terracotta. Also, you could even consider Terracotta as your framework for building the distributed cache. I don't have any personal experience in using Terracotta with Scala, but it has been done.
Features of the language... Try starting with the Programming in Scala book. It's a very good resource. If you want to do any concurrency you will have to be proficient in using Actors. I would recommend having a look over all the features of Scala. Each one has its uses and you will need to know at least a bit of them to recognise situations in which to use their power. :)
-- Flaviu Cipcigan
You might want to look at the project Velocity page.
In MSDN also there is an article about distributed caching in general.
I'm not sure, but I think the Akka project might is already doing what you're looking for (and a whole lot more). Perhaps you can take inspiration from that.

Open Source collaborative filtering frameworks

I was wondering if there exists any open source frameworks that will help me include the following type of functionality to my website:
1) If I am viewing a particular product, I would like to see what other products may be interesting to me. This information may be deduced by calculating for example what other people in my region (or any other characteristic of my profile) bought in addition to the product that I am viewing. Kind of like what Amazon.com does.
2) Deduce relationships between people based on their profile, interaction with one another on the website (via commenting on one another´s posts for example), use of the website in terms of areas most navigated, products bought in common etc.
I am not looking for a open source website with this functionality, but something like an object model into which I can feed information about users and their use of the site including rules about relationships and then at a later point ask it questions described in (1) and (2) above.
Any pointers to white papers / general information about best approaches to do this, or any related links will really help too.
(I am the developer of Taste, which is now part of Apache Mahout)
1) You're really asking for two things here:
a) Recommend items I might like
b) Favor items that are similar to the thing I am currently looking at.
Indeed, Mahout Taste is all about answering a). Everything it does supports systems like this. Take a look at the documentation to get started, and ask any questions to mahout-user#apache.org.
For 1b) in particular, Mahout has two answers:
If you are only interested in what items are similar to the current item, you would be interested in the ItemSimilarity abstraction in Mahout (org.apache.mahout.cf.taste.similarity.ItemSimilarity) and its implementations, like PearsonCorrelationSimilarity. Based on a set of user-item ratings, this could tell you an estimated similarity between any two items. You'd then just pick the most similar items. In fact, look at the TopItems class in Mahout which can just figure this for you quickly.
But also, you can combine a) and b) by computing recommendations, then applying a Rescorer implementation which then favors items that are similar to the currently-viewed item.
2) Yes likewise, you would be interesting the UserSimilarity abstraction, implementations, etc. This would deduce similarities based on item ratings. Mahout however does not help you deduce these ratings by, say, looking at user behavior. This is domain-specific and up to you.
Sound confusing -- read the docs and feel free to follow up on mahout-user#apache.org where I can tell you more.
I am researching the same topic, as I'm working on a project to help people decide how to vote on California's complicated ballot measures. Here are some open-source collaborative filtering engines that I've found:
Vogoo (PHP)
acts_as_recommendable (Ruby on Rails)
Mahout (formerly Taste) (Java)
There's also a good overview of these engines here.
There are also the Duine framework and OpenSlopeOne.
But in my opinion, Mahout is still the best.
You can find a survey about Open Source Recommender Systems here:
http://girlincomputerscience.blogspot.com.br/2012/11/open-source-recommendation-systems.html
Hope it helps!
You can find a List of Recommender Systems here