Survey on recommender system design approaches for MOOC providers? - recommendation-engine

I have been searching academic literature and other online resources, but was unable to find high-level design approaches for how providers such as Coursera surface "Personalized for You" courses. LinkedIn Learning did have a succinct writeup, but that was the extent of my findings. Have others found relatively recent papers etc. discussing how these providers approach the problem of surfacing relevant material via collaborative and/or content based filtering?

Related

Scentific approach to evaluting software

I'm currently in school and have been tasked with objectively evaluating a software (atlassians Jira platform). I'm currently having issues in staying objective. For example, saying that the platform is "easy to use" is according to my opinion of the platform and not so much based on evidence. So I'm curious to hear from you guys if you know if any scientific method to evaluate software or services? I've currently done a survey asking users how they use Jira and what they think about the platform. But I feel that this is not enough I would like to have some numbers that can point to how good or bad the software is.
The fist thing to mention is that a scientific work is always a collective work. Keep in mind that others might already have done such an scientific work you might use. So you have to create a small team or look for well-founded scientific work throu the internet or contacts in universities if you have such contacts.
If there are no results that fits you have to create knowledge. In this case a mathematical based decision will help. The Decisiontable might be the source for a scientifc decisions. The Decisiontable requires a couple of possible decisions, a couple of factors to respect in a specific weight. It contains the analysis and synthesis. After you have created the Decisiontable you should discus it in a critical team until the team agrees the results (and might offer them to the public).

Can Apache nifi be used as an application server?

I'm an application developer mainly develop and maintain enterprise application, like ERP, HCM system. After being in the field for many years, I started feeling that the way business system are developed is not quite right. After years of maintenance and enhancement by hundreds of developers, the system keeps getting bigger and bigger, more and more complex. At the end, it just impossible to do big changes in the system, because the logics are all tangled together like Italian noodles. Developers so afraid of causing severe customer issues.
Recently I find Flow based programming paradigm invented by J. Paul Morrison and I find it really interesting. I approve very much the idea of doing application development by drawing diagram visually. As we all know to develop business system we start with drawing business flow diagram. Why can't business flow diagram just be the system itself??
Naturally, I tried to find FBP implementations, and nifi is the one that the FBP inventor recommends. I haven't dig very deep into nifi.
Just after watching some introduction videos and documentation, I find most of the time, the nifi experts always talking about using nifi for iot system, real time streaming these kind of stuff. It seems that nifi is not related to business systems.
Looking forward to someone to clarify my doubts. Is nifi suitable for building business transactional systems?
Apache NiFi is definitely used for many "business logic" systems, especially taking on the role of handling extract/transform/load logic (ETL). While not strictly an ETL tool, NiFi can facilitate data routing and simple event processing in a number of scenarios. The "Powered By NiFi" page lists some public use cases of NiFi, and many are for "business systems" that do not relate to IoT.
sorry I didn't see your question before - your comments are interesting. I am surprised that you say that NiFi is the FBP software that I recommend - I do list it as a product that is closer to the "classical" FBP philosophy than what we call "FBP-like" or "FBP-inspired" systems, and I assume it is one of the few FBP products that are in the marketplace - unlike my work, which is all public domain. The terms "FBP-like" and "FBP-inspired" are actually thanks to Joe Witt, the developer of NiFi. I try to describe the difference between "classical" FBP and "FBP-like" in my article on https://jpaulm.github.io/fbp/noflo.html . With all due respect to Joe, I find NiFi a bit over-complex, although his data packets are immutable, which has certain advantages. For a complete suite that takes you from a diagram to actual running code, I would suggest you start with the FBP diagramming tool, https://github.com/jpaulm/drawfbp , generate a JavaFBP network, using https://github.com/jpaulm/javafbp , and run! Both of these tools, as well as others on https://github.com/jpaulm/ , are open source. My colleague, Bob Corrick, and I are working on a tutorial which you may find helpful: https://github.com/jpaulm/fbp-tutorial-filter-file .

Where can I learn about recommendation systems?

I'd like to play around with building a recommendations system, and by that I mean an algorithm that looks at preferences and/or reviews posted by a user and then makes recommendations for them, similar to what netflix or amazon use.
What are some good resources for learning how to write something like this? Where should I start?
Check out the Wikipedia page on the Netflix Prize and its discussion forum. Also, the somewhat related 2009 GitHub Contest is a good source for full source code on a number of different recommendation engines. And obviously there's also the Wikipedia page on the topic itself, which has some decent links.
If you start writing your own, you'll want to use a corpus. I'd actually recommend using the Netflix Prize's data set. Just carve the data set into two pieces. Train on the first piece and score your algorithm on the second piece.
Addenda: A somewhat related and scary application of this sort of thing is predicting demographic information: a user's gender, age, household income, IQ, sexual orientation, etc. You could probably do most of these attributes with the Netflix Prize dataset with a fairly high degree of accuracy. Fortunately everyone in that dataset is just a number.
Take a look at pysuggest a Python library that implements a variety of recommendation algorithms for collaborative filtering (which is used by Amazon.com).

What are some good resources for learning about Artificial Neural Networks? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I'm really interested in Artificial Neural Networks, but I'm looking for a place to start.
What resources are out there and what is a good starting project?
First of all, give up any notions that artificial neural networks have anything to do with the brain but for a passing similarity to networks of biological neurons. Learning biology won't help you effectively apply neural networks; learning linear algebra, calculus, and probability theory will. You should at the very least make yourself familiar with the idea of basic differentiation of functions, the chain rule, partial derivatives (the gradient, the Jacobian and the Hessian), and understanding matrix multiplication and diagonalization.
Really what you are doing when you train a network is optimizing a large, multidimensional function (minimizing your error measure with respect to each of the weights in the network), and so an investigation of techniques for nonlinear numerical optimization may prove instructive. This is a widely studied problem with a large base of literature outside of neural networks, and there are plenty of lecture notes in numerical optimization available on the web. To start, most people use simple gradient descent, but this can be much slower and less effective than more nuanced methods like
Once you've got the basic ideas down you can start to experiment with different "squashing" functions in your hidden layer, adding various kinds of regularization, and various tweaks to make learning go faster. See this paper for a comprehensive list of "best practices".
One of the best books on the subject is Chris Bishop's Neural Networks for Pattern Recognition. It's fairly old by this stage but is still an excellent resource, and you can often find used copies online for about $30. The neural network chapter in his newer book, Pattern Recognition and Machine Learning, is also quite comprehensive. For a particularly good implementation-centric tutorial, see this one on CodeProject.com which implements a clever sort of network called a convolutional network, which constrains connectivity in such a way as to make it very good at learning to classify visual patterns.
Support vector machines and other kernel methods have become quite popular because you can apply them without knowing what the hell you're doing and often get acceptable results. Neural networks, on the other hand, are huge optimization problems which require careful tuning, although they're still preferable for lots of problems, particularly large scale problems in domains like computer vision.
I'd highly recommend this excellent series by Anoop Madhusudanan on Code Project.
He takes you through the fundamentals to understanding how they work in an easy to understand way and shows you how to use his brainnet library to create your own.
Here are some example of Neural Net programming.
http://www.codeproject.com/KB/recipes/neural_dot_net.aspx
you can start reading here:
http://web.archive.org/web/20071025010456/http://www.geocities.com/CapeCanaveral/Lab/3765/neural.html
I for my part have visited a course about it and worked through some literature.
Neural Networks are kind of declasse these days. Support vector machines and kernel methods are better for more classes of problems then backpropagation. Neural networks and genetic algorithms capture the imagination of people who don't know much about modern machine learning but they are not state of the art.
If you want to learn more about AI and machine learning, I recommend reading Peter Norvig's Artificial Intelligence: A Modern Approach. It's a broad survey of AI and lots of modern technology. It goes over the history and older techniques too, and will give you a more complete grounding in the basics of AI and machine Learning.
Neural networks are pretty easy, though. Especially if you use a genetic algorithm to determine the weights, rather then proper backpropagation.
I second dwf's recommendation of Neural Networks for Pattern Recognition by Chris Bishop. Although, it's perhaps not a starter text. Norvig or an online tutorial (with code in Matlab!) would probably be a gentler introduction.
A good starter project would be OCR (Optical Character Recognition). You can scan in pages of text and feed each character through the network in order to perform classification. (You would have to train the network first of course!).
Raul Rojas' book is a a very good start (it's also free). Also, Haykin's book 3rd edition, although of large volume, is very well explained.
I can recommend where not to start. I bought An Introduction to Neural Networks by Kevin Gurney which has good reviews on Amazon and claims to be a "highly accessible introduction to one of the most important topics in cognitive and computer science". Personally, I would not recommend this book as a start. I can comprehend only about 10% of it, but maybe it's just me (English is not my native language). I'm going to look into other options from this thread.
http://www.ai-junkie.com/ann/evolved/nnt1.html is a clear introduction to multi-layers perceptron, although it does not describe the backpropagation algorithm
you can also have a look at generation5.org which provides a lot of articles about AI in general and has some great texts about neural network
If you don't mind spending money, The Handbook of Brain Theory and Neural Networks is very good. It contains 287 articles covering research in many disciplines. It starts with an introduction and theory and then highlights paths through the articles to best cover your interests.
As for a first project, Kohonen maps are interesting for categorization: find hidden relationships in your music collection, build a smart robot, or solve the Netflix prize.
I think a good starting point would always be Wikipedia. There you'll find some usefull links to documentations and projects which use neural nets, too.
Two books that where used during my study:
Introductional course: An introduction to Neural Computing by Igor Aleksander and Helen Morton.
Advanced course: Neurocomputing by Robert Hecht-Nielsen
I found Fausett's Fundamentals of Neural Networks a straightforward and easy-to-get-into introductory textbook.
I found the textbook "Computational Intelligence" to be incredibly helpful.
Programming Collective Intelligence discusses this in the context of Search and Ranking algorithms. Also, in the code available here (in ch.4), the concepts discussed in the book are illustrated in a Python example.
I agree with the other people who said that studying biology is not a good starting point... because theres a lot of irrelevant info in biology. You do not need to understand how a neuron works to recreate its functionality - you only need to simulate its actions. I recomend "How To Create A Mind" by Ray Kurzweil - it goes into the aspect of biology that is relevant for computational models, (creating a simualted neuron by combining several inputs and firing once a threshhold is reached) but ignores the irrelvant stuff like how the neuron actually adds thouse inputs togeather. (You will just use + and an inequality to compare to a threshold, for example)
I should also point out that the book isn't really about 'creating a mind' - it only focuses on heirarchical pattern recognition / the neocortex. The general theme has been talked about since the 1980s I beleive, so there are plenty of older books that probably contain slightly dated forms of the same information. I have read older documents stating that the vision system, for example, is a multi layered pattern recognizer. He contends that this applies to the entire neocortex. Also, take his 'predictions' with a grain of salt - his hardware estimates are probably pretty accurate, but i think he underestimates how complicated simple tasks can be (ex: driving a car). Granted, he has seen a lot of progress (and been part of some of it) but i still think he is over optimistic. There is a big difference between an AI car being able to drive a mile successfully 90% of the time, when compared to the 99.9+% that a human can do. I don't expect any AI to be truly out driving me for atleast 20 years... (I don't count BMWs track cars that need to be 'trained' on the actual course, as they aren't really playing the same game)
If you already have a basic idea of what AI is and how it can be modeled, you may be better off skipping to something more technical.
If you want to do quickly learn about applications of some neural network concepts on a real simulator, there is a great online book (now wiki) called 'Computational Cognitive Neuroscience' at http://grey.colorado.edu/CompCogNeuro/index.php/CCNBook/Main
The book is used at schools as a textbook, and takes you through lots of different brain areas, from individual neurons all the way to higher-order executive functioning.
In addition, each section is augmented with homework 'projects' that are already down for you. Just download, follow the steps, and simulate everything that the chapter talked about. The software they use, Emergent, is a little finnicky but incredibly robust: its the product of more than 10 years of work I believe.
I went through it in an undergrad class this past semester, and it was great. Walks you through everything step by step

What is model driven development good for?

Microsoft, of Cairo fame, is working on Oslo, a new modeling platform. Bob Muglia, Senior Vice President of Microsoft Server & Tools Business, states that the benefits of modeling have always been clear.
In simple, practical terms, what are the clear benefits that Oslo bestows upon its users?
In theory, there are a few benefits:
The people with the business knowledge can create the software models so you're less likely to lose anything in translation.
When non-technical shareholders create models, it forces them to "think like a developer". They see that what they considered obvious and easy is actually difficult when you formalize it.
It's more efficient. Business people have business knowledge and technical people have technical knowledge so, why not let each group design a system in their area of expertise? No more games of telephone as business experts re-explain what they mean to a developer. Developers are no longer distracted by cryptic business needs. They can focus on the interaction between highly technical systems.
In practice, it's a lot trickier:
Models are hard and that's that. Just because you push model creation to a different group doesn't mean you get foolproof models. Software development is all about modeling so developers are used to it. You may actually lose efficiency as a second group comes to grips with formalizing their understanding of a business need.
Model driven dev is tightly linked to OO concepts. OO is good for a lot of things, but not everything. What happens if what you really need falls outside the abilities of your modeling tool?
In my experience, the division between business and technical people is artificial. The most effective people are technical-minded business people or business-minded technical people. They make things happen. If you separate business tasks from technical tasks, you ruin the opportunity for cross-training and cross-thinking.
I think modeling is just about the next abstraction level. Once it is established it will lead to higher productivity.
MDSD Today - mostly in form of code generation - saves time. Duplicating working patterns for different parts of your software and only writing real business code manually boosts productivity a little bit, but most likely leads to better software quality and more clean architecture.
I think the short answer is research projects!
A good place to start though If you're keen to look into it more is Doug Purdy's PDC talk "A lap around Oslo" which you can see here. He explains how Oslo "captures the essence of the code without the ceremony",..whatever that means.
HTH.