Largest possible group of friends in common? - facebook

I'm trying to come up with the largest possible group of friends that would theoretically get along with each other, i.e., each person in the group should know at least 50% of the other people in the group.
I'm trying to come up with an algorithm for this that doesn't take ridiculously long; Facebook's API/cross-server talk is pretty slow as is.
I was thinking I could start with the friend that has the most mutual friends with me first, and then add people to the group one by one. But who would I choose next?
Just interested in the theory, no code is necessary.
Edit: When I said "theory", what I really meant what's the next logical step in plain english :) I was hoping I could code this up in an afternoon, but I guess this is a bit more complicated than I anticipated, and I'm not sure I want to spend weeks delving into heavy graph theory. Nevertheless, maybe someone else will find this interesting.

MIT did some work on social graphing a while back. Although it used mobile phone data, the clustering algorithms and other systems should still apply, even though they are constructed using different inputs and criteria.
There is more MIT chatter about social graphing going on at the moment. Definitely the place to look for technical pointers on this kind of thing.
Whilst the problem of graph enumeration from a given node to it's edges is NP complete for most useful problems ... the application of the graph traversal and the wealth of information might help you make this more efficient:
For any node (profile) N, you could data-scrape using Google or something to find associated edges out. This means that you can harness a cache of the pages and Googles search technology to mitigate having to traverse the edges yourself.
Social profiles contain tons of meta-data. Developing a statistical analysis method for working out the likelyhood of A knowing B without a direct path might be useful. Afterall friends have a) similar locations and b) similar interests
Other data, seemingly irrelevant can provide a means for locating people likely to know eachother and then you can double check the edges. Things such as chatter on boards about a band or gig, or people mentioning "cat fight" when Kate smacked Mary in the mouth.
The data just needs looking at in the right way, in the same way MIT looked at geographical statistics to determine relationships through phones.
Good Luck

There is an Algorithm called SCAN-Algorithm with some precalculations the algorithm can cluster a network in a good speed.
You can find informations about the algorithm here: SCAN: A Structural Clustering Algorithm for Networks

This is more "broad", but see if it helps to get ideas.

Related

Where can I learn about recommendation systems?

I'd like to play around with building a recommendations system, and by that I mean an algorithm that looks at preferences and/or reviews posted by a user and then makes recommendations for them, similar to what netflix or amazon use.
What are some good resources for learning how to write something like this? Where should I start?
Check out the Wikipedia page on the Netflix Prize and its discussion forum. Also, the somewhat related 2009 GitHub Contest is a good source for full source code on a number of different recommendation engines. And obviously there's also the Wikipedia page on the topic itself, which has some decent links.
If you start writing your own, you'll want to use a corpus. I'd actually recommend using the Netflix Prize's data set. Just carve the data set into two pieces. Train on the first piece and score your algorithm on the second piece.
Addenda: A somewhat related and scary application of this sort of thing is predicting demographic information: a user's gender, age, household income, IQ, sexual orientation, etc. You could probably do most of these attributes with the Netflix Prize dataset with a fairly high degree of accuracy. Fortunately everyone in that dataset is just a number.
Take a look at pysuggest a Python library that implements a variety of recommendation algorithms for collaborative filtering (which is used by Amazon.com).

What content have you made/seen made using procedural techniques

I was looking at some study i have to do in the future to do with procedural generation techniques and i was wondering what type of content you have:
Developed
Helped Develop
Seen implemented
Tried to develop
and what methods/techniques/procedures you used to develop it.
If you feel generous maybe you can even go into specifics of it such as data structures ad algorithms you have used to develop it.
If this needs to be put as community wiki because it is not me asking for a problem to be solved just let me know.
This is not a homework thread because it is a research unit that i'm not taking yet ;)
Introversion software, the makers of the games Defcon, Uplink and Darwinia (among others) have started working on a game about a year ago which extensively uses PCG for city generation, here is a video of their work, and you can read more about it on the development diary of the game (start from the first part at the bottom of the page!).
This immediately got me extremely interested, and seeing the potential for games I immediately started researching the technology. I have amassed a folder of 18 PDFs about the subject (research papers, SIGGRAPH presentations, etc). Here, I uploaded it for you.
The main approach is to use L-Systems, however, I never got around to understanding enough of that to make something out of this. I tried other, less successful approaches like using Voronois, recursively splitting a rectangular area into more smaller areas and shifting the boundaries a little to obtain a bit of randomness and polygon division.
The last method I had gotten from Mike's Code Blog's posts (here and here). The screenshots shown on his blog make me drool, it is my biggest programmer's dream to ever get something that looks like that. I emailed him to ask how he did it, and here is the relevant part of his reply, I'm sure he wouldn't mind me posting this here:
L-Systems is definitely one way to go, but that isn't what I'm doing. The basis of my method is polygon subdivision. I start with a simple polygon that represents the entire area of the city. Then, I split it (roughly) in half, and then split those two polygons, etc. until I get down to city-block size. At that point, the edges of all my polygons represent roads. I then use the same subdivision method to break the blocks down into building-size lots.
The devil is in the details, of course, but that is the basic method.
I for one still haven't managed to fully implement a solution of which I'm satisfied of, but it remains one of, if not my single biggest programmer's dream to ever achieve something like this.
Here are a few of the leaders in procedurally generated terrain (and to a lesser extent foliage). If you don't get a detailed answer here regarding methods and techniques, you might want to look in / ask in their forums. I have seen some discussions of techniques there.
TerraGen 2
World Builder
World Machine
Natural Graphics
Noone mentioned the demoscene that ONLY use procedural stuff?
So, go search for Werkkzeug, Kkrieger, MilkyTracker to start. Also you can visit the site pouet and see the wonder of well done procedural videos (yes, procedural videoclips! With music and graphics, all procedural!)
Allegorithmic's products are used in actual shipping titles. These guys focus on texture generation (both offline and at runtime).
They have some very pretty screenshots and demos.

Open Source collaborative filtering frameworks

I was wondering if there exists any open source frameworks that will help me include the following type of functionality to my website:
1) If I am viewing a particular product, I would like to see what other products may be interesting to me. This information may be deduced by calculating for example what other people in my region (or any other characteristic of my profile) bought in addition to the product that I am viewing. Kind of like what Amazon.com does.
2) Deduce relationships between people based on their profile, interaction with one another on the website (via commenting on one anotherĀ“s posts for example), use of the website in terms of areas most navigated, products bought in common etc.
I am not looking for a open source website with this functionality, but something like an object model into which I can feed information about users and their use of the site including rules about relationships and then at a later point ask it questions described in (1) and (2) above.
Any pointers to white papers / general information about best approaches to do this, or any related links will really help too.
(I am the developer of Taste, which is now part of Apache Mahout)
1) You're really asking for two things here:
a) Recommend items I might like
b) Favor items that are similar to the thing I am currently looking at.
Indeed, Mahout Taste is all about answering a). Everything it does supports systems like this. Take a look at the documentation to get started, and ask any questions to mahout-user#apache.org.
For 1b) in particular, Mahout has two answers:
If you are only interested in what items are similar to the current item, you would be interested in the ItemSimilarity abstraction in Mahout (org.apache.mahout.cf.taste.similarity.ItemSimilarity) and its implementations, like PearsonCorrelationSimilarity. Based on a set of user-item ratings, this could tell you an estimated similarity between any two items. You'd then just pick the most similar items. In fact, look at the TopItems class in Mahout which can just figure this for you quickly.
But also, you can combine a) and b) by computing recommendations, then applying a Rescorer implementation which then favors items that are similar to the currently-viewed item.
2) Yes likewise, you would be interesting the UserSimilarity abstraction, implementations, etc. This would deduce similarities based on item ratings. Mahout however does not help you deduce these ratings by, say, looking at user behavior. This is domain-specific and up to you.
Sound confusing -- read the docs and feel free to follow up on mahout-user#apache.org where I can tell you more.
I am researching the same topic, as I'm working on a project to help people decide how to vote on California's complicated ballot measures. Here are some open-source collaborative filtering engines that I've found:
Vogoo (PHP)
acts_as_recommendable (Ruby on Rails)
Mahout (formerly Taste) (Java)
There's also a good overview of these engines here.
There are also the Duine framework and OpenSlopeOne.
But in my opinion, Mahout is still the best.
You can find a survey about Open Source Recommender Systems here:
http://girlincomputerscience.blogspot.com.br/2012/11/open-source-recommendation-systems.html
Hope it helps!
You can find a List of Recommender Systems here

Group games to teach computer programming (either functional or imperative)

(See end for summary of updated question.)
I want to convey to groups of people (kids or adults) how a computer program written in a high-level language works, and what the relationship is of that program to the computer as a consumer device as they know it (a TV-like box that "does" typing and "internet").
I want to do it without computers. Not because I don't have them, but because I want a fun, physical activity that involves people the way acting, dance, music, sports, and capture-the-flag are fun.
I have read Teaching beginner programming, without computers here on stackoverflow; its reference to Computer Science Unplugged comes closest, but most of the activities there are either too complex, require too many props, or focus on specific computer science concepts.
I have also read Games that teach Programming Fundamentals but almost nothing matched my description in my first paragraph above.
And just for good measure, I have read Should functional programming be taught before imperative programming? so I am open to activities to teach either of those.
Keep in mind these requirements, some of which are subjective:
physical
no props (or very few)
fun
involves as many of the senses as possible
simulates the experience of writing a program and running it on a computer
no computers anywhere in the picture
is a game (competitive or cooperative)
It occurs to me that one source of material might be those team-building games that companies send you on. But those are designed for team-building, not teaching what writing and running a computer program is. But maybe you get the idea. Another way of looking at this question is to suggest what search terms I should use to find more answers -- though I usually can pick good search terms, an implicit "or" of "computers" and "games" will not find what I want because that combination is reserved for something totally different.
Update:
Thanks for responses so far!
I have now clarified that I'm interested in simulating the operation of a high-level-language program rather than either how the machine operates (1's and 0's) or specific concepts
With that clarification, you will be able to say specifically whether your game suggestion or game found teaches about functional or about imperative programming
With that clarification, please also respond to the part about games to teach the relationship of a computer program to the computer. What needs to be taught is that other consumer devices that physically look similar do not have "programs" -- why?
Your direct answers are much appreciated; if you can also find more ready-to-use sources beyond Computer Science Unplugged that will be great too
See my comments on answers so far, all of which are made in the spirit of thanks for what you've written, and not meant to be critical in any way.
Fundamentally, computers only do a few, very simple things:
They can do basic math,
They can move data from one place to another,
They can loop, and
They can make simple decisions.
The power of computers lies in the fact that they can do these simple things millions of times per second.
At the physical game level, I believe this is about all you can teach. Beyond that, I believe computer simulations and/or multimedia presentations are required (or, at the very least, a whiteboard).
1. Human Bubble Sort
Just test the Human Bubble Sort => ask a group of people - I'd recommend from min. 4 to max. infinite :-) - to sort themselves on the Bubble Sort principle, based on the alphabetical order of their family name.
Example : https://www.youtube.com/watch?v=8QD-R_MfDsQ
Works for kids and grownups.
2. Human Frenzy Robot
With physical people, paper sheets, and arrows + symbols written on them, reproduce the principle of the Frenzy Robot in real life. Look for "lightbot" on Google - I cannot post more than two links yet. I've just created my account to answer here :-)
3. Primo
For very young kids (after 4 years old), I really like Primo, a programmable small toy you put in motion on a grid => http://www.primotoys.com/
You could demonstrate thread locking by having two teams competing to get two halves of a key that opens the door to some reward (sweets for kids etc.). Each team grabs half the key each and then neither can open the door. If they cooperate then they both get the reward.
This might be a bit advanced - not sure now having re-read it.
It really was fun in CS Class: The Living Turing Machine.
You need:
Some place to place the formal rules of the machine, in the beginning it's pure chaos :-D
Humans:
a. A bunch of people that stand in line and simulate the linear memory, you just need a way to distinguish between 'ones' and 'zeros'. We did this by standing in the foreground or in the background, but I could also imagine other ways...
b. One person for every state of the machine
c. A 'reading head' which moves left or right on the memory.
Now you just need sample programs, start simply, for example with inverting a pattern. Then go on to more complex programs like increment/decrement.
For inspiration : an example of how physical people can materialize the Bubble Sort algorithm through dance => https://www.youtube.com/watch?v=lyZQPjUT5B4

Neural Networks or Human-computer interaction

I will be entering my third year of university in my next academic year, once I've finished my placement year as a web developer, and I would like to hear some opinions on the two modules in the Title.
I'm interested in both, however I want to pick one that will be relevant to my career and that I can apply to systems I develop.
I'm doing an Internet Computing degree, it covers web development, networking, database work and programming. Though I have had myself set on becoming a web developer I'm not so sure about that any more so am trying not to limit myself to that area of development.
I know HCI would help me as a web developer, but do you think it's worth it? Do you think Neural Network knowledge could help me realistically in a system I write in the future?
Thanks.
EDIT:
I thought it would be useful to follow-up with what I decided to do and how it's worked out.
I picked Artificial Neural Networks over HCI, and I've really enjoyed it. Having a peek into cognitive science and machine learning has ignited my interest for the subject area, and I will be hoping to take on a postgraduate project a few years from now when I can afford it.
I have got a job which I am starting after my final exams (which are in a few days) and I was indeed asked if I had done a module in HCI or similar. It didn't seem to matter, as it isn't a front-end developer position!
I would recommend taking the module if you have it as an option, as well as any module consisting of biological computation, it will open up more doors should you want to go onto postgraduate research in the future.
The worthiness depends on three factors:
How familiar are you with the topic already?
How good is the course/class you want to take?
What are your interested in more?
Especially for HCI, there is a broad range of "common sense" information you would also easily obtain from reading a good book or a wider range of articles about it also published on the internet. On the other hand, there indeed exist many deeper insights mostly obtained by Psychology studies. If the course is done right, you can indeed learn a lot about the topic and the real considerations to use for developing an interface.
For Neural Networks, one has to say that this is a typical hype topic. It would be mainly interesting in what application domain the course wants to deal with neural networks. You can be quite sure that you won't program or use any neural networks for web development. On the other hand, if the course is done right, this could be a good opportunity for you to broaden your knowledge. Especially, deepening your understanding about the theory of computer science. This highly depends on how the course is laid out, though.
HCI is a topic which helps your career as a web developer, but only if you feel incompetent in that topic (then it is a must) or it is done very well. Neural Networks is a topic which has more potential of being really interesting hardcore computer science stuff, where you indeed learn a better understanding about something. If you are interested in NN, you should not pass the opportunity to get an education which is not narrowly concentrated on the domain of web development -- and, after all, perhaps find more interest in other stuff (it is always good to know other directions you would perhaps like to go into for the future).
Neural networks sound cool until you read the fine print:
In modern software implementations of
artificial neural networks the
approach inspired by biology has more
or less been abandoned for a more
practical approach based on statistics
and signal processing.
This is something that has mystified me for years. Here you have an amazingly complex and powerful control system (real-world biological neural networks), and an academic discipline that appears to be about modeling these systems in software but that has in reality abandoned that activity.
If you're doing web development, your time is probably better spent in the HCI course.
Go with what interests you the most. The HCI stuff will be much easier to pick up later as needed, you'll likely never get another chance to learn about neural networks!
For prospective employers (at least the good ones!) you need to show a passion and excitement about what you do. I'd sooner hire someone who can enthusiastically talk about neural networks than someone who has an extra credit in HCI.
Unless you want to do the research end of the world, ie, get a Masters/PhD, go HCI.
I studied Neural Computation at University when I studied AI. I now run my own company. The number of times since I studied that I have used my NN skills equals zero. I'm glad I did it, as it was quite fascinating, but I would have found HCI much more useful from the position I'm at now. I think that you'd pick up a lot more insight from an HCI course relevant to the software industry, but if you think you experience should be more on the esoteric/almost arty side of development, go for NN.
Which sounds like more fun? Or, equivalently, which will you work harder at? Pick that one.
Did two courses in NN and some other AI-courses - its fun to poke round with that stuff and I actually managed to implement the stuff in some of the things I've done like face-recognition, and it's useful in some other areas to if you wanna plot your lab data etc. I have never used the NN:s in my web development career though I am sure it could be used for something however what it all really boils down to is to find a client or employee willing pay for it when you can just take the straight path. So I would rather read book about it if I wasn't that hardcore about it.
Fundamental Neural Networks doesn't take to much knowledge in math, and was what I used in my first course.
as a programmer to be you need the knowledge of neural network. if parallel processing is the way to go in hardware then future programmers must be knowledgable in neural network. don't forget that NN works better with noise or imprecise data but other systems may not. Note that most data we use for analysis are sample data which is a fraction of the whole and you could imagine if some in the sample are way off. so you need knowledge of NN if you want to last in computer programming field.