Trying to get facebook/twitter/myspace statuses and other data for statistics - facebook

I was wondering if anyone knows how to gather data from millions of people around the globe via these social networks in order to get the statistics. I need this for a project I'm trying to do and do not need to know the actual person posting such information (such as statuses, comments, information about them, etc) so as not to break any data privacy laws.
I need to know things like how many people commented about Obama today and what was their sex (female or male) and things like that.
is that possible in any way?
Thanks a million

I think you're asking if there are any resources to mine for social data.
Your best bet is to check out the Twitter or Facebook APIs. Variables like age, sex, location will probably be far more difficult to ascertain than raw status info, but it can be done.
For Twitter, I would recommend using the Twitter streaming API and filtering for specific keywords.

For MySpace, use the Real-Time Stream PUSH feature: http://wiki.developer.myspace.com/index.php?title=Category:Real_Time_Stream

the best tool so far used to get data from most of the social network site is this, build an algorithm that will suit your need from the data you collected

Related

How much, what order and where to put data?

I've been updating and moving my massage business website to Wordpress. During SEO process I interested and decided to include some structured data but I'm bit confused how to do it properly. I'm going to test that stuff first on my current site.
I'm going to present information with JSON-LD and I've been reading alot of schema-org manuals and blog posts about the schemas, still they are bit vaque to me.
How much data should I provide?
I still would like to present list of services we provide and price range by currency/min/maxPrice and persons data who are working there (name, profession, phone).
Would it be wise to put that data in the <head>-section of every page?
Or just specific data to page that they relate to like staff info to "Contact Us" page and service list to "Services" page?
Is there any penalty or down sides to have all that data on every page?
How do I present personal courses that every person has taken or other studies?
How do I present those services?
Can business under that HealthAndBeautyBusiness handle 3 phone numbers with names or should I just put contact info under person's data?
Does it matter in which order I present that data?
The more data you provide, the better
Better to be specific, otherwise it could be interpreted as spam. The structured data should be closely related to the content of the page itself
You mean the employees? You could use the employee property and the alumniOf properties but that doesn't match it very well. I think such data is a bit too detailed to be described at the moment - I would omit it for the time being
List them as offers, see makesOffer property
I would limit it to 1 number
The order doesn't matter
In the future try to split your questions, would be much easier to answer them that way.
I'm going to present information with JSON-LD and I've been reading alot of schema-org manuals and blog posts about the schemas, still they are bit vaque to me.
In regards to this statement. If I were you, and I'm not, therefore I can only assume you are just learning about technologies such as json-ld and how they relate to the bigger picture that is the Semantic Web also known as Web 3.0.
It sounds like you are on the right track I would suggest additionally reading articles relating to api's as well as the http request life cycle.
-Happy Coding

How to apply collaborative filtering on no-rating system like Twitter, Facebook

I'm studying Collaborative Filtering and want to apply to some social network like Twitter or Facebook. I tried with some demo provided by MovieLens and understood that user has to rate on some items which reflect the interesting, and the rating will be used as input for recommend algorithms. However with some social network which there are no rating feature like Twitter or Facebook, how can I apply these algorithms.
Someone worked on this area, please give me suggestions for that.
The keywords you should use in search are "implicit feedback". Luckily there are some good systems/approaches out there that allow you to work with such type of data.
Here is the one I consider the best https://github.com/benfred/implicit And what's even better this GitHub page provides you with links to the articles explaining the theory behind each of the approaches it uses. There are also a couple of tutorials that would help you to write your first recommender system in no time. And it's incredibly fast, took me 2 hours on quad-core PC to calculate recommendations for 600K users basing on 40M entries.
Instead of using explicit ratings. You can infer implicit ratings by defining your own weights for actions like:
Twitter: Reteweet=1, Save=2, Both=3
Facebook: Like=1, Share=2, Both=3
Using this method, you maintained a 1-3 rating system that can be fed into the collaborative-filtering algorithm.

Is it possible to get stocktwits sentiment indicator for a ticker via API

I looked at the API documentation and it was not immediately apparent to me. Is it available via partner access?
Also, the default rolling average for sentiment seems to be 7 days. Is there an option to change this. One obvious way of doing this is parsing the firehose and some partners probably do that. I don't care for all that data or parsing it, in the unlikely scenario where I can get access to that.
The Sentiment data is only available to partners that license our API. Please touch base with us and let us know what you would like to do and about your paid product:
http://stocktwits.com/developers/contact
There currently is no option to change the rolling average, we have plans to add different time frames, as we agree this would be helpful.
We offer a financial sentiment API at Knowsis.
API docs are available here: http://knowsis.github.io

Geolocation APIs: SimpleGeo vs CityGrid vs PublicEarth vs Twitter vs Foursquare vs Loopt vs Fwix. How to retrieve venue/location information?

We need to display meta information (e.g, address, name) on our site for various venues like bars, restaurants, and theaters.
Ideally, users would type in the name of a venue, along with zip code, and we present the closest matches.
Which APIs have people used for similar geolocation purposes? What are the pros and cons of each?
Our basic research yielded a few options (listed in title and below). We're curious to hear how others have deployed these APIs and which ones are ultimately in use.
Fwix API: http://developers.fwix.com/
Zumigo
Does Facebook plan on offering a Places API eventually that could accomplish this?
Thanks!
Facebook Places is based on Factual. You can use Factual's API which is pretty good (and still free, I think?)
http://www.factual.com/topic/local
You can also use unauthenticated Foursquare as a straight places database. The data is of uneven quality since it's crowdsourced, but I find it generally good. It's free to a certain API limit, but I think the paid tier is negotiated.
https://developer.foursquare.com/
I briefly looked at Google Places but didn't like it because of all the restrictions on how you have to display results (Google wants their ad revenue).
It's been a long time since this question was asked but a quick update on answers for other people.
This post, right now at least, will not go into great detail about each service but merely lists them:
http://wiki.developer.factual.com/w/page/12298852/start
http://developer.yp.com
http://www.yelp.com/developers/documentation
https://developer.foursquare.com/
http://code.google.com/apis/maps/documentation/places/
http://developers.facebook.com/docs/reference/api/
https://simplegeo.com/docs/api-endpoints/simplegeo-context
http://www.citygridmedia.com/developer/
http://fwix.com/developer_tools
http://localeze.com/
They each have their pros and cons (i.e. Google Places only allows 20 results per query, Foursquare and Facebook Places have semi-unreliable results) which can be explained a bit more in detail, although not entirely, in the following link. http://www.quora.com/What-are-the-pros-and-cons-of-each-Places-API
For my own project I ended up deciding to go with Factual's API since there are no restrictions on what you do with the data (one of the only ToS' that I've read in its entirety). Factual has a pretty reliable API, which as a user of the API you may update, modify, or flag rows of the data. Facebook Places bases their data on Factual's, just another fact to shed some perspective.
Hope I can be of help to any future searchers.
This is not a complete answer, because I havn't compared the given geolocation API, but there is also the Google Places API, which solves a similiar problem like the other APIs.
One thing about SimpleGeo: The Location API of SimpleGeo supports mainly US (and Canada?) based locations. The last time I checked, my home country Germany doesn't has many known locations.
Comparison between places data APIs is tough to keep up to date, with the fast past of the space, and with acquisitions like SimpleGeo and HyperPublic changing the landscape quickly.
So I'll just throw in CityGrids perspective as of February 2012. CityGrid provides 18M US places, allowing up to 10M requests per month for developers (publishers) at no charge.
You can search using a wide range of "what" and "where" (Cities, Neighborhoods, Zip Codes, Metro Areas, Addresses, Intersections) searches including latlong. We have rich data for each place including images, videos, reviews, offers, etc.
CityGrid also has a developer revenue sharing program where we'll pay you to display some places as well as large mobile and web advertising network.
You can also query Places via the CityGrid API using Factual, Foursquare and other places providers places and venue IDs. We aggregate data from several places data providers through our system.
Website: http://developer.citygridmedia.com/

Best Practice For 'Recent Activities of My Friends'

What are the best practice for making 'Recent Activities of My Friends' in websites such as FriendFeed or Facebook sites... How can these data stored and queried? Is there any publisher/ subscriber technology or something else?
Thanks
You're much more likely to get an answer if you provide more details. For instance, it seems likely you'll be using a database. MySQL? Oracle?
Language? PHP? Perl? Python? Tell us more.
We are using MySQL and our primary language is Java.
I have no idea such an architecture how it is build.
For example, a user with a 100 friend, should notify all of friends. Each friend should see recent activities of his friends also. However, I am not sure whether users should have their own history or not. What is the best practice for this?