About Googles PubSubHubbub drawback - feed

I was reading a paper titled "An Optimized Web Feed Aggregation Approach for Generic Feed Types" and googles PubSubHubbub protocol was discussed and the paper stated its drawback something like
Furthermore, there are patch systems such as pubsubhubbub (Google 2010) which can be seen as a mod-erator between feed readers and servers. All of these solutions only work if both, client and server support the extensions, which is rarely the case. Pubsubhubbub, for example,is only supported by 2 % of the feeds in our dataset.
I have never really interacted with this protocol , does it require clients (subscribers) to have some sort of a software on their system like feed listeners are required on the client side(subscribers) for obtaining feeds (is that what the above means) ?

I am not sure where they pulled that 2% number from, but it is probably not right.
For example, all the major blogging platforms support PubSubHubbub. A lot of news outlets (HuffPo, Gawker, Foxnews, ABCLocal...) support the protocol too.
Many other services, like Craigslist, Getglue, (even StackOverflow) . A lot of other services, like Github or Instagram do support PubSubHubbub-like APIs for JSON resources, even though this is outside of the current (0.3) spec.
The list goes on and on and on.
Now as far as complexity, it really isn't that difficult for a huge benefit. The "clients" (technically these are web servers) need to visible, accessible outside the firewall.
For publishers, it is even easier as they just need to ping (a simple HTTP POST request) the hub that they've chosen previously.

Related

REST - Should an API client "advance" to the "next" resource like a browser?

In my years specifying and designing REST APIs, I'm increasingly finding that its very similar to designing a website where the user's journey and the actions and links are story-boarded and critical to the UX.
With my API designs currently, I return links in items and at the bottom of resources. They perform actions, mutate state or bring back other resources.
But its as if each link opens in a new tab; the client explores down a new route and their next options may narrow as they go.
If this were a website, it wouldn't necessarily be a good design. The user would have to either open links in new tabs or back-up the stack all the time to get things done.
Good sites are forward only, or indeed have a way to indicate a branch off the main flow, i.e. links automatically opening in new windows (via anchor tag target).
So should a good REST API be designed as if the client discards the current resource and advances to the next and is always advancing forward?
Or do we assume the client is building a map as it goes, like um a Roomba exploring our living room?
The thing with the map concept is that the knowledge that one should return to a previous resource, of the many it might know about, is in a sentient human, a guess. Computers are incapable of guessing and so its need programming, and this implies out-of-band static documentation and breaks REST.
In my years specifying and designing REST APIs, I'm increasingly finding that its very similar to designing a website
Yes - a good REST API looks a lot like a machine readable web site.
So should a good REST API be designed as if the client discards the current resource and advances to the next and is always advancing forward?
Sort of - the client is permitted to cache representations; so if you present a link, the client may "follow" the link to the cached representation rather than using the server.
That also means that the client may, at its discretion, "hit the back button" to go off and do something else (for example, if the link that it was hoping to find isn't present, it might try to achieve its goal another way). This is part of the motivation for the "stateless" constraint; the server doesn't have to pretend to know the client's currently displayed page to interpret a message.
Computers are incapable of guessing and so its need programming, and this implies out-of-band static documentation and breaks REST.
Fielding, writing in 2008
Of course the client has prior knowledge. Every protocol, every media type definition, every URI scheme, and every link relationship type constitutes prior knowledge that the client must know (or learn) in order to make use of that knowledge. REST doesn’t eliminate the need for a clue. What REST does is concentrate that need for prior knowledge into readily standardizable forms. That is the essential distinction between data-oriented and control-oriented integration.
I found this nugget in Fielding's original work.
https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
The model application is therefore an engine that moves from one state to the next by examining and choosing from among the alternative state transitions in the current set of representations. Not surprisingly, this exactly matches the user interface of a hypermedia browser. However, the style does not assume that all applications are browsers. In fact, the application details are hidden from the server by the generic connector interface, and thus a user agent could equally be an automated robot performing information retrieval for an indexing service, a personal agent looking for data that matches certain criteria, or a maintenance spider busy patrolling the information for broken references or modified content [39].
It reads like a great REST application would be built to be forward only, like a great website should be simple to use even without a back button, including advancing to a previously-seen representation (home and search links always available).
Interestingly we tend to really think about user journeys in web design, and the term journey is a common part of our developer language, but in API design this hasn't yet permeated.

Webservice standards and DTDs

While brainstorming about six years ago, I had what I thought was a great idea: in the future there could be webservice standards and DTDs that effectively turn the web into a decentralized knowledgebase. I listed several areas where I thought this could be applied, one of which was:
For making data avail. directly from a business's website: open hours, locations, and contact phone numbers. Suggest a web service standard by which businesses have a standard URL extended off the main (base) URL for there website, at which is located a webservice. That webservice as well has a standardized set of services for downloading a list of their locations, contact telephone numbers, and business hours.
It's interesting looking back at these notes now since this is not how things have evolved. Instead of businesses putting this information on only their website then letting any search engine or other data aggregator to crawl it, they are updating it separately on their website, their Facebook page, and Google Maps. Facebook and Google Maps, due to their popularity, have become the solution to the problem I though my idea would solve.
Is the way things are better than the way I thought they could be? If so then why doesn't my idea fit the reality? If not then what's holding my idea back from being realized?
A lot of this information is available via APIs, that doesn't mean that it doesn't get put other places as well, through a variety of means. For example, a company may expose information via an API, and their Facebook app might use that API to populate a Facebook page.
Also, various microformats are in use that encapsulate some of this information.
The biggest obstacle is agreeing on what meta-information should be exposed, how it should be exposed, and how it should be accessed.

What is the technology behind Google Buzz?

I am really curious to know how Google Buzz and Facebook implement their comment feature which is being updated instantly. is it similar to Google wave technology? are there any resources to learn that technology and implement it to our website?
Thanks !!
I work on the Google Buzz team, so hopefully I can give you a good answer for our side of the equation. I obviously won't go into any of the confidential backend stuff, but I'm happy to address the open standards we use and the open source projects involved.
Starting in the UI space, we use technologies like Closure and GWT to build rich, responsive user interfaces. We use a technology vaguely similar to what you see in the Google App Engine Channel API to push real-time updates to the users. GAE is a really good choice for real-time web applications right now.
On the API side of things, we try to use open standards wherever possible. We use the Atom syndication format to enable feed readers to consume Buzz content, and Pubsubhubbub to enable real-time pushes of the content. In fact, we use Pubsubhubbub for our activity firehose — it's possible to subscribe to the entire real-time stream of all updates that happen in Buzz. Needless to say, this sends a massive amount of traffic to your application. On the JSON side of the equation, we use Activity Streams, and we're actively working with the community to refine and improve that specification. Our Atom feeds include Activity Streams as well, but the focus there is on syndication. All our secured API endpoints for Buzz use the OAuth standard for authorization.
On the backend, I think the only thing we're willing to say publicly is that Protocol Buffers are pretty awesome.
The technology is called Real-time web (http://en.wikipedia.org/wiki/Real-time_web). You have many application models to achieve real-time and one of them is Comet (http://en.wikipedia.org/wiki/Comet_%28programming%29). Good server to use it in your implementation is APE (http://www.ape-project.org/). It supports many common javascript frameworks. More you can check in provided links.

What are general areas you would want to use XMPP?

I understand that XMPP is used in chat services, but it seems to be more generally useful than that. Can someone list some scenarios and examples where you would consider using XMPP, and the pros and cons of it versus other approaches?
I know that Dropbox uses it for its filesharing system in Android (possibly it does in other platforms too).
Cons: much more verbose than binary (more bandwidth).
Pros: a wide variety of already implemented client and servers. A wide range of already implemented reliability, scalability, security, presence, rpc, federation, custom components, mail, VoIP mechanisms... the list is very very long. Even if you need something different, and you know where to touch, you could extend it to your needs, inheriting all the already implemented features.
We had a project on collecting information eg. wind direction, temperature, stock and forex, etc. Every sensor is a Jabber 'user'.
This allows us to detect if a sensor
is online, offline or problematic.
Sensors also publish information to
a pubsub node to be distributed to
collectors.
Human users can also
interact with a particular sensor by
querying with the sensor. The sensor
returns human friendly formatted
data.
We use it for chatrooms, and for distributing sports results to users watching live events.
Google Buzz and Facebook talk is built on it.

Old concepts with new names (namely REST and Cloud computing)

It seems that SaaS and Cloud computing are old concepts with new names, and I am curious if I am wrong.
For cloud computing you can look at: Difference between cloud computing and distributed computing?
Basically, it seems that when we have been hosting that that is cloud computing, it is just that now some companies have put in much great resources to ensure better uptime than my local ISP. But, it seems that there is nothing really new here.
For REST, it seems that it is what we have been doing with cgis for 15 years.
Here is a question on REST: What am I not understanding about REST?
It appears that REST is an old concept, and I am curious how it is different than has been done since the early days of the web, and, to a large extent, the early days of using telnet (which http is on top of).
Am I mistaken in my simplification of these? I try to see how what is new is like what I know so I can see what more has to be learned in that topic, but for cloud computing and REST it seems that very little needs to be learned.
You are both right and wrong. You are right in the sense that new ideas are normally similar to old ideas, and indeed cloud computing is based significantly on distributed computing.
What is new in cloud computing is
virtualization
self-service
With virtualization, you can run multiple operating systems on a single hardware. While that, in itself, isn't new, either, it was never considered in distributed systems as a relevant piece of the architecture. Using virtualization allows self-service: users can create their own clusters of nodes without the administrator of the hardware taking any action. This allows a significant acceleration of deployment, and a significant reduction of cost.
For ReST, what you are missing is the client API. It is true that on the server side, a ReST service can be implemented with CGI. What is new here is that it is not an end user which retrieves the URL, but a program.
Saying that HTTP is on top of telnet ignores realities; this is like saying that we made no progress since the introduction of copper wires for communication. Strictly speaking, HTTP is not in top of telnet, but on top of TCP (which telnet is also on top of, these days).
Considering Roy's dissertation coined the term REST back in 2000, you can definitely argue that there is nothing new about REST. Additionally, the REST architectural style was synthesized from successful existing practices, so REST implementations pre-date the definition. Having said that, there is nothing simple about designing REST interfaces. Ever since Netscape first abused cookies to allow servers to maintain session state people have been swimming upstream against the web.
REST's recent resurrection has come mainly from people becoming disillusioned with SOAP based Web Services. SOAP tried to hide HTTP instead of embracing it and I think people are starting to realize how effective HTTP can be as an distributed application protocol that can do more than just deliver HTML to web browsers.
RESTful web applications don't use session state, so one could argue that by that virtue alone it is different than most web applications in existence at the moment.
As for Cloud Computing, I find myself agreeing with Larry Ellison for once in my life.
I'm in agreement on what you've posted. You might consider making this community wiki since it's likely to garner many answers based on opinion. Cloud computing seems to have taken off as a buzzword, and this is largely due to a decrease in cost for mass quantities of hardware. And then there is REST which is really just a formal name and definition for something that has been in place for a long time. Some people like to encapsulate ideas with buzzwords and acronyms. Sometimes it's useful to put a name to an idea though.
Not only this, the concept of things being old concepts with new names is old. It's hard to be original these days :P
You are right about REST -- its mostly old concepts with a lot of added pedantry and not much added substance.
Cloud computing has a small but fundamental difference from distributed computing. In distributed computing you had servers dedicated to particular functions, and usually some sort of directory service to locate the correct server. In cloud computing any server is capable of any task and usually the servers queue up for work which is distributed from a central point.