Tutorial on how to write a custom I/O - apache-beam

I am unable to use the IOs that are provided with the beam sdk. In my case, data resides in LDAP (or in other enterprise apps). Hence the need to develop custom I/Os. But, I find that there isn't even a single tutorial on how to write an IO. Between trying to figure out how to use AutoValue and how to use PCollection, I am having a tough time (and yes, I can read code on github).
Any pointers?

Developing a new I/O connector guide is a good starting point.
If you decide to implement a source using the Splittable DoFn framework, there 's more documentation here.

Related

Implement LRS With Tin Can API for iPhone

I have to create a project in iPhone which uses the Tin Can API. The Tin Can API is an advanced distributed learning process.
I have no idea about where to start in Objective-C.
I have read the site http://tincanapi.com.
For implementation, I have some basic questions:
How and to create my own LRS?
How Tin Can API can communicate with my own LRS and LMS with ASIHTTPRequest programatically?
There are two parts to TinCanAPI at play here. I'm suspecting you only need to handle one of them on iPhone. One part is the client-side that sends the statement date to the second part (the LRS server-side). It would be very odd to create the LRS server part on an iOS device, so I'm going with the thought that you need to send TinCan statements from an iOS device to an existing LRS.
An LRS accepts statement data via a REST interface and this data can be POSTed using a standard NSURLConnection or using AFNetworking. There are a couple of options for abstracting all those calls with a library one of which is a new OSS version of the basics appearing very soon from Rustici Software found here http://rusticisoftware.github.io/TinCanObjC/. There is no link for it just yet, but feel free to contact me for more details and I'll update this answer with the link as soon as there is a public link.
For your specific questions:
1.) You can create your own LRS by understanding the spec document and creating the REST endpoints as specified. This is not a trivial undertaking by any means.
2.) Your best bet is to use an SDK or simple GET and PUT/POST statements from AFNetworking to the TCAPI endpoint.

How to start use webservice

i want to read and write data to a website (server on web) and don't have any information about webservices and other things that related to it
Does anybody have any idea about how to start it (mean offer complete books,papers,tutorials,websites,… or what should i learn at first mean is it necessary to learn xml,soap,... and other things)
Thank you
I've used Google App Engine with great success. You would format your data to output as JSON and use an iPhone library to read it. I've used this one (though Touch JSON seems to be more popular).
Read about REST, ROA and AtomPub. Thats got me started. I'm about to implement some webservices in WCF (WCF now acts like a RESTFul webservice, but you can also use plain old SOAP). Before I got to WCF, I experimented with RoR. RoR uses REST "out-of-the-box".

Existing pubsubhubbub ajax proxy/bridge? (Like Google Feeds API v2 with Push)

I'm looking for a server side component, preferably java, that will allow me to subscribe to pubsubhubbub feeds through javascript. I understand that subscribers are server side applications in the standard rest/pubsubhubbub format, but Google seems to have created a ajax bridge that looks quite handy.
Unfortunately, I'm dealing with data that simply cannot leave our servers, let alone go through Google's.
Is anyone aware of a (preferably free) server side proxy for pseudo javascript pubsubhubbub subscribers?
Reference: http://code.google.com/apis/feed/push/docs/index.html#hiworld
I know for a fact that Kwwika and Pusherapp are working on this. I can intro you with these guys if you want.
If not, I believe this should be relatively easy to build with stuff like Node.JS for example. This code on Github should be a good first start. Things like this have been built with it.
We (superfeedr) are trying to get more people building similar things...
I'm looking for a server side
component, preferably java, that will
allow me to subscribe to pubsubhubbub
feeds through javascript
There is a java implementation]1 of the subscribe part available. But the hub-part hasn't yet been implemented in java which is needed to subscribe to the feed which should be private. For the javascript(jquery) part I would just use simple long-polling.
Is anyone aware of a (preferably free)
server side proxy for pseudo
javascript pubsubhubbub subscribers?
I don't think a free solution like that exists (yet). Even google's push API isn't open yet.
Unfortunately, I'm dealing with data
that simply cannot leave our servers,
let alone go through Google's.
There isn't yet an implementation of the HUB-part of the pubsubhubbub protoccol. But if it is internally I also don't think you need this kind of fan-out the hub(specification) is offering(broadcast to other servers).
I think you could just use A comet framework like Atmosphere to suspend connection and broadcast feed diff. I think this can be written quick with the Atmosphere framework(1 day you will have a working prototype).
You can see an example using a combination of Superfeedr and Kwwika within a web application that lets you subscribe to any RSS feed or track keywords within RSS feeds here:
http://superfeedr.kwwika.com
And you can get the source code in GitHub here:
http://github.com/kwwika/ASP.NET-MVC-PubSubHubbub-Subscriber/tree/Kwwika-Superfeedr-Demo

using REST webservices as a datasource for Lift?

Is there a way to use a webservice (REST in this case) as the data source for a Lift application? I can find a number of tutorials/examples of using Lift to provide the REST API, but in my case the data is hosted elsewhere and exported as a REST webservice. Pointers to doc are greatly appreciated.
Thanks,
Jeff
This is not related to Lift in fact. There is a lot of different pieces of information already:
HttpClient library as was suggested already,
or Dispatch Scala library for accessing HTTP services
information on how to cache data in Scala in various ways in case you need it
Think about caching thoroughly, it is generally a good choice if your application generates a lot of requests and you can afford caching. Caching will let you achieve many goals:
decrease response time, as you do not depend on the remote service (if you do synchronous data processing)
avoid Denial of Service in case the remote service dies. Otherwise your application will generate many sockets to read data and exhaust resources (either sockets or threads or something else)
do not exceed SLA of the remote service, as many services constrain the number of requests you are allowed to pefrorm per some unit of time.
So you can just sit and put these things together, that's it.
If you really want to be fancy, you can create a Record implementation for a REST-based data source. There's already one of these in existence that works with CouchDB. Using the lift-couchdb module, the interactions with CouchDB are abstracted away and all you deal with is the Scala code. There is a short wiki page with instructions on how to get started with lift-couchdb here:
http://www.assembla.com/wiki/show/liftweb/CouchDB
The pertinent source code files are available here:
http://github.com/lift/lift/tree/master/framework/lift-persistence/lift-couchdb/src/main/scala/net/liftweb/couchdb/
Using the Record interface gives you access to lots of Traits which you use to provide functionality with minimal code-writing such as creating HTML forms, providing lifecycle based calls, and easy hooks for validation.
I've put a scala layer over HttpClient and then use that. I've been meaning to put this on github for some time.
I use Dispatch (which is a wrapper around HttpClient) for making REST calls. Looks nice and simple

Accessing Erlang business layer via REST

For a college project i'm thinking of implementing the business layer in Erlang and then accessing it via multiple front-ends using REST. I would like to avail of OTP features like distributed applications, etc.
My question is how do I expose gen_server calls/casts to other applications? Obviously I could make RPC calls via language specific "bridges" like OTP.net or JInterface, but I want a consistent way to access it like REST.
As already mentioned Yaws or Mochiweb are a great way to go but if you'd like a dead simple way to get your RESTful API done quickly and correctly then use Webmachine. It's a layer on top of Mochiweb that implements proper HTTP behavior based on Alan Dean's amazing HTTP flow diagram and makes it easy to get REST done right.
I'm using it right now to expose a REST API as well as handle a COMET application and it's been pretty easy to do, even for an Erlang newbie such as myself.
I did something similar for my job and found it best to use REST to expose the business layer because even Legacy languages such as SoftwareAG's Natural is able to access it. The best mechanism that I have found in Erlang is to use Mochiweb.
You can find more information about using it from the screencast located at
Erlang In Practice Screencast. Episode 6 is particularly helpful but all of them are excellent.
A resource to walk you through installation is How To Quickly Set Up Ubuntu 8.04 loaded with Erlang, Mochiweb and Nginx and Migrating a native Erlang interface to RESTful Mochiweb (with a bit of TDD) provides a good start if you don't find the screencasts to your liking.
The HTTP flow diagram link is dead. The original version and a updated version created in collaboration between Alan Dean and Justin Sheehy ist also hosted in the Webmachine project: link to latest version of the HTTP Diagramm.
There is valuable approach to design gen_server calls/casts in flavor of REST if possible. You can use messages as
{get, Resource}
{set, Resource, Value} % aka PUT
{delete, Resource}
{add, Resource, Value} % aka POST (possible another names are append, modify or similar)
Then its mapping is easy. You can make some transformation URI->RESOURCE or use identity. For most of your application this should be wort approach and special cases you should handle specially. You can think there will be big margin, where you can't use this approach, but this should be mostly premature optimization.
Do you really mean a RESTful interface or RPC over HTTP? Building a RESTful interface on top of an existing layer is more work than just exposing existing methods via HTTP.
I'd suggest to use mochiweb or yaws to implement a (generic) rpc layer.
Just an update, Webmachine has moved to bitbucket: new link to Webmachine