Non-RESTful backend with backbone.js - rest

I'm evaluating backbone.js as a potential javascript library for use in an application which will have a few different backends: WebSocket, REST, and 3rd party library producing JSON. I've read some opinions that backbone.js works beautifully with RESTful backends so long as the api is 'by the book' and follows the appropriate http verbage. Can someone elaborate on what this means?
Also, how much trouble is it to get backbone.js to connect to WebSockets? Lastly, are there any issues with integrating a backbone.js model with a function which returns JSON - in other words does the data model always need to be served via REST?

Backbone's power is that it has an incredibly flexible and modular structure. It means that any part of Backbone you can use, extend, take out, or modify. This includes the AJAX functionality.
Backbone doesn't "care" where do you get the data for your collections or models. It will help you out by providing an out of the box RESTful "ajax" solution, but it won't be mad if you want to use something else!
This allows you to find (or write) any plugin you want to handle the server interaction. Just look on backplug.io, Google, and Github.
Specifically for Sockets there is backbone.iobind.
Can't find a plugin, no worries. I can tell you exactly how to write one (it's 100x easier than it sounds).
The first thing that you need to understand is that overwriting behavior is SUPER easy. There are 2 main ways:
Globally:
Backbone.Collection.prototype.sync = function() {
//screw you Backbone!!! You're completely useless I am doing my own thing
}
Per instance
var MySpecialCollection = Backbone.Collection.extend({
sync: function() {
//I like what you're doing with the ajax thing... Clever clever ;)
// But for a few collections I wanna do it my way. That cool?
});
And the only other thing you need to know is what happens when you call "fetch" on a collection. This is the "by the book"/"out of the box behavior" behavior:
collection#fetch is triggered by user (YOU). fetch will delegate the ACTUAL fetching (ajax, sockets, local storage, or even a function that instantly returns json) to some other function (collection#sync). Whatever function is in collection.sync has to has to take 3 arguments:
action: create (for creating), action: read (for fetching), delete (for deleting), or update (for updating) = CRUD.
context (this variable) - if you don't know what this does it, don't worry about it, not important for now
options - where da magic is. We only care about 1 option though
success: a callback that gets called when the data is "ready". THIS is the callback that collection#fetch is interested in because that's when it takes over and does it's thing. The only requirements is that sync passes it the following 1st argument
response: the actual data it got back
Now
has to return a success callback in it's options that gets executed when it's done getting the data. That function what it's responsible for is
Whenever collection#sync is done doing it's thing, collection#fetch takes back over (with that callback in passed in to success) and does the following nifty steps:
Calls set or reset (for these purposes they're roughly the same).
When set finishes, it triggers a sync event on the collection broadcasting to the world "yo I'm ready!!"
So what happens in set. Well bunch of stuff (deduping, parsing, sorting, parsing, removing, creating models, propagating changesand general maintenance). Don't worry about it. It works ;) What you need to worry about is how you can hook in to different parts of this process. The only two you should worry about (if your wraps data in weird ways) are
collection#parse for parsing a collection. Should accept raw JSON (or whatever format) that comes from the server/ajax/websocket/function/worker/whoknowwhat and turn it into an ARRAY of objects. Takes in for 1st argument resp (the JSON) and should spit out a mutated response for return. Easy peasy.
model#parse. Same as collection but it takes in the raw objects (i.e. imagine you iterate over the output of collection#parse) and splits out an "unwrapped" object.
Get off your computer and go to the beach because you finished your work in 1/100th the time you thought it would take.
That's all you need to know in order to implement whatever server system you want in place of the vanilla "ajax requests".

Related

Making "parse" function RESTful

I have a RESTful service for getting let's say devices. It provides very usual functionality:
GET /devices
GET /devices/:id
POST /devices
PUT /devices/:id
DELETE /devices/:id
The device object might be defined as follows:
{
id: 123,
name: "Smoke detector",
firmware: "21.0.103",
battery: "ok",
last_maintenance: "2017-07-07",
last_alarm: "2014-02-01 12:11:10",
// ...
}
There is an application that might read device state via some device specific reader. The application itself has no idea how to interpret read data, but it might ask server to do it. In our case let's assume that the data contains the following: battery status, firmware version, last alarm.
If I were implementing regular RPC service, I would create function with "parse" meaning. It means it accept the raw data and returns an updated device object (or, alternatively, only the part of the device object containing the parsed state). But I doubt that I could find a good REST solution for such function. Now I am doing it via PATCH, but I personally do not like this solution, and therefore I will not provide it here. I believe there should be good solution for such class of problems.
So the question: how should I fit my "parse" logic in REST paradigm?
POST it to a /parsed-device-state URL, which will return a 201 Created, a Location header pointing to the place where you can get the parsed data from, and if you like, return the parsed data in the 201 as well (along with an additional Content-Location header with the same value as the Location header). Or if it takes a long time to parse, use 202 Accepted, and the same Location header. The caller can then poll that provided location until the results are ready.
So the question: how should I fit my "parse" logic in REST paradigm?
How would you fit your parse logic into a web site?
You'd probably start with a bookmark. GET $BOOKMARK would return a representation of a form. The form might include an input control like a text area element that would allow the consumer to input a representation, or it might include a input control that allows the consumer to link into a file. The consumer would submit the form, and the agent would create a request from the information in the form. That would probably be a POST (you aren't likely to include an arbitrary file's representation onto the query string) to whatever resource was specified as the action of the form. The server's response would provide a representation of the result.
If parsing were a particularly slow process, then the response instead might be a representation including links to resources that could be used to track the progress of the parsing. The whole protocol in this case looks a lot like putting work on a queue, and then polling for updates.
It's the right answer to a problem that is not a great fit for HTTP:
The REST interface is designed to be efficient for large-grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction.
To some degree, what you are trying to do with your function is transfer compute, which may be why it feels like you are trimming corners off of the peg to fit it in the hole.
An alternative approach, which is a better fit for HTTP, is think about transferring a representation of the behavior. The API client gets a function that understands how to parse apples into oranges, and then runs that code on the information that it keeps locally. Think java script - we get a representation of the behavior from the server (which can embed into that representation information the server has that the client will need), and then execute the result locally. Metadata in the headers describes the lifetime of the representation, in a way that is understood by any standards compliant cache.

hunchentoot session- v. thread-localized values (ccl)

I'm using hunchentoot session values to make my server code re-entrant. Problem is that session values are, by definition, retained during the session, i.e., from one call from the same browser to the next, whereas what I really am looking for is what amount to thread-specific re-entrancy, so that all the values disappear between calls -- I want to treat each click as a separate "from scratch" event, even if they are from the same session . Easy enough to have the driver either set to nil, or delete my session values, but I'm wondering if there's a "correct" way to do this? I don't see any thread-based analog to hunchentoot:session-value in the documentation.
Thanks in advance for any guidance you can offer.
If you want a value to be "thread specific" and at the same time to be "from scratch" on every request, that requires that every request must be dispatched in a brand new thread. This is not the case according to the Hunchentoot documentation, which says that two models are supported: a single-threaded taskmaster and a thread-per-connection taskmaster.
If your configuration is multi-threaded, then a thread-specific variable bound in a request-handling can therefore be expected to be per-connection. In a single-threaded Hunchentoot setup, it will effectively be global, tied to the request servicing thread.
A thread-based analog to hunchentoot:session-value probably doesn't exist because it would only introduce behaviors into the web app which surprisingly change if the threading model is reconfigured, or if the request pattern from the browser changes. A browser can make multiple requests using the same connection, or close the connection between requests.
To extend the request objects with custom per-request, I would look into, perhaps, subclassing from the acceptor (how to do this is described in the docs). My custom acceptor would have a custom method of the process-connection generic function which would create extended/subclasses request objects carrying the extra stuff I wanted to put into a request.
Another way would be to have some global weak hash which binds request objects as keys to additional information.

RESTful APIs: what to return when updating an entity produces side-effects

One of our APIs has a tasks resource. Consumers of the API might create, delete and update a given task as they wish.
If a task is completed (i.e., its status is changed via PUT /tasks/<id>), a new task might be created automatically as a result.
We are trying to keep it RESTful. What would be the correct way to tell the calling user that a new task has been created? The following solutions came to my mind, but all of them have weaknesses in my opinion:
Include an additional field on the PUT response which contains information about an eventual new task.
Return only the updated task, and expect the user to call GET /tasks in order to check if any new tasks have been created.
Option 1 breaks the RESTful-ness in my opinion, since the API is expected to return only information regarding the updated entity. Option 2 expects the user to do stuff, but if he doesn't then no one will realize that a new task was created.
Thank you.
UPDATE: the PUT call returns an HTTP 200 code along the full JSON representation of the updated task.
#tophallen suggests having a task tree so that (if I got it right) the returned entity in option 2 contains the new task as a direct child.
You really have 2 options with a 200 status PUT, you can do headers (which if you do, check out this post). Certainly not a bad option, but you would want to make sure it was normalized site-wide, well documented, and that you didn't have anything such as firewalls/F5's/etc/ re-writing your headers.
Something like this would be a fair option though:
HTTP/1.1 200 OK
Related-Tasks: /tasks/11;/tasks/12
{ ...task response... }
Or you have to give some indication to the client in the response body. You could have a task structure that supports child tasks being on it, or you could normalize all responses to include room for "meta" stuff, i.e.
HTTP/1.1 200 OK
{
"data": { ...the task },
"related_tasks": [],
"aggregate_status": "PartiallyComplete"
}
Something like this used everywhere (a bit of work as it sounds like you aren't just starting this project) can be very useful, as you can also use it for scenarios like paging.
Personally, I think if you made the related_tasks property just contain either routes to call for the child tasks, or id's to call, that might be best, lighter responses, since the client might not always care to call to check on said child-task immediately anyways.
EDIT:
Actually, the more I think about it - the more headers would make sense in your situation - as a client can update a task at any point during the task processing, there may or may not be a child task in play - so modifying the data structure for the off-chance the client calls to update a task when a child task has started seems more work than benefit. A header would allow you to easily add a child task and notify the user at any point - you could apply the same thing for a POST of a task that happens to immediately finish and kicks off a child task, etc. It can easily support more than one task. I think this as well keeps it the most restful and reduces server calls, a client would always be able to know what is going on in the process chain. The details of the header could define, but I believe it is more traditional in a scenario like this to have it point to a resource, rather than a key within a resource.
If there are other options though, I'm very interested to hear them.
It looks like you're very concerned about being RESTful, but you're not using HATEOAS, which is contradictory. If you use HATEOAS, the related entity is just another link and the client can follow them as they please. What you have is a non-problem in REST. If this sounds new to you, read this: http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
Option 1 breaks the RESTful-ness in my opinion, since the API is
expected to return only information regarding the updated entity.
This is not true. The API is expected to return whatever is documented as the information available for that media-type. If you documented that a task has a field for related side-effects tasks, there's nothing wrong with it.

Form-related problems

I am new to Lift and I am thinking whether I should investigate it more closely and start using it as my main platform for the web development. However I have few "fears" which I would be happy to be dispelled first.
Security
Assume that I have the following snippet that generates a form. There are several fields and the user is allowed to edit just some of them.
def form(in : NodeSeq): NodeSeq = {
val data = Data.get(...)
<lift:children>
Element 1: { textIf(data.el1, data.el1(_), isEditable("el1")) }<br />
Element 2: { textIf(data.el2, data.el2(_), isEditable("el2")) }<br />
Element 3: { textIf(data.el3, data.el3(_), isEditable("el3")) }<br />
{ button("Save", () => data.save) }
</lift:children>
}
def textIf(label: String, handler: String => Any, editable: Boolean): NodeSeq =
if (editable) text(label, handler) else Text(label)
Am I right that there is no vulnerability that would allow a user to change a value of some field even though the isEditable method assigned to that field evaluates to false?
Performance
What is the best approach to form processing in Lift? I really like the way of defining anonymous functions as handlers for every field - however how does it scale? I guess that for every handler a function is added to the session with its closure and it stays there until the form is posted back. Doesn't it introduce some potential performance issue when it comes to a service under high loads (let's say 200 requests per second)? And when do these handlers get freed (if the form isn't resubmitted and the user either closes the browser or navigate to another page)?
Thank you!
With regards to security, you are correct. When an input is created, a handler function is generated and stored server-side using a GUID identifier. The function is session specific, and closed over by your code - so it is not accessible by other users and would be hard to replay. In the case of your example, since no input is ever displayed - no function is ever generated, and therefore it would not be possible to change the value if isEditable is false.
As for performance, on a single machine, Lift performs incredibly well. It does however require session-aware load balancing to scale horizontally, since the handler functions do not easily serialize across machines. One thing to remember is that Lift is incredibly flexible, and you can also create stateless form processing if you need to (albeit, it will not be as secure). I have never seen too much of a memory hit with the applications we have created and deployed. I don't have too many hard stats available, but in this thread, David Pollak mentioned that demo.liftweb.net at the time had 214 open sessions consuming about 100MB of ram (500K/session).
Also, here is a link to the Lift book's chapter on Scalability, which also has some more info on security.
The closure and all the stuff is surely cleaned at sessionShutdown. Earlier -- I don't know. Anyway, it's not really a theoretical question -- it highly depends on how users use web forms in practice. So, for a broader answer, I'd ask the question on the main channel of liftweb -- https://groups.google.com/forum/#!forum/liftweb
Also, you can use a "statical" form if you want to. But AFAIK there are no problems with memory and everybody is using the main approach to forms.
If you don't create the handler xml/html -- the user won't be able to change the data, that's for sure. In your code, if I understood it correctly (I'm not sure), you don't create "text(label,handler)" when it's not needed, so everything's secure.

Will inserting the same `<script>` into the DOM twice cause a second request in any browsers?

I've been working on a bit of JavaScript code that, under certain conditions, lazy-loads a couple of different libraries (Clicky Web Analytics and the Sizzle selector engine).
This script is downloaded millions of times per day, so performance optimization is a major concern. To date, I've employed a couple of flags like script_loading and script_loaded to try to ensure that I don't load either library more than once (by "load," I mean requesting the scripts after page load by inserting a <script> element into the DOM).
My question is: Rather than rely on these flags, which have gotten a little unwieldy and hard to follow in my code (think callbacks and all of the pitfalls of asynchronous code), is it cross-browser safe (i.e., back to IE 6) and not detrimental to performance to just call a simple function to insert a <script> element whenever I reach a code branch that needs one of these libraries?
The latter would still ensure that I only load either library when I need it, and would also simplify and reduce the weight of my code base, but I need to be absolutely sure that this won't result in additional, unnecessary browser requests.
My hunch is that appending a <script> element multiple times won't be harmful, as I assume browsers should recognize a duplicate src URL and rely on a local cached copy. But, you know what happens when we assume...
I'm hoping that someone is familiar enough with the behavior of various modern (and not-so-modern, such as IE 6) browsers to be able to speak to what will happen in this case.
In the meantime, I'll write a test to try to answer this first-hand. My hesitation is just that this may be difficult and cumbersome to verify with certainty in every browser that my script is expected to support.
Thanks in advance for any help and/or input!
Got an alternative solution.
At the point where you insert the new script element in the DOM, could you not do a quick scan of existing script elements to see if there is another one with the same src? If there is, don't insert another?
Javascript code on the same page can't run multithreaded, so you won't get any race conditions in the middle of this or anything.
Otherwise you are just relying on the caching behaviour of current browsers (and HTTP proxies).
The page is processed as a stream. If you load the same script multiple times, it will be run every time it is included. Obviously, due to the browser cache, it will be requested from the server only once.
I would stay away from this approach of inserting script tags for the same script multiple times.
The way I solve this problem is to have a "test" function for every script to see if it is loaded. E.g. for sizzle this would be "function() { return !!window['Sizzle']; }". The script tag is only inserted if the test function returns false.
Each time you add a script to your page,even if it has the same src the browser may found it on the local cache or ask the server if the content is changed.
Using a variable to check if the script is included is a good way to reduce loading and it's very simple:
for example this may works for you:
var LOADED_JS=Object();
function js_isIncluded(name){//returns true if the js is already loaded
return LOADED_JS[name]!==undefined;
}
function include_js(name){
if(!js_isIncluded(name)){
YOUR_LAZY_LOADING_FUNCTION(name);
LOADED_JS[name]=true;
}
}
you can also get all script elements and check the src,my solution is better because it hase the speed and simplicity of an hash array and the script src has an absolute path even if you set it with a relative path.
you may also want to init the array with the scripts normally loaded(without lazy loading)on the page init to avoid double request.
For what it's worth, if you define the scripts as type="module", they will only be loaded and executed once.