Form-related problems - lift

I am new to Lift and I am thinking whether I should investigate it more closely and start using it as my main platform for the web development. However I have few "fears" which I would be happy to be dispelled first.
Security
Assume that I have the following snippet that generates a form. There are several fields and the user is allowed to edit just some of them.
def form(in : NodeSeq): NodeSeq = {
val data = Data.get(...)
<lift:children>
Element 1: { textIf(data.el1, data.el1(_), isEditable("el1")) }<br />
Element 2: { textIf(data.el2, data.el2(_), isEditable("el2")) }<br />
Element 3: { textIf(data.el3, data.el3(_), isEditable("el3")) }<br />
{ button("Save", () => data.save) }
</lift:children>
}
def textIf(label: String, handler: String => Any, editable: Boolean): NodeSeq =
if (editable) text(label, handler) else Text(label)
Am I right that there is no vulnerability that would allow a user to change a value of some field even though the isEditable method assigned to that field evaluates to false?
Performance
What is the best approach to form processing in Lift? I really like the way of defining anonymous functions as handlers for every field - however how does it scale? I guess that for every handler a function is added to the session with its closure and it stays there until the form is posted back. Doesn't it introduce some potential performance issue when it comes to a service under high loads (let's say 200 requests per second)? And when do these handlers get freed (if the form isn't resubmitted and the user either closes the browser or navigate to another page)?
Thank you!

With regards to security, you are correct. When an input is created, a handler function is generated and stored server-side using a GUID identifier. The function is session specific, and closed over by your code - so it is not accessible by other users and would be hard to replay. In the case of your example, since no input is ever displayed - no function is ever generated, and therefore it would not be possible to change the value if isEditable is false.
As for performance, on a single machine, Lift performs incredibly well. It does however require session-aware load balancing to scale horizontally, since the handler functions do not easily serialize across machines. One thing to remember is that Lift is incredibly flexible, and you can also create stateless form processing if you need to (albeit, it will not be as secure). I have never seen too much of a memory hit with the applications we have created and deployed. I don't have too many hard stats available, but in this thread, David Pollak mentioned that demo.liftweb.net at the time had 214 open sessions consuming about 100MB of ram (500K/session).
Also, here is a link to the Lift book's chapter on Scalability, which also has some more info on security.

The closure and all the stuff is surely cleaned at sessionShutdown. Earlier -- I don't know. Anyway, it's not really a theoretical question -- it highly depends on how users use web forms in practice. So, for a broader answer, I'd ask the question on the main channel of liftweb -- https://groups.google.com/forum/#!forum/liftweb
Also, you can use a "statical" form if you want to. But AFAIK there are no problems with memory and everybody is using the main approach to forms.
If you don't create the handler xml/html -- the user won't be able to change the data, that's for sure. In your code, if I understood it correctly (I'm not sure), you don't create "text(label,handler)" when it's not needed, so everything's secure.

Related

hunchentoot session- v. thread-localized values (ccl)

I'm using hunchentoot session values to make my server code re-entrant. Problem is that session values are, by definition, retained during the session, i.e., from one call from the same browser to the next, whereas what I really am looking for is what amount to thread-specific re-entrancy, so that all the values disappear between calls -- I want to treat each click as a separate "from scratch" event, even if they are from the same session . Easy enough to have the driver either set to nil, or delete my session values, but I'm wondering if there's a "correct" way to do this? I don't see any thread-based analog to hunchentoot:session-value in the documentation.
Thanks in advance for any guidance you can offer.
If you want a value to be "thread specific" and at the same time to be "from scratch" on every request, that requires that every request must be dispatched in a brand new thread. This is not the case according to the Hunchentoot documentation, which says that two models are supported: a single-threaded taskmaster and a thread-per-connection taskmaster.
If your configuration is multi-threaded, then a thread-specific variable bound in a request-handling can therefore be expected to be per-connection. In a single-threaded Hunchentoot setup, it will effectively be global, tied to the request servicing thread.
A thread-based analog to hunchentoot:session-value probably doesn't exist because it would only introduce behaviors into the web app which surprisingly change if the threading model is reconfigured, or if the request pattern from the browser changes. A browser can make multiple requests using the same connection, or close the connection between requests.
To extend the request objects with custom per-request, I would look into, perhaps, subclassing from the acceptor (how to do this is described in the docs). My custom acceptor would have a custom method of the process-connection generic function which would create extended/subclasses request objects carrying the extra stuff I wanted to put into a request.
Another way would be to have some global weak hash which binds request objects as keys to additional information.

Non-RESTful backend with backbone.js

I'm evaluating backbone.js as a potential javascript library for use in an application which will have a few different backends: WebSocket, REST, and 3rd party library producing JSON. I've read some opinions that backbone.js works beautifully with RESTful backends so long as the api is 'by the book' and follows the appropriate http verbage. Can someone elaborate on what this means?
Also, how much trouble is it to get backbone.js to connect to WebSockets? Lastly, are there any issues with integrating a backbone.js model with a function which returns JSON - in other words does the data model always need to be served via REST?
Backbone's power is that it has an incredibly flexible and modular structure. It means that any part of Backbone you can use, extend, take out, or modify. This includes the AJAX functionality.
Backbone doesn't "care" where do you get the data for your collections or models. It will help you out by providing an out of the box RESTful "ajax" solution, but it won't be mad if you want to use something else!
This allows you to find (or write) any plugin you want to handle the server interaction. Just look on backplug.io, Google, and Github.
Specifically for Sockets there is backbone.iobind.
Can't find a plugin, no worries. I can tell you exactly how to write one (it's 100x easier than it sounds).
The first thing that you need to understand is that overwriting behavior is SUPER easy. There are 2 main ways:
Globally:
Backbone.Collection.prototype.sync = function() {
//screw you Backbone!!! You're completely useless I am doing my own thing
}
Per instance
var MySpecialCollection = Backbone.Collection.extend({
sync: function() {
//I like what you're doing with the ajax thing... Clever clever ;)
// But for a few collections I wanna do it my way. That cool?
});
And the only other thing you need to know is what happens when you call "fetch" on a collection. This is the "by the book"/"out of the box behavior" behavior:
collection#fetch is triggered by user (YOU). fetch will delegate the ACTUAL fetching (ajax, sockets, local storage, or even a function that instantly returns json) to some other function (collection#sync). Whatever function is in collection.sync has to has to take 3 arguments:
action: create (for creating), action: read (for fetching), delete (for deleting), or update (for updating) = CRUD.
context (this variable) - if you don't know what this does it, don't worry about it, not important for now
options - where da magic is. We only care about 1 option though
success: a callback that gets called when the data is "ready". THIS is the callback that collection#fetch is interested in because that's when it takes over and does it's thing. The only requirements is that sync passes it the following 1st argument
response: the actual data it got back
Now
has to return a success callback in it's options that gets executed when it's done getting the data. That function what it's responsible for is
Whenever collection#sync is done doing it's thing, collection#fetch takes back over (with that callback in passed in to success) and does the following nifty steps:
Calls set or reset (for these purposes they're roughly the same).
When set finishes, it triggers a sync event on the collection broadcasting to the world "yo I'm ready!!"
So what happens in set. Well bunch of stuff (deduping, parsing, sorting, parsing, removing, creating models, propagating changesand general maintenance). Don't worry about it. It works ;) What you need to worry about is how you can hook in to different parts of this process. The only two you should worry about (if your wraps data in weird ways) are
collection#parse for parsing a collection. Should accept raw JSON (or whatever format) that comes from the server/ajax/websocket/function/worker/whoknowwhat and turn it into an ARRAY of objects. Takes in for 1st argument resp (the JSON) and should spit out a mutated response for return. Easy peasy.
model#parse. Same as collection but it takes in the raw objects (i.e. imagine you iterate over the output of collection#parse) and splits out an "unwrapped" object.
Get off your computer and go to the beach because you finished your work in 1/100th the time you thought it would take.
That's all you need to know in order to implement whatever server system you want in place of the vanilla "ajax requests".

GWT RPC server side safe deserialization to check field size

Suppose I send objects of the following type from GWT client to server through RPC. The objects get stored to a database.
public class MyData2Server implements Serializable
{
private String myDataStr;
public String getMyDataStr() { return myDataStr; }
public void setMyDataStr(String newVal) { myDataStr = newVal; }
}
On the client side, I constrain the field myDataStr to be say 20 character max.
I have been reading on web-application security. If I learned something it is client data should not be trusted. Server should then check the data. So I feel like I ought to check on the server that my field is indeed not longer than 20 characters otherwise I would abort the request since I know it must be an attack attempt (assuming no bug on the client side of course).
So my questions are:
How important is it to actually check on the server side my field is not longer than 20 characters? I mean what are the chances/risks of an attack and how bad could the consequences be? From what I have read, it looks like it could go as far as bringing the server down through overflow and denial of service, but not being a security expert, I could be mis-interpreting.
Assuming I would not be wasting my time doing the field-size check on the server, how should one accomplish it? I seem to recall reading (sorry I no longer have the reference) that a naive check like
if (myData2ServerObject.getMyDataStr().length() > 20) throw new MyException();
is not the right way. Instead one would need to define (or override?) the method readObject(), something like in here. If so, again how should one do it within the context of an RPC call?
Thank you in advance.
How important is it to actually check on the server side my field is not longer than 20 characters?
It's 100% important, except maybe if you can trust the end-user 100% (e. g. some internal apps).
I mean what are the chances
Generally: Increasing. The exact proability can only be answered for your concrete scenario individually (i. e. no one here will be able to tell you, though I would also be interested in general statistics). What I can say is, that tampering is trivially easy. It can be done in the JavaScript code (e. g. using Chrome's built-in dev tools debugger) or by editing the clearly visible HTTP request data.
/risks of an attack and how bad could the consequences be?
The risks can vary. The most direct risk can be evaluated by thinking: "What could you store and do, if you can set any field of any GWT-serializable object to any value?" This is not only about exceeding the size, but maybe tampering with the user ID etc.
From what I have read, it looks like it could go as far as bringing the server down through overflow and denial of service, but not being a security expert, I could be mis-interpreting.
This is yet another level to deal with, and cannot be addressed with server side validation within the GWT RPC method implementation.
Instead one would need to define (or override?) the method readObject(), something like in here.
I don't think that's a good approach. It tries to accomplish two things, but can do neither of them very well. There are two kinds of checks on the server side that must be done:
On a low level, when the bytes come in (before they are converted by RemoteServiceServlet to a Java Object). This needs to be dealt with on every server, not only with GWT, and would need to be answered in a separate question (the answer could simply be a server setting for the maximum request size).
On a logical level, after you have the data in the Java Object. For this, I would recommend a validation/authorization layer. One of the awesome features of GWT is, that you can use JSR 303 validation both on the server and client side now. It doesn't cover every aspect (you would still have to test for user permissions), but it can cover your "#Size(max = 20)" use case.

Will inserting the same `<script>` into the DOM twice cause a second request in any browsers?

I've been working on a bit of JavaScript code that, under certain conditions, lazy-loads a couple of different libraries (Clicky Web Analytics and the Sizzle selector engine).
This script is downloaded millions of times per day, so performance optimization is a major concern. To date, I've employed a couple of flags like script_loading and script_loaded to try to ensure that I don't load either library more than once (by "load," I mean requesting the scripts after page load by inserting a <script> element into the DOM).
My question is: Rather than rely on these flags, which have gotten a little unwieldy and hard to follow in my code (think callbacks and all of the pitfalls of asynchronous code), is it cross-browser safe (i.e., back to IE 6) and not detrimental to performance to just call a simple function to insert a <script> element whenever I reach a code branch that needs one of these libraries?
The latter would still ensure that I only load either library when I need it, and would also simplify and reduce the weight of my code base, but I need to be absolutely sure that this won't result in additional, unnecessary browser requests.
My hunch is that appending a <script> element multiple times won't be harmful, as I assume browsers should recognize a duplicate src URL and rely on a local cached copy. But, you know what happens when we assume...
I'm hoping that someone is familiar enough with the behavior of various modern (and not-so-modern, such as IE 6) browsers to be able to speak to what will happen in this case.
In the meantime, I'll write a test to try to answer this first-hand. My hesitation is just that this may be difficult and cumbersome to verify with certainty in every browser that my script is expected to support.
Thanks in advance for any help and/or input!
Got an alternative solution.
At the point where you insert the new script element in the DOM, could you not do a quick scan of existing script elements to see if there is another one with the same src? If there is, don't insert another?
Javascript code on the same page can't run multithreaded, so you won't get any race conditions in the middle of this or anything.
Otherwise you are just relying on the caching behaviour of current browsers (and HTTP proxies).
The page is processed as a stream. If you load the same script multiple times, it will be run every time it is included. Obviously, due to the browser cache, it will be requested from the server only once.
I would stay away from this approach of inserting script tags for the same script multiple times.
The way I solve this problem is to have a "test" function for every script to see if it is loaded. E.g. for sizzle this would be "function() { return !!window['Sizzle']; }". The script tag is only inserted if the test function returns false.
Each time you add a script to your page,even if it has the same src the browser may found it on the local cache or ask the server if the content is changed.
Using a variable to check if the script is included is a good way to reduce loading and it's very simple:
for example this may works for you:
var LOADED_JS=Object();
function js_isIncluded(name){//returns true if the js is already loaded
return LOADED_JS[name]!==undefined;
}
function include_js(name){
if(!js_isIncluded(name)){
YOUR_LAZY_LOADING_FUNCTION(name);
LOADED_JS[name]=true;
}
}
you can also get all script elements and check the src,my solution is better because it hase the speed and simplicity of an hash array and the script src has an absolute path even if you set it with a relative path.
you may also want to init the array with the scripts normally loaded(without lazy loading)on the page init to avoid double request.
For what it's worth, if you define the scripts as type="module", they will only be loaded and executed once.

Delphi: App initialization - best practices / approach

I run into this regularly, and am just looking for best practice/approach. I have a database / datamodule-containing app, and want to fire up the database/datasets on startup w/o having "active at runtime" set to true at design time (database location varies). Also run a web "check for updates" routine when the app starts up.
Given TForm event sequences, and results from various trial and error, I'm currently using this approach:
I use a "Globals" record set up in the main form to store all global vars, have one element of that called Globals.AppInitialized (boolean), and set it to False in the Initialization section of the main form.
At the main form's OnShow event (all forms are created by then), I test Globals.AppInitialized; if it's false, I run my "Initialization" stuff, and then finish by setting Globals.AppInitialized := True.
This seems to work pretty well, but is it the best approach? Looking for insight from others' experience, ideas and opinions. TIA..
I generally always turn off auto creation of all forms EXCEPT for the main form and possibly the primary datamodule.
One trick that I learned you can do, is add your datamodule to your project, allow it to auto-create and create BEFORE your main form. Then, when your main form is created, the onCreate for the datamodule will have already been run.
If your application has some code to say, set the focus of a control (something you can't do on creation, since its "not visible yet") then create a user message and post it to the form in your oncreate. The message SHOULD (no guarantee) be processed as soon as the forms message loop is processed. For example:
const
wm_AppStarted = wm_User + 101;
type
Form1 = class(tForm)
:
procedure wmAppStarted(var Msg:tMessage); message wm_AppStarted;
end;
// in your oncreate event add the following, which should result in your wmAppStarted event firing.
PostMessage(handle,wm_AppStarted,0,0);
I can't think of a single time that this message was never processed, but the nature of the call is that it is added to the message queue, and if the queue is full then it is "dropped". Just be aware that edge case exists.
You may want to directly interfere with the project source (.dpr file) after the form creation calls and before the Application.Run. (Or even earlier in case.)
This is how I usually handle such initialization stuff:
...
Application.CreateForm(TMainForm, MainForm);
...
MainForm.ApplicationLoaded; // loads options, etc..
Application.Run;
...
I don't know if this is helpful, but some of my applications don't have any form auto created, i.e. they have no mainform in the IDE.
The first form created with the Application object as its owner will automatically become the mainform. Thus I only autocreate one datamodule as a loader and let this one decide which datamodules to create when and which forms to create in what order. This datamodule has a StartUp and ShutDown method, which are called as "brackets" around Application.Run in the dpr. The ShutDown method gives a little more control over the shutdown process.
This can be useful when you have designed different "mainforms" for different use cases of your application or you can use some configuration files to select different mainforms.
There actually isn't such a concept as a "global variable" in Delphi. All variables are scoped to the unit they are in and other units that use that unit.
Just make the AppInitialized and Initialization stuff as part of your data module. Basically have one class (or datamodule) to rule all your non-UI stuff (kind of like the One-Ring, except not all evil and such.)
Alternatively you can:
Call it from your splash screen.
Do it during log in
Run the "check for update" in a background thread - don't force them to update right now. Do it kind of like Firefox does.
I'm not sure I understand why you need the global variables? Nowadays I write ALL my Delphi apps without a single global variable. Even when I did use them, I never had more than a couple per application.
So maybe you need to first think why you actually need them.
I use a primary Data Module to check if the DB connection is OK and if it doesn't, show a custom component form to setup the db connection and then loads the main form:
Application.CreateForm(TDmMain, DmMain);
if DmMain.isDBConnected then
begin
Application.CreateForm(TDmVisualUtils, DmVisualUtils);
Application.CreateForm(TfrmMain, frmMain);
end;
Application.Run;
One trick I use is to place a TTimer on the main form, set the time to something like 300ms, and perform any initialization (db login, network file copies, etc). Starting the application brings up the main form immediately and allows any initialization 'stuff' to happen. Users don't startup multiple instances thinking "Oh..I didn't dbl-click...I'll do it again.."