Zend Framework Dynamic Routing - zend-framework

I have a website that is purely database driven. I m new to Zend Framwork, and to the concept of routing, though I have been doing a lot of reading. Brain is pretty much a sponge, with some stuff still trying to be comprehended. I am using ZF mainly for the routing, though I plan to implement other aspects of it when I can. For the most part it is a learning process, so there are some things I will want to write myself without the framework.
Here’s how the site should work:
URLs could be anything from “/” - a root index, to
“/contact/ - a root file, to
 “/deposits/” – a sub-directory to
“/deposits/ira/” - a file in the sub directory.
When a user clicks on a link code will need to parse the REQUEST_URI in order to look into the “pages” table of the database. The sole purpose for this is to get the ID of the record matching the REQUEST_URI. That ID is the key to everything for the page, and other tables are then checked to see if there is any data for other aspects of the page that need to be gotten. The immediate need is for the template name. The site will have a few different pages that are used depending on if it is a home page, section landing page, or content of a section. This information is decided upon when a page is saved to the DB.
I want to be able to take this data and then decide how to route it so that it uses the correct template and can collect the rest of the data from that point to complete the page.
Since sections and pages can be created at any time, there must be controllers that can handle what to do based on the returned template data. This pretty much means the controllers and such will need some standardized names that are non-specific to what values passed in the REQUEST_URI.
How would I accomplish this in Zend so that all this happens before the controller is selected, and only having template names to go by for selecting the correct controller?
Thanks,
Cy

If it's truly the case that all your routes (at least for the frontend) are dynamic, then it sounds like you could do this:
On bootstrap, remove all default routes.
In a plugin on routeStartup, grab the REQUEST_URI and query your db. Presumably that record contains enough info for you to figure out the required controller, action, layout, etc.
2.1 Add a single route (matching the REQUEST_URI) mapping to the controller and action.
2.2 Set the layout to be the required layout.
Then during normal dispatch, the route will match, the controller/action will be invoked and you'll have the layout set correctly.
If the request doesn't match any page stored in your db, you'll have to invoke the error controller/action yourself to give a 404 response.
However, if you eventually have some static (that is, non-db-stored) routes in your app (which I imagine must be the case), then you'll want to match against them before you hit the db to search for the requested one. That matching sounds like a pain (though perhaps there is a way to ask the router itself if the requested route matches, just like the standard dispatcher does).
In that case, perhaps an alternative approach would be to add all those (non-db-stored) routes to the router in the standard way at bootstrap, and put all this REQUEST_URI inspection, db-searching, and controller/action/layout handling in the 404 handler. If the requested url matches something in your db, then _forward() (not redirect) to that controller/action and set the layout, as above.
It's probably not the most performant solution- since _forward() triggers another iteration in the dispatch loop - but it seems like it could work.

Related

REST on non-CRUD operations

I have a resource called “subscriptions”
I need to update a subscription’s send date. When a request is sent to my endpoint, my server will call a third-party system to update the passed subscription.
“subscriptions” have other types of updates. For instance, you can change a subscription’s frequency. This operation also involves calling a third-party system from my server.
To be truly “RESTful,” must I force these different types of updates to share an endpoint?
PATCH subscriptions/:id
I can hypothetically use my controller behind the endpoint to fire different functions depending on the query string... But what if I need to add a third or fourth “update” type action? Should they ALL run through this single PATCH route?
To be truly “RESTful,” must I force these different types of updates to share an endpoint?
No - but you will often want to.
Consider how you would support this on the web: you might have a number of different HTML forms, each accepting a slightly different set of inputs from the user. When the form is submitted, the browser will use the input controls and form metadata to construct an HTTP (POST) request. The target URI of the request is copied from the form action.
So your question is analogous to: should we use the same action for all of our different forms?
And the answer is yes, if you want the general purpose HTTP application to understand which resource is expected to change in response to the message. One reason that you might want that is cache invalidation; using the right target URI allows all of the caches to understand which previously cached responses should not be reused.
Is that choice free? no - it adds some ambiguity to your access logs, and routing the request to the appropriate handler in your code takes a bit more work.
Trying to use PATCH with different target URI is a little bit weird, and suggests that maybe you are trying to stretch PATCH beyond the standard constraints.
PATCH (and PUT) have remote authoring semantics; what they mean is "make your copy of the target resource look like my copy". These are methods we would use if we were trying to fix a spelling error on a web page.
Trying to change the representation of one resource by sending a remote authoring request to a different resource makes it harder for the general purpose HTTP application components to add value. You are coloring outside of the lines, and that means accepting the liability if anything goes wrong because you are using standardized messages in a non standard way.
That said, it is reasonable to have many different resources that present representations of the same domain entity. Instead of putting everything you know about a user into one web page, you can spread it out among several that are linked together.
You might have, for example, a web page for an invoice, and then another web page for shipping information, and another web page for billing information. You now have a resource model with clearer separation of concerns, and can combine the standardized meanings of PUT/PATCH with this resource model to further your business goals.
We can create as many resources as we need (in the web level; at the REST level) to get a job done. -- Webber, 2011
So, in your example, would I do one endpoint like this user/:id/invoice/:id and then another like this user/:id/billing/:id
Resources, not endpoints.
GET /invoice/12345
GET /invoice/12345/shipping-address
GET /invoice/12345/billing-address
Or
GET /invoice/12345
GET /shipping-address/12345
GET /billing-address/12345
The spelling conventions that you use for resource identifiers don't actually matter very much.
So if it makes life easier for you to stick all of these into a hierarchy that includes both users and invoices, that's also fine.

Restful business logic on property update

I'm building a REST API and I'm trying to keep it as RESTful as possible, but some things are still not quite clear for me. I saw a lot of topic about similar question but all too centered about the "simple" problem of updating data, my issue is more about the business logic around that.
My main issue is with business logic triggered by partial update of a model. I see a lot of different opinion online about PATCH methods, creating new sub-ressources or adding action, but it often seems counter productive with the REST approach of keeping URI simple and structured.
I have some record that need to be proceeded ( refused, validated, partially validated ..etc ), each change trigger additional actions.
If it's refused, an email with the reason should be sent
if it's partially validated, the link to fulfill the missing data is sent
if it's validated some other ressources must be created.
There is a few other change that can be made to the status but this is enough for the example.
What would be a RESTful way to do that ?
My first idea would be to create actions :
POST /record/:id/refuse
POST /record/:id/validate ..etc
It seems RESTful to me but too complicated, and moreover, this approach means having multiple route performing essentially the same thing : Update one field in the record object
I also see the possibility of a PATCH method like :
PATCH /record/:id in which I check if the field to update is status, and the new value to know which action to perform.
But I feel it can start to be too complex when I will have the need to perform similar action for other property of the record.
My last option, and I think maybe the best but I'm not sure if it's RESTful, would be to use a sub-ressource status and to use PUT to update it :
PUT /record/:id/status, with a switch on the new value.
No matter what the previous value was, switching to accepted will always trigger the creation, switching to refused will always trigger the email ...etc
Are those way of achieving that RESTful and which one make more sense ? Is there other alternative I didn't think about ?
Thanks
What would be a RESTful way to do that ?
In HTTP, your "uniform interface" is that of a document store. Your Rest API is a facade, that takes messages with remote authoring semantics (PUT/POST/PATCH), and your implementation produces useful work as a side effect of its handling of those messages.
See Jim Webber 2011.
I have some record that need to be proceeded ( refused, validated, partially validated ..etc ), each change trigger additional actions.
So think about how we might do this on the web. We GET some resource, and what is returned is an html representation of the information of the record and a bunch of forms that describe actions we can do. So there's a refused form, and a validated form, and so on. The user chooses the correct form to use in the browser, fills in any supplementary information, and submits the form. The browser, using the HTML form processing rules, converts the form information into an HTTP request.
For unsafe operations, the form is configured to use POST, and the browsers therefore know that the form data should be part of the message-body of the request.
The target-uri of the request is just whatever was used as the form action -- which is to say, the representation of the form includes in it the information that describes where the form should be submitted.
As far as the browser and the user are concerned, the target-uri can be anything. So you could have separate resources to handle validate messages and refused messages and so on.
Caching is an important idea, both in REST and in HTTP; HTTP has specific rules baked into it for cache invalidation. Therefore, it is often the case that you will want to use a target-uri that identifies the document you want the client to reload if the command is successful.
So it might go something like this: we GET /record/123, and that gives us a bunch of information, and also some forms describing how we can change the record. So fill one out, submit it successfully, and now we expect the forms to be gone - or a new set of forms to be available. Therefore, it's the record document itself that we would expect to be reloading, and the target-uri of the forms should be /record/123.
(So the API implementation would be responsible for looking at the HTTP request, and figuring out the meaning of the message. They might all go to a single /record/:id POST handler, and that code looks through the message-body to figure out which internal function should do the work).
PUT/PATCH are the same sort of idea, except that instead of submitting forms, we send edited representations of the resource itself. We GET /record/123, change the status (for example, to Rejected), and then send a copy of our new representation of the record to the server for processing. It would therefore be the responsibility of the server to examine the differences between its representation of the resource and the new provided copy, and calculate from them any necessary side effects.
My last option, and I think maybe the best but I'm not sure if it's RESTful, would be to use a sub-resource status and to use PUT to update it
It's fine -- think of any web page you have ever seen where the source has a link to an image, or a link to java script. The result is two resources instead of one, with separate cache entries for each -- which is great, when you want fine grained control over the caching of the resources.
But there's a trade - you also need to fetch more resources. (Server-push mitigates some of this problem).
Making things easier on the server may make things harder on the client - you're really trying to find the design with the best balance.

Adobe Analytics overwriting visitorID

I need to overwrite the default visitorID, that's automatically set by Adobe Analytics s_code, with a custom value.
As explained here, I may set the s.visitorID variable for this purpose but It's not clear to me how, and overall when, doing so.
I guess this variable should be set into the s_code itself but I fear that the automatic visitorID would be used anyway for the first s.t() call, in the place of the custom value I'd like to use.In fact, I want that since the first automatic request the custom visitorID is used.In addiction, assumed that the custom value is passed within a GET parameter, I'd like to know if the "Query string parameter on the image request" ("vid" param) could be used for this scenario. and how (is the second method listed in the link above).Thank you.
Some thoughts and words of warning about setting the vistor id yourself:
There are some benefits to setting the visitor id yourself. The main benefit is that you are 100% in control of how visit(or)s are tracked. Another good reason to do this is if you already have a visitor id infrastructure in place on your site, setting AA's visitor id to what you already have can potentially make it easier to tie data to or cross-reference data between AA and other places that utilize that visitor id.
Sidenote for that.. It is possible to get the visitor ids from Adobe from various places (e.g. Data Warehouse, Data Workbench, using the Adobe API), but currently there is no report in Adobe Analytics itself to see the visitor id, even if you set it yourself. To get around this, you can also assign it to a prop and/or eVar, which is visible in AA reports. But this is not possible with the 3rd party cookie tracking (because javascript cannot read 3rd party cookies), so that is another benefit to setting it yourself.
Beyond that, IMO there are basically no benefits to setting it yourself. On the other hand.. now for some warnings...
Setting a value in visitorID or vid or equivalent does NOT cause AA to update the visitor id in its cookie. AA continues to generate/output the visitor id in the cookie and then on AA server, and it simply overwrites that value with your visitor id for that hit (on the backend, on the collection server). But it does NOT update the cookie with your new visitor id.
The implication here is that if you want to set the visitor id yourself, you are essentially putting the responsibility on yourself to keep track of the visitor. So, you must have your own infrastructure in place that not only generates the id, but preserves it and ties it to the visitor, so that it can be output on every page the visitor views - including if the visitor navigates between multiple domains your tracking code is implemented on (if applicable).
If you do not do this, then the visit(or) will break and count as a new visit(or) going from page to page, or when the visitor hop from domainA to domainB. How often the break actually occurs is directly tied to how often you are actually outputting the visitor id yourself (with a correct value). For example if you only set it on first hit, and then on never set it, the visit(or) will break once, because from 2nd hit + AA will just default to using its own generated visitor id.
To put this into perspective.. I've had my fingers in the web analytics pie for almost 10 years now, the last 6 of them working full time for a high profile web analytics agency, so this is basically all I do for a living. I've worked with I don't even know how many clients by now (over 100 for sure), and there has only been one client who actually went through with setting the visitor id themselves. I'm just mentioning this so that you understand that (from my experience, anyways), this is not something most clients embark upon. So, make sure you are absolutely confident about your methods for generating and keeping track of visitor id values before doing this.
Having said that, if you still want to do this...
Firstly, to be clear about the link you posted, those are examples of how the visitor id can be set depending on your implementation. Adobe offers several ways to record data, and the javascript method is just one way. The examples on that page show what you'd set for some of the other ways (including the js way).
The point here is that not all of those methods may be relevant to you, depending on your implementation. For example, if you are only implementing AA with javascript (whether it be the core s_code.js or through DTM or w/e), the only one relevant to you is s.visitorID.
So for example, yes, you can use s.visitorID to override AA's default visitor id generation, for javascript implementation. If you set that variable, you will see the vid param show up in the request to AA (look at the request sent to the AA collection server with a packet sniffer, or with browser addon or dev tools net/traffic tab).
The reason vid was mentioned on that link is because that is what you'd use if you are manually building the request URL to AA. For example, if you don't want to use the javascript implementation, and instead want to use server-side logic to build and output an image tag yourelf, or send data to Omniture directly from your server (e.g. cURL), the vid param is what you'd set for the visitor id.
Sidenote: hardcoding your own image request is kind of a throwback for mobile device tracking in earlier times when mobile devices did not consistently or fully support javascript. Pretty much all modern mobile devices these days fully support javascript, and on top of that, Adobe has done a lot of work over time streamlining the core AA library to be leaner and more efficient whether viewing on desktop or mobile browser (make sure you are using the latest version of AppMeasurement library).
So IOW if you are using the javascript library, you don't need to worry about the vid param, because the js library already does it. Although for QA purposes, you can check that it is there with your value on a given request.
As far as "how" to set it.. assuming you're implementing it with javascript: you set it like any other AA variable. Somewhere between the AA library loading and the s.t or s.tl trigger, you assign a value to it, e.g.
s.visitorID="[my value here]";
Where specifically you set it, depends on how you have implemented AA. For example, if you have implemented AA through DTM, and you have your custom visitor id exposed on your page before the DTM library is loaded (e.g. some data layer property, or in a cookie), you can create a data element that grabs that value, and then in the AA > Tool Config > Cookie > Visitor ID field, you can specify your data element, and DTM will set it for you (but you are still responsible for making sure whatever source the data element draws from is there)
And again, note that even if you set it, you will still see AA's default generated id in their cookie and request urls. The override happens on AA's collection server, which you do not have visibility into. To verify that AA is actually using your custom value, you will need to export it from AA (e.g.with a data warehouse export).

GWT Editors: How to record changes to fields and sub-editors? (RequestFactory?)

I have an app that makes extensive use of the Editor Framework. Right now I'm at the point where I want to add a new feature: if a user edits an entity, I'd like to record which changes were made and store them in a separate datastore entity. This requires knowing if a field was changed, the field name, and the value it was changed to.
This is what I'd like to implement:
App calls edit(bean);
User makes changes, calls flush() and data gets sent back to server.
In server handler, changes from the bean are sent to processChanges(List<String> paths) which then creates and stores the record that "field foo" was changed to "bar", and so on.
The entity is saved, overwriting the existing one.
I use GWTP and currently use the RPC Command Pattern. I've read a bit about RequestFactory and as I understand, one of its main benefits is that it only sends the changed fields known as "deltas" back to the server to minimise the payload, so I'm wondering if using RequestFactory would be a better fit for my app?
Apologies - I've been reading through the GWT docs and Javadocs for the Editor Framework and RequestFactory but I'm still pretty confused. RequestFactoryEditorDriver.getPaths() seems like it might be what I need but any advice or pointers greatly appreciated.
I could probably watch for changes client-side but that seems like a bad idea.
I believe you could do that using an EditorVisitor, similar to the DirtCollector visitor used internally by the Editor framework (have a look at the PathCollector for how to collect paths in a visitor).
I would start by visiting the hierarchy to collect the initial values just after the call to edit() (this is done already by the DirtCollector internally, but there's no way to access its results, and it only collects leaf values anyway).
Then you could call flush() ans see whether there are errors, and possibly validated your object to see if everything's OK. Then you visit the hierarchy again to collect the changes (against the initial values you previously collected) so you can send them to the server.

Connectedness & HATEOAS

It is said that in a well defined RESTful system, the clients only need to know the root URI or few well known URIs and the client shall discover all other links through these initial URIs. I do understand the benefits (decoupled clients) from this approach but the downside for me is that the client needs to discover the links each time it tries access something i.e given the following hierarchy of resources:
/collection1
collection1
|-sub1
|-sub1sub1
|-sub1sub1sub1
|-sub1sub1sub1sub1
|-sub1sub2
|-sub2
|-sub2sub1
|-sub2sub2
|-sub3
|-sub3sub1
|-sub3sub2
If we follow the "Client only need to know the root URI" approach, then a client shall only be aware of the root URI i.e. /collection1 above and the rest of URIs should be discovered by the clients through hypermedia links. I find this cumbersome because each time a client needs to do a GET, say on sub1sub1sub1sub1, should the client first do a GET on /collection1 and the follow link defined in the returned representation and then do several more GETs on sub resources to reach the desired resource? or is my understanding about connectedness completely wrong?
Best regards,
Suresh
You will run into this mismatch when you try and build a REST api that does not match the flow of the user agent that is consuming the API.
Consider when you run a client application, the user is always presented with some initial screen. If you match the content and options on this screen with the root representation then the available links and desired transitions will match nicely. As the user selects options on the screen, you can transition to other representations and the client UI should be updated to reflect the new representation.
If you try and model your REST API as some kind of linked data repository and your client UI as an independent set of transitions then you will find HATEOAS quite painful.
Yes, it's right that the client application should traverse the links, but once it's discovered a resource, there's nothing wrong with keeping a reference to that resource and using it for a longer time than one request. If your client has the possibility of remembering things permanently, it can do so.
consider how a web browser keeps its bookmarks. You probably have maybe ten or a hundred bookmarks in the browser, and you probably found some of these deep in a hierarchy of pages, but the browser dutifully remembers them without requiring remembering the path it took to find them.
A more rich client application could remember the URI of sub1sub1sub1sub1 and reuse it if it still works. It's likely that it still represents the same thing (it ought to). If it no longer exists or fails for any other client reason (4xx) you could retrace your steps to see if you can find a suitable replacement.
And of course what Darrel Miller said :-)
I don't think that that's the strict requirement. From how I understand it, it is legal for a client to access resources directly and start from there. The important thing is that you do not do this for state transitions, i.e. do not automatically proceed with /foo2 after operating on /foo1 and so forth. Retrieving /products/1234 initially to edit it seems perfectly fine. The server could always return, say, a redirect to /shop/products/1234 to remain backwards compatible (which is desirable for search engines, bookmarks and external links as well).