REST Zend Framework, How to routing module based versioning & Api Key - zend-framework

I am building a RESTful API services with ZF 1.10.8 as am newbie its a little bit confusing when dealing with ZF routing.
I need to have versioning, api_key, and response format in url, something like:
/:version/:response_format/:api_key/:controller ...
/1.0/json/1234567890/articles/
The version is module based with the latest version as default
How to get this done?

Versioning is really not as simple as putting /v1/ in the URI.
In fact, that makes the API non-REST.
To do REST properly, every resource (thing the client wants to access) has one and only one URI.
The URI stays the same for v1 & v2 & v2; what changes is how you present that resource to the client.
How do you know which version they want? They set it as a request header.
How do you know which format (json,xml,html,wml,etc) they want it in? They set it as a request header.
How do you know which language they want it in? Request header.
The thing to remember is that the URI they are requesting stays the same.
Because each resource only has 1 URI, you never want a method name in the URI.
This is bad:
- /edit/place/43
Instead, you should use the proper HTTP methods
- to create a place, do an HTTP POST to /place
- to view place 43, do an HTTP GET to /place/43
- to update place 43, do an HTTP PUT to /place/43
- to delete place 43, do an HTTP DELETE to /place/43
When returning the response to the client, you should also include the URIs of all related bits of data the client might want to retrieve next. One of the principles of REST is that once the client has connected, it can find all the URIs it needs within the API itself. It only needs to know one URI to get into the system, and from that point on, all required URIs are provided in responses. This has the benefit of allowing you to change your URIs at will, since the client should never be paying attention to what they are... just using them as needed (i.e. the client knows what the URI points to, but not where it points).
Lastly, keep in mind that you don't want to be sending success/error markers as xml or json. They should be sent back as HTTP response codes. There's a code for creation, and one for deletion, and one for updating, etc.
Here are some fantastic articles on REST in general, and doing REST with the Zend Framework in particular:
http://blog.steveklabnik.com/2011/07/03/nobody-understands-rest-or-http.html
http://timelessrepo.com/haters-gonna-hateoas
http://martinfowler.com/articles/richardsonMaturityModel.html
http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_2_1_1
http://www.techchorus.net/create-restful-applications-using-zend-framework
http://www.techchorus.net/create-restful-applications-using-zend-framework-part-ii-using-http-response-code
http://weierophinney.net/matthew/archives/233-Responding-to-Different-Content-Types-in-RESTful-ZF-Apps.html
http://www.enrise.com/2010/12/rest-style-context-switching/
http://www.enrise.com/2011/01/rest-style-context-switching-part-2/
http://www.informit.com/articles/article.aspx?p=1566460
http://www.chrisdanielson.com/tag/zend_rest_controller/
http://barelyenough.org/blog/2008/05/versioning-rest-web-services/
I particularly recommend the article at weierophinney.net, for implementation details.

This is just an idea, but I would avoid making the code know anything at all about the version. (Other than what its current version number is.) Instead, I would make the /:version/ part of your URI the base in your rewrite scheme.
So instead of the base being something like: "http://www.example.com/"
It would be: "http://www.example.com/1.0/"
In this way you can simply have different branches of your source control on the server separately and your web server can determine which version to route the URI to. Then your code doesn't need any knowledge of how to handle different versions and your code base doesn't get polluted with large switch statements to do different things based on version.
To make it a little safer, you can require requests to contain the version number in the header. Then your code can just check if the version number in the header matches the version number of the code it's being routed to and throw an error if they don't match.
For example: Sending a GET to http://www.example.com/2.0/ with a version number in the header of 1.0 would throw a "wrong version" error. Your code would only need to know that header_version != current_version, so it shouldn't need to change as you release new versions.

Related

Which HTTP method to use to build a REST API to perform following operation?

I am looking for a REST API to do following
Search based on parameters sent, if results found, return the results.
If no results found, create a record based on search parameters sent.
Can this be accomplished by creating one single API or 2 separate APIs are required?
I would expect this to be handled by a single request to a single resource.
Which HTTP method to use
This depends on the semantics of what is going on - we care about what the messages mean, rather than how the message handlers are implemented.
The key idea is the uniform interface constraint it REST; because we have a common understanding of what HTTP methods mean, general purpose connectors in the HTTP application can do useful work (for example, returning cached responses to a request without forwarding them to the origin server).
Thus, when trying to choose which HTTP method is appropriate, we can consider the implications the choice has on general purpose components (like web caches, browsers, crawlers, and so on).
GET announces that the meaning of the request is effectively read only; because of this, general purpose components know that they can dispatch this request at any time (for instance, a user agent might dispatch a GET request before the user decides to follow the link, to make the experience faster).
That's fine when you intend the request to provide the client with a copy of your search results, and the fact that you might end up making changes to server local state is just an implementation detail.
On the other hand, if the client is trying to edit the results of a particular search (but sometimes the server doesn't need to change anything), then GET isn't appropriate, and you should use POST.
A way to think about the difference is to consider what action you want to be taken when an intermediate cache holds a response from an earlier copy of "the same" request. If you want the cache to reuse the response, GET is the best; on the other hand, if you want the cache to throw away the old response (and possibly store the new one), then you should be using POST.

Modelling a endpoint for a REST API that need save data for every request

A few time ago I participate from a interview where had a question about REST modelling, and how the best way to implement it. The question was:
You have an REST API where you expose a method to consult the distance between two point, although you must save each request to this method to expose the request history.
And I was questioned about which HTTP method should be used on this case, for me the logic answer in that moment was the GET method (to execute the both actions). After this the interviewer asked me why, because since we are also storing the request, this endpoint is not idempotent anymore, after that I wasn't able to reply it. Since this stills on my mind, so I decided to verify here and see others opinions about which method should be used for this case (or how many, GET and POST for example).
You have an REST API where you expose a method to consult the distance between two point, although you must save each request to this method to expose the request history.
How would you do this on the web? You'd probably have a web page with a form, and that form would have input controls to collect the start and end point. When you submit the form, the browser would use the data in the controls, as well as the form metadata and standard HTML processing rules to create a request that would be sent to the server.
Technically, you could use POST as the method of the form. It's completely legal to do that. BUT, as the semantics of the request are "effectively read only", a better choice would be to use GET.
More precisely, this would mean having a family of similar resources, the representation of which includes information about the two points described in the query string.
That family of similar resources would probably be implemented on your origin server as a single operation/route, with a parser extracting the two points from the query string and passing them along to the function as arguments.
the interviewer asked me why, because since we are also storing the request, this endpoint is not idempotent anymore
This is probably the wrong objection - the semantics of GET requests are safe (effectively read only). So the interview might argue that saving the request history is not read only. However, this objection is invalid, because the semantic constraints apply to the request message, not the implementation.
For instance, you may have noticed that HTTP servers commonly add an entry to their access log for each request. Clearly that's not "read only" - but it is merely an implementation detail; the client's request did not say "and also log this".
GET is still fine here, even though the server is writing things down.
One possible objection would be that, if we use GET, then sometimes a cache will return an previous response rather than passing the request all the way through to the origin server to get logged. Which is GREAT - caches are a big part of the reason that the web can be web scale.
But if you don't want caching, the correct way to handle that is to add metadata to the response to inhibit caching, not to change the HTTP method.
Another possibility, which is more consistent with the interviewer's "idempotent" remark, is that they wanted this "request history" to be a resource that the client could edit, and that looking up distances would be a side effect of that editing process.
For instance, we might have some sort of an "itinerary" resource with one or more legs provided by the client. Each time the client modifies the itinerary (for example, by adding another leg), the distance lookup method is called automatically.
In this kind of a problem, where the client is (logically) editing a resource, the requests are no longer "effectively read only". So GET is off the table as an option, and we have to look into the other possibilities.
The TL;DR version is that POST would always be acceptable (and this is how we would do it on the web), but you might prefer an API style where the client edits the representation of the resource locally, in which case you would let the client choose between PUT and PATCH.

Confusion about HTTP verbs

While building my Web API, I have encountered some cases, where I'm not sure what HTTP verbs to use.
Downloading a file with a side effect
My first thought was to use GET, but later I did realize, when a client calls the API to download a file, the server also updates the counter in the DB indicating total number of downloads and the date of the last download.
Isn't this against the specification? The server state was changed, after all. Shouldn't this be a POST/PUT? But if the POST/PUT would be used, I wouldn't be able to share the link and use it from the browser.
Generating random list of values
In my case I need to call the API to generate random list of questions for a test (exam). The request doesn't change anything on the server, it just produces different response content each time the client calls it, so I guess using GET is alright. The indempotency applies only for the server state, not the result handed to the client, right? So is it allowed to request (GET) the same resource repeatedly with different outcome (as seen from the client)?
Generating list of values based on the user input
The last case is similar to the previous. I need the server to generate list of questions. This time based on the previous test's wrong answers. Again, the request doesn't alter server data, but I need to send to the server (relatively) long list of items, which wouldn't have to fit as a query string. That's why I would think a POST with a payload in the body could be used. But to be honest, it feels weird.
Is there a definitive answer which verbs to use for each case?
Downloading a file with a side effect
My first thought was to use GET
And that's the right answer. HTTP Methods are about semantics, not implementation.
HTTP does not attempt to require the results of a GET to be safe. What
it does is require that the semantics of the operation be safe, and
therefore it is a fault of the implementation, not the interface
or the user of that interface, if anything happens as a result that
causes loss of property -- Fielding (2002)
Generating random list of values
it just produces different response content each time the client calls it, so I guess using GET is alright.
Yup - again, as long as the semantics are safe, GET is a fine choice.
Generating list of values based on the user input
I need to send to the server (relatively) long list of items, which wouldn't have to fit as a query string. That's why I would think a POST with a payload in the body could be used. But to be honest, it feels weird.
So if you weren't worried about length of the identifier, GET would be the usual answer here, with all of the user input encoded into the URI.
At this point, you have a couple of options.
The simplest one is to simply use POST, with the user input in the message body, and the resulting list of values in the Response. That shouldn't feel weird -- POST is the method in HTTP with the fewest semantic constraints.
Alternatively, you can rethink your protocol such that the client is creating a "query resource", using the message body as the payload. So POST could work here again, or alternatively you could use PUT (with a somewhat different handling of the URI).
A third possibility is to look in the Hypertext Transfer Protocol Method Registry to see if there is an extension method with the semantics that you need, paying careful attention to whether or not the method is safe. SEARCH and REPORT might fit your needs.
If I decide later, I want to record each generated test to the DB, would you recommend to change the API to POST or keep it as it is? In case of changing the HTTP verb, the client wouldn't notice any functional change, but it would break the API, so semantics-wise, wouldn't it be more appropriate to use POST right from the start, after all? In both cases the meaning would be "create a new test".
No, but change things up a bit and things get interesting. The interesting bit isn't really "record to the database", but "be able to pull it out of the database later". When you start looking toward creating a new resource that can be retrieved later, GET stops being a good fit.
it would break the API
Only because you are ignoring an important REST constraint - REST api are hypertext driven. On the web, we can easily change from GET to POST by changing from a link to a form (or from a GET form to a POST form). The client isn't playing "guess the URI" or "guess the method" because the representation of state includes these details.
But yes, if you make a big enough change to the semantics, it's not going to be backwards compatible. So don't try to pretend that it is backwards compatible - just create a new protocol using new resources.

RESTful API - how to include id/token/... in each request?

I developed a mobile app that needs to access and update data on a server. I'd like to include e.g. the device ID and some token in each request.
I am including these in the body at the moment, so I have only POST requests, even when asking to read data from the server. However, a request to read data should be GET, but how do I include these pieces of information? Should I just add a body to a GET request? Should I rather add some headers? If so, can I just create any custom headers with any name? Thank you for your guidance.
Your FCM token and device id are really authentication credentials for the request. In HTTP, you typically use the Authorization header with a scheme to indicate to the service
In your case, you could use bearer tokens in the HTTP Authorization header.
While bearer tokens are often used with JWT token, they are not required to be that specific format.
You could just concatenate the FCM token and the device id like the basic authentication scheme does.
BTW, it's not recommended to use a body on a GET request since some proxies may not retain that.
Well, REST is basically just a generalization of the concepts used already for years in the browser-based web. On applying these concepts consistently in your applications you'll gain freedom to evolve the server side while gaining robustness to changes on the clientside. However, in order to benefit from such strong properties a certain number of constraints need to be followed consequently like adhering to the rules of the underlying transport protocol or relying on HATEOAS to drive application state further. Any out-of-band information needed to interact with the service will lead to a coupling and therefore has the potential to either break clients or prevent servers from changing in future.
A common misconception in REST achitecture design is that URIs should be meaningful and express semantics to the client. However, in a REST architecture the URI is just a pointer to a resource which a client should never parse. The decision whether to invoke the URI should soly be based on the accompanying link relation name which may further be described in either the media-type or common standards. I.e. on a pageable collection link relation like prev, next, first or last may give a client the option to page through the collection. The actual structure of the URI is therefore not important to REST at all. Over-engineered URIs might further lead to typed resources. Therefore I don't like the term restful-url actually. How do non-restful-urls look like then?
While sending everything via POST requests is technically a valid option, it also has some drawbacks to consider though. IANA maintains a list of available HTTP methods you might use. Each method conveys different promisses and semantics. I.e. a client invoking a GET operation on a server should be safe to assume that invoking the resource does not cause any state changes (safe) and in case of network issues the request can be reissued again without any further considerations (idempotent). These are very important benefits to i.e. Web crawlers. Besides that intermediary nodes can determine based on the request method and the resulting response if the response can be cached or not. While this is not necessarily an issue in terms of decoupling clients from servers, it helps to take away unnecessary workload from the server itself, especially when resource state is rarly changing, improving the scalability of the whole system.
POST on the otherhand does not convey such properties. On sending a POST request for retrieving data the client can't be sure if the request actually lead to changes on the resources state or not. On a network issue the request might have reached the server and may have created a new resource though the response just got lost mid way which might keep the client in a state of uncertainty whether it simply can resend the request or not. Also, responses for POST operations are not cacheable by default, only after explicitely adding frehness information to it. A POST method invocation requests the target resource to process the provided representation accoding to the resources own semantics. As literally anything can be sent to the server it is of importance that the server teaches the client on how a request should look like. In HTML i.e. this is done via Web forms where a user can fill in data into certain input fields and then send the data to the server on clicking a submit button. The same concept could be applied for mobile or REST applications as well. Either reusing HTML forms or defining an own application/vnd.company-x.forms+json where the description of that media type is made public (or registered with IANA) can help you on this.
The actual question on where to include certain data is, unfortunately, to generic to give a short answer. It further depends whether data should be shareable or has some security related concerns. While parameters might be passed to the server via URL parameters (query, matrix, path) to a certain extent, it is probably not the best option in general eventhough query parameters are encrypted in SSL interactions. This option, though, is convenient if the URI should be pastable without losing information. This of course then shouldn't contain security related data then. Security related information should almost always be passed in HTTP headers or at least the actual payload itself.
Usually you shoud distinguish between content and meta-data describing the content. While the content should be the actual payload of the request/response, any meta-data describing the content should go inside the headers. Think of an image you want to transfer. As you don't want to mess with the bytes of the image you simply append the image name, the compression format and further properties describing how to convert the bytes back to an image representation within the headers. This discrimination works probably best for standardized representation formats as you need to be within the capabilities of the spec to guarantee interoperability. Though, even there things may start to get fuzzy. I.e in the area of EDI there exist a couple of well-defined standards like Edifact, Tradacoms, and so forth which can be used to exchange different message formats like invoices, orders, order responses, ... though different ERP systems speak different slangs and this is where things start to get complicated and messy.
If you are in control of your representation format as you probably did not standardize it or defined it only vaguely yet things might even be harder to determine whether to put it insight your document or append it via headers. Here it solely depends on your design. I have also seen representations that defined own header sections within the payload and therefore recreated a SOAP like envelop-header-body structure.
About your question if you can create custom header for your requirement. My answer is YES.
As above was mentioned, you can use the standard Authorization header to send the token in each request . Other alternative is defining a custom header. However you will have to implement by server side a logic to support that custom header .
You can read more about it here

RESTful Web Services: method names, input parameters, and return values?

I'm trying to develop a simple REST API. I'm still trying to understand the basic architectural paradigms for it. I need some help with the following:
"Resources" should be nouns, right? So, I should have "user", not "getUser", right?
I've seen this approach in some APIs: www.domain.com/users/ (returns list), www.domain.com/users/user (do something specific to a user). Is this approach good?
In most examples I've seen, the input and output values are usually just name/value pairs (e.g. color='red'). What if I wanted to send or return something more complex than that? Am I forced to deal with XML only?
Assume a PUT to /user/ method to add a new user to the system. What would be a good format for input parameter (assume the only fields needed are 'username' and 'password')? What would be a good response if the user is successful? What if the user has failed (and I want to return a descriptive error message)?
What is a good & simple approach to authentication & authorization? I'd like to restrict most of the methods to users who have "logged in" successfully. Is passing username/password at each call OK? Is passing a token considered more secured (if so, how should this be implemented in terms of expiration, etc.)?
For point 1, yes. Nouns are expected.
For point 2, I'd expect /users to give me a list of users. I'd expect /users/123 to give me a particular user.
For point 3, you can return anything. Your client can specify what it wants. e.g. text/xml, application/json etc. by using an HTTP request header, and you should comply as much as you can with that request (although you may only handle, say, text/xml - that would be reasonable in a lot of situations).
For point 4, I'd expect POST to create a new user. PUT would update an existing object. For reporting success or errors, you should be using the existing HTTP success/error codes. e.g. 200 OK. See this SO answer for more info.
the most important constraint of REST is the hypermedia constraint ("hypertext as the engine of application state"). Think of your Web application as a state machine where each state can be requested by the client (e.g. GET /user/1).Once the client has one such state (think: a user looking at a Web page) it sees a bunch of links that it can follow to go to a next state in the application. For example, there might be a link from the 'user state' that the client can follow to go to the details state.
This way, the server presents the client the application's state machine one state at a time at runtime. The clever thing: since the state machine is discovered at runtime on state at a time, the server can dynamically change the state machine at runtime.
Having said that...
on 1. the resources essentially represent the application states you want to present to the client. The will often closely match domain objects (e.g. user) but make sure you understand that the representations you provide for them are not simply serialized domain objects but states of your Web application.
Thinking in terms of GET /users/123 is fine. Do NOT place any action inside a URI. Although not harmful (it is just an opaque string) it is confusing to say the least.
on 2. As Brian said. You might want to take a look at the Atom Publishing Protocol RFC (5023) because it explains create/read/update cycles pretty well.
on 3. Focus on document oriented messages. Media types are an essential part of REST because they provide the application semantics (completely). Do not use generic types such as application/xml or application/json as you'll couple your clients and servers around the often implicit schema. If nothing fits your needs, just make up your own type.
Maybe you are interested in an example I am hacking together using UBL: http://www.nordsc.com/blog/?cat=13
on 4. Normally, use POST /users/ for creation. Have a look at RFC 5023 - this will clarify that. It is an easy to understand spec.
on 5. Since you cannot use sessions (stateful server) and be RESTful you have to send credentials in every request. Various HTTP auth schemes handle that already. It is also important with regard to caching because the HTTP Authorization header has special specified semantics to caches (no public caching). If you stuff your credentials into a cookie, you loose that important piece.
All HTTP status codes have a certain application semantic. Use them, do not tunnel your own error semantics through HTTP.
You can come visit #rest IRC or join rest-discuss on Yahoo for detailed discussions.
Jan