Advice on modeling behaviors/concepts not commonly thought to be persisted - rest

I am looking for advice and experience related to the best way to REST-ify behaviors/concepts that are not "commonly persisted" - in other words, resource-ifying transformative or side-effect behaviors.
First, let me say that I know there's no complete test for REST-ful-ness - and actually I don't care. I find the CRUD-over-the-Web notion of REST very intuitive and it works well on most of the data I'm providing access to. At the same time, I am not worried about straying from anyones bible on how to perfectly REST-ify something. I'm in search of whatever the best compromise between practical and REST-intuitive is for the cases that don't fit so well. Clearly, consistency is the main goal, but intuitive-ness helps ensure that.
Given that, let me dive into the details of what I'm after.
Since REST is inherently resource-oriented, it is easy to model things commonly persisted - what is less clear is how to model behaviors/concepts that are not commonly persisted, especially those which have side effects or are purely transformational.
For example, take a stackoverflow.com question. It can be Created, Updated, Read, and Deleted. Everyone can relate to that and it all makes sense under REST.
But now consider something like a translation - for example, I want to build a REST API for my service that translates an English sentence to Spanish.
There are at least two ways to address the translation scenario. I could:
Look at a translation invocation as the creation of a "translation instance" (which doesn't happen to be persisted, but could be), in which case a POST /Translation (i.e., a create) actually makes a lot of sense. This is particularly the case if the translation service requires the URL of something to be translated (since the contents of that URL can change over time).
Look at translation as really the act of querying a larger dictionary of known answers, in which case a GET /Translation (i.e., a read) might be more appropriate. This is especially tempting if the translation service requires just the text of the sentence to be translated. Noone would expect the translation of a static sentence to change over time. You could also argue that it could be cacheable, which GET would lend itself towards.
This same dilemma can crop up for other actions which primarily have side effects (e.g., sending an SMS or an e-mail) and are less commonly associated with persistence of the data.
My approach thus far has been to essentially "Order"-ify these cases, that is looking at everything as though one was placing an order. More generally, I think this is just converting verbs (translate) into nouns (translation order), which would seem to be one way of REST-ifying such things.
Can anyone share a better, more intuitive, way to approach the modeling of actions/behaviors that are not commonly assumed to be persisted?

The "Would it hurt to do more than once?"-test perhaps?
In your examples - what would happen if you are asked to translate the same text twice? Not really much. Sending an sms twice is likely to be a bad thing though.

The two options you describe are valid options and you have already highlighted scenarios when one is better than the other. The only other option I would suggest which is use POST with a "processing resource". e.g.
POST /translator?from=en&to=fr
Content-Type: text/plain
Body: The sentence to be translated
=>
200 OK
La phrase à traduire
One nice thing about "processing resources" is that they may or may not have persistent side effects. Obviously the disadvantage is that intermediaries can't cache the response.

Related

How to localize CPAN module and dependencies

I am trying to localize a CPAN module MooX::Options using Locale::TextDomain after having read "On the state of I18N in perl".
In the discussion in the pull request the question came up how to deal with messages not originating in the module itself, but in a dependency. In this specific case, when you specify an option on the command line which is not defined anywhere in the code, you'll get the warning:
Unknown option: xyz
originating in the module Getopt::Long, which in itself is not localized yet.
The question is how to deal with these. I see basically three strategies:
Ignore them, which I find dissatisfactory.
Try to someway or other catch all the corner cases and messages in the module I'm currently localizing (in this case MooX::Options), and this way working around the missing localization in the dependent modules. This option seems brittle, as I'd have to constantly adapt to changes in the base modules. Sometimes, it might by next to impossible to catch messages, as they're written to output streams directly by the modules (as is the case in this example).
Try to localize the dependent modules themselves. This option seems hard to achieve, as different projects might use different I18N tools and strategies themselves and the dependency graph might be huge.
All in all, I think this problem is more general and not that specific to perl and cpan modules. So, I'm interessted in your thoughts, strategies and approaches.
I have rather strong opionions on the idea of translating computing terms, and most people disagree with my views, so take what I am saying with a grain of salt.
I do not understand the point of internationalizing a library for parsing command line options unless you want to further ghettoize what is already a small group of users of said library.
Would wget be more useful to Turkish users if instead it was called wal or wgetir? Or, instead of wget --mirror, should Turkish users write getir --ayna? What about that w?
If you just translate the messages, what is the point of outputting a help message in response to wget -h when the Turkish equivalent would be wget -y?
The fact is, almost all attempts at translating programming related terms I have seen are simply awful. The people who are most eager to translate are usually not in command of either human language — Nor do they seem to understand what they are translating.
However, as a result of these eager people, I find that at least the Turkish translations of pretty much any software I touch is just awful. Whatever Danish translations I have seen did not fare much better, but, at least, they were tolerable owing to the greater commonality of structure between Danish and English.
I think everyone's energy is better spent on actually making sure their programs handle content, including names of external resources/references, in different languages well, rather than giving me error messages in some Frankenstein language, or letting me specify command line options whose mnemonics do not match their descriptions etc, or presenting menus that contain of strings of words that really do not convey any meaning.
I have felt this way for the last for many decades now ... Even when I was patching IBM PC keyboard drivers with hex editors so people at various places could type reports in WordStar, and create charts in Harvard Graphics.
So, my unpopular advice is to put your energy elsewhere ...
For example, use exception objects so the user of your library (who is likely a programmer and will understand "Directory not found" much more readily than "Kütük bulunamadı") can deduce in a human-language independent way what happened, and what message to show the user. I haven't looked closely at MooX::Options, but I notice there is at least one string croak.
Here is an actual error message from an IBM product:
Belirtilen kütük örüntüsüyle eşleşen hiçbir kütük bulunamadı
You can ask every one of the almost 200 million Turkic people on earth what a "kütük örüntüsü" is, and only the person who actually came up with this non-sensical string of characters will be able to tell you that it corresponds to "file pattern". What, then, do they gain by using the phrase "kütük örüntüsü" versus "file pattern"? Nothing.
However, they lose the ability to communicate with, and, also, compete with, programmers in the English speaking world.
PS: Apologies for all Turkish examples, but I feel most comfortable drawing abominable examples based on my native language.

REST: The same resource accessed from two different URLs

I've just started my journey with REST, so please be patient while anserwing my questions.
I have some doubts how to cope with some situations. For instance, I read that evenry resource/entity should have its own, unique URL. But how to deal with examples like:
There is a list of objects.
Each object is described by some basic information and its state.
Basic info for object is for example its shape, color.
The state of object contains information like position in space, velocity etc.
Both two pieces of information (basic + state) are rather big objects (only examplary properties were introduced).
Now, we have the following questions:
Get all basic info about objects.
Get all object states.
Get basic info of object with ID=2.
Get state of object ID=5.
Get all info of object ID=7.
I tried to solve it in this way:
/rest/objects
/rest/states
/rest/objects/2/basic
/rest/objects/5/state (/rest/states/5)
/rest/objects/7
However, I've some doubts pointed in 4 - it doesn't look like to be correct. There are two ways to access the same information/resource/entity.
How to deal with it?
Isn't the "basic" just part of the resource? It seems not from your description, but you should consider it. In fact, all of "basic" and "state" might just be part of the resource. It's difficult to be authoritative without more information. For instance, how large is "rather big"?
It also doesn't seem like /state should be an independent endpoint, since a state is an integral part of an object, not something that lives on its own.
I might start with something like:
1. GET /objects
2. GET /objects?expand=state
3. GET /objects/2?expand=basic -- or -- GET /objects/2/basic
4. GET /objects/5?expand=state -- or -- GET /objects/5/state
5. GET /objects/7?expand=basic,state
If you need /state to be top level, then you have two options:
2. GET /states -- or -- GET /objects/states
Neither of those are really ideal. It is acceptable to have multiple paths to get to the same resource, as long as there's only one canonical URI for the resource. The issue is that it's harder for the end user to learn APIs that lean heavily on that style. The second style is also confusing. /objects/XXX is a specific object, unless XXX == states? That's a special case for your end user to have to remember.
From a caching point of view it makes sense to have "state" and "basic" in separate resources - "state" resources changes frequently and is thus non-cachable and "basic" changes seldom and should thus be cachable for a long time. If this is your goal, then go ahead, it sure makes sense (and is an often mentioned pattern). But usually, to keep complexity low, both sets of values are combined into one single resource (your "Object" in this case).
Your paths are pretty good, Eric's version is also fine - here is my suggestion:
/rest/objects (all complete objects info)
/rest/objects/basic (all basic info)
/rest/objects/state (all state info)
/rest/objects/7 (single complete object)
/rest/objects/2/basic (single basic info)
/rest/objects/5/state (single state info)
In the end it all boils down to a matter of personal preference as the exact URL doesn't matter much.
See for instance http://soabits.blogspot.dk/2013/10/url-structures-and-hyper-media-for-web.html for a longer discussion on this issue (disclaimer: that's my blog).
resource/entity
Don't confuse these concepts. REST has nothing to do how you store your data, and resources map to entities only by CRUD applications.
I read that evenry resource/entity should have its own, unique URL.
You should read the IRI standard. IRIs are for identifying resources. Without them you cannot tell the server what you want to manipulate.
However, I've some doubts pointed in 4 - it doesn't look like to be
correct. There are two ways to access the same
information/resource/entity.
How to deal with it?
It is up to you. You can choose either of them or both. There is no restriction about how many identifiers a resource can have or what IRI structure to choose.

Is using a verb in URL fundamentally incompatible with REST?

So let's say we have something that does not seem best represented as a resource (status of process that we want to pause, stateless calculation we want to perform on the server, etc).
If in API design we use either process/123/pause or calculations/fibonacci -- is that fundamentally incompatible with REST? So far from what I read it does not seem to, as long as these URLs are discoverable using HATEOAS and media types are standardized.
Or should I prefer to put action in the message as answered here?
Note 1:
I do understand that it is possible to rephrase some of my examples in terms of nouns. However I feel that for specific cases nouns do not work as well as verbs do. So I am trying to understand if having those verbs would be immediately unRESTful. And if it is, then why the recommendation is so strict and what benefits I may miss by not following it in those cases.
Note 2:
Answer "REST does not have any constraints on that" would be a valid answer (which would mean that this approach is RESTful). Answers "it depends on who you ask" or "it is a best practice" is not really answering the question. The question assumes concept of REST exist as a well-defined common term two people can use to refer to the same set of constraints. If the assumption itself is incorrect and formal discussion of REST is meaningless, please do say so.
This article has some nice tips: http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api
Quoting from the article:
What about actions that don't fit into the world of CRUD operations?
This is where things can get fuzzy. There are a number of approaches:
Restructure the action to appear like a field of a resource. This works if the action doesn't take parameters. For example an activate action could be mapped to a boolean activated field and updated via a PATCH to the resource.
Treat it like a sub-resource with RESTful principles. For example, GitHub's API lets you star a gist with PUT /gists/:id/star and unstar with DELETE /gists/:id/star.
Sometimes you really have no way to map the action to a sensible RESTful structure. For example, a multi-resource search doesn't really
make sense to be applied to a specific resource's endpoint. In this
case, /search would make the most sense even though it isn't a noun.
This is OK - just do what's right from the perspective of the API
consumer and make sure it's documented clearly to avoid confusion.
I personally like suggestion #2. If you need to pause something, what are you pausing? If it's a process with a name, then try this:
/process/{processName}/pause
It's not strictly about nouns vs. verbs; it's about whether you are:
identifying resources
manipulating resources through representations
What's a resource? Fielding defines it thusly:
The key abstraction of information in REST is a resource. Any information that can be named can be a resource: a document or image, a temporal service (e.g. "today's weather in Los Angeles"), a collection of other resources, a non-virtual object (e.g. a person), and so on. In other words, any concept that might be the target of an author's hypertext reference must fit within the definition of a resource. A resource is a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time."
Now, to your question. You can't just look at a URL and say, "Is such-and-such a URL fundamentally incompatible with REST?" because URLs in a REST system aren't really the important bit. It's more important that the URLs process/123/pause and calculations/fibonacci identify resources by the above definition. If they do, there isn't a REST constraint violation. If they don't, you're violating the uniform interface constraint of REST. Your example leads me to believe it does not fit the resource definition and therefore would violate this constraint.
To illustrate what a resource might be in this system, you could change the status of a process by POSTing it to the paused-processes resource collection. Though that is perhaps an unusual way of working with processes, it's not fundamentally incompatible with the REST architecture style.
In the case of calculations, the calculations themselves might be the resource and that resource might look like this:
Request:
GET /calculations/5
Response:
{
fibonacci: 5,
prime-number: true,
square-root: 2.23607
}
Though again, that's a somewhat unusual concept of a resource. I suppose a slightly more typical use might look like this:
Request:
GET /stored-calculations/12381728 (note that URL is a random identifier)
Response:
{
number: 5,
fibonacci: 5,
prime-number: true,
square-root: 2.23607
}
though presumably you'd want to store additional information about that resource other than a sheer calculation that anyone can do with a calculator...
Response:
{
number: 5,
fibonacci: 5,
prime-number: true,
square-root: 2.23607,
last-accessed-date: 2013-10-28T00:00:00Z,
number-of-retrievals-of-this-resource: 183
}
It's considered bad practice to use verbs in your REST API.
There's some material on SO and elsewhere on why and how to avoid using verbs. That being said, there are plenty of "REST" APIs that use verbs.
For your process API, I would make the resource Process have a state field, which can be modified with a PUT.
Suppose GET /process/$id currently returns:
{
state: "PAUSED"
}
Then you PUT this to /process/$id:
{
state: "RUNNING"
}
which makes the process change state.
In the case of Fibonacci, just have a resource named fibonacci, and use POST with parameters (say n for the first n fibonacci numbers) in the body, or perhaps even GET with a query in the URL.
The HTTP method is the verb: GET, PUT, POST, et cetera, while the URL should always refer to the noun (recipient of the action). Think of it like this: Would two verbs in a sentence make sense? "GET calculate" is nonsense, where "GET state" is good and "GET process" is better ("state" being metadata for a process).

POST/GET bindings in Racket

Is there a built-in way to get at POST/GET parameters in Racket? extract-binding and friends do what I want, but there's a dire note attached about potential security risks related to file uploads which concludes
Therefore, we recommend against their
use, but they are provided for
compatibility with old code.
The best I can figure is (and forgive me in advance)
(bytes->string/utf-8 (binding:form-value (bindings-assq (string->bytes/utf-8 "[field_name_here]") (request-bindings/raw req))))
but that seems unnecessarily complicated (and it seems like it would suffer from some of the same bugs documented in the Bindings section).
Is there a more-or-less standard, non-buggy way to get the value of a POST/GET-variable, given a field name and request? Or better yet, a way of getting back a collection of the POST/GET values as a list/hash/a-list? Barring either of those, is there a function that would do the same, but only for POST variables, ignoring GETs?
extract-binding is bad because it is case-insensitive, is very messy for inputs that return multiple times, doesn't have a way of dealing with file uploads, and automatically assumes everything is UTF-8, which isn't necessarily true. If you can accept those problems, feel free to use it.
The snippet you wrote works when the data is UTF-8 and when there is only one field return. You can define it is a function and avoid writing it many times.
In general, I recommend using formlets to deal with forms and their values.
Now your questions...
"Is there a more-or-less standard, non-buggy way to get the value of a POST/GET-variable, given a field name and request?"
What you have is the standard thing, although you wrongly assume that there is only one value. When there are multiple, you'll want to filter the bindings on the field name. Similarly, you don't need to turn the value into a string, you can leave it as bytes just fine.
"Or better yet, a way of getting back a collection of the POST/GET values as a list/hash/a-list?"
That's what request-bindings/raw does. It is a list of binding? objects. It doesn't make sense to turn it into a hash due to multiple value returns.
"Barring either of those, is there a function that would do the same, but only for POST variables, ignoring GETs?"
The Web server hides the difference between POSTs and GETs from you. You can inspect uri and raw post data to recover them, but you'd have to parse them yourself. I don't recommend it.
Jay

Why use a post compiler?

I am battling to understand why a post compiler, like PostSharp, should ever be needed?
My understanding is that it just inserts code where attributed in the original code, so why doesn't the developer just do that code writing themselves?
I expect that someone will say it's easier to write since you can use attributes on methods and then not clutter them up boilerplate code, but that can be done using DI or reflection and a touch of forethought without a post compiler. I know that since I have said reflection, the performance elephant will now enter - but I do not care about the relative performance here, when the absolute performance for most scenarios is trivial (sub millisecond to millisecond).
Let's try to take an architectural point on the issue. Say you are an architect (everyone wants to be an architect ;)
You need to deliver the architecture to your team:
a selected set of libraries, architectural patterns, and design patterns. As a part of your design, you say: "we will implement caching using the following design pattern:"
string key = string.Format("[{0}].MyMethod({1},{2})", this, param1, param2 );
T value;
if ( !cache.TryGetValue( key, out value ) )
{
using ( cache.Lock(key) )
{
if (!cache.TryGetValue( key, out value ) )
{
// Do the real job here and store the value into variable 'value'.
cache.Add( key, value );
}
}
}
This is a correct way to do tracing. Developers are going to implement this pattern thousands of times, so you write a nice Word document telling how you want the pattern to be implemented. Yeah, a Word document. Do you have a better solution? I'm afraid you don't. Classic code generators won't help. Functional programming (delegates)? It works fairly well for some aspects, but not here: you need to pass method parameters to the pattern. So what's left? Describe the pattern in natural language and trust developers will implement them.
What will happen?
First, some junior developer will look at the code and tell "Hm. Two cache lookups. Kinda useless. One is enough." (that's not a joke -- ask the DNN team about this issue). And your patterns cease to be thread-safe.
As an architect, how do you ensure that the pattern is properly applied? Unit testing? Fair enough, but you will hardly detect threading issues this way. Code review? That's maybe the solution.
Now, what is you decide to change the pattern? For instance, you detect a bug in the cache component and decide to use your own? Are you going to edit thousands of methods? It's not just refactoring: what if the new component has different semantics?
What if you decide that a method is not going to be cached any more? How difficult will it be to remove caching code?
The AOP solution (whatever the framework is) has the following advantages over plain code:
It reduces the number of lines of code.
It reduces the coupling between components, therefore you don't have to change much things when you decide to change the logging component (just update the aspect), therefore it improves the capacity of your source code to cope with new requirements over time.
Because there is less code, the probability of bugs is lower for a given set of features, therefore AOP improves the quality of your code.
So if you put it all together:
Aspects reduce both development costs and maintenance costs of software.
I have a 90 min talk on this topic and you can watch it at http://vimeo.com/2116491.
Again, the architectural advantages of AOP are independent of the framework you choose. The differences between frameworks (also discussed in this video) influence principally the extent to which you can apply AOP to your code, which was not the point of this question.
Suppose you already have a class which is well-designed, well-tested etc. You want to easily add some timing on some of the methods. Yes, you could use dependency injection, create a decorator class which proxies to the original but with timing for each method - but even that class is going to be a mess of repetition...
... or you can add reflection to the mix and use a dynamic proxy of some description, which lets you write the timing code once, but requires you to get that reflection code just right -which isn't as easy as it might be, especially if generics are involved.
... or you can add an attribute to each method that you want timed, write the timing code once, and apply it as a post-compile step.
I know which seems more elegant to me - and more obvious when reading the code. It can be applied even in situations where DI isn't appropriate (and it really isn't appropriate for every single class in a system) and with no other changes elsewhere.
AOP (PostSharp) is for attaching code to all sorts of points in your application, from one location, so you don't have to place it there.
You cannot achieve what PostSharp can do with Reflection.
I personally don't see a big use for it, in a production system, as most things can be done in other, better, ways (logging, etc).
You may like to review the other threads on this matter:
Anyone with Postsharp experience in production?
Other than logging, and transaction management what are some practical applications of AOP?
Aspect Oriented Programming: What do you use PostSharp for?
etc (search)
Aspects take away all the copy & paste - code and make adding new features faster.
I hate nothing more than, for example, having to write the same piece of code over and over again. Gael has a very nice example regarding INotifyPropertyChanged on his website (www.postsharp.net).
This is exactly what AOP is for. Forget about the technical details, just implement what you are being asked for.
In the long run, I think we all should say goodbye to the way we are writing software now. It's tedious and plainly stupid to write boilerplate code and iterate manually.
The future belongs to declarative, functional style being held together by an object oriented framework - and the cross cutting concerns being handled by aspects.
I guess the only people who will not get it soon are the guys who are still payed for lines of code.