stubbing data in REST apis for large system/integration tests - rest

The Problem
Say I've got a cool REST resource /account.
I can create new accounts
POST /account
{accountName:"matt"}
which might produce some json response like:
{account:"/account/matt", accountName:"matt", created:"November 5, 2013"}
and I can look up accounts created within a date range by calling:
GET /account?created-range-start="June 01, 2013"&created-range-end="December 25, 2013"
which might also produce something like:
{accounts: {account:"/account/matt", accountName:"matt", created:"November 5, 2013"}, {...}, ...}
Now, let's say I want to set up some sample data and write some tests against the GET /account resource within some specified creation date range.
For example I want to somehow insert the following accounts into the system
name=account1, created=January 1, 2010
name=account2, created=January 2, 2010
name=account3, created=December 29, 2010
name=account4, created=December 30, 2010
then call
GET /account?created-range-start="January 2, 2010"&created=range-end="December 29,2010"
and verify that only accounts 2 and 3 are returned.
How should I insert these sample accounts to write my tests?
Possible Solutions
1) I could use inversion of control and allow the user to specify the creation date for new accounts.
POST /account
{account:"matt", created="June 01, 2013"}
However, even if the created field were optional, I don't like this approach because I may not want to allow my users the ability to set the creation date of their account. I surely need to be able to do it for testing but having that functionality as part of the public api seems wrong to me. Maybe I want to give a $5 credit to anyone who joined prior to some particular day. If they can specify their create date users can game the system. Not good.
2) I could add one or more testing configuration resources
PUT /account/creationDateTimestampProvider
{provider="DefaultProvider"}
or
PUT /account/creationDateTimestampProvider
{provider="FixedDateProvider", date="June 01, 2013"}
This approach affords me the ability to lock down these resources with security constraints so that only my test context can call them, but it also necessarily has side effects on the system that may become a pain to manage, especially if I have a bunch of backdoor configuration resources.
3) I could interact directly with the database circumventing the REST api altogether to set my sample data.
INSERT INTO ACCOUNTS ...
GET /account?...
However this can allow me to get into states that using the REST api may not allow me to get into and as the db model evolves maintaining these sql scripts might also be a pain.
So... how do i test my GET /account resource? Is there another way I'm not thinking of that is more elegant?

There are a lot of ways to do this, and you've come up with some solid (though maybe not perfect for your situation) solutions.
In the setup for the test, I would spin up an in-memory database like HSQLDB (there are others) and do the inserts. The test configuration will inject the appropriate database configuration into your service provider class. Run the tests, and then shut the database down on teardown.
This post provides a good example at least for the persistence side of things.
Incidentally, do not change the API of your service just to help facilitate a test. Maybe I misunderstood and you aren't anyway, but I thought I would mention just in case.
Hope that helps.

For what it's worth, these days I'm primarily using the second approach for most of my system level (black box) tests.
I create backdoor admin / test apis that have security requirements which only my system tests can access. These superpower apis allow me to seed data. I try to limit the scope of these apis as much as possible so they are not overly coupled to the specific implementation details but are flexible enough to allow specifying whatever is needed for the desired seed data.
The reason I prefer this approach to the database solution that Vidya provided, is so that my tests aren't coupled to the specific data storage technology. If I decide to switch from mongo to dynamo or something like that; using an admin api frees me from having to update all of my tests--instead I only need to update the admin api/impl.

Related

How should I design a REST API

I'm thinking about a REST API design. There are several tables in my database. For example Customer and Order.
Of course - each Order has its Customer (and every customer can have many Orders).
I've decided to provide such an interface
/api/v1/Customers/ -- get list of Customers, add new Customer
/api/v1/Customers/:id: -- get Customer with id=:id:
/api/v1/Orders/ -- get list of Orders, add new Order
/api/v1/Orders/:id: -- get Order with id=:id:
It works flawlessly. But my frontend has to display a list of orders with customer names. With this interface, I will have to make a single call to /api/v1/Orders/ and then another call to /api/v1/Customer/:id: for each record from the previous call. Or perform two calls to /api/v1/Orders/ and /api/v1/Customers/ and combine them on the frontend side.
It looks like overkill, this kind of operation should be done at the database level. But how can/should I provide an appropriate interface?
/api/v1/OrdersWithCustomers
/api/v1/OrdersWithCustomers/:id:
Seems weir. Is it a right way to go
There's no rule that says you cannot "extend" the data being returned from a REST API call. So instead of returning "just" the Order entity (as stored in the backend), you could of course return an OrderResponseDTO which includes all (revelant) fields of the Order entity - plus some from the Customer entity that might are relevant in your use case.
The data model for your REST API does not have to be an exact 1:1 match to your underlying database schema - it does give you the freedom to leave out some fields, or add some additional information that the consumers of your API will find helpful.
Great question, and any API design will tend to hit pragmatic reality at some point like this.
One option is to include a larger object graph for each resource (ie include the customer linked to each order) but use filter query parameters to allow users to specify what properties they require or don't require.
Personally I think that request parameters on a restful GET are fine for either search semantics when retrieving a list of resources, or filtering what is presented for each resource as in this case
Another option for your use case might be to look into a GraphQL approach.
How would you do it on the web?
You've got a web site, and that website serves documents about Customers, and documents about Orders. But your clients aren't happy, because its too much boring, mistake-prone work to aggregate information in the two kinds of documents.
Can we please have a document, they ask, with the boring work already done?
And so you generate a bunch of these new reports, and stick them on your web server, and create links to make it easier to navigate between related documents. TA-DA.
A "REST-API" is a facade that makes your information look and act like a web site. The fact that you are generating your representations from a database is an implementation details, deliberately hidden behind the "uniform interface".

Unit testing strategy

I have to admit I'm quite new to unit testing and there is a lot questions for me.
It's hard to name it but I think it's behaviour test, anyways let me go straight to the example:
I need to test users roles listing to make sure that my endpoint is working correctly and returns all user roles assigned to him.
That means:
I need to create user
I need to create role
I need to assign created role to created user
As we can see there is three operations that must be excecuted before actual test and I believe that in larger applications such list can grow to much larger number of operations and even complex.
The question is how I should test such endpoints: should I just insert raw data to DB, write some code blocks that would do such preparations.
It's probably best if you test the individual units of your service without hitting the service itself, otherwise you're also unit testing the WebApi framework itself.
This will also allow you to mock your database so you don't have to rely on any stored data to run your tests or the authorization to your service.

REST: How best to handle large lists

It seems to me that REST has clean and clear semantics for basic CRUD and for listing resources, but I've seen no discussion of how to handle large lists of resources. It doesn't scale to dump an entire database table over the network in a resource-oriented architecture (imagine a customer table with a million customers!), especially if you only need a few items. So it seems that some semantics should exist to filter, map and reduce a list of resources on the server-side.
So, do you know any tried and true ways to do the following kinds of requests in REST:
1) Retrieve just the count of the resources?
I could imagine doing something like GET /api/customer?result=count
Is that how it's usually done?
I could also imagine modifying the URL (/api/count/customer or /api/customer/count, for example), but that seems to either break the continuity of the resource paths or inflict an ugly hack on the expected ID field.
2) Filter the results on the server-side?
I could imagine using query parameters for this, in a context-specific way (such as GET /api/customer?country=US&state=TX).
It seems tricky to do it in a flexible way, especially if you need to join other tables (for example, get customers who purchased in the last 6 months).
I could imagine using the HTTP OPTIONS method to return a JSON string specifying possible filters and their possible values.
Has anyone tried this sort of thing?
If so, how complex did you get (for example, retrieving the items purchased year-to-date by female customers between 18 and 45 years old in Massachussetts, etc.)?
3) Mapping to just get a limited set of fields or to add fields from joined tables?
4) More complicated reductions than count (such as average, sum, etc.)?
EDIT: To clarify, I'm interested in how the request is formulated rather than how to implement it on the server-side.
I think the answer to your question is OData! OData is a generic protocol for querying and interacting with information. OData is based on REST but extends the semantics to include programatic elemements similar to SQL.
OData is not always URL-based only as it use JSON payloads for some scenarios. But it is a standard (OASIS) so it well structured and supported by many APIs.
A few general links:
https://en.wikipedia.org/wiki/Open_Data_Protocol
http://www.odata.org/
The most common ways of handling large data sets in GET requests are (afaict) :
1) Pagination. The request would be something like GET /api/customer?country=US&state=TX&firstResult=0&maxResults=50. This way the client has the freedom to choose the size of the data chunk he needs (this is often useful for UI-based clients).
2) Exposing a size service, so that the client gets to know how large the data set is before actually requesting it. The service would be something like
GET /api/customer/size?country=US&state=TX
Obviously the two can (and imho should) be used together, so that when/if a client (be it mobile or web or whatever) waints to fill its UI with content, he can choose what's the best data chunk size based also on the size of whole data set (e.g. to avoid creating 100 pages for the user to navigate).

Api naming in microservices design

Let's say that there are two microservices representing the resources orders(/orders) and customers(/customers). My requirement is to get all the orders made by a customer.
Had it been a monolithic application, I would have modeled my uri as /customers/{id}/orders. This would have hit the customers resource and made an in-memory service call to get the corresponding orders.
Now, in case of microservices, this isn't possible. So, is the only way to get the orders is to make a remote service call or is there a better way of doing it?
Can we create another resource with the representation /ordersByCustomers/{customerid}?
You can pass some query parameters as filters (this is the most common way I've seen). For example
/orders?customerId=123
I think that's quite clear, that you want to retrieve all customer orders filtered by customer id. In the same way you can add pagination or other filters.
The important thing to remember is that you want the order resource, so the URL should remain the same. I'm mentioning this, because this has been the most difficult thing for me to change... to think about resources rather than remote calls.
In general you should beware of using endpoint that are more or less similar to the one you suggested:
/ordersByCustomers/{customerid}
Why? Because this is not RESTful in general (even in microservices environment) and make the API difficult to understand and you by the consumers. What if you need orderByWhatever? Will you be introducing new endpoint every single time you need a new set of data? Try to avoid so opinionated endpoints.
What #Augutsto suggested is fully correct. If you're afraid of having a complicated logic in GET request this is the situation where you can break REST rules. I mean introducing:
POST /orders/filter/
Where all the filtering logic will be passed in requests body - so it's easier to carry complicated logic as well.

CQRS - When a command cannot resolve to a domain

I'm trying to wrap my head around CQRS. I'm drawing from the code example provided here. Please be gentle I'm very new to this pattern.
I'm looking at a logon scenario. I like this scenario because it's not really demonstrated in any examples i've read. In this case I do not know what the aggregate id of the user is or even if there is one as all I start with is a username and password.
In the fohjin example events are always fired from the domain (if needed) and the command handler calls some method on the domain. However if a user logon is invalid I have no domain to call anything on. Also most, if not all of the base Command/Event classes defined in the fohjin project pass around an aggregate id.
In the case of the event LogonFailure I may want to update a LogonAudit report.
So my question is: how to handle commands that do not resolve to a particular aggregate? How would that flow?
public void Execute(UserLogonCommand command)
{
var user = null;//user looked up by username somehow, should i query the report database to resolve the username to an id?
if (user == null || user.Password != command.Password)
;//What to do here? I want to raise an event somehow that doesn't target a specific user
else
user.LogonSuccessful();
}
You should take into account that it most cases CQRS and DDD is suitable just for some parts of the system. It is very uncommon to model entire system with CQRS concepts - it fits best to the parts with complex business domain and I wouldn't call logging user in a particularly complex business scenario. In fact, in most cases it's not business-related at all. The actual business domain starts when user is already identified.
Another thing to remember is that due to eventual consistency it is extremely beneficial to check as much as we can using only query-side, without event creating any commands/events.
Assuming however, that the information about successful / failed user log-ins is meaningful I'd model your scenario with following steps
User provides name and password
Name/password is validated against some kind of query database
When provided credentials are valid RegisterValidUserCommand(userId) is executed which results in proper event
If provided credentials are not valid
RegisterInvalidCredentialsCommand(providedUserName) is executed which results in proper event
The point is that checking user credentials is not necessarily part of business domain.
That said, there is another related concept, in which not every command or event needs to be business - related, thus it is possible to handle events that don't need aggregates to be loaded.
For example you want to change data that is informational-only and in no way affects business concepts of your system, like information about person's sex (once again, assuming that it has no business meaning).
In that case when you handle SetPersonSexCommand there's actually no need to load aggregate as that information doesn't even have to be located on entities, instead you create PersonSexSetEvent, register it, and publish so the query side could project it to the screen/raport.