Breeze: complex graph returns only 1 collection - nosql

I have a physician graph that looks something like this:
The query I use to get data from a WebApi backend looks like this:
var query = new breeze.EntityQuery().from("Physicians")
.expand("ContactInfo")
.expand("ContactInfo.Phones")
.expand("ContactInfo.Addresses")
.expand("PhysicianNotes")
.expand("PhysicianSpecialties")
.where("ContactInfo.LastName", "startsWith", lastInitial).take(5);
(note the ContactInfo is a pseudonym of the People object)
What I find is that If I request Contact.Phones to be expanded, I'll get just phones and no Notes or Specialties. If I comment out the phones I'll get Contact.Addresses and no other collections. If I comment out ContactInfo along with Phones and Addresses I'll get Notes only etc. Essentially, it seems like I can only get one collection at a time.
So, Is this a built in 'don't let the programmer shoot himself in the foot'?? safeguard or do I have to enable something?
OR is this graph too complicated?? should I consider a NoSql object store??
Thanks

You need to put all your expand clauses in a single one like this:
var query = new breeze.EntityQuery().from("Physicians")
.expand("ContactInfo, ContactInfo.Phones, ContactInfo.Addresses, PhysicianNotes, PhysicianSpecialties")
.where("ContactInfo.LastName", "startsWith", lastInitial).take(5);
You can see the documentation here: http://www.breezejs.com/sites/all/apidocs/classes/EntityQuery.html#method_expand

JY told you HOW. But BEWARE of performance consequences ... both on the data tier and over the wire. You can die a miserable death by grabbing too widely and deeply at once.
I saw the take(5) in his sample. That is crucial for restraining a runaway request (something you really must do also on the server). In general, I would reserve extended graph fetches of this kind for queries that pulled a single root entity. If I'm presenting a list for selection and I need data from different parts of the entity graph, I'd use a projection to get exactly what I need to display (assuming, of course, that there is no SQL View readily available for this purpose).
If any of the related items are reference lists (color, status, states, ...), consider bringing them into cache separately in a preparation step. Don't include them in the expand; Breeze will connect them on the client to your queried entities automatically.
Finally, as a matter of syntax, you don't have to repeat the name of a segment. When you write "ContactInfo.Phones", you get both ContactInfos and Phones so you don't need to specify "ContactInfo" by itself.

Related

How to properly access children by filtering parents in a single REST API call

I'm rewriting an API to be more RESTful, but I'm struggling with a design issue. I'll explain the situation first and then my question.
SITUATION:
I have two sets resources users and items. Each user has a list of item, so the resource path would like something like this:
api/v1/users/{userId}/items
Also each user has an isPrimary property, but only one user can be primary at a time. This means that if I want to get the primary user you'd do something like this:
api/v1/users?isPrimary=true
This should return a single "primary" user.
I have client of my API that wants to get the items of the primary user, but can't make two API calls (one to get the primary user and the second to get the items of the user, using the userId). Instead the client would like to make a single API call.
QUESTION:
How should I got about designing an API that fetches the items of a single user in only one API call when all the client has is the isPrimary query parameter for the user?
MY THOUGHTS:
I think I have a some options:
Option 1) api/v1/users?isPrimary=true will return the list of items along with the user data.
I don't like this one, because I have other API clients that call api/v1/users or api/v1/users?isPrimary=true to only get and parse through user data NOT item data. A user can have thousands of items, so returning those items every time would be taxing on both the client and the service.
Option 2) api/v1/users/items?isPrimary=true
I also don't like this because it's ugly and not really RESTful since there is not {userId} in the path and isPrimary isn't a property of items.
Option 3) api/v1/users?isPrimary=true&isShowingItems=true
This is like the first one, but I use another query parameter to flag whether or not to show the items belonging to the user in the response. The problem is that the query parameter is misleading because there is no isShowingItems property associated with a user.
Any help that you all could provide will be greatly appreciated. Thanks in advance.
There's no real standard solution for this, and all of your solutions are in my mind valid. So my answer will be a bit subjective.
Have you looked at HAL for your API format? HAL has a standard way to embed data from one resources into another (using _embedded) and it sounds like a pretty valid use-case for this.
The server can decide whether to embed the items based on a number of criteria, but one cheap solution might be to just add a query parameter like ?embed=items
Even if you don't use HAL, conceptually you could still copy this behavior similarly. Or maybe you only use _embedded. At least it's re-using an existing idea over building something new.
Aside from that practical solution, there is nothing in un-RESTful about exposing data at multiple endpoints. So if you created a resource like:
/v1/primary-user-with-items
Then this might be ugly and inconsistent with the rest of your API, but not inherently
'not RESTful' (sorry for the double negative).
You could include a List<User.Fieldset> parameter called fieldsets, and then include things if they are specified in fieldsets. This has the benefit that you can reuse the pattern by adding fieldsets onto any object in your API that has fields you might wish to include.
api/v1/users?isPrimary=true&fieldsets=items

REST where should end point go?

Suppose there's USERS and ORDERS
for a specific user's order list
You could do
/user/3/order_list
/order/?user=3
Which one is prefered and why?
Optional parameters tend to be easier to put in the query string.
If you want to return a 404 error when the parameter value does not correspond to an existing resource then I would tend towards a path segment parameter. e.g. /customer/232 where 232 is not a valid customer id.
If however you want to return an empty list then when the parameter is not found then query string parameters. e.g. /contacts?name=dave
If a parameter affects an entire URI structure then use a path e.g. a language parameter /en/document/foo.txt versus /document/foo.txt?language=en
If unique identifiers to be in a path rather than a query parameter.
Path is friendly for search engine/browser history/ Navigation.
When I started to create an API, I was thinking about the same question.
Video from apigee. help me a lot.
In a nutshell when you decide to build an API, you should decide which entity is independent and which is only related to someone.
For example, if you have a specific endpoint for orders with create/update/delete operations, then it will be fine to use a second approach /order/?user=3.
In the other way, if orders have only one representation, depends on a user and they don't have any special interaction then you could first approach.
There is also nice article about best practice
The whole point of REST is resources. You should try and map them as closely as possible to the actual requests you're going to get. I'd definitely not call it order_list because that looks like an action (you're "listing" the orders, while GET should be enough to tell you that you're getting something)
So, first of all I think you should have /users instead of /user, Then consider it as a tree structure:
A seller (for lack of a better name) can have multiple users
A user can have multiple orders
An order can have multiple items
So, I'd go for something like:
The seller can see its users with yourdomain.com/my/users
The details of a single user can be seen with yourdomain.com/my/users/3
The orders of a single user can be seen with yourdomain.com/my/users/3/orders
The items of a single order can be seen with yourdomain.com/my/users/3/orders/5

REST: How best to handle large lists

It seems to me that REST has clean and clear semantics for basic CRUD and for listing resources, but I've seen no discussion of how to handle large lists of resources. It doesn't scale to dump an entire database table over the network in a resource-oriented architecture (imagine a customer table with a million customers!), especially if you only need a few items. So it seems that some semantics should exist to filter, map and reduce a list of resources on the server-side.
So, do you know any tried and true ways to do the following kinds of requests in REST:
1) Retrieve just the count of the resources?
I could imagine doing something like GET /api/customer?result=count
Is that how it's usually done?
I could also imagine modifying the URL (/api/count/customer or /api/customer/count, for example), but that seems to either break the continuity of the resource paths or inflict an ugly hack on the expected ID field.
2) Filter the results on the server-side?
I could imagine using query parameters for this, in a context-specific way (such as GET /api/customer?country=US&state=TX).
It seems tricky to do it in a flexible way, especially if you need to join other tables (for example, get customers who purchased in the last 6 months).
I could imagine using the HTTP OPTIONS method to return a JSON string specifying possible filters and their possible values.
Has anyone tried this sort of thing?
If so, how complex did you get (for example, retrieving the items purchased year-to-date by female customers between 18 and 45 years old in Massachussetts, etc.)?
3) Mapping to just get a limited set of fields or to add fields from joined tables?
4) More complicated reductions than count (such as average, sum, etc.)?
EDIT: To clarify, I'm interested in how the request is formulated rather than how to implement it on the server-side.
I think the answer to your question is OData! OData is a generic protocol for querying and interacting with information. OData is based on REST but extends the semantics to include programatic elemements similar to SQL.
OData is not always URL-based only as it use JSON payloads for some scenarios. But it is a standard (OASIS) so it well structured and supported by many APIs.
A few general links:
https://en.wikipedia.org/wiki/Open_Data_Protocol
http://www.odata.org/
The most common ways of handling large data sets in GET requests are (afaict) :
1) Pagination. The request would be something like GET /api/customer?country=US&state=TX&firstResult=0&maxResults=50. This way the client has the freedom to choose the size of the data chunk he needs (this is often useful for UI-based clients).
2) Exposing a size service, so that the client gets to know how large the data set is before actually requesting it. The service would be something like
GET /api/customer/size?country=US&state=TX
Obviously the two can (and imho should) be used together, so that when/if a client (be it mobile or web or whatever) waints to fill its UI with content, he can choose what's the best data chunk size based also on the size of whole data set (e.g. to avoid creating 100 pages for the user to navigate).

Dynamic representation of a REST resource

Lets assume I have an object that I expose as a REST resource in my application. This object has many fields and contains many other objects including associated collections. Something like this, but think MUCH bigger:
Customer
List<Order> orders
List<Address> shippingAddresses;
// other fields for name, etc.
Order
List<Product> products
// fields for total, tax, shipping, etc.
Product
// fields for name, UPC, description, etc.
I expose the customer in my api as /customer/{id}
Some of my clients will want all of the details for every product in each order. If I follow HATEOAS I could supply a link to get the product details. That would lead to n+1 calls to the service to populate the products within the orders for the customer. On the other hand, if I always populate it then many clients receive a bunch of information they don't need and I do a ton of database lookups that aren't needful.
How do I allow for a customer representation of my resource based on the needs of the client?
I see a few options.
Use Jackson's JsonView annotation to specify in advance what is used. The caller asks for a view appropriate to them. i.e. /customer/{id}?view=withProducts. This would require me to specify all available views at compile time and would not be all that flexible.
Allow the caller to ask for certain fields to be populated in the request, i.e. /customer/{id}?fields=orders,firstName,lastName. This would require me to have some handler that could parse the fields parameter and probably use reflection to populate stuff. Sounds super messy to me. The what do you do about sub-resources. Could I do fields=orders.products.upc and join into the collection that way? Sounds like I'm trying to write hibernate on top of REST or something.
Follow HATEOAS and require the client to make a million HTTP calls in order to populate what they need. This would work great for those that don't want to populate the item most of the time, but gets expensive for someone that is attempting to show a summary of order details or something like that.
Have separate resources for each view...
Other?
I would do something like this:
/customers/{id}/orders/?include=entities
Which is a kind of a more specific variation of your option 1.
You would also have the following options:
Specific order from a specific customer without list of products:
/customers/{id}/orders/{id}
Just the orders of a customer without products:
/customers/{id}/orders/
I tend to avoid singular resources, because most of the time or eventually someone always wants a list of things.
Option 2 (client specifies fields) is a filtering approach, and acts more like a query interface than a GETable resource. Your filter could be more expressive if you accept a partial template in a POST request that your service will populate. But that's complicated.
I'm willing to bet all you need is 2 simple representations of any complex entity. That should handle 99.9% of the cases in your domain. Given that, make a few more URIs, one for each "view" of things.
To handle the 0.1% case (for example, when you need the Products collection fully populated), provide query interfaces for the nested entities that allow you to filter. You can even provide hypermedia links to retrieve these collections as part of the simplified representations above.

Is there a better restful interface for this?

GET https://api.website.com/v1/project/employee;company-id={company-id},
title={title-id}?non-smoker={true|false}&<name1>=<value1>&<name2>=<value2>&<name3>=<value3>
where:
company-id is mandatory,
title is optional
name/value can be any filter criteria.
Is there a better way to define the interface?
This API is not supposed to create an employee object. It is for getting an array of employee objects that belongs to a particular company and has a particular title and the other filter criteria.
I don't know if there is a better way, because it depends often on the technology you use and its idioms.
However, here is two different URI designs that I like (and why)
#1 GET https://api.website.com/v1/project/employee/{company-id}?title={title-id}&non-smoker={true|false}&<name1>=<value1>&<name2>=<value2>&<name3>=<value3>
#2 GET https://api.website.com/v1/project/company/{company-id}/employee?title={title-id}&non-smoker={true|false}&<name1>=<value1>&<name2>=<value2>&<name3>=<value3>
As you can see in both example I extracted company-id from the query string. I prefer to add mandatory parameters in the path info to distinguish them. Then, in the second URI, the employee ressource is nested in the company. That way you can easily guess that you can retrieve all employee from a specific company, which is not obvious in the first example.
This api is supposed to GET employee objects that satisfy the given criteria of belonging to a particular company, having particular job title and some other filter criteria.
Personally I would just design your URI as http://acme.com/employee/?company=X&title=Y&non-smoker=Z&T=U. I wouldn't write "in stone" that the company is mandatory: your API will be easier to change.
However, you should consider that few "big" requests are far faster than plenty of small ones. Moreover, URI representations can be effectively cached. Therefore it is often better to have URIs based on IDs (since there are more chances that they will be asked again).
So you could get the complete employee list of a company (plus other data about the company itself) with http://acme.com/company/X and then filter it client-side.
Are you creating a new employee object? If so then a POST (create) is more appropriate. A good clue is all the data you're pushing in the URL. All that should be in the body of the POST object.