Spring Data PagingAndSortingRepository - spring-data

Im using Spring Data together with QueryDSL as outlined by Gierke in his blog posts. Everything is working and is relatively simple to implement, but Ive now reached a point where I have a dataview which needs both paging AND sorting. It seems though, that one has to choose on or the other. Why is this? And is there really no way to get both? We have already made considerable investments in time and effort to implement everything this far, would be a shame to get stuck at such a seemingly simple task.
To put it shortly, I need to make a method that takes QueryDSL predicates, a pageable and and some form of sort object to deliver filtered, paged and sorted results.
Any info would be greatly appreciated.

PageRequest has a constructor PageRequest(int page, int size, Sort sort) so to combine both simply pipe your sort options into the PageRequest instance and hand this to into PagingAndSortRepository or the relevant methods in QueryDslSpecificationExcutor.

Related

.Net Core Rest API Request/Response best practice

I need some advice on how to best structure the requests and the responses for my Rest API.
I'm mostly trying to limit myself to CRUD operations on one resources and I work with one object: for example if the ressource is "book" I end up with the following actions in the controller
[HttpPost("books")] Book Create(Book book)
[HttpGet("books")] Book Get(int id)
This is relatively strait forward.
Now for a more complex example for the creation of a resource, I need to receive a complexe object different from my ressource and return an object containing the resource and extra data
For example for the Order resource I have a the following action in the controller:
[HttpPost("/order")] CreateOrderResponse CreateOrder(CreateOrderRequest createOrderRequest)
Here my action will use the "CreateOrderRequest" object to create to build an Order.
Then I would like to return a "createOrderResponse" object which contains the Order but also extra information that the client needs.
I'm not sure this is the best way to go, any advice ?
Thanks in advance for your help
I prefer the following:
[HttpPost("/order")] CreateOrderResponse CreateOrder(CreateOrderRequest createOrderRequest)
And here is why:
By this method, you are able to protect your public API from implementation details. If you expose your model to your API then you cannot make the same guarantee.
You can also make your validations specific to the request format. In some cases, you might require one subset of your model when creating a record and another subset when editing data. This approach will allow you to handle that scenario as well.
Security. Were you going to add that Book right to a DbContext and save it? Or attach it and update directly? Those would be potential issues from security and data quality perspectives.
But there are downsides:
This approach is time consuming. It may not be worth the time invested if you are writing something as a learning exercise or a quick implementation. And it adds complexity. But then, you might find complexity when you realize your Book object is insufficent in all cases.
You will feel like there is duplicate code in different places. The code may appear to be the same, but the use cases are actually different and may diverge over time. Having a Book parameter will be a liability at that point.

Conducting searches with REST that return large datasets?

I'm creating a RESTful WebAPI for our system in .Net, when conducting a search in my client I presume that it should hit the /person route passing parameters when required to filter the data. However, the person object that is return has quite a lot of nested objects which could slow down data retrieval. Should I have a separate controller which returns a more skeletonised view of a person, should I continue the way I am going, or should I be making subsequent requests to break down the person?
Actually, there is no silver-bullet way to solve your problem, but there are several approaches, which could be usefull for you. However, in my opinion, your idea about optimizing the size of resource representation in search results is correct.
You can include the list of requested fields in filtering query. (for example, see the similar signature/approach in ES search API). Many search engines are following this approach to reduce redundant response payload.
As you have metioned, you can break your heavy object in sub-resources, so that you would be able to include only links to nested objects inside the person, without including the whole represantations of inner-objects. The HATEOAS approach will fit perfectly for this purpose, but it will add extra complexity to your application (but the extra flexibility too).
However, you have to choose, which approach is better for your particular application, but I think, that a good starting point will be the approach with list of requested fields.

nosql wishlist models - Struggle between reference and embedded document

I got a question about modeling wishlists using mongodb and mongoose. The idea is I need a user beeing able to have many different wishlists which contain many wishes, each wish making a reference to a single article
I was thinking about it and because a wishlist only belong to a single user I thought using embedded document for that.
Same for the wish beeing embedded to a wishlist.
So I got something like that
var UserSchema = new Schema({
...
wishlists: [wishlistSchema]
...
})
var WishlistSchema = new Schema({
...
wishes: [wishSchema]
...
})
but my question is what to do with the article ? should I use a reference or should I copy the article's data in an embedded document.
If I use embedded document I got an update problem. When the article's price change, to update every wish referencing this article become a struggle. But to access those wishes's article is a piece of cake.
If I use reference, The update is not a problem anymore but I got a probleme when I filter the wish depending on their article criteria ( when I filter the wishes depending on a price, category etc .. ).
I think the second way is probably the best but I don't know how if it's possible to build a query to filter the wish depending on the article's field. I tried a lot of things using population but nothing works very well when you need to populate depending on a nested object field. ( for exemple getting wishes where their article respond to certain conditions ).
Is this kind of query doable ?
Sry for the loooong question and for my bad English :/ but any advice would be great !
In my experience in dealing with NoSQL database (mongo, mainly), when designing a collection, do not think of the relations. Instead, think of how you would display, page, and retrieve the documents.
I would prefer embedding and updating multiple schema when there's a change, as opposed to doing a ref, for multiple reasons.
Get would be fast and easy and filter is not a problem (like you've said)
Retrieve operations usually happen a lot more often than updates and with proper indexing, you wouldn't really have to bother about performance.
It leverages on NoSQL's schema-less nature and you'll be less prone restructuring due to requirement changes (new sorting, new filters, etc)
Paging would be a lot less of a hassle, and UI would not be restricted with it's design with paging and limit.
Joining could become expensive. Redundant data might be a hassle to update but it's always better than not being able to display a data in a particular way because your schema is normalized and joining is difficult.
I'd say that the rule of thumb is that only split them when you do not need to display them together. It is not impossible to join them back if you do, but definitely more troublesome.

Entity Framework 5 Get All method

In our EF implementation for Brand New Project, we have GetAll method in Repository. But, as our Application Grows, we are going to have let's say 5000 Products, Getting All the way it is, would be Performance Problem. Wouldn't it ?
if So, what is good implementation to tackle this future problem?
Any help will be hightly appreciated.
It could become a performance problem if get all is enumerating on the collection, thus causing the entire data set to be returned in an IEnumerable. This all depends on the number of joins, the size of the data set, if you have lazy loading enabled, how SQL Server performs, just to name a few.
Some people will say this is not desired, you could have GetAll() return an IQueryable which would defer the query until something caused the collection to be filled by going to SQL. That way you could filter the results with other Where(), Take(), Skip(), etc statements to allow for paging without having to retrieve all 5000+ products from the database.
It depends on how your repository class is set up. If you're performing the query immediately, i.e. if your GetAll() method returns something like IEnumerable<T> or IList<T>, then yes, that could easily be a performance problem, and you should generally avoid that sort of thing unless you really want to load all records at once.
On the other hand, if your GetAll() method returns an IQueryable<T>, then there may not be a problem at all, depending on whether you trust the people writing queries. Returning an IQueryable<T> would allow callers to further refine the search criteria before the SQL code is actually generated. Performance-wise, it would only be a problem if developers using your code didn't apply any filters before executing the query. If you trust them enough to give them enough rope to hang themselves (and potentially take your database performance down with them), then just returning IQueryable<T> might be good enough.
If you don't trust them, then, as others have pointed out, you could limit the number of records returned by your query by using the Skip() and Take() extension methods to implement simple pagination, but note that it's possible for records to slip through the cracks if people make changes to the database before you move on to the next page. Making pagination work seamlessly with an ever-changing database is much harder than a lot of people think.
Another approach would be to replace your GetAll() method with one that requires the caller to apply a filter before returning results:
public IQueryable<T> GetMatching<T>(Expression<Func<T, bool>> filter)
{
// Replace myQuery with Context.Set<T>() or whatever you're using for data access
return myQuery.Where(filter);
}
and then use it like var results = GetMatching(x => x.Name == "foo");, or whatever you want to do. Note that this could be easily bypassed by calling GetMatching(x => true), but at least it makes the intention clear. You could also combine this with the first method to put a firm cap on the number of records returned.
My personal feeling, though, is that all of these ways of limiting queries are just insurance against bad developers getting their hands on your application, and if you have bad developers working on your project, they'll find a way to cause problems no matter what you try to do. So my vote is to just return an IQueryable<T> and trust that it will be used responsibly. If people abuse it, take away the GetAll() method and give them training-wheels methods like GetRecentPosts(int count) or GetPostsWithTag(string tag, int count) or something like that, where the query logic is out of their hands.
One way to improve this is by using pagination
context.Products.Skip(n).Take(m);
What your referring to is known as paging, and it's pretty trivial to do using LINQ via the Skip/Take methods.
EF queries are lazily loaded which means they won't actually hit the database until they are evaluated so the following would only pull the first 10 rows after skipping the first 10
context.Table.Skip(10).Take(10);

How deep should I embed with Mongoid?

Hi I'm new to mongoDB and Mongoid and am little bit confused on when to use embedded documents and how deep to embedd.
So a fictional example:
Library collection has_many :books, which embeds_many :pages, which embeds_many :sections
Since I cannot work with say a Section directly I have to go trough books.pages.sections, right?
This would result in this route, libraries/:id/books/:id/pages/:id/sections/:id
Which seems a little bit crazy, best practice would be to only nest one level deep, right?
One way would be to have the route pages/:id/sections/:id and then stick the bookid in the request?
Would it be harder to query on say, sections? For example if I need to find all the books where sections has tag x?
However if I don't embedd all the way I would have an extra query?
Can someone shed some light? Thanks.
First of I believe that no one can give you the right answer about how deeply you should embed documents. It is highly dependent on your concrete project requirements. In general you should answer some questions to choose appropriate schema design:
Will users concurrently update same object in collection? (or what would my boss say if clients lost their updates)
Do I need support atomic operations?
Do I need to independently show nested collections or are they queried with the parent?
Do I need to sort embedded objects?
Do I need to query on embedded objects?
If you will answer 1,2-true; 3 I need show them independently (different page); 4,5 - true then i am sure that embedding will be some kind of pain in your ass.
Extra queries should not be a problem I guess.