Hibernate Search: Find in list of intervals - hibernate-search

I am using Hibernate Search 6.x within my Spring Boot application. I got an indexed entity with a set of date intervals.
#Indexed
public class ProudParent {
...
#IndexedEmbedded(includePaths = {"start", "end"})
Set<DateInterval> parentalLeaves;
}
And the class DateInterval looks like
#Embeddable
public class DateInterval {
#GenericField
LocalDate start;
#GenericField
LocalDate end;
}
That is my range query to search for all proud parents which were in parental leave on a specific date:
bool.must(factory.range().field("parentalLeaves.start").lessThan(date));
bool.must(factory.range().field("parentalLeaves.end").greaterThan(date));
Now the problem is that it finds proud parents which first parental leave was before date and the last parental leave was after date. So it does not search in the "same" date interval.
Can someone give me a hint of what I'm doing wrong?
Regards, Rokko

#IndexedEmbedded will "flatten" the object structure by default, so all starts and ends are mixed together in the index, with no way to know which end matches which start.
You can read more about this in this section of the reference documentation. In short, this default is mainly for performance, because preserving the object structure can be (very) costly on large indexes, and is usually not necessary.
The solution, in your case, would be to use a NESTED structure:
#Indexed
public class ProudParent {
...
#IndexedEmbedded(includePaths = {"start", "end"}, structure = ObjectStructure.NESTED)
Set<DateInterval> parentalLeaves;
}
Then make sure to tell Hibernate Search you want the matches on start and end to be on the same parentalLeave, using a nested predicate:
bool.must(factory.nested().objectField("parentalLeaves")
.nest(factory.bool()
.must(factory.range()
.field("parentalLeaves.start")
.lessThan(date))
.must(factory.range()
.field("parentalLeaves.end")
.greaterThan(date)))
Don't forget to reindex your data before testing.
Side note: the syntax should be slightly more convenient with the upcoming Hibernate Search 6.2.0.Alpha2 (not yet released):
bool.must(factory.nested("parentalLeaves")
.add(factory.range()
.field("parentalLeaves.start")
.lessThan(date))
.add(factory.range()
.field("parentalLeaves.end")
.greaterThan(date)))
Other side note: hopefully one day we'll have date ranges as a primitive type in Hibernate Search (HSEARCH-4199) and this whole discussion will become irrelevant :)

Related

Optaplanner passing a variable through planning solution

I have a planning entity Request and a planning variable as taxi.
I want to pass the Date(a particular day) to the drools file for cab allocation.
I tried adding Date to the planning solution but the rule always failed where i captured the Date.
Planning Solution
#PlanningSolution
public class NRequest extends AbstractPersistable implements Solution<HardMediumSoftScore> {
private Date date;
private List<Cabs> list_cabs;
#PlanningEntityCollectionProperty
private List<Requests> list_req;
.....
.....
}
Drools file
rule "Check overlap Shift1"
when
$date:Date()
then
scoreHolder.addHardConstraintMatch(kcontext, 3);
scoreHolder.addSoftConstraintMatch(kcontext, 2);
end
I'd suggest the NurseRosteringParametrization appoach.
The FooSolution class has a single FooParameterization class, which holds things like the date or the planning window starting date or the specific score weights etc. Then simply match on FooParameterization in your drools rules (you know there is only 1 instance) and that's it. Make sure that FooParameterization is part getProblemFacts() or #ProblemFactProperty

How to filter fields from the database in JSON response?

i am making a REST API in golang and i want to add support for filtering fields but i don't know the best way to implement that, lets say i have this structure representing an Album model
type Album struct {
ID uint64 `json:"id"`
User uint64 `json:"user"`
Name string `json:"name"`
CreatedDate time.Time `json:"createdDate"`
Privacy string `json:"privacy"`
Stars int `json:"stars"`
PicturesCount int `json:"picturesCount"`
}
and a function that returns an instance of an Album
func GetOne(id uint64, user uint64) (Album, error) {
var album Album
sql := `SELECT * FROM "album" WHERE "id" = $1 AND "user" = $2;`
err := models.DB.QueryRow(sql, id, user).Scan(
&album.ID,
&album.User,
&album.Name,
&album.CreatedDate,
&album.Privacy,
&album.Stars,
&album.PicturesCount,
)
return album, err
}
and the client was to issue a request like this
https://api.localhost.com/albums/1/?fields=id,name,privacy
obvious security issues aside, my first thought was to filter the fields in the database using something like this
func GetOne(id uint64, user uint64, fields string) {
var album Album
sql := fmt.Sprintf(`SELECT %s FROM "album" WHERE "id" = $1 AND "user" = $2;`, fields)
// i don't know what to do after this
}
and then i thought of adding omitempty tag to all the fields and setting the fields to their zero value before encoding it to JSON,
would this work?
which one is the better way?
is there a best way?
how would i go about implementing the first method?
Thank you.
For your first proposal (querying only the requested fields) there are two approaches (answering "would this work?" and "how would I go about implementing the first method?"):
Dynmaically reate a (possibly anonymous) struct and generate JSON from there using encoding/json.
Implement a wrapper that will translate the *database/sql.Rows you get back from the query into JSON.
For approach (1.), you will somehow need to create structs for any combination of attributes from your original struct. As reflect cannot create a new struct type at runtime, your only chance would be to generate them at compile time. The combinatorial explosion will bloat your binary, so do not do that.
Approach (2.) is to be handled with caution and can only be a last resort. Taking the list of requested fields and writing out JSON with the values you got from DB sounds straightforward and does not involve reflection. However your solution will be (very likely) much more unstable than encoding/json.
When reading your question I too thought about using the json:"omitempty" struct tag. And I think that it is the preferable solution. It does neither involve metaprogramming nor writing your own JSON encoder, which is a good thing. Just be aware of the implications in case some fields are missing (client side maybe has to account for that). You could query for all attributes always and override the unwanted ones using reflection.
In the end, all above solutions are suboptimal, and the best solution would be to not implement that feature at all. I hope you have a solid reason to make attributes variable, and I am happy to further clarify my answer based on your explaination. However, if one of the attributes of a resource is too large, it maybe should be a sub-resource.

ObjectContext, Entities and loading performance

I am writing a RIA service, which is also exposed using SOAP.
One of its methods needs to read data from a very big table.
At the beginning I was doing something like:
public IQueryable<MyItem> GetMyItems()
{
return this.ObjectContext.MyItems.Where(x => x.StartDate >= start && x.EndDate <= end);
}
But then I stopped because I was worried about the performance.
As far as I understand MyItemsis fully loaded and "Where" just filters the elements that were loaded at the first access of the property MyItems. Because MyItemswill have really lots of rows, I don't think this is the right approach.
I tried to google a bit the question but no interesting results came up.
So, I was thinking I could create a new instance of the context inside the GetMyItems method and load MyItems selectively. Something like:
public IQueryable<MyItems> GetMyItems(string Username, DateTime Start, DateTime End)
{
using (MyEntities ctx = new MyEntities ())
{
var objQuery = ctx.CreateQuery<MyItems>(
"SELECT * FROM MyItems WHERE Username = #Username AND Timestamp >= #Start AND Timestamp <= #End",
new ObjectParameter("#Username", Username),
new ObjectParameter("#Start", Start),
new ObjectParameter("#End", End));
return objQuery.AsQueryable();
}
}
But I am not sure at all this is the correct way to do it.
Could you please assist me and point out the right approach to do this?
Thanks in advance,
Cheers,
Gianluca.
As far as I understand MyItemsis fully loaded and "Where" just filters the elements that were loaded at the first access of the property MyItems.
No. That's entirely wrong. Don't fix "performance problems" until you actually have them. The code you already have is likely to perform better than the code you propose replacing it with. It certainly won't behave in the way you describe. But don't take my word for it. Use the performance profiler. Use SQL Profiler. And test!

Make Lucene index a value and store another

I want Lucene.NET to store a value while indexing a modified, stripped-down version of the stored value. e.g. Consider the value:
this_example-has some/weird (chars) 100%
I want it stored right like that (so that I can retrieve exactly that for showing in the results list), but I want lucene to index it as:
this example has some weird chars 100
(you see, like a "sanitized" version of the original value) for a simplified search.
I figure this would be the job of an analyzer, but I don't want to mess with rolling my own. Ideally, the solution should remove everything that is not a letter, a number or quotes, replacing the removed chars by a white-space before indexing.
Any suggestions on how to implement that?
This is because I am indexing products for an e-commerce search, and some have realy creepy names. I think this would improve search assertiveness.
Thanks in advance.
If you don't want a custom analyzer, try storing the value as a separate non-indexed field, and use a simple regex to generate the sanitized version.
var input = "this_example-has some/weird (chars) 100%";
var output = Regex.Replace(input, #"[\W_]+", " ");
You mention that you need another Analyzer for some searching functionality. Dont forget the PerFieldAnalyzerWrapper which will allow you to use different analyzers within the same document.
public static void Main() {
var wrapper = new PerFieldAnalyzerWrapper(defaultAnalyzer: new StandardAnalyzer(Version.LUCENE_29));
wrapper.AddAnalyzer(fieldName: "id", analyzer: new KeywordAnalyzer());
IndexWriter writer = null; // TODO: Retrieve these.
Document document = null;
writer.AddDocument(document, analyzer: wrapper);
}
You are correct that this is the work of the analyzer. And I'd start by using a tool like luke to see what the standard analyzer does with your term before getting into what to use -- it tends to do a good job stripping noise characters and words.

Entity Framework Code First & Search Criteria

So I have a model created in Entity Framework 4 using the CTP4 code first features. This is all working well together.
I am attempting to add an advanced search feature to my application. This "advanced search" feature simply allows the users to enter multiple criteria to search by. For example:
Advanced Product Search
Name
Start Date
End Date
This would allow the user to search by the product name and also limit the results by the dates that they were created.
The problem is that I do not know how many of these fields will be used in any single search. How then can my Entity Framework query be constructed?
I have an example describing how to create a dynamic query for Entity Framework, however this does not seem to work for the POCO classes I created for Code First persistence.
What is the best way for to construct a query when the number of constraints are unknown?
So after some hours of work on this problem (and some help from our friend Google) I have found a workable solution to my problem. I created the following Linq expression extension:
using System;
using System.Linq;
using System.Linq.Expressions;
namespace MyCompany.MyApplication
{
public static class LinqExtensions
{
public static IQueryable<TSource> WhereIf<TSource>(this IQueryable<TSource> source, bool condition, Expression<Func<TSource, bool>> predicate)
{
if (condition)
return source.Where(predicate);
else
return source;
}
}
}
This extension allows for a Linq query to be created like this:
var products = context.Products.WhereIf(!String.IsNullOrEmpty(name), p => p.Name == name)
.WhereIf(startDate != null, p => p.CreatedDate >= startDate)
.WhereIf(endDate != null, p => p.CreatedDate <= endDate);
This allows each WhereIf statement to only affect the results if it meets the provided condition. The solution seems to work, but I'm always open to new ideas and/or constructive criticism.
John,
Your solution is absolutely awesome! But, just to share, I have been using this method above until I see your ideia.
var items = context.Items.Where(t => t.Title.Contains(keyword) && !String.IsNullOrEmpty(keyword));
So, it seems not to be the best solution for this, but for sure it is a way around.