Case-insensitive indexing with Hibernate-Search?

Case-insensitive indexing with Hibernate-Search? - hibernate-search

Is there a simple way to make Hibernate Search to index all its values in lower case ? Instead of the default mixed-case.
I'm using the annotation #Field. But I can't seem to be able to configure some application-level set

Fool that I am ! The StandardAnalyzer class is already indexing in lowercase. It's just a matter of setting the search terms in lowercase too. I was assuming the query would do that.
However, if a different analyzer were to be used, application-wide, then it can be set using the property hibernate.search.analyzer.

Lowercasing, term splitting, removing common terms and many more advanced language processing functions are applied by the Analyzer.
Usually you should process user input meant to match indexed strings with the same Analyzer used at indexing; configuring hibernate.search.analyzer sets the default (global) Analyzer, but you can customize it per index, per entity type, per field and even on different entity instances.
It is for example useful to have language specific analysis, so to process Chinese descriptions with Chinese specific routines, Italian descriptions with Italian tokenizers.
The default analyzer is ok for most use cases, and does lowercasing and splits terms on whitespace.
Consider as well that when using the Lucene Queryparser the API requests you the appropriate Analyzer.
When using the Hibernate Search QueryBuilder it attempts to apply the correct Analyzer on each field; see also http://docs.jboss.org/hibernate/search/4.1/reference/en-US/html_single/#search-query-querydsl .

There are multiple way to make sort insensitive in string type field only.
1.First Way is add #Fields annotation in field/property on entity.
Like
#Fields({#Field(index=Index.YES,analyze=Analyze.YES,store=Store.YES),
#Field(index=Index.YES,name = "nameSort",analyzer = #Analyzer(impl=KeywordAnalyzer.class), store = Store.YES)})
private String name;
suppose you have name property with custom analyzer and sort on that. so it's not possible then you can add new Field in index with nameSort apply sort on that field.
you must apply Keyword Analyzer class because that is not tokeniz field and by default apply lowercase factory class in field.
2.Second way is that you can implement your comparison class on sorting like
#Override
public FieldComparator newComparator(String field, int numHits, int sortPos, boolean reversed) throws IOException {
return new StringValComparator(numHits, field);
}
Make one class with extend FieldComparatorSource class and implement above method.
Created new Class name with StringValComparator and implements FieldComparator
and implement following method
class StringValComparator extends FieldComparator {
private String[] values;
private String[] currentReaderValues;
private final String field;
private String bottom;
StringValComparator(int numHits, String field) {
values = new String[numHits];
this.field = field;
}
#Override
public int compare(int slot1, int slot2) {
final String val1 = values[slot1];
final String val2 = values[slot2];
if (val1 == null) {
if (val2 == null) {
return 0;
}
return -1;
} else if (val2 == null) {
return 1;
}
return val1.toLowerCase().compareTo(val2.toLowerCase());
}
#Override
public int compareBottom(int doc) {
final String val2 = currentReaderValues[doc];
if (bottom == null) {
if (val2 == null) {
return 0;
}
return -1;
} else if (val2 == null) {
return 1;
}
return bottom.toLowerCase().compareTo(val2.toLowerCase());
}
#Override
public void copy(int slot, int doc) {
values[slot] = currentReaderValues[doc];
}
#Override
public void setNextReader(IndexReader reader, int docBase) throws IOException {
currentReaderValues = FieldCache.DEFAULT.getStrings(reader, field);
}
#Override
public void setBottom(final int bottom) {
this.bottom = values[bottom];
}
#Override
public String value(int slot) {
return values[slot];
}
}
Apply sorting on Fields Like
new SortField("name",new StringCaseInsensitiveComparator(), true);

Related

How to optimize SQL query in Anylogic

I am generating Agents with parameter values coming from SQL table in Anylogic. when agent is generated at source I am doing a v look up in table and extracting corresponding values from table. For now it is working perfectly but it is slowing down the performance.
Structure of Table looks like this
I am querying the data from this table with below code
double value_1 = (selectFrom(account_details)
.where(account_details.act_code.eq(z))
.list(account_details.avg_value)).get(0);
double value_min = (selectFrom(account_details)
.where(account_details.act_code.eq(z))
.list(account_details.min_value)).get(0);
double value_max = (selectFrom(account_details)
.where(account_details.act_code.eq(z))
.list(account_details.max_value)).get(0);
// Fetch the cluster number from account table
int cluster_num = (selectFrom(account_details)
.where(account_details.act_code.eq(z))
.list(account_details.cluster)).get(0);
int act_no = (selectFrom(account_details)
.where(account_details.act_code.eq(z))
.list(account_details.actno)).get(0);
String pay_term = (selectFrom(account_details)
.where(account_details.act_code.eq(z))
.list(account_details.pay_term)).get(0);
String pay_term_prob = (selectFrom(account_details)
.where(account_details.act_code.eq(z))
.list(account_details.pay_term_prob)).get(0);
But this is very slow and wants to improve the performance. someone mentioned that we can create a Java class and then add the table into collection . Is there any example where I can refer. I am finding it difficult to put entire code.
I have created a class using below code:
public class Customer {
private String act_code;
private int actno;
private double avg_value;
private String pay_term;
private String pay_term_prob;
private int cluster;
private double min_value;
private double max_value;
public String getact_code() {
return act_code;
}
public void setact_code(String act_code) {
this.act_code = act_code;
}
public int getactno() {
return actno;
}
public void setactno(int actno) {
this.actno = actno;
}
public double getavg_value() {
return avg_value;
}
public void setavg_value(double avg_value) {
this.avg_value = avg_value;
}
public String getpay_term() {
return pay_term;
}
public void setpay_term(String pay_term) {
this.pay_term = pay_term;
}
public String getpay_term_prob() {
return pay_term_prob;
}
public void setpay_term_prob(String pay_term_prob) {
this.pay_term_prob = pay_term_prob;
}
public int cluster() {
return cluster;
}
public void setcluster(int cluster) {
this.cluster = cluster;
}
public double getmin_value() {
return min_value;
}
public void setmin_value(double min_value) {
this.min_value = min_value;
}
public double getmax_value() {
return max_value;
}
public void setmax_value(double max_value) {
this.max_value = max_value;
}
}
Created collection object like this:
Pls provide an reference to add this database table into collection as a next step. then I want to query the collection based on the condition

You are on the right track here!
Every time you access the database to read data there is a computational overhead. So the best option is to access the database only once, at the start of the model. Create all the objects you need, store other data you will need later into Java classes, and then use the Java classes.
My suggestion is to create a Java class for each row in your table, like you have done. And then create a map object - like you have done, but with the key as String and the value as this new object.
Then on model start you can populate this map as follows:
List<Tuple> rows = selectFrom(customer).list();
for (Tuple row : rows) {
Customer customerData = new Customer(
row.get( customer.act_code ),
row.get( customer.actno ),
row.get( customer.avg_value )
);
mapOfCustomerData.put(customerData.act_code, customerData);
}
Where mapOfCustomerData is a linkedHashMap and customer is the name of the table
See the model created in this blog post for more details and an example on using a scenario object to store all the data from the Database in a separate object
Note: The code above is just an example - read this blog post for more details on using the AnyLogic INternal Database

Before using Java classes, try this first: click the "index" tickbox for all columns that you query with a WHERE clause.

Concatenating ImmutableLists

I have a List<ImmutableList<T>>. I want to flatten it into a single ImmutableList<T> that is a concatenation of all the internal ImmutableLists. These lists can be very long so I do not want this operation to perform a copy of all the elements. The number of ImmutableLists to flatten will be relatively small, so it is fine that lookup will be linear in the number of ImmutableLists. I would strongly prefer that the concatenation will return an Immutable collection. And I need it to return a List that can be accessed in a random location.
Is there a way to do this in Guava?
There is Iterables.concat but that returns an Iterable. To convert this into an ImmutableList again will be linear in the size of the lists IIUC.

By design Guava does not allow you to define your own ImmutableList implementations (if it did, there'd be no way to enforce that it was immutable). Working around this by defining your own class in the com.google.common.collect package is a terrible idea. You break the promises of the Guava library and are running firmly in "undefined behavior" territory, for no benefit.
Looking at your requirements:
You need to concatenate the elements of n ImmutableList instances in sub-linear time.
You would like the result to also be immutable.
You need the result to implement List, and possibly be an ImmutableList.
As you know you can get the first two bullets with a call to Iterables.concat(), but if you need an O(1) random-access List this won't cut it. There isn't a standard List implementation (in Java or Guava) that is backed by a sequence of Lists, but it's straightforward to create one yourself:
/**
* A constant-time view into several {#link ImmutableList} instances, as if they were
concatenated together. Since the backing lists are immutable this class is also
immutable and therefore thread-safe.
*
* More precisely, this class provides O(log n) element access where n is the number of
* input lists. Assuming the number of lists is small relative to the total number of
* elements this is effectively constant time.
*/
public class MultiListView<E> extends AbstractList<E> implements RandomAccess {
private final ImmutableList<ImmutableList<E>> elements;
private final int size;
private final int[] startIndexes;
private MutliListView(Iterable<ImmutableList<E>> elements) {
this.elements = ImmutableList.copyOf(elements);
startIndexes = new int[elements.size()];
int currentSize = 0;
for (int i = 0; i < this.elements.size(); i++) {
List<E> ls = this.elements.get(i);
startIndexes[i] = ls.size();
currentSize += ls.size();
}
}
#Override
public E get(int index) {
checkElementIndex(index, size);
int location = Arrays.binarySearch(startIndexes, index);
if (location >= 0) {
return elements.get(location).get(0);
}
location = (~location) - 1;
return elements.get(location).get(index - startIndexes[location]);
}
#Override
public int size() {
return size;
}
// The default iterator returned by AbstractList.iterator() calls .get()
// which is likely slower than just concatenating the backing lists' iterators
#Override
public Iterator<E> iterator() {
return Iterables.concat(elements).iterator();
}
public static MultiListView<E> of(Iterable<ImmutableList<E>> lists) {
return new MultiListView<>(lists);
}
public static MultiListView<E> of(ImmutableList<E> ... lists) {
return of(Arrays.asList(lists));
}
}
This class is immutable even though it doesn't extend ImmutableList or ImmutableCollection, therefore there's no need for it to actually extend ImmutableList.
As to whether such a class should be provided by Guava; you can make your case in the associated issue, but the reason this doesn't already exist is that surprisingly few users actually need it. Be sure there isn't a reasonable way to solve your problem with an Iterable before using MultiListView.

Firstly, #dimo414's answer is right on the mark - with a clean wrapper view implementation and advice.
Still, I would like to emphasise that since Java 8, you probably just want to do:
listOfList.stream()
.flatMap(ImmutableList::stream)
.collect(ImmutableList.toImmutableList());
The guava issue was since closed as working-as-intended with the remark:
We are more down on lazy view collections than we used to be (especially now that Stream exists) (...)
At least, profile your own use case before trying the view-collection approach.
Under the hood using streams, what effectively happens is that a new backing array is populated with references to the elements - the elements themselves are not deeply copied. So there's very low number of objects created (GC costs) and linear copies from backing-arrays to backing-arrays usually proceed faster than you might expect even with large inner-lists. (They work well with CPU cache prefetch).
Depending on how much you do with the result, the stream version might work out faster that the wrapper version's extra indirection every time you access it.

Here is probably a slightly more readable version of dimo414 implementation, which processes empty lists correctly and populates startIndexes correctly:
public class ImmutableMultiListView<E> extends AbstractList<E> implements RandomAccess {
private final ImmutableList<ImmutableList<E>> listOfLists;
private final int[] startIndexes;
private final int size;
private ImmutableMultiListView(List<ImmutableList<E>> originalListOfLists) {
this.listOfLists =
originalListOfLists.stream().filter(l -> !l.isEmpty()).collect(toImmutableList());
startIndexes = new int[listOfLists.size()];
int sumSize = 0;
for (int i = 0; i < listOfLists.size(); i++) {
List<E> list = listOfLists.get(i);
sumSize += list.size();
if (i < startIndexes.length - 1) {
startIndexes[i + 1] = sumSize;
}
}
this.size = sumSize;
}
#Override
public E get(int index) {
checkElementIndex(index, size);
int location = Arrays.binarySearch(startIndexes, index);
if (location >= 0) {
return listOfLists.get(location).get(0);
} else {
// See Arrays#binarySearch Javadoc:
int insertionPoint = -location - 1;
int listIndex = insertionPoint - 1;
return listOfLists.get(listIndex).get(index - startIndexes[listIndex]);
}
}
#Override
public int size() {
return size;
}
// AbstractList.iterator() calls .get(), which is slower than just concatenating
// the backing lists' iterators
#Override
public Iterator<E> iterator() {
return Iterables.concat(listOfLists).iterator();
}
public static <E> ImmutableMultiListView<E> of(List<ImmutableList<E>> lists) {
return new ImmutableMultiListView<>(lists);
}
}

Not sure if it is possible just with Guava classes, but it seems not difficult to implement, how about something like the following:
package com.google.common.collect;
import java.util.List;
public class ConcatenatedList<T> extends ImmutableList<T> {
private final List<ImmutableList<T>> underlyingLists;
public ConcatenatedList(List<ImmutableList<T>> underlyingLists) {
this.underlyingLists = underlyingLists;
}
#Override
public T get(int index) {
for (ImmutableList<T> list : underlyingLists) {
if (index < list.size()) return list.get(index);
index -= list.size();
}
throw new IndexOutOfBoundsException();
}
#Override
boolean isPartialView() {
for (ImmutableList<T> list : underlyingLists) {
if (list.isPartialView()) return true;
}
return false;
}
#Override
public int size() {
int result = 0;
for (ImmutableList<T> list : underlyingLists) {
result += list.size();
}
return result;
}
}
Note package declaration, it needs to be like that to access Guava's ImmutableList package access constructor. Be aware that this implementation might break with future version of Guava, since the constructor is not part of API. Also as mentioned in the javadoc of ImmutableList and in comments this class was not intended to be subclassed by the original library author. However, there is no good reason for not using it in application you control and it has additional benefit of expressing immutability in the type signature compared to MultiListView suggested in the other answer.

How to compare all properties values from a two equal classes

I've a class defined as follows:
Public Class DeviceConfig
Private _maxNumCodesGlobal As Integer
Private _maxNumCodesDataMatrix As Integer
Private _maxNumCodesQR As Integer
Private _maxNumCodesBarcode As Integer
Private _partialResults As String
Private _allowIdenticalSymbols As String
Private _datamatrixValidation As Integer
Private _datamatrixValidationType
'AND MUCH MORE PROPERTIES
'GETTERS & SETTERS
End Class
as you can see it's a long list of properties in this class.
I need to compare the values of the properties from an instance with the values of the properties of another instance.
Is there a way to iterate through all of them, or even better, just comparing both classes and get true/false if they have the same properties values or not?
if instance1=instance2 then true
Thank you

I encountered the same problem and created this method. Hopefully it will help you.
It uses reflections to iterate through the public fields, ignoring those with the JsonIgnore annotation.
This method is not considering fields as List, Set, etc.
You can change it to work for properties instead of fields.
protected <T> boolean equals(T object1, T object2) {
Field[] fields = object1.getClass().getFields();
for (Field field : fields) {
if (field.getAnnotation(JsonIgnore.class)!= null) continue; //do not check the fields with JsonIgnore
Object value1;
Object value2;
try {
value1 = field.get(object1);
value2 = field.get(object2);
} catch (Exception e) {
logger.error("Error comparing objects. Exception: " + e.getMessage());
return false;
}
//comparing
if (value1 == null) {
if (value2 != null)
return false;
} else if (!value1.equals(value2))
return false;
}
return true;
}

Get and Set attribute values of a class using aspectJ

I am using aspectj to add some field to a existing class and annotate it also.
I am using load time weaving .
Example :- I have a Class customer in which i am adding 3 string attributes. But my issues is that I have to set some values and get it also before my business call.
I am trying the below approach.
In my aj file i have added the below, my problem is in the Around pointcut , how do i get the attribute and set the attribute.
public String net.customers.PersonCustomer.getOfflineRiskCategory() {
return OfflineRiskCategory;
}
public void net.customers.PersonCustomer.setOfflineRiskCategory(String offlineRiskCategory) {
OfflineRiskCategory = offlineRiskCategory;
}
public String net.customers.PersonCustomer.getOnlineRiskCategory() {
return OnlineRiskCategory;
}
public void net.customers.PersonCustomer.setOnlineRiskCategory(String onlineRiskCategory) {
OnlineRiskCategory = onlineRiskCategory;
}
public String net.customers.PersonCustomer.getPersonCommercialStatus() {
return PersonCommercialStatus;
}
public void net.customers.PersonCustomer.setPersonCommercialStatus(String personCommercialStatus) {
PersonCommercialStatus = personCommercialStatus;
}
#Around("execution(* net.xxx.xxx.xxx.DataMigration.populateMap(..))")
public Object invoke(ProceedingJoinPoint joinPoint) throws Throwable {
Object arguments[] = joinPoint.getArgs();
if (arguments != null) {
HashMap<String, String> hMap = (HashMap) arguments[0];
PersonCustomer cus = (PersonCustomer) arguments[1];
return joinPoint.proceed();
}
If anyone has ideas please let me know.
regards,
FT

First suggestion, I would avoid mixing code-style aspectj with annotation-style. Ie- instead of #Around, use around.
Second, instead of getting the arguments from the joinPoint, you should bind them in the pointcut:
Object around(Map map, PersonCustomer cust) :
execution(* net.xxx.xxx.xxx.DataMigration.populateMap(Map, PersonCustomer) && args(map, cust) {
...
return proceed(map, cust);
}
Now, to answer your question: you also need to use intertype declarations to add new fields to your class, so do something like this:
private String net.customers.PersonCustomer.OfflineRiskCategory;
private String net.customers.PersonCustomer.OnlineRiskCategory;
private String net.customers.PersonCustomer.PersonCommercialStatus;
Note that the private keyword here means private to the aspect, not to the class that you declare it on.

Generating Cache Keys from IQueryable For Caching Results of EF Code First Queries

I'm trying to implement a caching scheme for my EF Repository similar to the one blogged here. As the author and commenters have reported the limitation is that the key generation method cannot produce cache keys that vary with a given query's parameters. Here is the cache key generation method:
private static string GetKey<T>(IQueryable<T> query)
{
string key = string.Concat(query.ToString(), "\n\r",
typeof(T).AssemblyQualifiedName);
return key;
}
So the following queries will yield the same cache key:
var isActive = true;
var query = context.Products
.OrderBy(one => one.ProductNumber)
.Where(one => one.IsActive == isActive).AsCacheable();
and
var isActive = false;
var query = context.Products
.OrderBy(one => one.ProductNumber)
.Where(one => one.IsActive == isActive).AsCacheable();
Notice that the only difference is that isActive = true in the first query and isActive = false in the second.
Any suggestions/insight to efficiently generating cache keys which vary by IQueryable parameters would be truly appreciated.
Kudos to Sergey Barskiy for sharing the EF CodeFirst caching scheme.
Update
I took the approach of traversing the IQueryable's expression tree myself with the goal of resolving the values of the parameters used in the query. With maxlego's suggestion, I extended the System.Linq.Expressions.ExpressionVisitor class to visit the expression nodes that we're interested in - in this case, the MemberExpression. The updated GetKey method looks something like this:
public static string GetKey<T>(IQueryable<T> query)
{
var keyBuilder = new StringBuilder(query.ToString());
var queryParamVisitor = new QueryParameterVisitor(keyBuilder);
queryParamVisitor.GetQueryParameters(query.Expression);
keyBuilder.Append("\n\r");
keyBuilder.Append(typeof (T).AssemblyQualifiedName);
return keyBuilder.ToString();
}
And the QueryParameterVisitor class, which was inspired by the answers of Bryan Watts and Marc Gravell to this question, looks like this:
/// <summary>
/// <see cref="ExpressionVisitor"/> subclass which encapsulates logic to
/// traverse an expression tree and resolve all the query parameter values
/// </summary>
internal class QueryParameterVisitor : ExpressionVisitor
{
public QueryParameterVisitor(StringBuilder sb)
{
QueryParamBuilder = sb;
Visited = new Dictionary<int, bool>();
}
protected StringBuilder QueryParamBuilder { get; set; }
protected Dictionary<int, bool> Visited { get; set; }
public StringBuilder GetQueryParameters(Expression expression)
{
Visit(expression);
return QueryParamBuilder;
}
private static object GetMemberValue(MemberExpression memberExpression, Dictionary<int, bool> visited)
{
object value;
if (!TryGetMemberValue(memberExpression, out value, visited))
{
UnaryExpression objectMember = Expression.Convert(memberExpression, typeof (object));
Expression<Func<object>> getterLambda = Expression.Lambda<Func<object>>(objectMember);
Func<object> getter = null;
try
{
getter = getterLambda.Compile();
}
catch (InvalidOperationException)
{
}
if (getter != null) value = getter();
}
return value;
}
private static bool TryGetMemberValue(Expression expression, out object value, Dictionary<int, bool> visited)
{
if (expression == null)
{
// used for static fields, etc
value = null;
return true;
}
// Mark this node as visited (processed)
int expressionHash = expression.GetHashCode();
if (!visited.ContainsKey(expressionHash))
{
visited.Add(expressionHash, true);
}
// Get Member Value, recurse if necessary
switch (expression.NodeType)
{
case ExpressionType.Constant:
value = ((ConstantExpression) expression).Value;
return true;
case ExpressionType.MemberAccess:
var me = (MemberExpression) expression;
object target;
if (TryGetMemberValue(me.Expression, out target, visited))
{
// instance target
switch (me.Member.MemberType)
{
case MemberTypes.Field:
value = ((FieldInfo) me.Member).GetValue(target);
return true;
case MemberTypes.Property:
value = ((PropertyInfo) me.Member).GetValue(target, null);
return true;
}
}
break;
}
// Could not retrieve value
value = null;
return false;
}
protected override Expression VisitMember(MemberExpression node)
{
// Only process nodes that haven't been processed before, this could happen because our traversal
// is depth-first and will "visit" the nodes in the subtree before this method (VisitMember) does
if (!Visited.ContainsKey(node.GetHashCode()))
{
object value = GetMemberValue(node, Visited);
if (value != null)
{
QueryParamBuilder.Append("\n\r");
QueryParamBuilder.Append(value.ToString());
}
}
return base.VisitMember(node);
}
}
I'm still doing some performance profiling on the cache key generation and hoping that it isn't too expensive (I'll update the question with the results once I have them). I'll leave the question open, in case anyone has suggestions on how to optimize this process or has a recommendation for a more efficient method for generating cache keys with vary with the query parameters. Although this method produces the desired output, it is by no means optimal.

i suggest to use ExpressionVisitor
http://msdn.microsoft.com/en-us/library/bb882521(v=vs.90).aspx

Just for the record, "Caching the results of LINQ queries" works well with the EF and it's able to work with parameters correctly, so it can be considered as a good second level cache implementation for EF.

While the solution of the OP works quite well, I found that the performance of the solution is a little bit poor.
The duration of the key generation varied between 300ms and 1200ms for my queries.
However, I've found another solution that has quite better performance (<10ms).
public static string ToTraceString<T>(DbQuery<T> query)
{
var internalQueryField = query.GetType().GetFields(BindingFlags.NonPublic | BindingFlags.Instance).Where(f => f.Name.Equals("_internalQuery")).FirstOrDefault();
var internalQuery = internalQueryField.GetValue(query);
var objectQueryField = internalQuery.GetType().GetFields(BindingFlags.NonPublic | BindingFlags.Instance).Where(f => f.Name.Equals("_objectQuery")).FirstOrDefault();
var objectQuery = objectQueryField.GetValue(internalQuery) as ObjectQuery<T>;
return ToTraceStringWithParameters(objectQuery);
}
private static string ToTraceStringWithParameters<T>(ObjectQuery<T> query)
{
string traceString = query.ToTraceString() + "\n";
foreach (var parameter in query.Parameters)
{
traceString += parameter.Name + " [" + parameter.ParameterType.FullName + "] = " + parameter.Value + "\n";
}
return traceString;
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Case-insensitive indexing with Hibernate-Search? - hibernate-search

Is there a simple way to make Hibernate Search to index all its values in lower case ? Instead of the default mixed-case. I'm using the annotation #Field. But I can't seem to be able to configure some application-level set

Related

How to optimize SQL query in Anylogic

Concatenating ImmutableLists

How to compare all properties values from a two equal classes

Get and Set attribute values of a class using aspectJ

Generating Cache Keys from IQueryable For Caching Results of EF Code First Queries

Categories

Resources