How to implement search with multiple filters using lucene.net - lucene.net

I'm new to lucene.net. I want to implement search functionality on a client database. I have the following scenario:
Users will search for clients based on the currently selected city.
If the user wants to search for clients in another city, then he has to change the city and perform the search again.
To refine the search results we need to provide filters on Areas (multiple), Pincode, etc. In other words, I need the equivalent lucene queries to the following sql queries:
SELECT * FROM CLIENTS
WHERE CITY = N'City1'
AND (Area like N'%area1%' OR Area like N'%area2%')
SELECT * FROM CILENTS
WHERE CITY IN ('MUMBAI', 'DELHI')
AND CLIENTTYPE IN ('GOLD', 'SILVER')
Below is the code I've implemented to provide search with city as a filter:
private static IEnumerable<ClientSearchIndexItemDto> _search(string searchQuery, string city, string searchField = "")
{
// validation
if (string.IsNullOrEmpty(searchQuery.Replace("*", "").Replace("?", "")))
return new List<ClientSearchIndexItemDto>();
// set up Lucene searcher
using (var searcher = new IndexSearcher(_directory, false))
{
var hits_limit = 1000;
var analyzer = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30);
// search by single field
if (!string.IsNullOrEmpty(searchField))
{
var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_30, searchField, analyzer);
var query = parseQuery(searchQuery, parser);
var hits = searcher.Search(query, hits_limit).ScoreDocs;
var results = _mapLuceneToDataList(hits, searcher);
analyzer.Close();
searcher.Dispose();
return results;
}
else // search by multiple fields (ordered by RELEVANCE)
{
var parser = new MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_30, new[]
{
"ClientId",
"ClientName",
"ClientTypeNames",
"CountryName",
"StateName",
"DistrictName",
"City",
"Area",
"Street",
"Pincode",
"ContactNumber",
"DateModified"
}, analyzer);
var query = parseQuery(searchQuery, parser);
var f = new FieldCacheTermsFilter("City",new[] { city });
var hits = searcher.Search(query, f, hits_limit, Sort.RELEVANCE).ScoreDocs;
var results = _mapLuceneToDataList(hits, searcher);
analyzer.Close();
searcher.Dispose();
return results;
}
}
}
Now I have to provide more filters on Area, Pincode, etc. in which Area is multiple. I tried BooleanQuery like below:
var cityFilter = new TermQuery(new Term("City", city));
var areasFilter = new FieldCacheTermsFilter("Area",areas); -- where type of areas is string[]
BooleanQuery filterQuery = new BooleanQuery();
filterQuery.Add(cityFilter, Occur.MUST);
filterQuery.Add(areasFilter, Occur.MUST); -- here filterQuery.Add not have an overloaded method which accepts string[]
If we perform the same operation with single area then it's working fine.
I've tried with ChainedFilter like below, which doesn't seems to satisfy the requirement. The below code performs or operation on city and areas. But the requirement is to perform OR operation between the areas provided in the given city.
var f = new ChainedFilter(new Filter[] { cityFilter, areasFilter });
Can anybody suggest to me how to achieve this in lucene.net? Your help will be appreciated.

You're looking for the BooleanFilter. Almost any query object has a matching filter object.
Look into TermsFilter (from Lucene.Net.Contrib.Queries) if your indexing doesn't match the requirements of FieldCacheTermsFilter. From the documentation of the later; "this filter requires that the field contains only a single term for all documents".
var cityFilter = new FieldCacheTermsFilter("CITY", new[] {"MUMBAI", "DELHI"});
var clientTypeFilter = new FieldCacheTermsFilter("CLIENTTYPE", new [] { "GOLD", "SILVER" });
var areaFilter = new TermsFilter();
areaFilter.AddTerm(new Term("Area", "area1"));
areaFilter.AddTerm(new Term("Area", "area2"));
var filter = new BooleanFilter();
filter.Add(new FilterClause(cityFilter, Occur.MUST));
filter.Add(new FilterClause(clientTypeFilter, Occur.MUST));
filter.Add(new FilterClause(areaFilter, Occur.MUST));
IndexSearcher searcher = null; // TODO.
Query query = null; // TODO.
Int32 hits_limit = 0; // TODO.
var hits = searcher.Search(query, filter, hits_limit, Sort.RELEVANCE).ScoreDocs;

What you are looking for is nested boolean queries so that you have an or (on your cities) but that whole group (matching the or) is itself matched as an and
filter1 AND filter2 AND filter3 AND (filtercity1 OR filtercity2 OR filtercity3)
There is already a good description of how to do this here:
How to create nested boolean query with lucene API (a AND (b OR c))?

Related

How to modify the available filter operators of a smart table

I have a smart table that shows data from odata service. all properties of the entity type are Edm.String.
now i can set a filter for each column of the resulting table with a lot of filter operators.
My goal is to filter the list of available filter operators depending on the selected column.
e.g.
selected colum 'A' then allow only 'equal to'.
Is that somehow possible? I would like to solve it in front end code.
I didn't find anything like that in ui5 docu...
you need to use equals FilterOperator
here is a link for FilterOperator and another example how to use filter in grid table https://sapui5.hana.ondemand.com/
Here is a quick example of setting more than one filter each with different Filter Operator
filterGlobally : function(oEvent) {
var sQuery = oEvent.getParameter("query");
this._oGlobalFilter = null;
if (sQuery) {
this._oGlobalFilter = new Filter([
new Filter("columA", FilterOperator.EQ, sQuery),
new Filter("columB", FilterOperator.Contains, sQuery)
], false);
}
var oFilter = null;
if (this._oGlobalFilter) {
oFilter = new Filter([this._oGlobalFilter], true);
}
this.byId("idTable").getBinding().filter(oFilter, "Application");

MongoDB : How to find multiple documents and update at the same time?

I have mongo DB and I am using C#.Net to interact with mongo db. C# API has methods for finding a single document and updating it at the same time. For example FindOneAndUpdateAsync.
However I couldn't find any method to find multiple documents and update them at the same time asynchronously.
The code below finding and processing each document asynchronously. How do I also update that document at the same time?
public async Task<IList<IDictionary<string, string>>> DoWork()
{
var collection = _mongoDatabase.GetCollection<BsonDocument>("units");
var filterBuilder = Builders<BsonDocument>.Filter;
var filter = filterBuilder.Ne<string>("status", "INIT") &
(filterBuilder.Exists("isDone", false) |
filterBuilder.Eq<bool>("isDone", false));
// I want to pass this update filter to update the document. But not sure how
var update = Builders<BsonDocument>.Update
.CurrentDate("startTime");
var sort = Builders<BsonDocument>.Sort.Ascending("startTime");
var projection = Builders<BsonDocument>.Projection
.Include("_id")
.Include("fileName"); // for bravity i have removed other projection elements
var output = new List<IDictionary<string, string>>();
// How do i pass update filter and update the document at the same time??
await collection
.Find(filter)
.Sort(sort)
.Project(projection)
.ForEachAsync((unit) =>
{
var dictionary = new Dictionary<string, string>();
Recurse(unit, dictionary);
output.Add(dictionary);
});
return output.Count > 0 ? output : null;
}
That doesn't exist in the mongo .Net api see here.
Just use a combination of Find and UpdateManyAsync.

Query without condition in MongoDB + C#

I'm trying to use the collection.FindAndModify and give it a IMongoQuery which selects all the documents. But I can not find how to create a query without any conditions!
Can anyone tell me how to do this? I'm using MongoDB C# Driver v1.8.3.
Here's my code:
var query = ???;
var sortBy = SortBy.Ascending(new string[] { "last_update" });
var update = Update<Entity>.Set(e => e.last_update, DateTime.Now);
var fields = Fields.Include(new string[] { "counter", "_id" });
var m = collection.FindAndModify(query, sortBy, update, fields, false, false);
I wonder what should I write in place of ??? to select all the documents!?
Use an empty QueryDocument:
var query = new QueryDocument();
But keep in mind that FindAndModify will only modify the first matching document.

MongoDB query in C#

I'd like to get certain documents that match a specific clause, but don't know how to achieve that WHERE effect in relational databases. I have a simple database with words and their translations (objects with 2 fields) and use this code
var words = database.GetCollection<Word>("Dictionary")
to get them. But this gets the whole collection. What if there were thousands of records in the collection? How to get just the records I want?
Use regular expressions matching as below. The 'i' shows case insensitivity.
var collections = mongoDatabase.GetCollection("Abcd");
var queryA = Query.And(
Query.Matches("strName", new BsonRegularExpression("ABCD", "i")),
Query.Matches("strVal", new BsonRegularExpression("4121", "i")));
var queryB = Query.Or(
Query.Matches("strName", new BsonRegularExpression("ABCD","i")),
Query.Matches("strVal", new BsonRegularExpression("33156", "i")));
var getA = collections.Find(queryA);
var getB = collections.Find(queryB);
For Using 'And' or 'Or' in your query, if you want to search over multiple fields.
This assumes you have a class called Word that is modled like you collection.
MongoServer _server = new MongoClient(connectionString).GetServer();
MongoDatabase _database = _server.GetDatabase(database);
MongoCollection _collection = _database.GetCollection(collection);
var results = _collection.FindAs<Word>(Query.EQ("MyField","WordToFind"));

MongoDb: how to return distinct field in select (find) with C# official driver

I need to select User Name from the collection of Users. I do it in a such way:
MongoCollection<Enums> coll = Db.GetCollection<Enums>("Users");
var query = Query.EQ("_id", id);
var res = coll.FindOne(query);
var name = res.Name;
var url = res.UserUrl; //or some more fields, not just Name
Assuming that User document can contain a lot of data, and there is no need to transfer the whole user document, how to select only a few distinct fields, using official C# driver?
You'll have to use a function that returns a MongoCursor.
In the MongoCursor you can specify the fields you want to return.
var result = Db.GetCollection<Enums>("Users").FindAll();
result.Fields = Fields.Include(new [] {"Name"});;
foreach (var user in result)
{
Console.WriteLine(user.Name);
}