Phrase query in Lucene 6.2.0

Phrase query in Lucene 6.2.0 - mongodb

I have a document like this:
{
"_id" : ObjectId("586b723b4b9a835db416fa26"),
"name" : "test",
"countries" : {
"country" : [
{
"name" : "russia iraq"
},
{
"name" : "USA china"
}
]
}
}
In MongoDB I am trying to retrieve it using phrase query(Lucene 6.2.0). My code looks as folllows:
StandardAnalyzer analyzer = new StandardAnalyzer();
// 1. create the index
Directory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
try {
IndexWriter w = new IndexWriter(index, config);
MongoClient client = new MongoClient("localhost", 27017);
DB database = client.getDB("test123");
DBCollection coll = database.getCollection("test1");
//MongoCollection<org.bson.Document> collection = database.getCollection("test1");
DBCursor cursor = coll.find();
System.out.println(cursor);
while (cursor.hasNext()) {
BasicDBObject obj = (BasicDBObject) cursor.next();
Document doc = new Document();
BasicDBObject f = (BasicDBObject) (obj.get("countries"));
List<BasicDBObject> dts = (List<BasicDBObject>)(f.get("country"));
doc.add(new TextField("id",obj.get("_id").toString().toLowerCase(), Field.Store.YES));
doc.add(new StringField("name",obj.get("name").toString(), Field.Store.YES));
doc.add(new StringField("countries",f.toString(), Field.Store.YES));
for(BasicDBObject d : dts){
doc.add(new StringField("country",d.get("name").toString(), Field.Store.YES));
//
}
w.addDocument(doc);
}
w.close();
and my search goes like :
PhraseQuery query = new PhraseQuery("country", "iraq russia" );
// 3. search
int hitsPerPage = 10;
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopDocs docs = searcher.search(query, hitsPerPage);
ScoreDoc[] hits = docs.scoreDocs;
// 4. display results
System.out.println("Found " + hits.length + " hits.");
for(int j=0;j<hits.length;++j) {
int docId = hits[j].doc;
Document d = searcher.doc(docId);
System.out.println(d);
}
reader.close();
}
catch (Exception e) {
e.printStackTrace();
}
I am getting zero hits for this query. Can anyone tell what I am doing wrong?
jars used:
lucene-queries4.2.0
lucene-queryparser-6.2.1
lucene-analyzers-common-6.2.0

i made certain changes which goes like:
Query query = new PhraseQuery.Builder()
.add(new Term("country", "iraq"))
.add(new Term("country", "russia"))
.setSlop(2)
.build();
and also i changed the type of feild while indexing :
for(BasicDBObject d : dts){
doc.add(newTextField("country",d.get("name").toString(), Field.Store.YES));
}
But can anyone tell me the difference between StringFeild and TextFeild while indexing?

Firstly, never mix Lucene versions. All your jars should be the same version. Upgrade lucene-queries to 6.2.1. In practice you might or might not run into trouble mixing up 6.2.0 and 6.2.1, but you definitely should upgrade lucene-analyzers-common as well.
PhraseQuery doesn't analyze for you, you have to add terms to it separately. In your example, "iraq russia" is treated as a single terms, rather than two separate (analyzed) terms.
It should look something like this:
Query query = new PhraseQuery.Builder()
.add(new Term("country", "iraq"))
.add(new Term("country", "russia"))
.build();
If you want something that will analyze for you, you can use the QueryParser:
QueryParser parser = new QueryParser("country", new StandardAnalyzer())
Query query = queryparser.parse("\"iraq russia\"");

Related

How to retrieve the fields and respective values from a single document search using dotnet lucene?

I have a | delimited txt document in which i have fields like ScenarioId, Input value, database value
I am indexing these values.
doc.Add(new Field("Database value", List.Databasevalue, Field.Store.YES, Field.Index.ANALYZED));
var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_30, "Database value", analyzer);
Console.WriteLine("Text found: {0}" + Environment.NewLine, document.Get("Database value"));
Scenario Id|Input value|Database value
1|Akshay|Akshay Kumar
2|Akshath|Akshath T
3|Paul|John paul
4|Abraham|Abraham Joseph
5|Morris|Morris Johnson
Since my input is only one document. I dont care about the doc number. If there is any match i need to retrieve the respective database value from the input file and respective score. How do I achieve it ? There is a Get function in Document I don't know how does it work! Can someone help me out as there is limited resources available for dotnet lucene

I'm still a bit unclear about what you need, but here is some sample code (in the form of an xUnit test) for indexing docs, performing a search and then reading a document back in Lucene.NET 4.8.
[Fact]
public void StandardAnalyzerExample() {
Directory indexDir = new RAMDirectory();
Analyzer standardAnalyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48);
IndexWriterConfig indexConfig = new IndexWriterConfig(LuceneVersion.LUCENE_48, standardAnalyzer);
indexConfig.UseCompoundFile = true;
IndexWriter writer = new IndexWriter(indexDir, indexConfig);
SearcherManager searcherManager = new SearcherManager(writer, applyAllDeletes: true, new SearchWarmer());
Document doc = new Document();
doc.Add(new StringField("examplePrimaryKey", "001", Field.Store.YES));
doc.Add(new TextField("exampleField", "Unique gifts are great gifts.", Field.Store.YES));
writer.AddDocument(doc);
doc = new Document();
doc.Add(new StringField("examplePrimaryKey", "002", Field.Store.YES));
doc.Add(new TextField("exampleField", "Everyone is gifted.", Field.Store.YES));
writer.AddDocument(doc);
doc = new Document();
doc.Add(new StringField("examplePrimaryKey", "003", Field.Store.YES));
doc.Add(new TextField("exampleField", "Gifts are meant to be shared.", Field.Store.YES));
writer.AddDocument(doc);
writer.Commit();
searcherManager.MaybeRefreshBlocking();
IndexSearcher indexSearcher = searcherManager.Acquire();
try {
QueryParser parser = new QueryParser(LuceneVersion.LUCENE_48, "exampleField", standardAnalyzer);
Query query = parser.Parse("everyone");
TopDocs topDocs = indexSearcher.Search(query, int.MaxValue);
int numMatchingDocs = topDocs.ScoreDocs.Length;
Assert.Equal(1, numMatchingDocs);
Document docRead = indexSearcher.Doc(topDocs.ScoreDocs[0].Doc); //this is how you read back a doc
string primaryKey = docRead.Get("examplePrimaryKey");
Assert.Equal("002", primaryKey);
} finally {
searcherManager.Release(indexSearcher);
}
}
}

How to compare two collections and archive documents which are not common

I have two collections for example CollectionA and CollectionB both have common filed which is hostname
Collection A :
{
"hostname": "vm01",
"id": "1",
"status": "online",
}
Collection B
{
"hostname": "vm01",
"id": "string",
"installedversion": "string",
}
{
"hostname": "vm02",
"id": "string",
"installedversion": "string",
}
what i want to achieve is when i receive a post message for collection B
I want to check if the record exists in Collection B based on hostname and update all the values. if not insert the new record ( i have read it can be achieved by using upsert -- still looking how to make it work)
I want to check if the hostname is present in Collection A , if not move the record from collection B to another collection which is collection C ( as archive records).ie in the above hostname=vm02 record from collection B should be moved to collectionC
how can i achieve this using springboot mongodb anyhelp is appreciated.The code which i have to save the Collection B is as follows which i want to update to achieve the above desired result
public RscInstalltionStatusDTO save(RscInstalltionStatusDTO rscInstalltionStatusDTO) {
log.debug("Request to save RscInstalltionStatus : {}", rscInstalltionStatusDTO);
RscInstalltionStatus rscInstalltionStatus = rscInstalltionStatusMapper.toEntity(rscInstalltionStatusDTO);
rscInstalltionStatus = rscInstalltionStatusRepository.save(rscInstalltionStatus);
return rscInstalltionStatusMapper.toDto(rscInstalltionStatus);
}
Update 1 : The below works as i expected but I think there should be a better way to do this.
public RscInstalltionStatusDTO save(RscInstalltionStatusDTO rscInstalltionStatusDTO) {
log.debug("Request to save RscInstalltionStatus : {}", rscInstalltionStatusDTO);
RscInstalltionStatus rscInstalltionStatus = rscInstalltionStatusMapper.toEntity(rscInstalltionStatusDTO);
System.out.print(rscInstalltionStatus.getHostname());
Query query = new Query(Criteria.where("hostname").is(rscInstalltionStatus.getHostname()));
Update update = new Update();
update.set("configdownload",rscInstalltionStatus.getConfigdownload());
update.set("rscpkgdownload",rscInstalltionStatus.getRscpkgdownload());
update.set("configextraction",rscInstalltionStatus.getConfigextraction());
update.set("rscpkgextraction",rscInstalltionStatus.getRscpkgextraction());
update.set("rscstartup",rscInstalltionStatus.getRscstartup());
update.set("installedversion",rscInstalltionStatus.getInstalledversion());
mongoTemplate.upsert(query, update,RscInstalltionStatus.class);
rscInstalltionStatus = rscInstalltionStatusRepository.findByHostname(rscInstalltionStatus.getHostname());
return rscInstalltionStatusMapper.toDto(rscInstalltionStatus);
}
Update2 : with the below code i am able to get move the records to another collection
String query = "{$lookup:{ from: \"vmdetails\",let: {rschostname: \"$hostname\"},pipeline:[{$match:{$expr:{$ne :[\"$hostname\",\"$$rschostname\"]}}}],as: \"rscInstall\"}},{$unwind:\"$rscInstall\"},{$project:{\"_id\":0,\"rscInstall\":0}}";
AggregationOperation rscInstalltionStatusTypedAggregation = new CustomProjectAggregationOperation(query);
LookupOperation lookupOperation = LookupOperation.newLookup().from("vmdetails").localField("hostname").foreignField("hostname").as("rscInstall");
UnwindOperation unwindOperation = Aggregation.unwind("$rscInstall");
ProjectionOperation projectionOperation = Aggregation.project("_id","rscInstall").andExclude("_id","rscInstall");
OutOperation outOperation = Aggregation.out("RscInstallArchive");
Aggregation aggregation = Aggregation.newAggregation(rscInstalltionStatusTypedAggregation,unwindOperation,projectionOperation,outOperation);
List<BasicDBObject> results = mongoTemplate.aggregate(aggregation,"rsc_installtion_status",BasicDBObject.class).getMappedResults();
this issue which i have here is it returns multiple records

Found the solution , there may be other best solutions but for me this one worked
create a class customeAggregationGeneration (found in SO answers and extended to match my needs)
public class CustomProjectAggregationOperation implements AggregationOperation {
private String jsonOperation;
public CustomProjectAggregationOperation(String jsonOperation) {
this.jsonOperation = jsonOperation;
}
#Override
public Document toDocument(AggregationOperationContext aggregationOperationContext) {
return aggregationOperationContext.getMappedObject(Document.parse(jsonOperation));
}
}
String lookupquery = "{$lookup :{from:\"vmdetails\",localField:\"hostname\",foreignField:\"hostname\"as:\"rscinstall\"}}";
String matchquery = "{ $match: { \"rscinstall\": { $eq: [] } }}";
String projectquery = "{$project:{\"rscinstall\":0}}";
AggregationOperation lookupOpertaion = new CustomProjectAggregationOperation(lookupquery);
AggregationOperation matchOperation = new CustomProjectAggregationOperation(matchquery);
AggregationOperation projectOperation = new CustomProjectAggregationOperation(projectquery);
Aggregation aggregation = Aggregation.newAggregation(lookupOpertaion, matchOperation, projectOperation);
ArrayList<Document> results1 = (ArrayList<Document>) mongoTemplate.aggregate(aggregation, "rsc_installtion_status", Document.class).getRawResults().get("result");
// System.out.println(results1);
for (Document doc : results1) {
// System.out.print(doc.get("_id").toString());
mongoTemplate.insert(doc, "RscInstallArchive");
delete(doc.get("_id").toString());

How to write multiple group by id fields in Mongodb java driver

In the below query
{ $group : {
_id : { success:'$success', responseCode:'$responseCode', label:'$label'},
max_timeStamp : { $timeStamp : 1 },
count_responseCode : { $sum : 1 },
avg_value : { $sum : "$value" },
count_success : { $sum : 1 }
}}
How _id : { success:'$success', responseCode:'$responseCode', label:'$label'}, can be translated to use in java mongodb driver.
I tried
BasicDBList list = new BasicDBList();
list.add(new BasicDBObject("success", "$success"));
list.add(new BasicDBObject("responseCode", "$responseCode"));
list.add(new BasicDBObject("label", "$label"));
AggregationOutput output = collection.aggregate(match, project, group);
and
Multi-dimension array
String [][] muitiGroupBy = {{"success", "$success"},{"responseCode", "$responseCode"},{"label", "$label"}};
etc..
But i always get like this as result
"_id" : [ { "success" : "$success"} , { "responseCode" : "$responseCode"}]
If I use only one field it works.
DBObject groupFields = new BasicDBObject( "_id", new BasicDBObject("success", "$success"));

I had a similar need and titogeo's answer from 2013 led me in the right direction after many failed attempts to translate my aggregation operation into something the Java client could handle. This is what I used:
MongoCollection<Document> myCollection = myDB.getCollection("myCollection");
Map<String, Object> multiIdMap = new HashMap<String, Object>();
multiIdMap.put("groupField1", "$groupField1");
multiIdMap.put("groupField2", "$groupField2");
Document groupFields = new Document(multiIdMap);
AggregateIterable<Document> aggregate = myCollection.aggregate(Arrays.asList(
Aggregates.group(groupFields,
Accumulators.last("lastDate", "$dateCreated"),
Accumulators.last("lastNumAvail", "$availableUnits")
)
));
I got back exactly what I needed to match the result from this:
db.myCollection.aggregate([
{"$group":{ "_id":{
groupField1: "$groupField1",
groupField2: "$groupField2"},
lastDate:
{"$last":"$dateCreated"},
lastNumAvail:
{"$last":"$availableUnits"}
}
}
]);

We did figure out how. We can achieve by using this.
Map<String, Object> dbObjIdMap = new HashMap<String, Object>();
dbObjIdMap.put("success", "$success");
dbObjIdMap.put("responseCode", "$responseCode");
dbObjIdMap.put("label", "$label");
DBObject groupFields = new BasicDBObject( "_id", new BasicDBObject(dbObjIdMap));

I could achieve this through this code (grails code and mongo-java-driver-3.2):
DBObject groupFields = new BasicDBObject()
groupFields.put('success', "\$success")
groupFields.put('responseCode', "\$responseCode")
groupFields.put('label', "\$label")
def result = collection.aggregate(Arrays.asList(Aggregates.group(groupFields, []))).iterator()

inserting a new document in the arrray of documents

I want to add a new document to the following document having an outer key "User"
{
name:himani,
User:[
{
_id:e25ffgf627627,
Name:User1
},
{
_id:fri2i2jhjh9098,
Name:User2
}
]
};
Below is my code in which I am trying to add a new document to already existing document.
My code is:
var server = MongoServer.Create("mongodb://username:password#localhost:27017/?safe=true");
SafeMode mode = new SafeMode(true);
SafeModeResult result = new SafeModeResult();
var db = server.GetDatabase("himani");
var coll = db.GetCollection("test");
BsonDocument document = new BsonDocument();
document.Add("name", "himani");
result = coll.Insert(document, mode);
BsonDocument nested = new BsonDocument();
nested.Add("1", "heena").Add("2", "divya");
BsonArray a = new BsonArray();
a.Add(2);
a.Add(5);
nested.Add("values", a);
document["3"] = new BsonArray().Add(BsonValue.Create(nested));
coll.Save(document);
var query = Query.And(
Query.EQ("name", "himani"),
Query.EQ("3.1", "heena")
);
var match = coll.FindOne(query);
var update = Update.AddToSet("3", new BsonDocument {{ "count", "2" }});
coll.Update(query, update);
I want to add a new document to the User array. I am doing this by above code but its not working.Please tell me the right way of doing it.

I don't understand your document structure at all... and the only "user" array I could find in here was a field called "3". Your code does in fact work and appends a document into the "3" array. The below is the result after running your code. Perhaps you could be more clear as to what you want your document to look like after you have "appended" a user.
{
"_id":ObjectId("4fa7d965ce48f3216c52c6c7"),
"name":"himani",
"3":[
{
"1":"heena",
"2":"divya",
"values":[ 2, 5 ]
},
{
"count":"2"
}
]
}

Lucene 'join' how-to? part II

Part I here...
Requirement:
search by multiple values in multiple fields AND Where Bar.Id == argBar.Id
var parser = new MultiFieldQueryParser
(new[] { "Name", "Title" }, new SimpleAnalyzer());
parser.???(string.Format("Bar.Id:{0}",argBar.Id)); // o_0
var query = Session.CreateFullTextQuery
(parser.Parse(searchValue), new[] { typeof(Foo) });
Found this:
Query searchQuery = MultiFieldQueryParser.Parse
(term, new[] {"title", "description"},
new[] {BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD},
new StandardAnalyzer());
So, theoretically - i should be able to add argBar.Id and BooleanClause.Occur.Must, but there isn't such an overload in Lucene.Net 2.4.0.2.

var bq = new BooleanQuery();
bq.Add(parser.Parse(searchValue), BooleanClause.Occur.SHOULD);
bq.Add(new TermQuery
(new Term("Bar.Id", argBar.Id.ToString()), BooleanClause.Occur.Must);
var r = Session.CreateFullTextQuery(bq, new[] {typeof(Foo)});
//victory

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Phrase query in Lucene 6.2.0 - mongodb

Related

How to retrieve the fields and respective values from a single document search using dotnet lucene?

How to compare two collections and archive documents which are not common

How to write multiple group by id fields in Mongodb java driver

inserting a new document in the arrray of documents

Lucene 'join' how-to? part II

Categories

Resources