How to retrieve the fields and respective values from a single document search using dotnet lucene? - lucene.net

I have a | delimited txt document in which i have fields like ScenarioId, Input value, database value
I am indexing these values.
doc.Add(new Field("Database value", List.Databasevalue, Field.Store.YES, Field.Index.ANALYZED));
var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_30, "Database value", analyzer);
Console.WriteLine("Text found: {0}" + Environment.NewLine, document.Get("Database value"));
Scenario Id|Input value|Database value
1|Akshay|Akshay Kumar
2|Akshath|Akshath T
3|Paul|John paul
4|Abraham|Abraham Joseph
5|Morris|Morris Johnson
Since my input is only one document. I dont care about the doc number. If there is any match i need to retrieve the respective database value from the input file and respective score. How do I achieve it ? There is a Get function in Document I don't know how does it work! Can someone help me out as there is limited resources available for dotnet lucene

I'm still a bit unclear about what you need, but here is some sample code (in the form of an xUnit test) for indexing docs, performing a search and then reading a document back in Lucene.NET 4.8.
[Fact]
public void StandardAnalyzerExample() {
Directory indexDir = new RAMDirectory();
Analyzer standardAnalyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48);
IndexWriterConfig indexConfig = new IndexWriterConfig(LuceneVersion.LUCENE_48, standardAnalyzer);
indexConfig.UseCompoundFile = true;
IndexWriter writer = new IndexWriter(indexDir, indexConfig);
SearcherManager searcherManager = new SearcherManager(writer, applyAllDeletes: true, new SearchWarmer());
Document doc = new Document();
doc.Add(new StringField("examplePrimaryKey", "001", Field.Store.YES));
doc.Add(new TextField("exampleField", "Unique gifts are great gifts.", Field.Store.YES));
writer.AddDocument(doc);
doc = new Document();
doc.Add(new StringField("examplePrimaryKey", "002", Field.Store.YES));
doc.Add(new TextField("exampleField", "Everyone is gifted.", Field.Store.YES));
writer.AddDocument(doc);
doc = new Document();
doc.Add(new StringField("examplePrimaryKey", "003", Field.Store.YES));
doc.Add(new TextField("exampleField", "Gifts are meant to be shared.", Field.Store.YES));
writer.AddDocument(doc);
writer.Commit();
searcherManager.MaybeRefreshBlocking();
IndexSearcher indexSearcher = searcherManager.Acquire();
try {
QueryParser parser = new QueryParser(LuceneVersion.LUCENE_48, "exampleField", standardAnalyzer);
Query query = parser.Parse("everyone");
TopDocs topDocs = indexSearcher.Search(query, int.MaxValue);
int numMatchingDocs = topDocs.ScoreDocs.Length;
Assert.Equal(1, numMatchingDocs);
Document docRead = indexSearcher.Doc(topDocs.ScoreDocs[0].Doc); //this is how you read back a doc
string primaryKey = docRead.Get("examplePrimaryKey");
Assert.Equal("002", primaryKey);
} finally {
searcherManager.Release(indexSearcher);
}
}
}

Related

spring boot query to get the maximum value alone from a field in an array of subdocument

{
"_id":"1",
"name":"Elon musk",
"created_by":"alien",
"versions":[
{
"version":1,
"active":true,
"group":"ALL",
},
{
"version":2,
"active":false,
"group":"ALL",
}
]
}
I need a query which returns the maximum value of versions.version which is 2
val query = Aggregation.newAggregation(
Aggregation.group("version").max("versions.version").as("maximum"),
project("maximum").and("version").previousOperation())
val groupResults = mongoTemplate.aggregate(query, test::class.java, sample::class.java)
for (results in groupResults){
println(results.maximum)
}
I tried above but it is returning only 1 but I'm expecting 2
and also is there a query which I can use in #Query
need help!!!!
this just return max version value you could customise project to get another fields
Arrays.asList(new Document("$match",
new Document("name", "Elon musk")),
new Document("$unwind",
new Document("path", "$versions")),
new Document("$sort",
new Document("versions.version", -1L)),
new Document("$limit", 1L),
new Document("$project",
new Document("version", "$versions.version")
.append("_id", 0L)))

MongoDB Java API Query UPDATE include word

I have a problem. I need include one word in the value from the field: "name".
For example:
{name : "Apple inc."}
I would like add the word "Company".
{name : "Company Apple inc."}
But, I can't. Where is it my issue?
collectionCOMPANIES.updateMany(
new Document("$where", "true")
,
new Document("$set", new Document("name","Company + $name")
));
If I do this, return:
{name : "Company + $name"}
Thanks.
******* UPDATE SOLUTION******
For the final code In java this is:
collectionCOMPANIES.find().forEach(new Block<Document>() {
public void apply(final Document document) {
final Document DocuNew = new Document();
DocuNew.putAll(document);
DocuNew.put("name", "Company " + document.get("name"));
newcollection.insertOne(DocuNew);
}
});
Try this.
db.collectionCOMPANIES.find().snapshot().forEach( function (doc) {
doc.name = "Company " + doc.name;
db.collectionCOMPANIES.save(doc);
});
For this I am using snapshot provided by mongodb. you can read more about it here

Phrase query in Lucene 6.2.0

I have a document like this:
{
"_id" : ObjectId("586b723b4b9a835db416fa26"),
"name" : "test",
"countries" : {
"country" : [
{
"name" : "russia iraq"
},
{
"name" : "USA china"
}
]
}
}
In MongoDB I am trying to retrieve it using phrase query(Lucene 6.2.0). My code looks as folllows:
StandardAnalyzer analyzer = new StandardAnalyzer();
// 1. create the index
Directory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
try {
IndexWriter w = new IndexWriter(index, config);
MongoClient client = new MongoClient("localhost", 27017);
DB database = client.getDB("test123");
DBCollection coll = database.getCollection("test1");
//MongoCollection<org.bson.Document> collection = database.getCollection("test1");
DBCursor cursor = coll.find();
System.out.println(cursor);
while (cursor.hasNext()) {
BasicDBObject obj = (BasicDBObject) cursor.next();
Document doc = new Document();
BasicDBObject f = (BasicDBObject) (obj.get("countries"));
List<BasicDBObject> dts = (List<BasicDBObject>)(f.get("country"));
doc.add(new TextField("id",obj.get("_id").toString().toLowerCase(), Field.Store.YES));
doc.add(new StringField("name",obj.get("name").toString(), Field.Store.YES));
doc.add(new StringField("countries",f.toString(), Field.Store.YES));
for(BasicDBObject d : dts){
doc.add(new StringField("country",d.get("name").toString(), Field.Store.YES));
//
}
w.addDocument(doc);
}
w.close();
and my search goes like :
PhraseQuery query = new PhraseQuery("country", "iraq russia" );
// 3. search
int hitsPerPage = 10;
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopDocs docs = searcher.search(query, hitsPerPage);
ScoreDoc[] hits = docs.scoreDocs;
// 4. display results
System.out.println("Found " + hits.length + " hits.");
for(int j=0;j<hits.length;++j) {
int docId = hits[j].doc;
Document d = searcher.doc(docId);
System.out.println(d);
}
reader.close();
}
catch (Exception e) {
e.printStackTrace();
}
I am getting zero hits for this query. Can anyone tell what I am doing wrong?
jars used:
lucene-queries4.2.0
lucene-queryparser-6.2.1
lucene-analyzers-common-6.2.0
i made certain changes which goes like:
Query query = new PhraseQuery.Builder()
.add(new Term("country", "iraq"))
.add(new Term("country", "russia"))
.setSlop(2)
.build();
and also i changed the type of feild while indexing :
for(BasicDBObject d : dts){
doc.add(newTextField("country",d.get("name").toString(), Field.Store.YES));
}
But can anyone tell me the difference between StringFeild and TextFeild while indexing?
Firstly, never mix Lucene versions. All your jars should be the same version. Upgrade lucene-queries to 6.2.1. In practice you might or might not run into trouble mixing up 6.2.0 and 6.2.1, but you definitely should upgrade lucene-analyzers-common as well.
PhraseQuery doesn't analyze for you, you have to add terms to it separately. In your example, "iraq russia" is treated as a single terms, rather than two separate (analyzed) terms.
It should look something like this:
Query query = new PhraseQuery.Builder()
.add(new Term("country", "iraq"))
.add(new Term("country", "russia"))
.build();
If you want something that will analyze for you, you can use the QueryParser:
QueryParser parser = new QueryParser("country", new StandardAnalyzer())
Query query = queryparser.parse("\"iraq russia\"");

inserting a new document in the arrray of documents

I want to add a new document to the following document having an outer key "User"
{
name:himani,
User:[
{
_id:e25ffgf627627,
Name:User1
},
{
_id:fri2i2jhjh9098,
Name:User2
}
]
};
Below is my code in which I am trying to add a new document to already existing document.
My code is:
var server = MongoServer.Create("mongodb://username:password#localhost:27017/?safe=true");
SafeMode mode = new SafeMode(true);
SafeModeResult result = new SafeModeResult();
var db = server.GetDatabase("himani");
var coll = db.GetCollection("test");
BsonDocument document = new BsonDocument();
document.Add("name", "himani");
result = coll.Insert(document, mode);
BsonDocument nested = new BsonDocument();
nested.Add("1", "heena").Add("2", "divya");
BsonArray a = new BsonArray();
a.Add(2);
a.Add(5);
nested.Add("values", a);
document["3"] = new BsonArray().Add(BsonValue.Create(nested));
coll.Save(document);
var query = Query.And(
Query.EQ("name", "himani"),
Query.EQ("3.1", "heena")
);
var match = coll.FindOne(query);
var update = Update.AddToSet("3", new BsonDocument {{ "count", "2" }});
coll.Update(query, update);
I want to add a new document to the User array. I am doing this by above code but its not working.Please tell me the right way of doing it.
I don't understand your document structure at all... and the only "user" array I could find in here was a field called "3". Your code does in fact work and appends a document into the "3" array. The below is the result after running your code. Perhaps you could be more clear as to what you want your document to look like after you have "appended" a user.
{
"_id":ObjectId("4fa7d965ce48f3216c52c6c7"),
"name":"himani",
"3":[
{
"1":"heena",
"2":"divya",
"values":[ 2, 5 ]
},
{
"count":"2"
}
]
}

Lucene 'join' how-to? part II

Part I here...
Requirement:
search by multiple values in multiple fields AND Where Bar.Id == argBar.Id
var parser = new MultiFieldQueryParser
(new[] { "Name", "Title" }, new SimpleAnalyzer());
parser.???(string.Format("Bar.Id:{0}",argBar.Id)); // o_0
var query = Session.CreateFullTextQuery
(parser.Parse(searchValue), new[] { typeof(Foo) });
Found this:
Query searchQuery = MultiFieldQueryParser.Parse
(term, new[] {"title", "description"},
new[] {BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD},
new StandardAnalyzer());
So, theoretically - i should be able to add argBar.Id and BooleanClause.Occur.Must, but there isn't such an overload in Lucene.Net 2.4.0.2.
var bq = new BooleanQuery();
bq.Add(parser.Parse(searchValue), BooleanClause.Occur.SHOULD);
bq.Add(new TermQuery
(new Term("Bar.Id", argBar.Id.ToString()), BooleanClause.Occur.Must);
var r = Session.CreateFullTextQuery(bq, new[] {typeof(Foo)});
//victory