How to change default Similarity for Indexing and searching in Lucene.net 3.0.3 - lucene.net

I am new in Lucene and I am trying to change the default similarity to BM25.
I am using Lucene.net library (version 3.0.3) in a WPF project (.net 4.8).
For instance for IndexSearcher I would expect to be something like this :
IndexReader = IndexReader.Open(directory, true);
var _Searcher = new IndexSearcher(IndexReader);
_Searcher.Similarity=Similarity.BM25;
Also on the indexing part I cannot find were to change the default Similarity.

Hello everyone for anyone has the same query with me in the future, it seems that Similarity is not included in 3.0.3 but is included in Lucene 4.8. Hence downloading the beta version of 4.8 did the trick

Related

Handling multiple document version in Mongo Collection

{Hello. I'm not completely satisfied with the title, Mods please help amend it if necessary}
We are trying to come up with ways to implement mongoDB on the back end in our project. We have to address a couple of concerns that were raised as below. Some input by experts in the field would be really helpful.
Remove / Add entirely new fields into the document given the early development changes --> How best can this be accommodated?
As an example to this, suppose my collection contains about 1000 records, and there is an that contains 'Address' data. owing to operational changes, we need to add to (or replace) the 'Address' field with an array of 'Street', 'POBOX' etc. and populate them with a certain default value, how best can this be accommodated?
Specific scenario wherein not all devices that we run would need to be updated to the latest version. This means that the new "fields" added in the DB would essentially be irrelevant to the devices running the older version of the app. --> How best can this scenario be dealt with?
As an example to this, let us assume that some devices run an earlier version of the app which only looks for 'Address' as a field. Certain devices that are updated to the latest app need will need to refer to the 'Street' and 'POBox' fields instead of the address. How best can this be handled from Mongo's perspective?
in simple words:
As you development will progress, document shape can be changed as necessary.
Depend of change type, structure update can be conducted by one update statement
or sometimes there will be a need to use aggregation framework and save results in new collection.
For backward compatibility with app version on device in use - document can contain both versions of fields, so older version of application can be used with newer schema.
This could leads to other problem, what to do if document will be updated? That means app could be setup to read from newer schema, but not write (if possible).
If there will be a possibility to use webApi to communicate with app and then with mongo - you can do all migration on the fly, as api will be aware of changes.....
Any comments welcome!

not_analyzed for search and analyzed for visual in kibana

I have a problem with analyzed or not string fields. Because i need "analyzed" data for searchs, and "not_analyzed" when i do a top N report, for example with "vertical line chart"
It's posible use any filter or exclude/include Pattern to keep the full string without analyzed in visuals ??
Im using Version 4.1.1 of Kibana.
Thanks a lot !
Elasticsearch has different analyzers called index analyzer and search analyzer. You can use different analyzers for both.
This link here might give you a solution: https://www.elastic.co/guide/en/elasticsearch/guide/current/_controlling_analysis.html

How to avoid that much casts with MongoDb Java-Api

HI i'm working with the Java-Api of mongo-db.
I have to cast verry often like this
BasicDBList points = ((BasicDBList) ((BasicDBObject) currentObject.get("poly")).get("coordinates"));
which is not fun. Am i missing something or it is just the way to do it?
i think BasicDBObject should have functions like
BasicDBObject getBasicDBObject(String key)
BasicDBList getBasicDBList(String key)
Unfortunately, the current java driver is not perfect and it is difficult to avoid casting as you mentioned. However, java driver team is working on the next version and as far as I understand it will be completely rewritten.
In one of the mongodb meetup I heard that the new version will make use of asynchronous API, similar to the node driver. I guess we need to sit tight and wait for the next major release.
Alternatives, are (from Mongo Java drivers & mappers performances):
async Java driver
a library built on top of a driver, e.g. Morphia, Jongo, see POJOMappers

q.setOrder(...) extremely slow in DataNucleus + MongoDB

I have a latest stable DataNucleus (3.0.1) with MongoDB datastore and JDO implementation.
The collection has around 1M documents.
The "id" field is indexed.
This code takes several minutes to execute:
Query q = pm.newQuery(CellMeasurement.class);
q.setOrdering("id descending");
q.setRange(0, count);
Collection<CellMeasurement> result = (Collection<CellMeasurement>)q.execute();
if I remove the q.setOrdering(...) everything is ok, for count=1000 it takes around a second to load.
It looks like DN does the in-memory reordering, does it have any sense ? MongoDB itself orders instantly by this indexed field, the API supports ordering..
Any idea ? Thanks.
Looking in the log (for any application) would obviously reveal much; in this case what query is actually performed. In this case it would tell you easily enough that ordering is not currently implemented to execute in-datastore.
Obviously anybody could contribute to such codebase that has been open source since its first release. org.datanucleus.store.mongodb.query.QueryToMongoDBMapper method "compileOrdering" is where you need to implement that, and then attach a patch to a JIRA issue when you're done. Thx

Magento get language collection

There seems to be a possibility in Magento to get a language collection, namely via Mage::getSingleton('adminhtml/system_config_source_language'), which I would like to use. It results however in an error in my version of Magento (both Enterprise 1.10 and Community 1.4), expecting to get its data from an unexisting table called core_language.
Has anyone found a good solution or alternative to this? Or maybe have used this and has a table dump for core_language?
Magento is built on Zend so you can use,
Zend_Locale::getTranslationList("language")
which returns an array of strings keyed by their abbreviation.
Hmm, I looked through the installation files and apparently the table is created initially but dropped from version 0.7.5, so it's probably deprecated code. The class file doesn't mention this though, so quite obscure.