Not able to query MongoDB.Bson.ObjectId using ElasticSearch Nest Client - mongodb

I have a c# model which represents a collection/table in MongoDB. Assume it looks something like:
public class MyModel
{
public ObjectId Id { get; set; }
public ObjectId ClaimId { get; set; }
pub string SomeStringProperty { get; set; }
//..other props
}
This model has several properties with type ObjectId which is part of MonboDB.Bson Package.
Problem: When I index my List<MyModel> to my ElasticSearch server using NEST, all ObjectId type fields of MyModel look like following structure in ElasticSearch:
Question: How can I query the data in ElasticSearch having ObjectId field in 'where' clause of query?
When I have to filter over string field SomeStringProperty of MyModel, I can do following, which works good:
var result =
_elasticClient.Search<MyModel>(x => x
.Index("mymodels")
.Query(q => q
.Term(p => p.SomeStringProperty.Suffix("keyword"), filterValue)
)
)
But how can I apply same filter on ObjectId field? I tried few things but it wasn't matching, and I think that is because its value has completely been changed in elasticsearch server. Is there a way to query over this field?
I went through several articles on SO, but couldn't find similar or exactly like this issue. And secondly, I'm just beginner with ElasticSearch world. Any help/guideline will be appreciated.

Related

Mongodb: unwind to a property with a different name

Using c# fluent api I built a query aggregate with lookup and unwind stages.
I join a first collection with Badge collection where there is a 1:1 relationship.
I have the following intermediate classes for building strongly typed query
public class AfterLookupClass
{
...
public IEnumerable<Badge> badges { get; set; }
}
public class AfterUnwindClass
{
...
public Badge badges { get; set; }
}
The relevant part of the aggregate query is the following:
Lookup<FirstClass, Badge, AfterLookupClass>(
foreignCollection: BadgeCollection,
localField: e => e.Codice,
foreignField: f => f.Codice,
#as: (AfterLookupClasse eo) => eo.badges).
Unwind<AfterLookupClass, AfetrUnwindClass>(el => el.badges)
If in the AfterUnwindClass I change the name of the property from badges to something else, for example a more appropriate badge (it is expected to be only one) the query doesn't work anymore.
Is it possible to unwind to a property with a different name ?
you can try using a lookup syntax with pipeline (also supported via c# driver) or to use Project stage between Lookup and Unwind

How can I use OrderBy with an aggregate of a object property with Ardalis Specification?

I am trying to query my postgresql database using Ef core and Ardalis Specification.
For the query I build I want to sort the results by using OrderBy with an aggregate of a property that is on a nested object.
The sorting I want is to sort the list of Clinics by the Clinic that has the most Reviews with high Grades. The grades are on a scale of 1-5.
So if a clinic has two reviews with Grade=5 it should come on top of a clinic that has 5 reviews with Grade=2 or Grade=4. To do this I have to calculate the mean value and then order by the highest
public class Clinic
{
public int Id { get; set; }
public ICollection<Review> Reviews {get; set;}
}
public class Review
{
public int Id { get; set; }
public decimal Grade {get; set;}
}
My query so far, which doesnt work as intended as it only gets the highest value. Can I insert a mean-value calculation here somehow?
public ClinicFilterPaginatedSpecification()
{
Query.OrderByDescending(x => x.Reviews.Max(x => x.Grade ));
}
Running the query:
var filterSpec = new ClinicFilterSpecification();
var itemsOnPage= await _clinicRepo.ListAsync(filterSpec);
As Ivan Stoev notes in his comment, you should be able to use the .Average() command:
public ClinicFilterPaginatedSpecification()
{
Query.OrderByDescending(clinic => clinic.Reviews.Average(review => review.Grade ));
}
Have you tried this and if so is it working or producing an error?
I should mention that this is not directly related to the Specification package, in the sense that the expression is not altered in any way. Whatever works on EF, should work through specs as well. We're passing the expression as it is.
Now the question is how EF would behave in this case when you need to aggregate some data from the collections. I think this is optimized in EF Core 5, and Ardalis' suggestion should work. Prior to EF Core 3, this scenario would have involved an explicit Join operation (not quite sure).

What are drawbacks of storing Guid as String in MongoDB?

An application persists Guid field in Mongo and it ends up being stored as BinData:
"_id" : new BinData(3, "WBAc3FDBDU+Zh/cBQFPc3Q==")
The advantage in this case is compactness, the disadvantage shows up when one needs to troubleshoot the application. Guids are passed via URLs, and constantly transforming them to BinData when going to Mongo console is a bit painful.
What are drawbacks of storing Guid as string in addition to increase in size? One advantage is ease of troubleshooting:
"_id" : "3c901cac-5b90-4a09-896c-00e4779a9199"
Here is a prototype of a persistent entity in C#:
class Thing
{
[BsonIgnore]
public Guid Id { get; set; }
[BsonId]
public string DontUseInAppMongoId
{
get { return Id.ToString(); }
set { Id = Guid.Parse(value); }
}
}
In addition to gregor's answer, using Guids will currently prevent the use of the new Aggregation Framework as it is represented as a binary type. Regardless, you can do what you are wanting in an easier way. This will let the mongodb bson library handle doing the conversions for you.
public class MyClass
{
[BsonRepresentation(BsonType.String)]
public Guid Id { get; set;}
}
The drawbacks are that mongodb is optimised to use BSON ObjectID's so it will be slightly less efficient to use strings as ObjectID's. Also if you want to use range based queries on string ObjectIDs then a lexicographic compare will be used which may give different results than you expect. Other than that you can use strings as ObjectIDs.
See Optimizing ObjectIDs
http://www.mongodb.org/display/DOCS/Optimizing+Object+IDs

mongodb insert creation time separated from objectId

I am using mongodb with the official c# driver.
I am using Guids as Id field for my objects. I don't want to introduce a dependency on the mongodb bson classes so I am not using ObjectId in my domain layer.
Is it possible to instruct mongodb to insert a creation timestamp into objects that I insert into the datastore?
Example:
public class Foo
{
public Guid Id {get;set;}
public DateTime CreatedOn {get;set;}
}
Using mongodb idGenerators I can get the Guids generated upon insert. I know ObjectId has the timestamp included but as mentioned I wouldn't want my class to look like this
public class Foo
{
public ObjectId Id { get; set; }
public DateTime CreatedOn {get { return Id.CreationTime;}}
}
Is it important that it's the inserted timestamp and not the object's created timestamp? If not then do it in the constructor of the class. Or even better, a base class for your class(es).
public abstract class BaseClass
{
[BsonId]
public Guid Id {get;set;}
public DateTime CreatedOn {get;set;}
protected BaseClass()
{
Guid = new Guid();
CreatedOn = new DateTime.UtcNow;
}
}
public class Foo : BaseClass
{
}
Is this something you can use for it?
You can have an _id which is itself a document :
in json : { _id : {guid : ...., createdOn : ....} , field1 : ..., field2:....}
You just have to modify your idGenerator to have this behavior.
I recommend, however, that you really re-consider to use ObjectId.
If you're trying to make your domain layer pure and free of persistence concerns, then it makes sense to populate the creation date yourself in the domain layer, when deciding to create an entity, instead of relying on the database technology to put in a server timestamp.
This makes "creation date" a logical domain concept rather than the DB's concept of "timestamp when first stored in DB". The two can differ e.g. in cases of migrating data (but keeping the timestamp), deferring execution (e.g. in jobs), etc.
It also creates a healthy separation between "physical timestamp" and "logical timestamp" which you can further exploit during testing/mocking (e.g. you could have a test that says "do X, then change the logical time to 2 days in the future, then assert Y").
Finally, it forces you to think of what the creation date means in your domain layer instead of blindly assuming that it will be correct.
All this being said, if you insist on having it in MongoDB you can have a mapping that creates an ObjectID into some kind of hidden field (e.g. an explicitly-implemented interface) at insert time, and extracts its timestamp into the CreationDate field at read time.

Querying a child collection by multiple values in RavenDB

I'm using RavenDB build 371 and I have the following model:
class Product {
public string Id { get; set; }
public ProductSpec[] Specs { get; set; }
}
class ProductSpec {
public string Name { get; set; }
public string Value { get; set; }
}
I would like to be able to query for products which have a set of specs. When querying by a single spec:
session.Query<Product>()
.Where(product => product.Specs.Any(spec => spec.Name == "Color" && spec.Value == "Red"))
.ToList();
The expected results are returned, however when an additional spec predicate is added:
session.Query<Product>()
.Where(product => product.Specs.Any(spec => spec.Name == "Color" && spec.Value == "Red"))
.Where(product => product.Specs.Any(spec => spec.Name == "Country" && spec.Value == "US"))
.ToList();
no results are returned even though the results returned by the first query contain products with spec name "Country" and spec value "US". The same outcome is observed when using the LuceneQuery method. This seems to be a similar issue to this discussion however I was unable to implement to suggested solution. Specifically, after creating the suggested index, I don't know how to query it.
How can I support this type of query in RavenDB?
EDIT
I still can't query on multiple values on a collection of compound types. Instead, I changed the model so that a spec/value combination is a concatenated string such that the specs collection is an array of strings. This can be queried by multiple values:
class Product {
public string Id { get; set; }
public int CategoryId { get; set; }
public string[] Specs { get; set; }
}
For reference, the original model and query works when using MongoDB with their multikeys index feature. The very surprising problem with MongoDB is that the count() operation is slow for index queries. This type of query is essential for pagination and although count can be cached I would like a solution which provides this out of the box. Also, one other requirement I have is the ability to aggregate spec groups for arbitrary collections of products (for example, to get a collection of all spec/value combinations for products in a given category). In MongoDB this can be achieved using their MapReduce functionality, however the results of a MapReduce operation are static and must be manually updated when the source data changes whereas RavenDB updates MapReduce indexes automatically in the background. So, even though declaring MapReduce indexes in RavenDB is more cumbersome than it is in MongoDB IMO, the automatic background updating outweighs the drawbacks by a long shot. I will be looking at CouchDB as their views are also updated automatically, though it appears they are updated on demand, not automatically in the background, not sure if this will be an issue.
I have tried different things, and could not make it work either. The specific query you are trying to execute is resolved to this Lucene query by RavenDB (in version 426):
"{(Name:Color AND Value:Red) AND (Name:Country AND Value:US)}" which explains why you get no result.
After googling on the subject, I found this post: Lucene Query Syntax
Different workarounds are suggested among the answers. Hope this will help. Im rather curious myself though, if this really isn't possible.
As per build 717 you can do this using the new .Intersect() feature that has been done by Matt Warren. Take a look here: http://issues.hibernatingrhinos.com/issue/RavenDB-51
I've changed the model a bit and was able to achieve the desired result using the Project method in AbstractIndexCreationTask. This is the (simplified) data model:
public class Product
{
public string Id { get; set; }
public int CategoryId { get; set; }
public int TotalSold { get; set; }
public Dictionary<string, string> Specs { get; set; }
}
This is the index definition:
public class Products_ByCategoryIdAndSpecs_SortByTotalSold : AbstractIndexCreationTask<Product>
{
public Products_ByCategoryIdAndSpecs_SortByTotalSold()
{
this.Map = products => from product in products
select new
{
product.CategoryId,
_ = Project(product.Specs, spec => new Field("Spec_" + spec.Key, spec.Value, Field.Store.NO, Field.Index.ANALYZED)),
product.TotalSold
};
}
}
Then I can query like so:
var results = session.Advanced.LuceneQuery<Product, Products_ByCategoryIdAndSpecs_SortByTotalSold>()
.WhereEquals("CategoryId", 15920)
.AndAlso().WhereEquals("Spec_Class", "3A")
.AndAlso().WhereEquals("Spec_Finish", "Plain")
.OrderBy("-TotalSold")
.ToList();
This will return the products in category "15920" which have a "Class" spec value of "3A" and a "Finish" spec value of "Plain" sorted in descending order by the total units sold.
The key was using the Project method which basically creates fields in the Lucene document for each spec name-value pair.