Querying a child collection by multiple values in RavenDB

Querying a child collection by multiple values in RavenDB - nosql

I'm using RavenDB build 371 and I have the following model:
class Product {
public string Id { get; set; }
public ProductSpec[] Specs { get; set; }
}
class ProductSpec {
public string Name { get; set; }
public string Value { get; set; }
}
I would like to be able to query for products which have a set of specs. When querying by a single spec:
session.Query<Product>()
.Where(product => product.Specs.Any(spec => spec.Name == "Color" && spec.Value == "Red"))
.ToList();
The expected results are returned, however when an additional spec predicate is added:
session.Query<Product>()
.Where(product => product.Specs.Any(spec => spec.Name == "Color" && spec.Value == "Red"))
.Where(product => product.Specs.Any(spec => spec.Name == "Country" && spec.Value == "US"))
.ToList();
no results are returned even though the results returned by the first query contain products with spec name "Country" and spec value "US". The same outcome is observed when using the LuceneQuery method. This seems to be a similar issue to this discussion however I was unable to implement to suggested solution. Specifically, after creating the suggested index, I don't know how to query it.
How can I support this type of query in RavenDB?
EDIT
I still can't query on multiple values on a collection of compound types. Instead, I changed the model so that a spec/value combination is a concatenated string such that the specs collection is an array of strings. This can be queried by multiple values:
class Product {
public string Id { get; set; }
public int CategoryId { get; set; }
public string[] Specs { get; set; }
}
For reference, the original model and query works when using MongoDB with their multikeys index feature. The very surprising problem with MongoDB is that the count() operation is slow for index queries. This type of query is essential for pagination and although count can be cached I would like a solution which provides this out of the box. Also, one other requirement I have is the ability to aggregate spec groups for arbitrary collections of products (for example, to get a collection of all spec/value combinations for products in a given category). In MongoDB this can be achieved using their MapReduce functionality, however the results of a MapReduce operation are static and must be manually updated when the source data changes whereas RavenDB updates MapReduce indexes automatically in the background. So, even though declaring MapReduce indexes in RavenDB is more cumbersome than it is in MongoDB IMO, the automatic background updating outweighs the drawbacks by a long shot. I will be looking at CouchDB as their views are also updated automatically, though it appears they are updated on demand, not automatically in the background, not sure if this will be an issue.

I have tried different things, and could not make it work either. The specific query you are trying to execute is resolved to this Lucene query by RavenDB (in version 426):
"{(Name:Color AND Value:Red) AND (Name:Country AND Value:US)}" which explains why you get no result.
After googling on the subject, I found this post: Lucene Query Syntax
Different workarounds are suggested among the answers. Hope this will help. Im rather curious myself though, if this really isn't possible.

As per build 717 you can do this using the new .Intersect() feature that has been done by Matt Warren. Take a look here: http://issues.hibernatingrhinos.com/issue/RavenDB-51

I've changed the model a bit and was able to achieve the desired result using the Project method in AbstractIndexCreationTask. This is the (simplified) data model:
public class Product
{
public string Id { get; set; }
public int CategoryId { get; set; }
public int TotalSold { get; set; }
public Dictionary<string, string> Specs { get; set; }
}
This is the index definition:
public class Products_ByCategoryIdAndSpecs_SortByTotalSold : AbstractIndexCreationTask<Product>
{
public Products_ByCategoryIdAndSpecs_SortByTotalSold()
{
this.Map = products => from product in products
select new
{
product.CategoryId,
_ = Project(product.Specs, spec => new Field("Spec_" + spec.Key, spec.Value, Field.Store.NO, Field.Index.ANALYZED)),
product.TotalSold
};
}
}
Then I can query like so:
var results = session.Advanced.LuceneQuery<Product, Products_ByCategoryIdAndSpecs_SortByTotalSold>()
.WhereEquals("CategoryId", 15920)
.AndAlso().WhereEquals("Spec_Class", "3A")
.AndAlso().WhereEquals("Spec_Finish", "Plain")
.OrderBy("-TotalSold")
.ToList();
This will return the products in category "15920" which have a "Class" spec value of "3A" and a "Finish" spec value of "Plain" sorted in descending order by the total units sold.
The key was using the Project method which basically creates fields in the Lucene document for each spec name-value pair.

Related

Table Splitting - Migration Warning

My scenario:
I have a Product that has various properties such a price, size, etc. that are declared in the Product Entity.
Additionally, a Product can have a collection of StockRequirements, i.e. when that Product is used the constituent StockItems can be depleted by the StockRequirement quantity accordingly.
Under one use case I just want the Product so that I can play with the core properties. For another use case I want the Product with its StockRequirements.
This means that when retrieving a Product I may be using it in different contexts. My chosen approach has been to use EF table splitting.
I have one repository for Products and one repository for ProductStockRequirements. They are referring to the same unique Product.
The Product repository will provide a Product Entity with the core details only.
The ProductStockRequirements repository will provide ProductStockRequirements entity which does not have the core details, but does have the list of StockRequirements.
This seemed a reasonable approach so that I am not retrieving 'owned' StockRequirements when I only want to change the price of the product. Similarly, if I'm only interested in playing with the StockRequirements then I don't retrieve the other core details.
Entities
class Product
{
public int Id { get; set; }
public string CoreProperty { get; set; }
}
class ProductStockRequirements
{
public int Id { get; set; }
public List<StockRequirement> StockRequirements { get; set; }
}
Product Mapping
b.ToTable("Products");
b.HasKey(p => p.Id);
b.Property(p => p.CoreProperty).IsRequired();
ProductStockRequirementsMapping
b.ToTable("Products");
b.HasKey(p => p.Id);
b.OwnsMany<StockRequirement>(p => StockRequirements, b =>
{
b.ToTable("StockRequirements");
b.WithOwner().HasForeignKey("ProductId");
}
b.HasOne<Product>()
.WithOne()
.HasForeignKey<ProductStockRequirements>("Id");
When running a migration, I get the warning:
The entity type 'ProductStockRequirements' is an optional dependent
using table sharing without any required non shared property that
could be used to identify whether the entity exists. If all nullable
properties contain a null value in database then an object instance
won't be created in the query. Add a required property to create
instances with null values for other properties or mark the incoming
navigation as required to always create an instance.
Focusing on the advice:
mark the incoming navigation as required to always create an instance
I have tried:
b.HasOne<Product>()
.WithOne()
.HasForeignKey<ProductStockRequirements>("Id")
.IsRequired();
and
b.HasOne<Product>()
.WithOne()
.IsRequired()
.HasForeignKey<ProductStockRequirements>("Id");
to no avail.
The warning does not appear to result in any bad behaviour. All my tests are passing. But, it seems that I should be able to create a map that removed this warning, but cannot find the way.

This should really just be
class Product
{
public int Id { get; set; }
public string CoreProperty { get; set; }
public List<StockRequirement> StockRequirements { get; set; } = new List<StockRequirement>();
}
As the StockRequiremens are not part of the Product entity, and related data isn't loaded unless you request it.
And the Entity model is simply not the correct layer to define your aggregates. An Aggregate is defined by selecting a single Entity from your entity model along with 0-few related entities. Typically you include the closely-related and weak entities together in an aggregate.
If your entity model is a graph of 23 related entities, you might organize it into 10 separate and partially-overlapping aggregates or sub-graphs.

Not able to query MongoDB.Bson.ObjectId using ElasticSearch Nest Client

I have a c# model which represents a collection/table in MongoDB. Assume it looks something like:
public class MyModel
{
public ObjectId Id { get; set; }
public ObjectId ClaimId { get; set; }
pub string SomeStringProperty { get; set; }
//..other props
}
This model has several properties with type ObjectId which is part of MonboDB.Bson Package.
Problem: When I index my List<MyModel> to my ElasticSearch server using NEST, all ObjectId type fields of MyModel look like following structure in ElasticSearch:
Question: How can I query the data in ElasticSearch having ObjectId field in 'where' clause of query?
When I have to filter over string field SomeStringProperty of MyModel, I can do following, which works good:
var result =
_elasticClient.Search<MyModel>(x => x
.Index("mymodels")
.Query(q => q
.Term(p => p.SomeStringProperty.Suffix("keyword"), filterValue)
)
)
But how can I apply same filter on ObjectId field? I tried few things but it wasn't matching, and I think that is because its value has completely been changed in elasticsearch server. Is there a way to query over this field?
I went through several articles on SO, but couldn't find similar or exactly like this issue. And secondly, I'm just beginner with ElasticSearch world. Any help/guideline will be appreciated.

How can I use OrderBy with an aggregate of a object property with Ardalis Specification?

I am trying to query my postgresql database using Ef core and Ardalis Specification.
For the query I build I want to sort the results by using OrderBy with an aggregate of a property that is on a nested object.
The sorting I want is to sort the list of Clinics by the Clinic that has the most Reviews with high Grades. The grades are on a scale of 1-5.
So if a clinic has two reviews with Grade=5 it should come on top of a clinic that has 5 reviews with Grade=2 or Grade=4. To do this I have to calculate the mean value and then order by the highest
public class Clinic
{
public int Id { get; set; }
public ICollection<Review> Reviews {get; set;}
}
public class Review
{
public int Id { get; set; }
public decimal Grade {get; set;}
}
My query so far, which doesnt work as intended as it only gets the highest value. Can I insert a mean-value calculation here somehow?
public ClinicFilterPaginatedSpecification()
{
Query.OrderByDescending(x => x.Reviews.Max(x => x.Grade ));
}
Running the query:
var filterSpec = new ClinicFilterSpecification();
var itemsOnPage= await _clinicRepo.ListAsync(filterSpec);

As Ivan Stoev notes in his comment, you should be able to use the .Average() command:
public ClinicFilterPaginatedSpecification()
{
Query.OrderByDescending(clinic => clinic.Reviews.Average(review => review.Grade ));
}
Have you tried this and if so is it working or producing an error?

I should mention that this is not directly related to the Specification package, in the sense that the expression is not altered in any way. Whatever works on EF, should work through specs as well. We're passing the expression as it is.
Now the question is how EF would behave in this case when you need to aggregate some data from the collections. I think this is optimized in EF Core 5, and Ardalis' suggestion should work. Prior to EF Core 3, this scenario would have involved an explicit Join operation (not quite sure).

Access underlying DbContext (or run stored procedure) from Entity Framework POCO method

Is it possible to access the underlying DbContext (the DbContext that has populated this object/has this object in its cache/is tracking this object) from inside a model object, and if so, how?
The best answer I have found so far is this blog post which is five years old. Is it still the best solution available?
I’m using the latest version of Entity Framework if that matters.
Here's a sample to clarify my question:
I have a hierarchical tree. Let’s say it is categories that could have sub-categories. The model object would be something like this:
class Category
{
string CategoryId { get; set; }
string Name { get; set; }
virtual Category Parent { get; set; }
virtual ICollection<Category> Children { get; set; }
}
Now, if I want to access all descendants of a category (not just its immediate children) I can use a recursive query like this:
class Category
{
//...
IEnumerable<Category> Descendants
{
get
{
return Children.Union(Children.SelectMany(q => q.Descendants));
}
}
}
which works, but has bad performance (due to multiple database queries it needs to perform).
But suppose I have an optimized query that I can run to find descendent (maybe I store my primary key in a way that already contains path, or maybe I’m using SQL Server data type hierarchyid, etc.). How can I run such a query, which needs access to the whole table/database and not just the records available through model object’s navigational properties?
This can be either done by running a stored procedure/SQL command on the database, or a query like this:
class Category
{
//...
IEnumerable<Category> Descendants
{
get
{
// this won't work, because underlying DbContext is not available in this context!
return myDbContext.Categories.Where(q => q.CategoryId.StartsWith(this.CategoryId));
}
}
}
Is there a way at all to implement such a method?

What are drawbacks of storing Guid as String in MongoDB?

An application persists Guid field in Mongo and it ends up being stored as BinData:
"_id" : new BinData(3, "WBAc3FDBDU+Zh/cBQFPc3Q==")
The advantage in this case is compactness, the disadvantage shows up when one needs to troubleshoot the application. Guids are passed via URLs, and constantly transforming them to BinData when going to Mongo console is a bit painful.
What are drawbacks of storing Guid as string in addition to increase in size? One advantage is ease of troubleshooting:
"_id" : "3c901cac-5b90-4a09-896c-00e4779a9199"
Here is a prototype of a persistent entity in C#:
class Thing
{
[BsonIgnore]
public Guid Id { get; set; }
[BsonId]
public string DontUseInAppMongoId
{
get { return Id.ToString(); }
set { Id = Guid.Parse(value); }
}
}

In addition to gregor's answer, using Guids will currently prevent the use of the new Aggregation Framework as it is represented as a binary type. Regardless, you can do what you are wanting in an easier way. This will let the mongodb bson library handle doing the conversions for you.
public class MyClass
{
[BsonRepresentation(BsonType.String)]
public Guid Id { get; set;}
}

The drawbacks are that mongodb is optimised to use BSON ObjectID's so it will be slightly less efficient to use strings as ObjectID's. Also if you want to use range based queries on string ObjectIDs then a lexicographic compare will be used which may give different results than you expect. Other than that you can use strings as ObjectIDs.
See Optimizing ObjectIDs
http://www.mongodb.org/display/DOCS/Optimizing+Object+IDs

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Querying a child collection by multiple values in RavenDB - nosql

As per build 717 you can do this using the new .Intersect() feature that has been done by Matt Warren. Take a look here: http://issues.hibernatingrhinos.com/issue/RavenDB-51

Related

Table Splitting - Migration Warning

Not able to query MongoDB.Bson.ObjectId using ElasticSearch Nest Client

How can I use OrderBy with an aggregate of a object property with Ardalis Specification?

Access underlying DbContext (or run stored procedure) from Entity Framework POCO method

What are drawbacks of storing Guid as String in MongoDB?

Categories

Resources