EmbeddedDocumentSerializer runs query for every ReferenceField - mongodb

I have following models and serializer the target is when serializer runs to have only one query:
Models:
class Assignee(EmbeddedDocument):
id = ObjectIdField(primary_key=True)
assignee_email = EmailField(required=True)
assignee_first_name = StringField(required=True)
assignee_last_name = StringField()
assignee_time = DateTimeField(required=True, default=datetime.datetime.utcnow)
user = ReferenceField('MongoUser', required=True)
user_id = ObjectIdField(required=True)
class MongoUser(Document):
email = EmailField(required=True, unique=True)
password = StringField(required=True)
first_name = StringField(required=True)
last_name = StringField()
assignees= EmbeddedDocumentListField(Assignee)
Serializers:
class MongoUserSerializer(DocumentSerializer):
assignees = AssigneeSerializer(many=True)
class Meta:
model = MongoUser
fields = ('id', 'email', 'first_name', 'last_name', 'assignees')
depth = 2
class AssigneeSerializer(EmbeddedDocumentSerializer):
class Meta:
model = Assignee
fields = ('assignee_first_name', 'assignee_last_name', 'user')
depth = 0
When checking the mongo profiler I have 2 queries for the MongoUser Document. If I remove the assignees field from the MongoUserSerializer then there is only one query.
As a workaround I've tried to use user_id field to store only ObjectId and changed AssigneeSerializer to:
class AssigneeSerializer(EmbeddedDocumentSerializer):
class Meta:
model = Assignee
fields = ('assignee_first_name', 'assignee_last_name', 'user_id')
depth = 0
But again there are 2 queries. I think that the serializer EmbeddedDocumentSerializer fetches all the fields and queries for ReferenceField and
fields = ('assignee_first_name', 'assignee_last_name', 'user_id')
works after the queries are made.
How to use ReferenceField and not run a separate query for each reference when serializing?

I ended up with a workaround and not using ReferenceField. Instead I am using ObjectIdField:
#user = ReferenceField("MongoUser", required=True) # Removed now
user = ObjectIdField(required=True)
And changed value assignment as follows:
- if assignee.user == MongoUser:
+ if assignee.user == MongoUser.id:
It is not the best way - we are not using ReferenceField functionality but it is better than creating 30 queries in the serializer.
Best Regards,
Kristian

It's a very interesting question and I think it is related to Mongoengine's DeReference policy: https://github.com/MongoEngine/mongoengine/blob/master/mongoengine/dereference.py.
Namely, your mongoengine Documents have a method MongoUser.objects.select_related() with max_depth argument that should be large enough that Mongoengine traversed 3 levels of depth: MongoUser->assignees->Assignee->user and cached all the related MongoUser objects for current MongoUser instance. Probably, we should call this method somewhere in our DocumentSerializers in DRF-Mongoengine to prefetch the relations, but currently we don't.
See this post about classical DRF + Django ORM that explains, how to fight N+1 requests problem by doing prefetching in classical DRF. Basically, you need to override the get_queryset() method of your ModelViewSet to use select_related() method:
from rest_framework_mongoengine.viewsets import ModelViewSet
class MongoUserViewSet(ModelViewSet):
def get_queryset(self):
queryset = MongoUser.objects.all()
# Set up eager loading to avoid N+1 selects
queryset.select_related(max_depth=3)
return queryset
Unfortunately, I don't think that current implementation of ReferenceField in DRF-Mongoengine is smart enough to handle these querysets appropriately. May be ComboReferenceField will work?
Still, I've never used this feature yet and didn't have enough time to play with these settings myself, so I'd be grateful to you, if you shared your findings.

Related

How to extend django's default User in mongodb?

I'm using mongodb as database and trying to extend the django's inbuilt user model.
here's the error I'm getting:
django.core.exceptions.ValidationError: ['Field "auth.User.id" of model container:"<class \'django.contrib.auth.models.User\'>" cannot be of type "<class \'django.db.models.fields.AutoField\'>"']
Here's my models.py:
from djongo import models
from django.contrib.auth.models import User
class Profile(models.Model):
user = models.EmbeddedField(model_container=User)
mobile = models.PositiveIntegerField()
address = models.CharField(max_length=200)
pincode = models.PositiveIntegerField()
Using EmbeddedField is not a good idea, because it will duplicate user data in the database. You will have some user in the Users collection and the same data will be embedded in the Profile collection elements.
Just keep the user id in the model and query separately:
class Profile(models.Model):
user_id = models.CharField() #or models.TextField()
mobile = models.PositiveIntegerField()
address = models.CharField(max_length=200)
pincode = models.PositiveIntegerField()
It is simple as defined in the documentation.
So, first, use djongo models as the model_container, and I suppose the User model is the Django model, not the djongo model.
And the second thing, make your model_cotainer model abstract by defining in the Meta class as given below.
from djongo import models
class Blog(models.Model):
name = models.CharField(max_length=100)
class Meta:
abstract = True
class Entry(models.Model):
blog = models.EmbeddedField(
model_container=Blog
)
headline = models.CharField(max_length=255)
Ref: https://www.djongomapper.com/get-started/#embeddedfield

Entity Framework select few fields from navigation property object

I have a simple query:
var car = CarRepository.GetById(1);
var engine = new EngineDto
{
Prop1 = car.Engine.Prop1,
Prop2 = car.Engine.Prop2,
Prop3 = car.Engine.Prop3,
Prop4 = car.Engine.Prop4
}
The issue is the Engine model has more then 50 columns and when I try to get value of Engine model property Entity Framework generate query
SELECT TOP(1) * FROM Engine WHERE ID = <id>
Is there any way to create query for getting only few fields?
Yes, there is a way: don't fetch the complete entity, but use Enumerable.Select to fetch only the data you plan to use.
For this your CarRepository would need a function that return IEnumerable<Car> (or similar IQueryable)
In two steps:
EngineDto engine = CarRepository.QueryCars() // function to fetch IEnumerable<Car>
.Where(car => car.Id == 1)
.Select(car => new EngineDto
{
Prop1 = car.Engine.Prop1,
Prop2 = car.Engine.Prop2,
Prop3 = car.Engine.Prop3,
Prop4 = car.Engine.Prop4,
})
.SingleOrDefault();
If the designer of the CarRepository didn't provide a function that would return a sequence of Cars, then obviously he thought that no one would ever want it. But I'm pretty sure the CarRepository has a function to "fetch all cars"
By the way, as transferring the data from the database to your local process is usually the slower part of your processing, it is always wise to transfer as little data as possible. Try to avoid transferring properties that you didn't plan to use.

Best Way to convert one Edmx Entity to one Business entity

I am developing one application in which data is access from edmx entities and from that we have to fill each business entity after retriving data from edmx entity like:-
var tblproducts = tblproductsData
.Select(t => new tblProduct()
{
CategoryID = t.CategoryID,
Description = t.Description,
ID = t.ID,
Image = t.Image,
InsDt = t.InsDt,
Price = t.Price,
Quantity = t.Quantity,
Status = t.Status,
Title = t.Title,
tblCategory = new EFDbFirst.Models.tblCategory()
{
ID = t.tblCategory.ID,
status = t.tblStatus.StatusID,
Title_Category = t.tblCategory.Title_Category
},
tblStatu = new EFDbFirst.Models.tblStatu()
{
StatusDescription = t.tblStatus.StatusDescription
,
StatusID = t.tblStatus.StatusID
}
});
I am fadeup with this because everytime i have to convert one to another while getting data and setting data in db,
Is there any good way to create some common mehod which takes one anonymous type and converts it to another anonymous type.
Thanks in Advance
Your example isn't that clear.
First of all, EF doesn't work with anonymous types inside itself, it works with the EF types you have defined either using edmx file or code first. You can however create anonymous types yourself by defining an Select statement.
E.g:
var products = context.tblProductsData
.Select(r => new { Description = r.Description }); //new without typename is an
//anonymous object
The tblProduct, tblCategory and tblStatu objects, are they EF types? If so, you don't need to write a Select, EF will generate objects for you when you execute it.
E.g:
var products = context.tblProductsData.ToList();
This will automatically generate tblProduct objects for you. When you try to navigate to tblProduct.tblCategory or tblProduct.tblStatu, lazy loading will retrieve them for you. If you want to explicit load them during first query (eager-loading) use the Include function.
E.g:
var products = context.tblProductsData.Include(r => r.tblCategory)
.Include(r => r.tblStatu).ToList();
However if tblProducts, tblCategory and tblStatu is business objects and NOT EF types, there isn't any other way to do this, you have to explicit create them in a Select statement.

Loading particular peoperties for entity and assigning it as reference to another entity

Example:
I am entering a new invoice. For this invoice I need to enter a customer. Lets assume that we retrieved a list of customers:
var list = Context.Set<Customer>().ToList();
Here I see two issues:
1) I do not need to bring all information for customer, I only need Id, Code and Name
2) Customer in current DbContext is read-only, so it would be nice if it is possible to tell DbContext not to monitor their states, to improve performance.
Questions:
1) Can we load only partial data for customer, but still be able to assign it to Invoice (see code bellow)?
2) Can we tell DbContext not to monitor Customers for changes, and still be able to do this:
Invoice.Customer = CustomerList[10];
There's not a direct way to do exactly what you want, but you might be able to achieve your goals with some compromise.
I do not need to bring all information for customer, I only need Id,
Code and Name
There isn't a way for EF to create a partially loaded entity, but you could create an anonymous type:
Context.Customers.Select(c => new {Id = c.CustomerId, Code = c.Code, Name = c.Name}).Tolist()
If you could live with the new anonymous type then use that, or you could then iterate through that list, creating actual customer objects.
Customer in current DbContext is read-only, so it would be nice if it
is possible to tell DbContext not to monitor their states, to improve
performance.
EF provides an Extension of AsNoTracking() which will do exactly what you're looking for:
var list = Context.Set<Customer>().AsNoTracking().ToList();
Depending on what you choose from above, the following code may change, but this code does achieve what you're looking for. Partially loads the customer, but still allows you to attach the customer to the invoice.
Note: You'll need to attach the customer to your context before you can use it, and then setting it to a state of Unchanged will prevent it from overwriting exiting data.
m = new Model();
var list = m.Customers.Select(c => new {Id = c.CustomerId, Code = c.Code, Name = c.Name});
List<Customer> customerList = new List<Customer>();
foreach (var item in list)
{
customerList.Add(new Customer()
{
CustomerId = item.Id,
Code = item.Code,
Name = item.Name
});
}
Invoice i = new Invoice();
var customer = customerList.First();
m.Customers.Attach(customer);
m.Entry(customer).State = EntityState.Unchanged;
i.Customer = customer;
m.Invoices.Add(i);
m.SaveChanges();

tastypie - List related resources keys instead of urls

When I have a related Resource, I would like to list foreign keys, instead of a url to that resource. How is that possible aside from dehydrating it?
I'm not sure that it's possible without dehydrating the field. I usually have utility functions that handle conversion the dehydration of foreign key and many-to-many relationships, something like this:
#api_utils.py
def many_to_many_to_ids(bundle, field_name):
field_ids = getattr(bundle.obj, field_name).values_list('id', flat=True)
field_ids = map(int, field_ids)
return field_ids
def foreign_key_to_id(bundle, field_name):
field = getattr(bundle.obj, field_name)
field_id = getattr(field, 'id', None)
return field_id
And apply them to the fields like so:
#api.py
from functools import partial
class CompanyResource(CommonModelResource):
categories = fields.ManyToManyField(CompanyCategoryResource, 'categories')
class Meta(CommonModelResource.Meta):
queryset = Company.objects.all()
dehydrate_categories = partial(many_to_many_to_ids, field_name='categories')
class HotDealResource(CommonModelResource):
company = fields.ForeignKey(CompanyResource, 'company')
class Meta(CommonModelResource.Meta):
queryset = HotDeal.objects.all()
dehydrate_company = partial(foreign_key_to_id, field_name='company')