Combining Two Querysets in Django (With Different Models)

Don't Use Django for Big Projects - DEV


Today, I stumbled upon a use case where I needed to have a querysets that had objects from different models. Django has a neat "contenttypes framework" which is a good way to achieve this. So here are my notes on what I learned today, and I hope it will help someone in the future. NOTE: If you can design your models from scratch this is not the best approach to follow. Read my note under step 5.

Let us consider the following models:

class Bmw(models.Model):
    series = models.CharField(max_length=50)
    created = models.DateTimeField()
    class Meta:
        ordering = ['-created']
    def __str__(self):
         return "{0} - {1}".format(self.series, self.created.date())
class Tesla(models.Model):
    series = models.CharField(max_length=50)
    created = models.DateTimeField()
    class Meta:
        ordering = ['-created']
    def __str__(self):
        return "{0} - {1}".format(self.series, self.created.date())

We can get list of Bmw's and Teslas separatley like so

>>> Bmw.objects.filter()
[<Bmw: Bmw Series 1 - 2013-08-04>, <Bmw: Bmw Series 2 - 2010-01-15>]
>>> Tesla.objects.filter()
[<Tesla: Tesla Series 2 - 2015-03-29>, <Tesla: Tesla Series 1 - 2011-09-10>]

But what if we want the two querysets combined, say we want to display all cars in our dealership page by creation date. So we want something like

[<Car: Tesla Series 2 - 2015-03-29>, <Car: Bmw Series 1 - 2013-08-04>, <Car: Tesla Series 1 - 2011-09-10>, <Car: Bmw Series 2 - 2010-01-15>]

How do we do that? Here are two viable approaches.

Using itertools chain is one approach.

from itertools import chain
def get_all_cars():
    bmws = Bmw.objects.filter()
    teslas = Tesla.objects.filter()
    cars_list = sorted(
        chain(bmws, teslas),
        key=lambda car: car.created, reverse=True)
    return cars_list

Here we get the queryset for Bmws and queryset of Teslas, and pass them to the chain function which combines these two iterables and makes a new iterator. We then pass this list to the sort function and specify that we want to sort it by the created date. Finally we say that we want the order to be reversed. Here is the result

[<Tesla: Tesla Series 2 - 2015-03-29>, <Bmw: Bmw Series 1 - 2013-08-04>, <Tesla: Tesla Series 1 - 2011-09-10>, <Bmw: Bmw Series 2 - 2010-01-15>]

This is a good approach if the queryset is small. However if we are dealing with larger querysets and need to involve pagination, every time we need to query the entire database and sort by the created date. Even if we slice the list, then we have to manually keep track of our slice index and created date for sorting, and the whole approach could get messy.

Django's contenttypes framework is really a good option for this use case. From the docs: At the heart of the contenttypes application is the ContentType model, which lives at django.contrib.contenttypes.models.ContentType. Instances of ContentType represent and store information about the models installed in your project, and new instances of ContentType are automatically created whenever new models are installed. I would urge you to read up more on it

From the docs:

Adding a foreign key from one of your own models to ContentType allows your model to effectively tie itself to another model class.

So we add a new model to our models called car which uses the Generic Relations.

class Car(models.Model):
    content_type = models.ForeignKey(ContentType)
    object_id = models.PositiveIntegerField()
    content_object = GenericForeignKey('content_type', 'object_id')
    created = models.DateTimeField()
    class Meta:
        ordering = ['-created']
    def __str__(self):
        return "{0} - {1}".format(self.content_object.series,
                                  self.created.date())

We then update our models and define a post save handler.

def create_car(sender, instance, created, **kwargs):
    """
    Post save handler to create/update car instances when
    Bmw or Tesla is created/updated
    """
    content_type = ContentType.objects.get_for_model(instance)
    try:
        car= Car.objects.get(content_type=content_type,
                             object_id=instance.id)
    except Car.DoesNotExist:
        car = Car(content_type=content_type, object_id=instance.id)
    car.created = instance.created
    car.series = instance.series
    car.save()

And we add the post save handler to our Tesla model.

class Tesla(models.Model):
    series = models.CharField(max_length=50)
    created = models.DateTimeField()
    class Meta:
        ordering = ['-created']
    def __str__(self):
        return "{0} - {1}".format(self.series, self.created.date())
post_save.connect(create_car, sender=Tesla)

(and similarly added for Bmw model not show for brevity) So now every time an instance of Tesla or Bmw is created or updated, the corresponding Car model instance gets updated.

Here is an updated query using the contentypes framework that we just set up. Notice how we have both Bmw and Tesla objects returned as Car instances.

>>> Car.objects.filter()
[<Car: Tesla Series 2 - 2015-03-29>, <Car: Bmw Series 1 - 2013-08-04>, <Car: Tesla Series 1 - 2011-09-10>, <Car: Bmw Series 2 - 2010-01-15>]

Here we have returned car objects, so here is how we get to the actual car type a Car instance holds.

>>> car = Car.objects.first()
>>> car.content_object
<Tesla: Tesla Series 2 - 2015-03-29>
>>> car.content_object.series
u'Tesla Series 2'

Although this approach has an overhead of an extra table, for larger query sets I fell this is a cleaner approach.

Let me know what you think or if you have any questions in the comments below.

Comments

Popular posts from this blog

Documentation is Very vital before you develop any system or app

Everything you need to know when developing an on demand service app

Steps followed when creating a new software