Fix Python – Count vs len on a Django QuerySet

Question

Asked By – antonagestam

In Django, given that I have a QuerySet that I am going to iterate over and print the results of, what is the best option for counting the objects? len(qs) or qs.count()?

(Also given that counting the objects in the same iteration is not an option.)

Now we will see solution for issue: Count vs len on a Django QuerySet


Answer

Choosing between len() and count() depends on the situation and it’s worth to deeply understand how they work to use them correctly.

Let me provide you with a few scenarios:

  1. (most crucial) When you only want to know the number of elements and you do not plan to process them in any way it’s crucial to use count():

DO: queryset.count() – this will perform single SELECT COUNT(*) FROM some_table query, all computation is carried on RDBMS side, Python just needs to retrieve the result number with fixed cost of O(1)

DON’T: len(queryset) – this will perform SELECT * FROM some_table query, fetching whole table O(N) and requiring additional O(N) memory for storing it. This is the worst that can be done

  1. When you intend to fetch the queryset anyway it’s slightly better to use len() which won’t cause an extra database query as count() would

len() (one db query)

    len(queryset) # SELECT * fetching all the data - NO extra cost - data would be fetched anyway in the for loop

    for obj in queryset: # data is already fetched by len() - using cache
        pass

count() (two db queries!):

    queryset.count() # First db query SELECT COUNT(*)

    for obj in queryset: # Second db query (fetching data) SELECT *
        pass
  1. Reverted 2nd case (when queryset has already been fetched):

     for obj in queryset: # iteration fetches the data
         len(queryset) # using already cached data - O(1) no extra cost
         queryset.count() # using cache - O(1) no extra db query
    
     len(queryset) # the same O(1)
     queryset.count() # the same: no query, O(1)
    

Everything will be clear once you take a glance “under the hood”:

class QuerySet(object):
   
    def __init__(self, model=None, query=None, using=None, hints=None):
        # (...)
        self._result_cache = None
 
    def __len__(self):
        self._fetch_all()
        return len(self._result_cache)
 
    def _fetch_all(self):
        if self._result_cache is None:
            self._result_cache = list(self.iterator())
        if self._prefetch_related_lookups and not self._prefetch_done:
            self._prefetch_related_objects()
 
    def count(self):
        if self._result_cache is not None:
            return len(self._result_cache)
 
        return self.query.get_count(using=self.db)

Good references in Django docs:

This question is answered By – Krzysiek

This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0