collections#

Allows date types that enhance standard Python collections.

defaultdict#

A type that represents a dictionary to which a key is automatically added when an attempt is made to access that key.

Basics#

You need to pass a function that will return the items that will be used as default values.

In the following example, I have created a dictionary where when I try to access an unknown key in the dictionary, an int is created, one more than the last time.

from collections import defaultdict

counter = 0

def dict_filler():
    global counter
    counter += 1
    return counter

my_defaultdict = defaultdict(dict_filler)

print("before cycle ", dict(my_defaultdict))

for i in range(5):
    print(f"my_defaultdict[\"element{i}\"] == ", my_defaultdict[f"element{i}"])

print("after cycle ", dict(my_defaultdict))
before cycle  {}
my_defaultdict["element0"] ==  1
my_defaultdict["element1"] ==  2
my_defaultdict["element2"] ==  3
my_defaultdict["element3"] ==  4
my_defaultdict["element4"] ==  5
after cycle  {'element0': 1, 'element1': 2, 'element2': 3, 'element3': 4, 'element4': 5}

A more applied example is calculating the frequencies of random integers. It is enough to create a defaultdict which by default uses 0 as a value for an unknown key. Without any initialisation, you can just use the += operator for the dictionary, where each key corresponds to the number of times a given number is used.

import random
from collections import defaultdict
my_defaultdict = defaultdict(int)

for i in range(500):
    my_defaultdict[random.randint(1, 10)] += 1

sorted_dict = dict(sorted(my_defaultdict.items(), key=lambda x: x[0]))
print(sorted_dict)
{1: 47, 2: 54, 3: 43, 4: 52, 5: 46, 6: 57, 7: 47, 8: 57, 9: 52, 10: 45}

To/from dict#

From#

You can convert a regular dict to the defaultdict simply by passing it as the second argument to the defalutdict function.

So in the following example I will convert initial_dict to default_dict and then show that it has got properties of defaultdcit.

from collections import defaultdict
initial_dict = {"key1": 1, "key2":3}
default_dict = defaultdict(int, initial_dict)
my_defalut_value = default_dict["key3"]
print(default_dict)
defaultdict(<class 'int'>, {'key1': 1, 'key2': 3, 'key3': 0})

To#

Just pass defaultdict to the dict function to get it as a regular Python dict.

from collections import defaultdict
default_dict = defaultdict(int)
# create dict using defaultdict mechanisms
for i in range(10): default_dict[i]
# but for clearer printing, let's convert it 
# into a regular dictionary
dict(default_dict)
{0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}

Counter#

This is a datatype that allows you to count items in the collection.

Basic#

You will have the result just as dictionary with items <value in input collection> : <number of occurrences of an element in the collection>.

The following example shows how Counter is applied to a list of frequently recurring items.

from random import randint
from collections import Counter

input_lst = [randint(1,4) for i in range(10)]
print("Initial list:", input_lst)
Counter(input_lst)
Initial list: [2, 2, 4, 1, 1, 2, 4, 2, 2, 2]
Counter({2: 6, 4: 2, 1: 2})

dict properties#

Counter objects have all the properties of a normal dictionary.

So in the following example I show that Counter can be easily accessed by inedex.

from random import randint
from collections import Counter

input_lst = [randint(1,4) for i in range(10)]
print("Initial list:", input_lst)
my_counter = Counter(input_lst)
print(f"You can find '{2}' element {my_counter[2]} times in the input array")
Initial list: [2, 2, 2, 1, 3, 4, 4, 4, 2, 4]
You can find '2' element 4 times in the input array

Unhashable types#

As we know, unhashable types cannot be used as keys in a dictionary. Counter is just a modified dictionary, so you may get trouble trying to apply the Counter to the collection with unhashable items in it:

Counter([[1,2,3], "test"])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[105], line 1
----> 1 Counter([[1,2,3], "test"])

File /usr/lib/python3.10/collections/__init__.py:577, in Counter.__init__(self, iterable, **kwds)
    566 '''Create a new, empty Counter object.  And if given, count elements
    567 from an input iterable.  Or, initialize the count from another mapping
    568 of elements to their counts.
   (...)
    574 
    575 '''
    576 super().__init__()
--> 577 self.update(iterable, **kwds)

File /usr/lib/python3.10/collections/__init__.py:670, in Counter.update(self, iterable, **kwds)
    668             super().update(iterable)
    669     else:
--> 670         _count_elements(self, iterable)
    671 if kwds:
    672     self.update(kwds)

TypeError: unhashable type: 'list'