October 13, 2024

Python Collections Module

The collections module in Python provides alternatives to Python’s general-purpose built-in containers like dictionaries, lists, sets, and tuples. It includes several specialized container datatypes that offer additional functionality and flexibility.

1. namedtuple

The namedtuple function returns a tuple subclass with named fields, making the code more readable and self-documenting.

Example:

from collections import namedtuple

# Define a named tuple type
Point = namedtuple('Point', ['x', 'y'])

# Create an instance of Point
p = Point(10, 20)

print(p.x)  # Output: 10
print(p.y)  # Output: 20

2. deque

A deque (double-ended queue) is a generalization of stacks and queues that supports adding and removing elements from either end efficiently.

Example:

from collections import deque

# Initialize a deque
d = deque([1, 2, 3])

# Append to the right
d.append(4)
print(d)  # Output: deque([1, 2, 3, 4])

# Append to the left
d.appendleft(0)
print(d)  # Output: deque([0, 1, 2, 3, 4])

# Pop from the right
d.pop()
print(d)  # Output: deque([0, 1, 2, 3])

# Pop from the left
d.popleft()
print(d)  # Output: deque([1, 2, 3])

3. Counter

The Counter is a dictionary subclass that is used to count hashable objects. It is particularly useful for counting occurrences of elements in an iterable.

Example:

from collections import Counter

# Count the frequency of elements in a list
counter = Counter(['apple', 'banana', 'apple', 'orange', 'banana', 'apple'])
print(counter)
# Output: Counter({'apple': 3, 'banana': 2, 'orange': 1})

# Get the most common elements
print(counter.most_common(2))
# Output: [('apple', 3), ('banana', 2)]

4. OrderedDict

An OrderedDict is a dictionary subclass that remembers the order in which keys were inserted. This can be particularly useful when you need to maintain the insertion order of items.

Example:

from collections import OrderedDict

# Initialize an OrderedDict
ordered_dict = OrderedDict()

# Add some items
ordered_dict['a'] = 1
ordered_dict['b'] = 2
ordered_dict['c'] = 3

# Print the ordered dictionary
print(ordered_dict)
# Output: OrderedDict([('a', 1), ('b', 2), ('c', 3)])

5. defaultdict

The defaultdict is a dictionary subclass that provides a default value for a nonexistent key. This prevents KeyError when accessing keys that do not exist.

Example:

from collections import defaultdict

# Initialize a defaultdict with a default type of list
default_dict = defaultdict(list)

# Access a nonexistent key and append to it
default_dict['key'].append('value')

print(default_dict)
# Output: defaultdict(<class 'list'>, {'key': ['value']})

6. ChainMap

The ChainMap class groups multiple dictionaries into a single view. It allows you to search through multiple dictionaries as if they were a single one.

Example:

from collections import ChainMap

# Create two dictionaries
dict1 = {'a': 1, 'b': 2}
dict2 = {'c': 3, 'd': 4}

# Create a ChainMap
chain = ChainMap(dict1, dict2)

print(chain)
# Output: ChainMap({'a': 1, 'b': 2}, {'c': 3, 'd': 4})

# Access values
print(chain['a'])  # Output: 1
print(chain['c'])  # Output: 3

7. UserDict, UserList, and UserString

These are wrapper classes around standard dictionary, list, and string objects. They can be used to create subclasses that add or modify functionality. They are useful for customizing the behavior of these built-in types.

Example with UserDict:

from collections import UserDict

# Create a custom dictionary that converts all keys to lowercase
class MyDict(UserDict):
    def __setitem__(self, key, value):
        key = key.lower()
        super().__setitem__(key, value)

# Initialize MyDict
my_dict = MyDict()
my_dict['A'] = 10
print(my_dict)
# Output: {'a': 10}

The collections module in Python provides a rich set of specialized container datatypes that can improve your code’s performance, readability, and functionality. By using these advanced collections, you can write cleaner and more efficient code for tasks that go beyond the capabilities of the standard list, dictionary, and tuple objects.