Python Collections
Last updated on 26th Sep 2020, Artciles, Blog
Collections in Python are containers that are used to store collections of data, for example, list, dict, set, tuple etc. These are built-in collections. Several modules have been developed that provide additional data structures to store collections of data. One such module is the Python collections module.
The most commonly used data structures from the Python collections module are as follows:
- Counter
- defaultdict
- OrderedDict
- deque
- ChainMap
- namedtuple()
The Counter
Counter is a subclass of dictionary object. The Counter() function in the collections module takes an iterable or a mapping as the argument and returns a Dictionary. In this dictionary, a key is an element in the iterable or the mapping and value is the number of times that element exists in the iterable or the mapping.
Import the Counter class before you can create a counter instance.
Subscribe For Free Demo
Error: Contact form not found.
from collections import Counter
Create Counter Objects
There are multiple ways to create counter objects. The simplest way is to use the Counter() function without any arguments.
cnt = Counter()
Pass an iterable (list) to Counter() function to create a counter object.
- list = [1,2,3,4,1,2,6,7,3,8,1]
- Counter(list)
Finally, the Counter() function can take a dictionary as an argument. In this dictionary, the value of a key should be the ‘count’ of that key.
Counter({1:3,2:4})
You can access any counter item with its key as shown below:
- list = [1,2,3,4,1,2,6,7,3,8,1]
- cnt = Counter(list)
- print(cnt[1])
when you print cnt[1], you will get the count of 1.
Output:
3
In the above examples, cnt is an object of Counter class which is a subclass of dict. So it has all the methods of dict class.
Apart from that, Counter has three additional functions:
- Elements
- Most_common([n])
- Subtract([interable-or-mapping])
The element() Function
You can get the items of a Counter object with elements() function. It returns a list containing all the elements in the Counter object.
example:
- cnt = Counter({1:3,2:4})
- print(list(cnt.elements()))
Output:
[1, 1, 1, 2, 2, 2, 2]
Here, we create a Counter object with a dictionary as an argument. In this Counter object, count of 1 is 3 and count of 2 is 4. The elements() function is called using cnt object which returns an iterator which is passed as an argument to the list.
The iterator repeats 3 times over 1 returning three ‘1’s, and repeats four times over 2 returning four ‘2’s to the list. Finally, the list is printed using the print function.
The most_common() Function
The Counter() function returns a dictionary which is unordered. You can sort it according to the number of counts in each element using most_common() function of the Counter object.
- list = [1,2,3,4,1,2,6,7,3,8,1]
- cnt = Counter(list)
- print(cnt.most_common())
Output:
[(1, 3), (2, 2), (3, 2), (4, 1), (6, 1), (7, 1), (8, 1)]
You can see that most_common function returns a list, which is sorted based on the count of the elements. 1 has a count of three, therefore it is the first element of the list.
The subtract() Function
The subtract() takes an iterable (list) or a mapping (dictionary) as an argument and deducts elements count using that argument.
example:
- cnt = Counter({1:3,2:4})
- deduct = {1:1, 2:2}
- cnt.subtract(deduct)
- print(cnt)
Output:
Counter({1: 2, 2: 2})
You can notice that cnt object we first created has a count of 3 for ‘1’ and a count of 4 for ‘2’. The deduct dictionary has the value ‘1’ for key ‘1’ and value ‘2’ for key ‘2’. The subtract() function deducted 1 count from key ‘1’ and 2 counts from key ‘2’.
The defaultdict
The defaultdict works exactly like a python dictionary, except for it does not throw KeyError when you try to access a non-existent key.
Instead, it initializes the key with the element of the data type that you pass as an argument at the creation of defaultdict. The data type is called default_factory.
Import defaultdict
First, you have to import defaultdict from collections module before using it:
from collections import defaultdict
Create a defaultdict
You can create a defaultdict with the defaultdict() constructor. You have to specify a data type as an argument. Check the following code:
- nums = defaultdict(int)
- nums[‘one’] = 1
- nums[‘two’] = 2
- print(nums[‘three’])
Output:
0
In this example, int is passed as the default_factory. Notice that you only pass int, not int(). Next, the values are defined for the two keys, namely, ‘one’ and ‘two’, but in the next line we try to access a key that has not been defined yet.
In a normal dictionary, this will force a KeyError. But defaultdict initializes the new key with default_factory’s default value which is 0 for int. Hence, when the program is executed, and 0 will be printed. This particular feature of initializing non-existent keys can be exploited in various situations.
For example, let’s say you want the count of each name in a list of names given as “Mike, John, Mike, Anna, Mike, John, John, Mike, Mike, Britney, Smith, Anna, Smith”.
from collections import defaultdict
- count = defaultdict(int)
- names_list = “Mike John Mike Anna Mike John John Mike Mike Britney Smith Anna Smith”.split()
- for names in names_list:
- count[names] +=1
- print(count)
Output:
defaultdict(<class ‘int’>, {‘Mike’: 5, ‘Britney’: 1, ‘John’: 3, ‘Smith’: 2, ‘Anna’: 2})
First, we create a defaultdict with int as default_factory. The names_list includes a set of names which repeat several times. The split() function returns a list from the given string. It breaks the string whenever a white-space is encountered and returns words as elements of the list. In the loop, each item in the list is added to the defaultdict named as count and initialized to 0 based on default_factory. If the same element is encountered again, as loop continues, count of that element will be incremented.
The OrderedDict
OrderedDict is a dictionary where keys maintain the order in which they are inserted, which means if you change the value of a key later, it will not change the position of the key.
Import OrderedDict
To use OrderedDict you have to import it from the collections module.
from collections import OrderedDict
Create a OrderedDict
You can create an OrderedDict object with OrderedDict() constructor. In the following code, You create an OrderedDict without any arguments. After that some items are inserted into it.
- od = OrderedDict()
- od[‘a’] = 1
- od[‘b’] = 2
- od[‘c’] = 3
- print(od)
Output:
OrderedDict([(‘a’, 1), (‘b’, 2), (‘c’, 3)])
You can access each element using a loop as well. Take a look at the following code:
- for key, value in od.items():
- print(key, value)
Output:
a 1
b 2
c 3
Following example is an interesting use case of OrderedDict with Counter. Here, we create a Counter from a list and insert element to an OrderedDict based on their count.
Most frequently occurring letter will be inserted as the first key and the least frequently occurring letter will be inserted as the last key.
- list = [“a”,”c”,”c”,”a”,”b”,”a”,”a”,”b”,”c”]
- cnt = Counter(list)
- od = OrderedDict(cnt.most_common())
- for key, value in od.items():
- print(key, value)
Output:
a 4
c 3
b 2
The deque
The deque is a list optimized for inserting and removing items.
Import the deque
You have to import the deque class from collections module before using it.
from collections import deque
Creating a deque
You can create a deque with the deque() constructor. You have to pass a list as an argument.
- list = [“a”,”b”,”c”]
- deq = deque(list)
- print(deq)
Output:
deque([‘a’, ‘b’, ‘c’])
Inserting Elements to deque
You can easily insert an element to the deq we created at either of the ends. To add an element to the right of the deque, you have to use append() method.
If you want to add an element to the start of the deque, you have to use the appendleft() method.
- deq.append(“d”)
- deq.appendleft(“e”)
- print(deq)deque
Output:
deque([‘e’, ‘a’, ‘b’, ‘c’, ‘d’])
You can notice that d is added at the end of deq and e is added to the start of the deq
Removing Elements from the deque
Removing elements is similar to inserting elements. You can remove an element the similar way you insert elements. To remove an element from the right end, you can use pop() function and to remove an element from left, you can use popleft().
- deq.pop()
- deq.popleft()
- print(deq)
Output:
deque([‘a’, ‘b’, ‘c’])
You can notice that both the first and last elements are removed from the deq.
Clearing a deque
If you want to remove all elements from a deque, you can use the clear() function.
- list = [“a”,”b”,”c”]
- deq = deque(list)
- print(deq)
- print(deq.clear())
Output:
deque([‘a’, ‘b’, ‘c’])
None
You can see in the output, at first there is a queue with three elements. Once we apply the clear() function, the deque is cleared and you see none in the output.
Counting Elements in a deque
If you want to find the count of a specific element, use count(x) function. You have to specify the element for which you need to find the count, as the argument.
- list = [“a”,”b”,”c”]
- deq = deque(list)
- print(deq.count(“a”))
Output:
1
In the above example, the count of ‘a’ is 1. Hence ‘1’ printed.
The ChainMap
ChainMap is used to combine several dictionaries or mappings. It returns a list of dictionaries.
Import chainmap
You have to import ChainMap from the collections module before using it.
from collections import ChainMap
Create a ChainMap
To create a chainmap we can use the ChainMap() constructor. We have to pass the dictionaries we are going to combine as an argument set.
- dict1 = { ‘a’ : 1, ‘b’ : 2 }
- dict2 = { ‘c’ : 3, ‘b’ : 4 }
- chain_map = ChainMap(dict1, dict2)
- print(chain_map.maps)
Output:
[{‘b’: 2, ‘a’: 1}, {‘c’: 3, ‘b’: 4}]
You can see a list of dictionaries as the output. You can access chain map values by key name.
print(chain_map[‘a’])
Output:
1
‘1’ is printed as the value of key ‘a’ is 1. Another important point is ChainMap updates its values when its associated dictionaries are updated. For example, if you change the value of ‘c’ in dict2 to ‘5’, you will notice the change in ChainMap as well.
- dict2[‘c’] = 5
- print(chain_map.maps)
Output:
[{‘a’: 1, ‘b’: 2}, {‘c’: 5, ‘b’: 4}]
Getting Keys and Values from ChainMap
You can access the keys of a ChainMap with keys() function. Similarly, you can access the values of elements with values() function, as shown below:
- dict1 = { ‘a’ : 1, ‘b’ : 2 }
- dict2 = { ‘c’ : 3, ‘b’ : 4 }
- chain_map = ChainMap(dict1, dict2)
- print (list(chain_map.keys()))
- print (list(chain_map.values()))
Output:
[‘b’, ‘a’, ‘c’]
[2, 1, 3]
Notice that the value of the key ‘b’ in the output is the value of key ‘b’ in dict1. As a rule of thumb, when one key appears in more than one associated dictionaries, ChainMap takes the value for that key from the first dictionary.
Adding a New Dictionary to ChainMap
If you want to add a new dictionary to an existing ChainMap, use new_child() function. It creates a new ChainMap with the newly added dictionary.
- dict3 = {‘e’ : 5, ‘f’ : 6}
- new_chain_map = chain_map.new_child(dict3)
- print(new_chain_map)
Output:
ChainMap({‘f’: 6, ‘e’: 5}, {‘a’: 1, ‘b’: 2}, {‘b’: 4, ‘c’: 3})
Notice that a new dictionary is added to the beginning of the ChainMap list.
The namedtuple()
The namedtuple() returns a tuple with names for each position in the tuple. One of the biggest problems with ordinary tuples is that you have to remember the index of each field of a tuple object. This is obviously difficult. The namedtuple was introduced to solve this problem.
Import namedtuple
Before using namedtuple, you have to import it from the collections module.
- from collections import namedtuple
- Student = namedtuple(‘Student’, ‘fname, lname, age’)
- s1 = Student(‘John’, ‘Clarke’, ’13’)
- print(s1.fname)
Output:
Student(fname=’John’, lname=’Clarke’, age=’13’)
In this example, a namedtuple object Student has been declared. You can access the fields of any instance of a Student class by the defined field name.
Creating a namedtuple Using List
The namedtuple() function requires each value to be passed to it separately. Instead, you can use _make() to create a namedtuple instance with a list. Check the following code:
- s2 = Student._make([‘Adam’,’joe’,’18’])
- print(s2)
Output:
Student(fname=’Adam’, lname=’joe’, age=’18’)
Create a New Instance Using Existing Instance
The _asdict() function can be used to create an OrderedDict instance from an existing instance.
- s2 = s1._asdict()
- print(s2)
Output:
OrderedDict([(‘fname’, ‘John’), (‘lname’, ‘Clarke’), (‘age’, ’13’)])
Changing Field Values with _replace() Function
To change the value of a field of an instance, the _replace() function is used. Remember that, _replace() function creates a new instance. It does not change the value of an existing instance.
- s2 = s1._replace(age=’14’)
- print(s1)
- print(s2)
Output:
Student(fname=’John’, lname=’Clarke’, age=’13’)
Student(fname=’John’, lname=’Clarke’, age=’14’)
Conclusion
With that, we concludel on the Collections module. We have discussed all the important topics in the collection module. The Python collection module still needs improvements if we compare it with Java’s Collection library. Therefore, we can expect a lot of changes in upcoming versions.
Are you looking training with Right Jobs?
Contact Us- Top Python Framework’s
- Python Interview Questions and Answers
- Python Tutorial
- Advantages and Disadvantages of Python Programming Language
- Python Career Opportunities
Related Articles
Popular Courses
- Java Online Training
11025 Learners
- Nodejs Certification Training
12022 Learners
- Ruby Online Training
11141 Learners
- What is Dimension Reduction? | Know the techniques
- Difference between Data Lake vs Data Warehouse: A Complete Guide For Beginners with Best Practices
- What is Dimension Reduction? | Know the techniques
- What does the Yield keyword do and How to use Yield in python ? [ OverView ]
- Agile Sprint Planning | Everything You Need to Know