Data Structures Cheat Sheet with Python Tutorial
Last updated on 26th Sep 2020, Blog, Tutorials
Data Structure is a collection of data types and set of rules with a format of organizing, managing and storage which can be used for efficient accessing and modification. Data structures are used in every field for storing and organizing data in the computer.
Data structures are fundamental concepts of computer science which helps is writing efficient programs in any language. Python is a high-level, interpreted, interactive and object-oriented scripting language using which we can study the fundamentals of data structure in a simpler way as compared to other programming languages.
In this chapter we are going to study a short overview of some frequently used data structures in general and how they are related to some specific python data types. There are also some data structures specific to python which is listed as another category.
General Data Structures
The various data structures in computer science are divided broadly into two categories shown below. We will discuss about each of the below data structures in detail in subsequent chapters.
Linear Data Structures
These are the data structures which store the data elements in a sequential manner.
- Array: It is a sequential arrangement of data elements paired with the index of the data element.
- Linked List: Each data element contains a link to another element along with the data present in it.
- Stack: It is a data structure which follows only to specific order of operation. LIFO(last in First Out) or FILO(First in Last Out).
- Queue: It is similar to Stack but the order of operation is only FIFO(First In First Out).
- Matrix: It is two dimensional data structure in which the data element is referred by a pair of indices.
Non-Linear Data Structures
These are the data structures in which there is no sequential linking of data elements. Any pair or group of data elements can be linked to each other and can be accessed without a strict sequence.
- Binary Tree: It is a data structure where each data element can be connected to maximum two other data elements and it starts with a root node.
- Heap: It is a special case of Tree data structure where the data in the parent node is either strictly greater than/ equal to the child nodes or strictly less than it’s child nodes.
- Hash Table: It is a data structure which is made of arrays associated with each other using a hash function. It retrieves values using keys rather than index from a data element.
- Graph: .It is an arrangement of vertices and nodes where some of the nodes are connected to each other through links.
Python Specific Data Structures
These data structures are specific to python language and they give greater flexibility in storing different types of data and faster processing in python environment.
- List: It is similar to array with the exception that the data elements can be of different data types. You can have both numeric and string data in a python list.
- Tuple: Tuples are similar to lists but they are immutable which means the values in a tuple cannot be modified they can only be read.
- Dictionary: The dictionary contains Key-value pairs as its data elements.
In the next chapters we are going to learn the details of how each of these data structures can be implemented using Python.
Data Structures Cheat Sheet
This cheat sheet uses the Big O notation to express time complexity.
- For a reminder on Big O, see Understanding Big O Notation and Algorithmic Complexity.
- For a quick summary of complexity for common data structure operations, see the Big-O Algorithm Complexity Cheat Sheet.
Array
- Quick summary: a collection that stores elements in order and looks them up by index.
- Also known as: fixed array, static array.
- Important facts:
-
- Stores elements sequentially, one after another.
- Each array element has an index. Zero-based indexing is used most often: the first index is 0, the second is 1, and so on.
- Is created with a fixed size. Increasing or decreasing the size of an array is impossible.
- Can be one-dimensional (linear) or multi-dimensional.
- Allocates contiguous memory space for all its elements.
- Ensures constant time access by index.
- Constant time append (insertion at the end of an array).
- Fixed size that can’t be changed.
- Search, insertion and deletion are O(n). After insertion or deletion, all subsequent elements are moved one index further.
- Can be memory intensive when capacity is underused.
- The String data type that represents text is implemented in programming languages as an array that consists of a sequence of characters plus a terminating character.
- Access: O(1)
- Search: O(n)
- Insertion: O(n) (append: O(1))
- Deletion: O(n)
Data Type and Data Structures
As you read in the introduction, data structures help you to focus on the bigger picture rather than getting lost in the details. This is known as data abstraction. Now, data structures are actually an implementation of Abstract Data Types or ADT. This implementation requires a physical view of data using some collection of programming constructs and basic data types.
Generally, data structures can be divided into two categories in computer science: primitive and non-primitive data structures. The former are the simplest forms of representing data, whereas the latter are more advanced: they contain the primitive data structures within more complex data structures for special purposes.
Subscribe For Free Demo
Error: Contact form not found.
Primitive Data Structures
These are the most primitive or the basic data structures. They are the building blocks for data manipulation and contain pure, simple values of a data. Python has four primitive variable types:
- Integers
- Float
- Strings
- Boolean
Data Structure 1: Lists in Python
Lists in Python are the most versatile data structure. They are used to store heterogeneous data items, from integers to strings or even another list! They are also mutable, which means that their elements can be changed even after the list is created.
Creating Lists
Lists are created by enclosing elements within [square] brackets and each item is separated by a comma:
Since each element in a list has its own distinct position, having duplicate values in a list is not a problem:
Accessing List elements
To access elements of a list, we use Indexing. Each element in a list has an index related to it depending on its position in the list. The first element of the list has the index 0, the next element has index 1, and so on. The last element of the list has an index of one less than the length of the list.
But indexes don’t always have to be positive, they can be negative too. What do you think negative indexes indicate?
While positive indexes return elements from the start of the list, negative indexes return values from the end of the list. This saves us from the trivial calculation which we would have to otherwise perform if we wanted to return the nth element from the end of the list. So instead of trying to return List_name[len(List_name)-1] element, we can simply write List_name[-1].
Using negative indexes, we can return the nth element from the end of the list easily. If we wanted to return the first element from the end, or the last index, the associated index is -1. Similarly, the index for the second last element will be -2, and so on. Remember, the 0th index will still refer to the very first element in the list.
But what if we wanted to return a range of elements between two positions in the lists? This is called Slicing. All we have to do is specify the start and end index within which we want to return all the elements – List_name[start : end].
One important thing to remember here is that the element at the end index is never included. Only elements from start index till index equaling end-1 will be returned.
Appending values in Lists
We can add new elements to an existing list using the append() or insert() methods:
- append() – Adds an element to the end of the list
- insert() – Adds an element to a specific position in the list which needs to be specified along with the value
Removing elements from Lists
Removing elements from a list is as easy as adding them and can be done using the remove() or pop() methods:
- remove() – Removes the first occurrence from the list that matches the given value
- pop() – This is used when we want to remove an element at a specified index from the list. However, if we don’t provide an index value, the last element will be removed from the list
Sorting Lists
Most of the time, you will be using a list to sort elements. So it is very important to know about the sort() method. It lets you sort list elements in-place in either ascending or descending order:
But where things get a bit tricky is when you want to sort a list containing string elements. How do you compare two strings? Well, string values are sorted using ASCII values of the characters in the string. Each character in the string has an integer value associated with it. We use these values to sort the strings.
On comparing two strings, we just compare the integer values of each character from the beginning. If we encounter the same characters in both the strings, we just compare the next character until we find two differing characters. It is, of course, done internally so you don’t have to worry about it!
Concatenating Lists
We can even concatenate two or more lists by simply using the + symbol. This will return a new list containing elements from both the lists:
List comprehensions
A very interesting application of Lists is List comprehension which provides a neat way of creating new lists. These new lists are created by applying an operation on each element of an existing list. It will be easy to see their impact if we first check out how it can be done using the good old for-loops:
Now, we will see how we can concisely perform this operation using list comprehensions:
See the difference? List comprehensions are a useful asset for any data scientist because you have to write concise and readable code on a daily basis!
Stacks & Queues using Lists
A list is an in-built data structure in Python. But we can use it to create user-defined data structures. Two very popular user-defined data structures built using lists are Stacks and Queues.
Stacks are a list of elements in which the addition or deletion of elements is done from the end of the list. Think of it as a stack of books. Whenever you need to add or remove a book from the stack, you do it from the top. It uses the simple concept of Last-In-First-Out.
Queues, on the other hand, are a list of elements in which the addition of elements takes place at the end of the list, but the deletion of elements takes place from the front of the list. You can think of it as a queue in the real-world. The queue becomes shorter when people from the front exit the queue. The queue becomes longer when someone new adds to the queue from the end. It uses the concept of First-In-First-Out.
Now, as a data scientist or an analyst, you might not be employing this concept every day, but knowing it will surely help you when you have to build your own algorithm!
Data Structure 2: Tuples in Python
Tuples are another very popular in-built data structure in Python. These are quite similar to Lists except for one difference – they are immutable. This means that once a tuple is generated, no value can be added, deleted, or edited.
We will explore this further, but let’s first see how you can create a Tuple in Python!
Creating Tuples in Python
Tuples can be generated by writing values within (parentheses) and each element is separated by a comma. But even if you write a bunch of values without any parenthesis and assign them to a variable, you will still end up with a tuple! Have a look for yourself:
Ok, now that we know how to create tuples, let’s talk about immutability.
Immutability of Tuples
Anything that cannot be modified after creation is immutable in Python. Python language can be broken down into mutable and immutable objects.
Lists, dictionaries, sets (we will be exploring these in the further sections) are mutable objects, meaning they can be modified after creation. On the other hand integers, floating values, boolean values, strings, and even tuples are immutable objects. But what makes them immutable?
Everything in Python is an object. So we can use the in-built id() method which gives us the ability to check the memory location of an object. This is known as the identity of the object. Let’s create a list and determine the location of the list and its elements:
As you can see, both the list and its element have different locations in memory. Since we know lists are mutable, we can alter the value of its elements. Let’s do that and see how it affects the location values:
The location of the list did not change but that of the element did. This means that a new object was created for the element and saved in the list. This is what is meant by mutable. A mutable object is able to change its state, or contents, after creation but an immutable object is not able to do that.
But we can call tuples pseudo-immutable because even though they are immutable, they can contain mutable objects whose values can be modified!
As you can see from the example above, we were able to change the values of an immutable object, list, contained within a tuple.
Tuple assignment
Tuple packing and unpacking are some useful operations that you can perform to assign values to a tuple of elements from another tuple in a single line.
We already saw tuple packing when we made our planet tuple. Tuple unpacking is just the opposite-assigning values to variables from a tuple:
It is very useful for swapping values in a single line. Honestly, this was one of the first things that got me excited about Python, being able to do so much with such little coding!
Changing Tuple values
Although I said that tuple values cannot be changed, you can actually make changes to it by converting it to a list using list(). And when you are done making the changes, you can again convert it back to a tuple using tuple().
This change, however, is expensive as it involves making a copy of the tuple. But tuples come in handy when you don’t want others to change the content of the data structure.
Data Structure 3: Dictionary in Python
Dictionary is another Python data structure to store heterogeneous objects that are immutable but unordered. This means that when you try to access the elements, they might not be in exactly the order as the one you inserted them in.
But what sets dictionaries apart from lists is the way elements are stored in it. Elements in a dictionary are accessed via their key values instead of their index, as we did in a list. So dictionaries contain key-value pairs instead of just single elements.
Generating Dictionary
Dictionaries are generated by writing keys and values within a { curly } bracket separated by a semi-colon. And each key-value pair is separated by a comma:
Using the key of the item, we can easily extract the associated value of the item:
These keys are unique. But even if you have a dictionary with multiple items with the same key, the item value will be the one associated with the last key:
Dictionaries are very useful to access items quickly because, unlike lists and tuples, a dictionary does not have to iterate over all the items finding a value. Dictionary uses the item key to quickly find the item value. This concept is called hashing.
Accessing keys and values
You can access the keys from a dictionary using the keys() method and the values using the values() method. These we can view using a for-loop or turn them into a list using list():
We can even access these values simultaneously using the items() method which returns the respective key and value pair for each element of the dictionary.
Data Structure 4: Sets in Python
Sometimes you don’t want multiple occurrences of the same element in your list or tuple. It is here that you can use a set data structure. Set is an unordered, but mutable, collection of elements that contains only unique values.
You will see that the values are not in the same order as they were entered in the set. This is because sets are unordered.
Add and Remove elements from a Set
To add values to a set, use the add() method. It lets you add any value except mutable objects:
To remove values from a set, you have two options to choose from:
- The first is the remove() method which gives an error if the element is not present in the Set
- The second is the discard() method which removes elements but gives no error when the element is not present in the Set
If the value does not exist, remove() will give an error but discard() won’t.
Set operations
Using Python Sets, you can perform operations like union, intersection and difference between two sets, just like you would in mathematics.
Union of two sets gives values from both the sets. But the values are unique. So if both the sets contain the same value, only one copy will be returned:
Intersection of two sets returns only those values that are common to both the sets:
Difference of a set from another gives only those values that are not present in the first set:
CONCLUSION
Isn’t Python a beautiful language? It provides you with so many options to handle your data more efficiently. And learning about data structures in Python is a key aspect of your own learning journey.
Are you looking training with Right Jobs?
Contact Us- Top Python Framework’s
- Python Interview Questions and Answers
- Python Tutorial
- Why Python Is Essential for Data Analysis and Data Science
- Advantages and Disadvantages of Python Programming Language
Related Articles
Popular Courses
- C And C Plus Plus Online Training
11025 Learners
- Java Online
Training
12022 Learners
- C Sharp Online Training
11141 Learners
- What is Dimension Reduction? | Know the techniques
- Difference between Data Lake vs Data Warehouse: A Complete Guide For Beginners with Best Practices
- What is Dimension Reduction? | Know the techniques
- What does the Yield keyword do and How to use Yield in python ? [ OverView ]
- Agile Sprint Planning | Everything You Need to Know