Suman is a Data Scientist working for a Fortune Top 5 company. His expertise lies in the field of Machine Learning, Time Series & NLP. He has built scalable solutions for retail & manufacturing organisations.

- Home
- Blog
- Data Science
- Data Structures and Algorithms in Python [With Examples]

HomeBlogData ScienceData Structures and Algorithms in Python [With Examples]

Share

Published

05th Sep, 2023

Views

Read TimeRead it in

6 Mins

In this article

Data Structures and Algorithms are one of computer science's most important fields of study. Any app or software which we use has thousands of lines of code written by leveraging several Data Structures and Algorithms. Like any other programming language, Python also provides the flexibility to use simple and complex data structures and algorithms. Refer to the Data Science Bootcamp Curriculum to get a deeper understanding of Data Structures and Algorithms using Python.

Python is one of the most used programming languages in Data Science. The multidimensional nature of Python makes it usable for various types of projects. It is simple, flexible, and provides a rich vein of libraries and packages to perform various data operations.

In terms of Data Science, there are two aspects of Python which are commonly used in real life –

- Python for Data Analysis includes libraries like Pandas, Numpy, Plotly.
- Data Structures in Python include both built-in data structures such as lists, sets, etc., and user-defined data structures such as arrays, linked lists, and so on.

Data Structures allow efficient storing and retrieval of data which is highly desirable in an enterprise setting.

The significance of Data Structures using Python cannot be understated. The Data Science course explains various built-in data structures and their functionalities. Some built-in types of Data Structures in Python are often used in our day-to-day activities. Let’s quickly learn two new terms before we dive ahead:

**Mutable**: Mutable objects can be modified or changed after they are created**Immutable**: Mutable objects cannot be modified or changed after they are created

**1. Tuple:** Tuples are immutable Data Structures in Python. You cannot perform addition, deletion operation in a tuple. Additionally, you cannot modify items once it is already defined in a tuple.

- Syntax - Tuples are created using () or tuple().
- Advantages - Compared to lists, tuples are faster and could also be utilized as keys in dictionary.
- Some notable methods & tips - index(), count(), etc.

**2. List:** List is a sequential data structure which is mutable in nature.

- Syntax - Lists are defined using [] or list().
- Advantages - You can add, subtract, and even modify elements in place in a list.
- Some notable methods & tips - slicing, append(), insert(), pop(), and many more.

**3. Dictionary:** A dictionary data structures follows the concept of key-value pairs. Though the dictionary keys are immutable, the values could be modified.

- Syntax - Dictionaries are created using {} or dict().
- Advantages - It provides a flexible way to store data as one key could have multiple values.
- Some notable methods & tips - keys(), values(), items(), update(), del, etc.

**4. Set:** A lot of times, you need to pass unique data to your program. A set is an unordered collection of items, where multiple items can be stored into a single variable.

- Syntax - Sets are created using set() operator.
- Advantages - Set is such a kind of data structure which lets you store unique elements. The elements in a set in unordered but is mutable in nature.
- Some notable methods & tips - add(), clear(), intersect(), etc.

Apart from the built-in Data Structures we mentioned, several user-defined data structures are used extensively while building large-scale systems in various companies.

**1. Arrays:** You could use the array's data structure to store elements in contiguous memory locations. In an array, each element could be accessed by its index. An array stores elements of the same data type together, wherein adding an offset to the base value gives the position of an element. Given the element's index, it is very fast to retrieve an element.

- Syntax – 1D or nD arrays could be created using numpy.array().
- Advantages - An array stores elements of same data type together wherein adding an offset to the base value gives the position of an element. Given the element’s index, it is very fast to retrieve an element.
- Some notable methods & tips – numpy.ones(), numpy.empty(), numpy.concatenate(), etc.

**2. Stack:** It is a data structure with linear characteristics. A stack could be described as a collection of dishes stacked on top of one another.

- Syntax – collections.deque() or queue.LifoQueue() would create a stack.
- Advantages - Stack follows the LIFO (Last In First Out) principle. The element that is removed first from the stack is the one which has been inserted at the end. It helps in managing the data better.
- Some notable methods & tips – Some of the basic operations of Python Stack are push(), pop(), etc.

**3. Queues:** Similar to a ticket queue outside any movie theater or railway station, the Python Queue data structure follows the First In First Out (FIFO) principle. The item, which is inserted first in this data structure, is the one which is removed first. When you insert elements in a queue, it is termed as enqueue, whereas when it is removed, it’s referred to as dequeue.

- Syntax – queue..Queue() would create a queue in python.
- Advantages - The memory is utilized in a better way. It helps in readability and scalability.
- Some notable methods & tips – Some of the basic operations of Python Stack are get(), empty(), qsize etc.

**4. Trees:** A non-linear Data Structure consisting of nodes which are connected via edges. Some of the notable terminologies in a tree data structure are node, edge, root, depth of a node, height of a node, height of a tree, etc.

- Syntax – A binary tree in python could be defined as.

class BinaryTreeNode: def __init__(self, data): self.data = data self.leftChild = None self.rightChild=None

- Advantages - Provides structural relationship in the data. In real world, the data is not always linear in nature. Hence, using a non-linear data structure like trees allow you to store data efficiently.
- Some notable methods & tips – insert, pre-order traversal, post-order traversal.

**5. Linked Lists:** A linear data structure where a series of nodes are connected via an address. Each node has data and an address that points to the next node. The address of the first node is generally termed as Head, and the last node generally points to NULL. The types of Linked Lists are Single, Double, and Circular.

- Syntax – Linked Lists in python could be defined as follows.

Class Node: def __init__(self, data, next = None): self.data = data self.next = next

- Advantages - It allows dynamic data storage and no memory wastage.
- Some notable methods & tips – insert at the beginning, insert after a given node, delete before a node, etc.

**6. Graphs: **In a Graph Data Structure, all nodes have data and are connected to each other. E.g., On Facebook, everything is connected via a Graph Data Structure. Whenever you join a new group or like a page, a new edge is created to store that relationship within the graph data structure. Some of the key terminologies of this data structure are Adjacency, Path, and Directed Graph.

- Syntax – Graphs in python could be defined as follows.

def generate_edges(graph): edges = [] # for each node in graph for node in graph: # for each neighbour node of a single node for neighbour in graph[node]: # if edge exists then append edges.append((node, neighbour)) return edges

- Advantages - Makes the data more presentable and helps in better summarization of data.
- Some notable methods & tips – Add edge is a notable method in graph.

**7. Hash Maps:** Like a Dictionary, a hash-map stores elements in the form of key-value pairs. Some of the implementations of hash maps are for constant-time data search, cryptographic applications, and so on.

- Syntax – Similar to dictionary, you can define hash maps using {} or dict()
- Advantages - The key is always unique which could have multiple values associated with it.
- Some notable methods & tips – keys(), values(), items(), etc.

When working on a business problem, you need to devise a set of instructions or rules that could solve that problem. These sets of rules could be deemed as Algorithms. These are mainly executed in a finite sequential order. An algorithm could solve problems ranging from simple sorting to recommending certain products in e-commerce.

Certain steps need to be followed while developing any algorithm.

- First, you need to formulate the problem statement and understand the requirements.
- Determine the objective function which needs to be optimized and figure out the starting point.
- Set up an endpoint. This could enable you to stop at a certain point.
- In between steps 2 and 3, formulate the intermediate steps.
- Once it’s done, review all the previous steps.

While developing an algorithm, the end goal or the objective function should be clear. There are certain assumptions that validate a good algorithm.

**Clear outline**: All the steps which are defined while developing an algorithm should be finite, clear, and understandable.**Precise input and output**: The input and output descriptions should be precise.**Clearly defined**: Every algorithm step should have a clearly defined output dependent on the input.**Robust**: An algorithm should not be one-dimensional, i.e., it should be flexible enough to be tweaked for different purposes.**Language Independent**: An algorithm should be language-agnostic. Since you define a set of instructions within an algorithm, it should be usable across any programming language.

To know more about the importance of Data Structures and algorithms, refer to KnowledgeHut’s best Data Science Bootcamps.

The algorithms could be divided into several classes.

**Greedy Algorithm:**Needless to say, the first approach one thinks of while solving a problem is by using a greedy approach. Though in many cases, that would be the only algorithm useful, but these are not optimal and computationally in-efficient.**Divide and Conquer:**Another class of algorithm used primarily in searching and sorting is the divide and conquer approach. It is more efficient than a greedy approach. You divide the problem into several sub-parts and solve each one individually.**Dynamic Programming:**An advanced class of algorithm which is computationally more efficient than the other two. Like divide and conquer, here also you divide the problem into sub-parts but remember the results of each one of them. All sub-parts together give the final output.

Sorting is a technique which is often used in our daily work stream. Any Data Structures and Algorithms course would teach different sorting algorithms. Some of the mostly used sorting python algorithms examples are -

**1. Selection Sort:** The original unsorted array is repeatedly sorted by finding the minimum element and replacing the first element with this value. Its time complexity in all of worst, average and best scenarios is O(n^2). Whenever, the list of elements is small, selection sort could be used.

**2. Bubble Sort:** In this sorting algorithm, two adjacent elements are compared and swapped until they are in their intended order. The best time complexity is O(n) where both the average and worst time complexity is O(n^2). It is most simple of sorting algorithms.

**3. Insertion Sort:** Here the first element is assumed sorted, and it is compared against the next element. This way after each iteration unsorted elements are placed in their intended place. The time complexity for insertion sort is same as bubble sort. Mostly used when the number of elements is less.

**4. Merge Sort: **It is based on the principle of Divide and Conquer as already mentioned earlier. The time complexity of merge sort is O(n*logn) for all the three cases.

**5. Quick Sort:** It is also based on divide and conquer principle where a pivot element is selected first such that elements less than that pivot are placed on its left while those which are higher are placed on its right. Both left and right sub-arrays now created follows this same approach until each has one single element. The worst time complexity is O(n^2), whereas the best and average time complexity is O(n*logn).

**6. Shell Sort:** The elements which are apart from each other are sorted first which reduces the interval between the elements that are to be sorted. The time complexity is same as Quick sort.

Data Structures and Algorithms go together. Algorithms are generally created to retrieve, search, and sort elements stored in a Data Structure. While choosing the best algorithm, it is important to understand the software, or the system being built and its complexities.

If a project requires low latency, then it’s recommended to choose an algorithm with faster computation. On the other hand, if space is a constraint, you need to understand which data structure to use to ensure less space complexity. Thus, the optimal one should be considered for problem-solving with algorithms and data structures in python.

In this blog, we discussed about various Data Structures and Algorithms in Python. Since building large scale enterprise system requires the best usage of DSA, it is important to learn Data structures and Algorithms.

1. Is Python good for DSA?

Python is definitely an intuitive language to learn. The functionalities of Python would make it easier to master DSA. This blog has provided a comprehensive view on some of the common data structure algorithm in python.

2. How many data structures are there in Python?

Python has both 4 built-in Data Structures. Those are list, tuple, sets and dictionary. There are other user defined Data Structures as well.

3. What is the best way to learn DSA in python?

The best way to learn data structures and algorithms is to pick up a problem statement across various DSA topics and start practicing them using Python. You would also find many data structures in python tutorial online.

4. What is the most important Data Structure?

There is no such Data Structure as most important. The usefulness is based on the problem statement. However, arrays, strings, stacks, queues, heaps, trees are some of the most important ones.

5. What is syntax in Pytho

Syntax are the numerical and English words arranged to create instructions for solving a problem.

Name | Date | Fee | Know more |
---|

Course Advisor