Series List
Filter

Rated 4.5/5
based on 12 customer reviews

- by Priyankur Sarkar
- 10th Sep, 2019
- Last updated on 10th Sep, 2019
- 15 mins read

Whenever you visit a pharmacy and ask for a particular medicine, have you noticed something? It hardly takes any time for the pharmacist to find it among several medicines. This is because all the items are arranged in a certain fashion which helps them know the exact place to look for. They may be arranged in alphabetical order or according to their category such as ophthalmics or neuro or gastroenterology and so on. A proper arrangement not only saves time but also make operations simple and easy, hence sorting is essential.

At some point or the other, every programmer needs to learn one of the most essential skills, Sorting. Python sorting functions comprise of a lot of features to perform basic sorting or customize ordering according to the programmer’s needs.

Basically, sorting is a technique that is used to arrange data in an increasing and decreasing fashion according to some linear relationship among the data elements. You can sort numbers, or names or records of any kind in any fashion according to your needs. Sorting techniques can be used to arrange a list of mail recipients in an alphabetical manner.

There are a number of sorting algorithms in Python starting from Bubble Sort, Insertion Sort, Quick Sort, Merge Sort, Selection Sort and so on. In this article we will look into how to use sorted() and sort() in Python. To learn more about other concepts of Python, go through our Python Tutorials.

In simple terms, sorting means arranging data systematically. If the data you want to work with is not sorted you will face problems in finding your desired element.

The main advantages of sorting elements in Python are:

- When you work with sorting modules, you will get to know about a large number of language components.
- Sorting Algorithms provide an abstract way of learning about the accuracy of your program without worrying about system developments or dependencies.
- It will help you in understanding the program complexity and speed and how to increase the efficiency.

sorted() is a built-in function that accepts an iterable and returns the sorted values in ascending order by default which contains the iterable items.

Sorting Numbers using sorted()

Let us define a list of integers called num_list and pass it as an argument to** sorted()**:

>>> num_list = [4, 10, 1, 7] >>> sorted(num_list) [1, 4, 7, 10] >>> num_list [4, 10, 1, 7]

Some of the insights we gain from the code above are:

- sorted() is a built-in function found in the Python Standard Library. It cannot be defined.
- sorted() orders the values in
**num_list**in ascending order by default, i.e. smallest to largest. - The original values of num_list are not changed.
- sorted() being called, returns an ordered list of values.

Since sorted() function returns the list in order, we can assign the returned list to another variable:

>>> num_list = [4, 10, 1, 7] >>> sorted_list = sorted(num_list) >>> sorted_list [1, 4, 7, 10] >>> num_list [4, 10, 1, 7]

A new variable sorted_list is created which holds the output of sorted().

You can also use sorted() to sort tuples and sets just like numbers:

>>> tuples = (4, 10, 1, 7) >>> sets = {10, 5, 10, 0, 2} >>> sorted_tuples = sorted(numbers_tuple) >>> sorted_sets = sorted(numbers_set) >>> sorted_tuples [1, 4, 7, 10] >>> sorted_sets [0, 2, 5, 10]

The definition of sorted() states that it will return a new list whatever the input may be. So even if the input variables are tuples and sets, sorted() always returns a list.

You can also perform type casting in cases where you need to match the returned object with the input type:

>>> tuples = (4, 10, 1, 7) >>> sets = {10, 5, 10, 0, 2} >>> sorted_tuples = sorted(numbers_tuple) >>> sorted_sets = sorted(numbers_set) >>> sorted_tuples [1, 4, 7, 10] >>> sorted_sets [0, 2, 5, 10] >>> tuples(sorted_tuples) (1, 4, 7, 10) >>> sets(sorted_sets) {0, 2, 5, 10}

In the code above, you can see the sorted_tuples when cast to tuples is retained in an ordered manner whereas sorted_sets when casted does not return an order list since it is unordered by definition.

Sorting of strings is just like sorting tuples and sets.** sorted() **iterates across each character of the input and returns a string order.

An example of sorting str type using sorted():

>>> num_string = '98765' >>> char_string = 'sorting is fun' >>> sorted_num_string = sorted(num_string) >>> sorted_char_string = sorted(char_string) >>> sorted_num_string ['5', '6', '7', '8', '9'] >>> sorted_char_string ['', '','f', 'g', 'i', 'i', 'n', 'n', 'o', 'r', 's', 's', 't','u']

The str is treated as a list and sorted() iterates through each character including spaces.

You can use .split() to change the behavior and clean the output and .join() to rejoin them together:

>>> string = 'sorting is fun' >>> sorted_string = sorted(string.split()) >>> sorted_string ['fun', 'is', 'sorting'] >>> ' '.join(sorted_string) 'fun is sorting'

The actual string is converted into a list of words using .split() and then it is sorted with sorted() and then again joined together using .join().

The syntax of the sorted() function is sorted(iterable, /, *, key=None, reverse=False).

The built-in function sorted() comprises of three parameters:

— Required. A sequence such as string, tuple or list and collection such as set or dictionary.**iterable**Optional. A function that serves as a key or to customize the sort order. The argument is set to**key**—**None**Optional. A boolean flag that reverses the order of sorting. If**reverse**—**True**, the sorted list is reversed. The default argument is**False**.

reverse is an optional keyword argument that changes the sorting order according to the Boolean value assigned to it. The default value is** False**, which performs sorting in ascending order. However, if the value is given as True, descending sort occurs:

>>> name_list = ['Markian', 'Alex', 'Suzzane', 'Harleen'] >>> sorted(name_list) ['Alex', 'Harleen', 'Markian', 'Suzzane'] >>> sorted(name_list, reverse=True) ['Suzzane', 'Markian', 'Harleen', 'Alex']

In the example above, the sorting is done on the basis of the first alphabet. However, when sorted() encounters the reverse keyword with a True argument, the output is reversed.

Another example to understand the behavior of the reverse keyword:

>>> case_sensitive_names = ['Markian', 'alex', 'Suzzane', 'harleen'] >>> sorted(case_sensitive_names, reverse=True) ['harleen', 'alex', 'Suzzane', 'Markian'] >>> values_to_sort = [False, 1, 'A' == 'B', 1 <= 0] >>> sorted(values_to_sort, reverse=True) [1, False, False, False] >>> num_list = [7, 10, 0, 4] >>> sorted(num_list, reverse=False) [0, 4, 7, 10]

The keyword argument key accepts a function and this function determines the resulting order by implementing itself in each value of the list.

An example to illustrate sorting of a list using the function len(), which returns the length of the string, and providing the key argument as** len:**

>>> word = 'pencil' >>> len(word) 6 >>> word_list = ['cherry', 'donut', 'Michigan', 'transcipt'] >>> sorted(word_list, key=len) ['donut', 'cherry', 'Michigan', 'transcript']

The len() function determines the length of each item in the list and returns the list in ascending order (shortest to longest).

Let us sort the earlier example using key where the first alphabet with different case was considered for the order:

>>> case_sensitive_names = ['Markian', 'alex', 'Suzzane', 'harleen'] >>> sorted(case_sensitive_names, reverse=True) ['Markian', 'Suzzane', 'alex', 'harleen'] >>> sorted(case_sensitive_names, key=str.lower) ['alex', 'harleen', 'Markian', 'Suzzane']

The key cannot make any changes to the original values in the list. So the final output will be the original sorted elements.

Though key is considered as one of the most powerful components of** sorted()**, it has a number of limitations.

The first limitation is that key** **accepts only single argument functions.

An example of a function** addition** that accepts two arguments:

>>> def addition(a, b): return a + b >>> number_to_add = [1, 3, 5] >>> sorted(number_to_add , key=addition) Traceback (most recent call last): File "stdin", line 5, in <module> sorted(number_to_add, key=addition) TypeError: addition() missing 1 required positional argument: 'b'

The program fails because whenever** addition()** is called during sorting, it receives only one element from the list at a time. The second argument is always missing.

The second limitation is that the** key** function that is used must be able to handle all types of iterable values.

An example to illustrate the second limitation:

>>> cast_values = ['4', '5', '6', 'seven'] >>> sorted(cast_values, key=int) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: invalid literal for int() with base 10: 'seven'

The example above contains a list of numbers to be used by sorted() as strings. The key will try to convert the numbers to** int**. Each of the numbers represented as strings can be converted to** int**, but** four** cannot be. So a ValueError** **gets raised since four** **is not valid to cast into an int.

Let us see an example to arrange an iterable by the last letter of each string:

>>> def reverse(word): return word[::-1] >>> words = ['cherry', 'cake', 'Michigan', 'transcript'] >>> sorted(words, key=reverse) ['cake', 'Michigan', 'transcript', 'cherry']

The function reverse is defined to reverse the input string and then the function is used as the key argument. The slice syntax word[::-1]** **reverses the string and then the function reverse()** **takes all the elements one at a time and sorts the list according to the last alphabet.

You can also use lambda function in the key argument instead of defining a regular function. A lambda is an anonymous function that does not have a name and executes just like normal functions. Lambda functions do not contain any statements.

An example to show the previous code using a lambda function:

>>> words = ['cherry', 'cake', 'Michigan', 'transcript'] >>> sorted(words, key = lambda x: x[::-1]) ['cake', 'Michigan', 'transcript', 'cherry']

Here, the key is defined with lambda with no name and x is the argument. The slice syntax word[::-1]** **reverses each of the element and the reversed output is then used for sorting.

An example to use** key** along with reverse argument:

>>> words = ['cherry', 'cake', 'Michigan', 'transcript'] >>> sorted(words, key = lambda x: x[::-1], reverse = True) ['cherry', 'transcript', 'Michigan', 'cake']

In this example, the order is reversed into a descending manner.

Lambda functions can also be used to sort class objects according to their properties.

An example to sort a group of students based on their grade in descending order:

>>> from collections import namedtuple >>> Student = namedtuple('Student', 'name grade') >>> alex = Student('Alex', 95) >>> bob = Student('Bob', 87) >>> charlie = Student('Charlie', 91) >>> students = [alex, bob, charlie] >>> sorted(students, key=lambda x: getattr(x, 'grade'), reverse=True) [Student(name='Alex', grade=95), Student(name='Charlie', grade=91), Student(name='Bob', grade=87)]

The namedtuple is used to produce classes with name** **and grade attributes. The lambda is used to get the grade property of each student and reverse is used to reverse the output into descending order so that the highest grades are arranged first.

There are a lot of possible techniques to arrange elements using sorted() with key and reverse arguments. Lambda functions can also be helpful during sorting by making your code simple and clean.

You can also use operator module functions like itemgetter() and attrgetter() to make your sorting program simpler and faster. The operator module is used to export a set of accessor functions in correspondence to the operators of Python.

An example to illustrate operator module functions using key:

>>> tuples = [

('alex', 'B', 13),

('bob', 'A', 12),

('charles', 'B', 10),

]

>>> from operator import itemgetter

>>> sorted(tuples, key=itemgetter(2))

>>>[('charles', 'B', 10), ('bob', 'B', 12), ('alex', 'A', 13)]

tuples is declared with the name, grade and age of three persons. The function itemgetter is imported from the module operator and then it is sorted by age and the output displayed in ascending order.

The .sort() which is quite similar to sorted() in naming has few differences than sorted(). The** help** documentation of Python will clear out the two critical differences between .sort() and sorted()**:**

>>> help(sorted) Help on built-in function sorted in module builtins: sorted(iterable, /, *, key=None, reverse=False) Return a new list containing all items from the iterable in ascending order. A custom key function can be supplied to customize the sort order, and the reverse flag can be set to request the result in descending order. >>> help(list.sort) Help on method_descriptor: sort(self, /, *, key=None, reverse=False) Stable sort *IN PLACE*.

Firstly, .sort() is not a built-in function unlike sorted(). It is a method of** list **class and works only with lists. You cannot pass iterables to .sort().

Secondly,** .sort()** returns None and changes the values.

Let us see the differences of code for .sort() and what impact it has on the code:

>>> sort_numbers = [10, 2, 7, 3] >>> sort(sort_numbers) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'sort' is not defined >>> sort_tuples = (10, 2, 7, 3) >>> sort_tuple.sort()>>> sort_tuples = (10, 2, 7, 3) >>> sort_tuple.sort() Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'tuple' object has no attribute 'sort' >>> sorted_values = sort_numbers.sort() >>> print(sorted_values) None >>> sorted_values = sort_numbers.sort() >>> print(sorted_values)int(sort_numbers) [1, 2, 5, 6]

The code above highlights some operational differences between .sort() and sorted()**:**

- When any assignment is done to a new variable, it returns a None type. This is because .sort()
- The original order of sort_numbers is not maintained and is changed in place.

.sort() also contains the key** **and reverse optional keyword arguments just like sorted() which produces the same functionality.

An example of** .sort()** using lambda to sort a list of phrases by the first letter of the third word:

>>> sort_phrases = ['welcome to python', 'python is fun', 'python is easy' ] >>> sort_phrases.sort(key=lambda x: x.split()[2][1], reverse=False) >>> sort_phrases ['python is easy', 'python is fun', 'welcome to python']

Here,** lambda** is used to split each phrase into a list of words and then find the second letter of the third element for each phrase.

Python has some limitations when you try to sort values besides integers.

You cannot use sort data types that are different from each other. Python raises an error when sorted() is used on non-comparable data.

An example to illustrate sorting of values of different data types:

>>> mixed_values = [None, 5] >>> sorted(mixed_values) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: '<' not supported between instances of 'int' and 'NoneType'

Python raises a TypeError because it cannot sort None and int** **in the same list because of their incompatibility. It uses the less than operator ( < ) to determine the lower value in the order of the sort.

If you try to compare the same values manually without using sorted(), it will still raise a TypeError because of non-comparable data types:

>> None < 5 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: '<' not supported between instances of 'NoneType' and 'int'

However, if your list contains a combination of integers and strings that are all numbers, Python will cast them to comparable types using a list comprehension:

>>> num_mix = [10, "5", 200, "11"] >>> sorted(num_mix) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: '<' not supported between instances of 'str' and 'int' >>> # List comprehension to cast all values to integers >>> [int(z) for z in num_mix] [10, 5, 200, 11] >>> sorted([int(z) for z in num_mix]) [5, 10, 11, 200]

int()** **converts all the string values in num_mix to integers and then sorted() compares all values and returns a sorted output.

An example of a Python code of implicitly converting a value to another type:

>>> values = [1, False, 0, 'a' == 'b', 0 >= 1] >>> sorted(values) [False, 0, False, False, 1]

In the example above, all the elements in the list are converted to boolean type.** 0 >= 1** evaluates to a False output. The number 1 and 0 are converted to True and False as bool** **type respectively.

This particular example highlights an important characteristic of sorting– sort stability. Sorting ability means that sorting algorithms are always stable. The original order is retained even if multiple records have the same key argument.

An example to illustrate sort stability:

>>> values = [False, 0, 0, 3 == 4, 1, False, False] >>> sorted(values) [False, 0, 0, False, 0, False, 1]

If you take a look at the original order and the sorted output, you’ll find that the expression 3 == 4 is casted to False and all sorted output is in the actual order. You can also perform complex sorts with the help of the knowledge of sort stability.

Case-Sensitive Sorting

You can use sorted() to sort a list of strings in ascending order which is alphabetical by default:

>>> name_list = ['Markian', 'Alex', 'Suzzane', 'Harleen'] >>> sorted(name_list) ['Alex', 'Harleen', 'Markian', 'Suzzane']

However, Python uses Unicode Code Point of the first letter of each string to evaluate the ascending order of the sort. If there are two names Al and al, Python will treat both of them differently.

An example to return the Unicode Code Point of the first alphabet of each string:

>>> case_sensitive_names = ['Markian', 'alex', 'Suzzane', 'harleen'] >>> sorted(case_sensitive_names) ['Markian', 'Suzzane', 'alex', 'harleen'] >>> # List comprehension for Unicode Code Point of first letter in each word >>> [(ord(name[0]), name[0]) for name in sorted(case_sensitive_names)] [(77, 'M'), (83, 'S'), (97, 'a'), (104, 'h')]

In the example above, name[0] returns the first letter of the string and ord(name[0])** **returns the Unicode Code Point. You can notice that even a comes before M alphabetically, the output has M before a. This is because the code point of M comes before a.

Consider a situation where the first letter is the same for all the strings that need to be sorted. In such cases, the sorted() function will use the second letter to determine the order and if the second letter is also same, it will consider the third letter and so on, till the end of string:

>>> similar_strings = ['zzzzzn', 'zzzzzc', 'zzzzza','zzzzze'] >>> sorted(similar_strings) ['zzzzza', 'zzzzzc', 'zzzzze', 'zzzzzn']

Here,** sorted() **will compare the strings based on the sixth character since the first five characters are the same ( z ). The output will also depend on the last character of each string.

An example of sorting elements having identical values:

>>> different_lengths = ['zzzzz', 'zz', 'zzzz','z'] >>> sorted(different_lengths) ['z', 'zz', 'zzzz', 'zzzzz']

In this case, the sorting order will be from the shortest to the longest. The shortest string z is ordered first and the longest string zzzzz is ordered at the last.

Let us consider a case where you need to collect data from a race of 5k runners, the Python 5k Annual and then sort them. You will have to collect the runner’s bib number and the time it took to finish the race:

>>> from collections import namedtuple >>> Runner_data = namedtuple('Runner', 'bibnumber duration')

Each of the Runner_data will be added to a list called** runners**:

>>> runners = [] >>> runners.append(Runner_data('2548597', 1200)) >>> runners.append(Runner_data('8577724', 1720)) >>> runners.append(Runner_data('2666234', 1600)) >>> runners.append(Runner_data('2425114', 1450)) >>> runners.append(Runner_data('2235232', 1620)) ... ... >>> runners.append(Runner_data('2586674', 1886))

The bib number and the total time taken by the runner is added to** runners** each time they cross the finish line.

Now, you know the top five runners according to the duration time are the winners and the rest of them will be sorted by the fastest time:

>>> runners.sort(key=lambda x: getattr(x, 'duration')) >>> fastest_five_runners = runners[:5]

In this example, we didn’t need any multiple types of sorting. The list was a reasonable choice. You just sorted the participants and grabbed the fastest five runners. Storing the list elsewhere was also not needed. The lambda function is used here to get the duration of each runner and then sorting is performed. Finally, the result is stored in fastest_five_runners.

However, the managing director of the race comes to you and informs that they have decided that every 20th runner will be awarded a free sports bag. Since the original data has been changed and cannot be recoverable, it is impossible to find every 20th runner.

In such cases, where you find a slight possibility that the original data is to be recovered, use** sorted()** instead of sort()**.**

Let us implement the same code above using sorted():

>>> runners_by_time = sorted(runners, key=lambda x: getattr(x, 'duration')) >>> fastest_five_runners = runners_by_time[:5]

In this situation, sorted() holds the original list of runners and their data and is not overwritten. You can find every 20th person to cross the finish line by interacting with the original values:

`>>> every_twentieth_runner = runners[::20]`

List slice on runners is used to create every_twentieth_runner that holds the actual order in which runners crossed the finish line.

So, sorted()** **should be used in cases where the original data is to be retained and** sort()** should be used where the original data is a copy or unimportant and losing it won’t stand as an issue.

There were mainly two approaches of sorting when Python 2** **was released— decorated-sort-undecorated and using cmp parameter.

Decorated-Sort-Undecorated

This idiom Decorated-Sort-Undecorated is based upon three three steps:

- First of all, the original list is decorated with new elements which manages the sort order.
- Secondly, sorting is performed on the decorated list.
- Finally, a list is created that contains the original elements in the new order and the decorations are removed.

Let us see an example of the DSU approach using a class:

>>> class Student: def prop(self,name, grade, age): self.name = name self.grade = grade self.age = age def stu_repr(self): return repr((self.name, self.grade, self.age)) >>> student_objects = [ Student('alex', 'B', 13), Student('bob', 'A', 12), Student('chrles', 'B', 10), ] #Regular sorting using sorted() >>> sorted(student_objects, key=lambda student: student.age) [('charles', 'B', 10), ('bob', 'A', 12), ('alex', 'B', 13)] #DSU Approach >>> decorated_values = [(student.grade, i, student) for i, student in enumerate(student_objects)] >>> decorated_values.sort() >>> [student for grade, i, student in decorated_values] [('bob', 'A', 12), ('alex', 'B', 13),('charles', 'B', 10)]

In this code above, a class Student is created with student objects name,** grade **and age. Firstly, the original values are decorated and then sorted. Finally, the decorations are removed from decorated_values and then the new list is created with original values in new order.

The Decorated-Sort-Undecorated** **technique is also the Schwartzian Transform and is helpful in increasing the efficiency of sorting in Python.

Using cmp Parameter

cmp is a method or parameter in Python that is used to compare two arguments. It returns either of the three values– a negative value in case of less than (<) comparisons or zero if equal or a positive value for greater than (>) comparisons.

An example to illustrate cmp using sorted():

>>> def num_compare(a, b): return a - b >>> sorted([9, 2, 5, 0, 7], cmp=num_compare) [0, 2, 5, 7, 9]

Here, a function num_compare is created and then the list is sorted by comparing each value in the list. Finally, the output is displayed in ascending order.

Note that cmp parameter will work only in Python 2 . It is completely removed from Python 3 to make the language more simple and to resist conflicts between other comparison techniques and** cmp**.

Let us sum up what we have learned in this article so far—

- Sorting and its needs.
- How to use sorted() to sort values with and without key and reverse.
- How to use
**.sort()**to order values with and without key and reverse. - Limitations and Gotchas with Python Sorting.
- Appropriate use of .sort() and sorted().

Both .sort() and sorted() can be used to sort elements in a similar manner if used properly with key and reverse arguments.

However, both have different characteristics when output and in-place modifications are considered, so it is suggested to first have a clear understanding of the program to be worked upon, while using .sort() since it can irrevocably overwrite data.

To become a good Python developer, understanding complex sorting algorithms would be a useful skill set in the long run. For more information about sorting in Python, look into the official documentation of sorting of the Python Software Foundation and also grab a glimpse of another Python sorting algorithm called the TimSort. You may also join our Python certification course to gain further skills and knowledge in Python.

Rated 5.0/5
based on 43 customer reviews

12949

- by Priyankur Sarkar
- 05 Sep 2019
- 26 mins read

While you are dealing with data, sometimes you may... Read More

Rated 4.5/5
based on 19 customer reviews

9950

- by KnowledgeHut
- 02 Dec 2019

Programming Languages: Their popularity Every pas... Read More

Rated 4.5/5
based on 19 customer reviews

10023

- by Priyankur Sarkar
- 14 Nov 2019
- 9 mins read

Over the years, Python language has evolved enormo... Read More

Subscribe to our newsletter.

## Join the Discussion

Your email address will not be published. Required fields are marked *