Course Discount

Search

Python Programming Filter

How to Round Numbers in Python

While you are dealing with data, sometimes you may come across a biased dataset. In statistics, bias is whereby the expected value of the results differs from the true underlying quantitative parameter being estimated. Working with such data can be dangerous and can lead you to incorrect conclusions. To learn more about various other concepts of Python, go through our Python Tutorials or enroll to our Python Certification course online.There are many types of biases such as selection bias, reporting bias, sampling bias and so on. Similarly, rounding bias is related to numeric data. In this article we will see:Why is it important to know the ways to round numbersHow to use various strategies to round numbersHow data is affected by rounding itHow to use NumPy arrays and Pandas DataFrames to round numbersLet us first learn about Python’s built-in rounding process.About Python’s Built-in round() FunctionPython Programming offers a built-in round() function which rounds off a number to the given number of digits and makes rounding of numbers easier. The function round() accepts two numeric arguments, n and n digits and then returns the number n after rounding it to ndigits. If the number of digits are not provided for round off, the function rounds off the number n to the nearest integer.Suppose, you want to round off a number, say 4.5. It will be rounded to the nearest whole number which is 5. However, the number 4.74 will be rounded to one decimal place to give 4.7.It is important to quickly and readily round numbers while you are working with floats which have many decimal places. The inbuilt Python function round() makes it simple and easy.Syntaxround(number, number of digits)The parameters in the round() function are:number - number to be roundednumber of digits (Optional) - number of digits up to which the given number is to be rounded.The second parameter is optional. In case, if it is missing then round() function returns:For an integer, 12, it rounds off to 12For a decimal number, if the last digit after the decimal point is >=5 it will round off to the next whole number, and if <5 it will round off to the floor integerLet us look into an example where the second parameter is missing.# For integers print(round(12))   # For floating point print(round(21.7))   print(round(21.4))The output will be:12 22 21Now, if the second parameter is present.# when the (ndigit+1)th digit is =5 print(round(5.465, 2))   # when the (ndigit+1)th digit is >=5 print(round(5.476, 2))     # when the (ndigit+1)th digit is <5 print(round(5.473, 2))The output will be:5.46 5.48 5.47A practical application of round() functionThere is always a mismatch between fractions and decimals. The rounding of functions can be used to handle such cases. While converting fractions to decimals, we generally get many digits after the decimal point such as for ⅙ we get 0.166666667 but we use either two or three digits to the right of the decimal point. This is where the round function saves the day.For example:x = 1/3 print(x) print(round(x, 2))The output will be:0.3333333333333333 0.33Some errors and exceptions associated with this functionFor example,print(round("x", 2))The output will be:--------------------------------------------------------------------------- TypeError                                 Traceback (most recent call last) <ipython-input-9-6fc428ecf419> in <module>() ----> 1 print(round("x", 2)) TypeError: type str doesn't define __round__ methodAnother example,print(round(1.5)) print(round(2)) print(round(2.5))The output will be:2 2 2The function round() rounds 1.5 up to 2, and 2.5 down to 2. This is not a bug, the round() function behaves this way. In this article you will learn a few other ways to round a number. Let us look at the variety of methods to round a number.Diverse Methods for RoundingThere are many ways to round a number with its own advantages and disadvantages. Here we will learn some of the techniques to rounding a number.TruncationTruncation, as the name means to shorten things. It is one of the simplest methods to round a number which involves truncating a number to a given number of digits. In this method, each digit after a given position is replaced with 0. Let us look into some examples.ValueTruncated ToResult19.345Tens place1019.345Ones place1919.345Tenths place19.319.345Hundredths place19.34The truncate() function can be used for positive as well as negative numbers:>>> truncate(19.5) 19.0 >>> truncate(-2.852, 1) -2.8 >>> truncate(2.825, 2) 2.82The truncate() function can also be used to truncate digits towards the left of the decimal point by passing a negative number.>>> truncate(235.7, -1) 230.0 >>> truncate(-1936.37, -3) -1000.0When a positive number is truncated, we are basically rounding it down. Similarly, when we truncate a negative number, the number is rounded up. Let us look at the various rounding methods.Rounding UpThere is another strategy called “rounding up” where a number is rounded up to a specified number of digits. For example:ValueRound Up ToResult12.345Tens place2018.345Ones place1918.345Tenths place18.418.345Hundredths place18.35The term ceiling is used in mathematics to explain the nearest integer which is greater than or equal to a particular given number. In Python, for “rounding up” we use two functions namely,ceil() function, andmath() functionA non-integer number lies between two consecutive integers. For example, considering a number 5.2, this will lie between 4 and 5. Here, ceiling is the higher endpoint of the interval, whereas floor is the lower one. Therefore, ceiling of 5.2 is 5, and floor of 5.2 is 4. However, the ceiling of 5 is 5.In Python, the function to implement the ceiling function is the math.ceil() function. It always returns the closest integer which is greater than or equal to its input.>>> import math >>> math.ceil(5.2) 6 >>> math.ceil(5) 5 >>> math.ceil(-0.5) 0If you notice you will see that the ceiling of -0.5 is 0, and not -1.Let us look into a short code to implement the “rounding up” strategy using round_up() function:def round_up(n, decimals=0):     multiplier = 10 ** decimals     return math.ceil(n * multiplier) / multiplierLet’s look at how round_up() function works with various inputs:>>> round_up(3.1) 4.0 >>> round_up(3.23, 1) 3.3 >>> round_up(3.543, 2) 3.55You can pass negative values  to decimals, just like we did in truncation.>>> round_up(32.45, -1) 40.0 >>> round_up(3352, -2) 3400You can follow the diagram below to understand round up and round down. Round up to the right and down to the left.Rounding up always rounds a number to the right on the number line, and rounding down always rounds a number to the left on the number line.Rounding DownSimilar to rounding up we have another strategy called rounding down whereValueRounded Down ToResult19.345Tens place1019.345Ones place1919.345Tenths place19.319.345Hundredths place19.34In Python, rounding down can be implemented using a similar algorithm as we truncate or round up. Firstly you will have to shift the decimal point and then round an integer. Lastly shift the decimal point back.math.ceil() is used to round up to the ceiling of the number once the decimal point is shifted. For “rounding down” we first need to round the floor of the number once the decimal point is shifted.>>> math.floor(1.2) 1 >>> math.floor(-0.5) -1Here’s the definition of round_down():def round_down(n, decimals=0):     multiplier = 10 ** decimals return math.floor(n * multiplier) / multiplierThis is quite similar to round_up() function. Here we are using math.floor() instead of math.ceil().>>> round_down(1.5) 1 >>> round_down(1.48, 1) 1.4 >>> round_down(-0.5) -1Rounding a number up or down has extreme effects in a large dataset. After rounding up or down, you can actually remove a lot of precision as well as alter computations.Rounding Half UpThe “rounding half up” strategy rounds every number to the nearest number with the specified precision, and breaks ties by rounding up. Here are some examples:ValueRound Half Up ToResult19.825Tens place1019.825Ones place2019.825Tenths place19.819.825Hundredths place19.83In Python, rounding half up strategy can be implemented by shifting the decimal point to the right by the desired number of places. In this case you will have to determine whether the digit after the shifted decimal point is less than or greater than equal to 5.You can add 0.5 to the value which is shifted and then round it down with the math.floor() function.def round_half_up(n, decimals=0):     multiplier = 10 ** decimals return math.floor(n*multiplier + 0.5) / multiplierIf you notice you might see that round_half_up() looks similar to round_down. The only difference is to add 0.5 after shifting the decimal point so that the result of rounding down matches with the expected value.>>> round_half_up(19.23, 1) 19.2 >>> round_half_up(19.28, 1) 19.3 >>> round_half_up(19.25, 1) 19.3Rounding Half DownIn this method of rounding, it rounds to the nearest number similarly like “rounding half up” method, the difference is that it breaks ties by rounding to the lesser of the two numbers. Here are some examples:ValueRound Half Down ToResult16.825Tens place1716.825Ones place1716.825Tenths place16.816.825Hundredths place16.82In Python, “rounding half down” strategy can be implemented by replacing math.floor() in the round_half_up() function with math.ceil() and then by subtracting 0.5 instead of adding:def round_half_down(n, decimals=0):     multiplier = 10 ** decimals return math.ceil(n*multiplier - 0.5) / multiplierLet us look into some test cases.>>> round_half_down(1.5) 1.0 >>> round_half_down(-1.5) -2.0 >>> round_half_down(2.25, 1) 2.2In general there are no bias for both round_half_up() and round_half_down(). However, rounding of data with more number of ties results in bias. Let us consider an example to understand better.>>> data = [-2.15, 1.45, 4.35, -12.75]Let us compute the mean of these numbers:>>> statistics.mean(data) -2.275Now let us compute the mean on the data after rounding to one decimal place with round_half_up() and round_half_down():>>> rhu_data = [round_half_up(n, 1) for n in data] >>> statistics.mean(rhu_data) -2.2249999999999996 >>> rhd_data = [round_half_down(n, 1) for n in data] >>> statistics.mean(rhd_data) -2.325The round_half_up() function results in a round towards positive infinity bias, and round_half_down() results in a round towards negative infinity bias.Rounding Half Away From ZeroIf you have noticed carefully while going through round_half_up() and round_half_down(), neither of the two is symmetric around zero:>>> round_half_up(1.5) 2.0 >>> round_half_up(-1.5) -1.0 >>> round_half_down(1.5) 1.0 >>> round_half_down(-1.5) -2.0In order to introduce symmetry, you can always round a tie away from zero. The table mentioned below illustrates it clearly:ValueRound Half Away From Zero ToResult16.25Tens place2016.25Ones place1616.25Tenths place16.3-16.25Tens place-20-16.25Ones place-16-16.25Tenths place-16.3The implementation of “rounding half away from zero” strategy on a number n is very simple. All you need to do is start as usual by shifting the decimal point to the right a given number of places and then notice the digit d immediately to the right of the decimal place in this new number. Here, there are four cases to consider:If n is positive and d >= 5, round upIf n is positive and d < 5, round downIf n is negative and d >= 5, round downIf n is negative and d < 5, round upAfter rounding as per the rules mentioned above, you can shift the decimal place back to the left.There is a question which might come to your mind - How do you handle situations where the number of positive and negative ties are drastically different? The answer to this question brings us full circle to the function that deceived us at the beginning of this article: Python’s built-in  round() function.Rounding Half To EvenThere is a way to mitigate rounding bias while you are rounding values in a dataset. You can simply round ties to the nearest even number at the desired precision. Let us look at some examples:ValueRound Half To Even ToResult16.255Tens place2016.255Ones place1616.255Tenths place16.216.255Hundredths place16.26To prove that round() really does round to even, let us try on a few different values:>>> round(4.5) 4 >>> round(3.5) 4 >>> round(1.75, 1) 1.8 >>> round(1.65, 1) 1.6The Decimal ClassThe  decimal module in Python is one of those features of the language which you might not be aware of if you have just started learning Python. Decimal “is based on a floating-point model which was designed with people in mind, and necessarily has a paramount guiding principle – computers must provide an arithmetic that works in the same way as the arithmetic that people learn at school.” – except from the decimal arithmetic specification. Some of the benefits of the decimal module are mentioned below -Exact decimal representation: 0.1 is actually 0.1, and 0.1 + 0.1 + 0.1 - 0.3 returns 0, as expected.Preservation of significant digits: When you add 1.50 and 2.30, the result is 3.80 with the trailing zero maintained to indicate significance.User-alterable precision: The default precision of the decimal module is twenty-eight digits, but this value can be altered by the user to match the problem at hand.Let us see how rounding works in the decimal module.>>> import decimal >>> decimal.getcontext() Context(     prec=28,     rounding=ROUND_HALF_EVEN,     Emin=-999999,     Emax=999999,     capitals=1,     clamp=0,     flags=[],     traps=[         InvalidOperation,         DivisionByZero,         Overflow     ] )The function decimal.getcontext() returns a context object which represents the default context of the decimal module. It also includes the default precision and the default rounding strategy.In the above example, you will see that the default rounding strategy for the decimal module is ROUND_HALF_EVEN. It allows to align with the built-in round() functionLet us create a new Decimal instance by passing a string containing the desired value and declare a number using the decimal module’s Decimal class.>>> from decimal import Decimal >>> Decimal("0.1") Decimal('0.1')You may create a Decimal instance from a floating-point number but in that case, a floating-point representation error will be introduced. For example, this is what happens when you create a Decimal instance from the floating-point number 0.1>>> Decimal(0.1) Decimal('0.1000000000000000055511151231257827021181583404541015625')You may create Decimal instances from strings containing the decimal numbers you need in order to maintain exact precision.Rounding a Decimal using the .quantize() method:>>> Decimal("1.85").quantize(Decimal("1.0")) Decimal('1.8')The Decimal("1.0") argument in .quantize() allows to determine the number of decimal places in order to round the number. As 1.0 has one decimal place, the number 1.85 rounds to a single decimal place. Rounding half to even is the default strategy, hence the result is 1.8.Decimal class:>>> Decimal("2.775").quantize(Decimal("1.00")) Decimal('2.78')Decimal module provides another benefit. After performing arithmetic the rounding is taken care of automatically and also the significant digits are preserved.>>> decimal.getcontext().prec = 2 >>> Decimal("2.23") + Decimal("1.12") Decimal('3.4')To change the default rounding strategy, you can set the decimal.getcontect().rounding property to any one of several  flags. The following table summarizes these flags and which rounding strategy they implement:FlagRounding Strategydecimal.ROUND_CEILINGRounding updecimal.ROUND_FLOORRounding downdecimal.ROUND_DOWNTruncationdecimal.ROUND_UPRounding away from zerodecimal.ROUND_HALF_UPRounding half away from zerodecimal.ROUND_HALF_DOWNRounding half towards zerodecimal.ROUND_HALF_EVENRounding half to evendecimal.ROUND_05UPRounding up and rounding towards zeroRounding NumPy ArraysIn Data Science and scientific computation, most of the times we store data as a  NumPy array. One of the most powerful features of NumPy is the use of  vectorization and broadcasting to apply operations to an entire array at once instead of one element at a time.Let’s generate some data by creating a 3×4 NumPy array of pseudo-random numbers:>>> import numpy as np >>> np.random.seed(444) >>> data = np.random.randn(3, 4) >>> data array([[ 0.35743992,  0.3775384 ,  1.38233789,  1.17554883],        [-0.9392757 , -1.14315015, -0.54243951, -0.54870808], [ 0.20851975, 0.21268956, 1.26802054, -0.80730293]])Here, first we seed the np.random module to reproduce the output easily. Then a 3×4 NumPy array of floating-point numbers is created with np.random.randn().Do not forget to install pip3 before executing the code mentioned above. If you are using  Anaconda you are good to go.To round all of the values in the data array, pass data as the argument to the  np.around() function. The desired number of decimal places is set with the decimals keyword argument. In this case, round half to even strategy is used similar to Python’s built-in round() function.To round the data in your array to integers, NumPy offers several options which are mentioned below:numpy.ceil()numpy.floor()numpy.trunc()numpy.rint()The np.ceil() function rounds every value in the array to the nearest integer greater than or equal to the original value:>>> np.ceil(data) array([[ 1.,  1.,  2.,  2.],        [-0., -1., -0., -0.], [ 1., 1., 2., -0.]])Look at the code carefully, we have a new number! Negative zero! Let us now take a look at Pandas library, widely used in Data Science with Python.Rounding Pandas Series and DataFramePandas has been a game-changer for data analytics and data science. The two main data structures in Pandas are Dataframe and Series. Dataframe works like an Excel spreadsheet whereas you can consider Series to be columns in a spreadsheet. Series.round() and DataFrame.round() methods. Let us look at an example.Do not forget to install pip3 before executing the code mentioned above. If you are using  Anaconda you are good to go.>>> import pandas as pd >>> # Re-seed np.random if you closed your REPL since the last example >>> np.random.seed(444) >>> series = pd.Series(np.random.randn(4)) >>> series 0    0.357440 1    0.377538 2    1.382338 3    1.175549 dtype: float64 >>> series.round(2) 0    0.36 1    0.38 2    1.38 3    1.18 dtype: float64 >>> df = pd.DataFrame(np.random.randn(3, 3), columns=["A", "B", "C"]) >>> df           A         B         C 0 -0.939276 -1.143150 -0.542440 1 -0.548708  0.208520  0.212690 2  1.268021 -0.807303 -3.303072 >>> df.round(3)        A      B      C 0 -0.939 -1.143 -0.542 1 -0.549  0.209  0.213 2  1.268 -0.807 -3.303 The DataFrame.round() method can also accept a dictionary or a Series, to specify a different precision for each column. For instance, the following examples show how to round the first column of df to one decimal place, the second to two, and the third to three decimal places: >>> # Specify column-by-column precision with a dictionary >>> df.round({"A": 1, "B": 2, "C": 3})      A     B      C 0 -0.9 -1.14 -0.542 1 -0.5  0.21  0.213 2  1.3 -0.81 -3.303 >>> # Specify column-by-column precision with a Series >>> decimals = pd.Series([1, 2, 3], index=["A", "B", "C"]) >>> df.round(decimals)      A     B      C 0 -0.9 -1.14 -0.542 1 -0.5  0.21  0.213 2  1.3 -0.81 -3.303 If you need more rounding flexibility, you can apply NumPy's floor(), ceil(), and print() functions to Pandas Series and DataFrame objects: >>> np.floor(df)      A    B    C 0 -1.0 -2.0 -1.0 1 -1.0  0.0  0.0 2  1.0 -1.0 -4.0 >>> np.ceil(df)      A    B    C 0 -0.0 -1.0 -0.0 1 -0.0  1.0  1.0 2  2.0 -0.0 -3.0 >>> np.rint(df)      A    B    C 0 -1.0 -1.0 -1.0 1 -1.0  0.0  0.0 2  1.0 -1.0 -3.0 The modified round_half_up() function from the previous section will also work here: >>> round_half_up(df, decimals=2)       A     B     C 0 -0.94 -1.14 -0.54 1 -0.55  0.21  0.21 2 1.27 -0.81 -3.30Best Practices and ApplicationsNow that you have come across most of the rounding techniques, let us learn some of the best practices to make sure we round numbers in the correct way.Generate More Data and Round LaterSuppose you are dealing with a large set of data, storage can be a problem at times. For example, in an industrial oven you would want to measure the temperature every ten seconds accurate to eight decimal places, using a temperature sensor. These readings will help to avoid large fluctuations which may lead to failure of any heating element or components. We can write a Python script to compare the readings and check for large fluctuations.There will be a large number of readings as they are being recorded each and everyday. You may consider to maintain three decimal places of precision. But again, removing too much precision may result in a change in the calculation. However, if you have enough space, you can easily store the entire data at full precision. With less storage, it is always better to store at least two or three decimal places of precision which are required for calculation.In the end, once you are done computing the daily average of the temperature, you may calculate it to the maximum precision available and finally round the result.Currency Exchange and RegulationsWhenever we purchase an item from a particular place, the tax amount paid against the amount of the item depends largely on geographical factors. An item which costs you $2 may cost you less (say $1.8)  if you buy the same item from a different state. It is due to regulations set forth by the local government.In another case, when the minimum unit of currency at the accounting level in a country is smaller than the lowest unit of physical currency, Swedish rounding is done. You can find a list of such rounding methods used by various countries if you look up on the internet.If you want to design any such software for calculating currencies, keep in mind to check the local laws and regulations applicable in your present location.Reduce errorAs you are rounding numbers in a large datasets used in complex computations, your primary concern should be to limit the growth of the error due to rounding.SummaryIn this article we have seen a few methods to round numbers, out of those “rounding half to even” strategy minimizes rounding bias the best. We are lucky to have Python, NumPy, and Pandas already have built-in rounding functions to use this strategy. Here, we have learned about -Several rounding strategies, and how to implement in pure Python.Every rounding strategy inherently introduces a rounding bias, and the “rounding half to even” strategy mitigates this bias well, most of the time.You can round NumPy arrays and Pandas Series and DataFrame objects.If you enjoyed reading this article and found it to be interesting, leave a comment. To learn more about rounding numbers and other features of Python, join our Python certification course.
Rated 5.0/5 based on 43 customer reviews

How to Round Numbers in Python

13183
How to Round Numbers in Python

While you are dealing with data, sometimes you may come across a biased dataset. In statistics, bias is whereby the expected value of the results differs from the true underlying quantitative parameter being estimated. Working with such data can be dangerous and can lead you to incorrect conclusions. To learn more about various other concepts of Python, go through our Python Tutorials or enroll to our Python Certification course online.

There are many types of biases such as selection bias, reporting bias, sampling bias and so on. Similarly, rounding bias is related to numeric data. In this article we will see:

  • Why is it important to know the ways to round numbers
  • How to use various strategies to round numbers
  • How data is affected by rounding it
  • How to use NumPy arrays and Pandas DataFrames to round numbers

Let us first learn about Python’s built-in rounding process.

About Python’s Built-in round() Function

Python Programming offers a built-in round() function which rounds off a number to the given number of digits and makes rounding of numbers easier. The function round() accepts two numeric arguments, n and n digits and then returns the number n after rounding it to ndigits. If the number of digits are not provided for round off, the function rounds off the number n to the nearest integer.

Suppose, you want to round off a number, say 4.5. It will be rounded to the nearest whole number which is 5. However, the number 4.74 will be rounded to one decimal place to give 4.7.

It is important to quickly and readily round numbers while you are working with floats which have many decimal places. The inbuilt Python function round() makes it simple and easy.

Syntax

round(number, number of digits)

The parameters in the round() function are:

  1. number - number to be rounded
  2. number of digits (Optional) - number of digits up to which the given number is to be rounded.

The second parameter is optional. In case, if it is missing then round() function returns:

  • For an integer, 12, it rounds off to 12
  • For a decimal number, if the last digit after the decimal point is >=5 it will round off to the next whole number, and if <5 it will round off to the floor integer

Let us look into an example where the second parameter is missing.

# For integers
print(round(12))
 
# For floating point
print(round(21.7))  
print(round(21.4))

The output will be:

12
22
21

Now, if the second parameter is present.

# when the (ndigit+1)th digit is =5 
print(round(5.465, 2)) 
  
# when the (ndigit+1)th digit is >=5 
print(round(5.476, 2))   
  
# when the (ndigit+1)th digit is <5 
print(round(5.473, 2))

The output will be:

5.46 
5.48 
5.47

A practical application of round() function
There is always a mismatch between fractions and decimals. The rounding of functions can be used to handle such cases. While converting fractions to decimals, we generally get many digits after the decimal point such as for ⅙ we get 0.166666667 but we use either two or three digits to the right of the decimal point. This is where the round function saves the day.

For example:

x = 1/3
print(x)
print(round(x, 2))

The output will be:

0.3333333333333333 
0.33

Some errors and exceptions associated with this function
For example,

print(round("x", 2))

The output will be:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-6fc428ecf419> in <module>()
----> 1 print(round("x", 2))
TypeError: type str doesn't define __round__ method

Another example,

print(round(1.5))
print(round(2))
print(round(2.5))

The output will be:

2
2
2

The function round() rounds 1.5 up to 2, and 2.5 down to 2. This is not a bug, the round() function behaves this way. In this article you will learn a few other ways to round a number. Let us look at the variety of methods to round a number.

Diverse Methods for Rounding

There are many ways to round a number with its own advantages and disadvantages. Here we will learn some of the techniques to rounding a number.

Truncation

Truncation, as the name means to shorten things. It is one of the simplest methods to round a number which involves truncating a number to a given number of digits. In this method, each digit after a given position is replaced with 0. Let us look into some examples.

ValueTruncated ToResult
19.345Tens place10
19.345Ones place19
19.345Tenths place19.3
19.345Hundredths place19.34

The truncate() function can be used for positive as well as negative numbers:

>>> truncate(19.5)
19.0

>>> truncate(-2.8521)
-2.8

>>> truncate(2.8252)
2.82

The truncate() function can also be used to truncate digits towards the left of the decimal point by passing a negative number.

>>> truncate(235.7, -1)
230.0

>>> truncate(-1936.37-3)
-1000.0

When a positive number is truncated, we are basically rounding it down. Similarly, when we truncate a negative number, the number is rounded up. Let us look at the various rounding methods.

Rounding Up

There is another strategy called “rounding up” where a number is rounded up to a specified number of digits. For example:

ValueRound Up ToResult
12.345Tens place20
18.345Ones place19
18.345Tenths place18.4
18.345Hundredths place18.35

The term ceiling is used in mathematics to explain the nearest integer which is greater than or equal to a particular given number. In Python, for “rounding up” we use two functions namely,

  1. ceil() function, and
  2. math() function

A non-integer number lies between two consecutive integers. For example, considering a number 5.2, this will lie between 4 and 5. Here, ceiling is the higher endpoint of the interval, whereas floor is the lower one. Therefore, ceiling of 5.2 is 5, and floor of 5.2 is 4. However, the ceiling of 5 is 5.

In Python, the function to implement the ceiling function is the math.ceil() function. It always returns the closest integer which is greater than or equal to its input.

>>> import math

>>> math.ceil(5.2)
6

>>> math.ceil(5)
5

>>> math.ceil(-0.5)
0

If you notice you will see that the ceiling of -0.5 is 0, and not -1.
Let us look into a short code to implement the “rounding up” strategy using round_up() function:

def round_up(n, decimals=0): 
    multiplier = 10 ** decimals 
    return math.ceil(n * multiplier) / multiplier

Let’s look at how round_up() function works with various inputs:

>>> round_up(3.1)
4.0

>>> round_up(3.231)
3.3

>>> round_up(3.5432)
3.55

You can pass negative values  to decimals, just like we did in truncation.

>>> round_up(32.45, -1)
40.0

>>> round_up(3352-2)
3400

You can follow the diagram below to understand round up and round down. Round up to the right and down to the left.

The diagram which helps to better understand Rounding Up and Rounding Down in Python

Rounding up always rounds a number to the right on the number line, and rounding down always rounds a number to the left on the number line.

Rounding Down

Similar to rounding up we have another strategy called rounding down where

ValueRounded Down ToResult
19.345Tens place10
19.345Ones place19
19.345Tenths place19.3
19.345Hundredths place19.34

In Python, rounding down can be implemented using a similar algorithm as we truncate or round up. Firstly you will have to shift the decimal point and then round an integer. Lastly shift the decimal point back.

math.ceil() is used to round up to the ceiling of the number once the decimal point is shifted. For “rounding down” we first need to round the floor of the number once the decimal point is shifted.

>>> math.floor(1.2)
1

>>> math.floor(-0.5)
-1

Here’s the definition of round_down():

def round_down(n, decimals=0):
    multiplier = 10 ** decimals
return math.floor(n * multiplier) / multiplier

This is quite similar to round_up() function. Here we are using math.floor() instead of math.ceil().

>>> round_down(1.5)
1

>>> round_down(1.481)
1.4

>>> round_down(-0.5)
-1

Rounding a number up or down has extreme effects in a large dataset. After rounding up or down, you can actually remove a lot of precision as well as alter computations.

Rounding Half Up

The “rounding half up” strategy rounds every number to the nearest number with the specified precision, and breaks ties by rounding up. Here are some examples:

ValueRound Half Up ToResult
19.825Tens place10
19.825Ones place20
19.825Tenths place19.8
19.825Hundredths place19.83

In Python, rounding half up strategy can be implemented by shifting the decimal point to the right by the desired number of places. In this case you will have to determine whether the digit after the shifted decimal point is less than or greater than equal to 5.

You can add 0.5 to the value which is shifted and then round it down with the math.floor() function.

def round_half_up(n, decimals=0):
    multiplier = 10 ** decimals
return math.floor(n*multiplier + 0.5) / multiplier

If you notice you might see that round_half_up() looks similar to round_down. The only difference is to add 0.5 after shifting the decimal point so that the result of rounding down matches with the expected value.

>>> round_half_up(19.23, 1)
19.2

>>> round_half_up(19.281)
19.3

>>> round_half_up(19.251)
19.3

Rounding Half Down

In this method of rounding, it rounds to the nearest number similarly like “rounding half up” method, the difference is that it breaks ties by rounding to the lesser of the two numbers. Here are some examples:

ValueRound Half Down ToResult
16.825Tens place17
16.825Ones place17
16.825Tenths place16.8
16.825Hundredths place16.82

In Python, “rounding half down” strategy can be implemented by replacing math.floor() in the round_half_up() function with math.ceil() and then by subtracting 0.5 instead of adding:

def round_half_down(n, decimals=0):
    multiplier = 10 ** decimals
return math.ceil(n*multiplier - 0.5) / multiplier

Let us look into some test cases.

>>> round_half_down(1.5)
1.0

>>> round_half_down(-1.5)
-2.0

>>> round_half_down(2.251)
2.2

In general there are no bias for both round_half_up() and round_half_down(). However, rounding of data with more number of ties results in bias. Let us consider an example to understand better.

>>> data = [-2.151.454.35-12.75]

Let us compute the mean of these numbers:

>>> statistics.mean(data)
-2.275

Now let us compute the mean on the data after rounding to one decimal place with round_half_up() and round_half_down():

>>> rhu_data = [round_half_up(n, 1for n in data]
>>> statistics.mean(rhu_data)
-2.2249999999999996

>>> rhd_data = [round_half_down(n, 1for n in data]
>>> statistics.mean(rhd_data)
-2.325

The round_half_up() function results in a round towards positive infinity bias, and round_half_down() results in a round towards negative infinity bias.

Rounding Half Away From Zero

If you have noticed carefully while going through round_half_up() and round_half_down(), neither of the two is symmetric around zero:

>>> round_half_up(1.5)
2.0

>>> round_half_up(-1.5)
-1.0

>>> round_half_down(1.5)
1.0

>>> round_half_down(-1.5)
-2.0

In order to introduce symmetry, you can always round a tie away from zero. The table mentioned below illustrates it clearly:

ValueRound Half Away From Zero ToResult
16.25Tens place20
16.25Ones place16
16.25Tenths place16.3
-16.25Tens place-20
-16.25Ones place-16
-16.25Tenths place-16.3

The implementation of “rounding half away from zero” strategy on a number n is very simple. All you need to do is start as usual by shifting the decimal point to the right a given number of places and then notice the digit d immediately to the right of the decimal place in this new number. Here, there are four cases to consider:

  1. If n is positive and d >= 5, round up
  2. If n is positive and d < 5, round down
  3. If n is negative and d >= 5, round down
  4. If n is negative and d < 5, round up

After rounding as per the rules mentioned above, you can shift the decimal place back to the left.

There is a question which might come to your mind - How do you handle situations where the number of positive and negative ties are drastically different? The answer to this question brings us full circle to the function that deceived us at the beginning of this article: Python’s built-in  round() function.

Rounding Half To Even

There is a way to mitigate rounding bias while you are rounding values in a dataset. You can simply round ties to the nearest even number at the desired precision. Let us look at some examples:

ValueRound Half To Even ToResult
16.255Tens place20
16.255Ones place16
16.255Tenths place16.2
16.255Hundredths place16.26

To prove that round() really does round to even, let us try on a few different values:

>>> round(4.5)
4

>>> round(3.5)
4

>>> round(1.751)
1.8

>>> round(1.651)
1.6

The Decimal Class

The  decimal module in Python is one of those features of the language which you might not be aware of if you have just started learning Python. Decimal “is based on a floating-point model which was designed with people in mind, and necessarily has a paramount guiding principle – computers must provide an arithmetic that works in the same way as the arithmetic that people learn at school.” – except from the decimal arithmetic specification. 

Some of the benefits of the decimal module are mentioned below -

  • Exact decimal representation: 0.1 is actually 0.1, and 0.1 + 0.1 + 0.1 - 0.3 returns 0, as expected.

  • Preservation of significant digits: When you add 1.50 and 2.30, the result is 3.80 with the trailing zero maintained to indicate significance.

  • User-alterable precision: The default precision of the decimal module is twenty-eight digits, but this value can be altered by the user to match the problem at hand.

Let us see how rounding works in the decimal module.

>>> import decimal
>>> decimal.getcontext()
Context(
    prec=28,
    rounding=ROUND_HALF_EVEN,
    Emin=-999999,
    Emax=999999,
    capitals=1,
    clamp=0,
    flags=[],
    traps=[
        InvalidOperation,
        DivisionByZero,
        Overflow
    ]
)

The function decimal.getcontext() returns a context object which represents the default context of the decimal module. It also includes the default precision and the default rounding strategy.

In the above example, you will see that the default rounding strategy for the decimal module is ROUND_HALF_EVEN. It allows to align with the built-in round() function

Let us create a new Decimal instance by passing a string containing the desired value and declare a number using the decimal module’s Decimal class.

>>> from decimal import Decimal
>>> Decimal("0.1")
Decimal('0.1')

You may create a Decimal instance from a floating-point number but in that case, a floating-point representation error will be introduced. For example, this is what happens when you create a Decimal instance from the floating-point number 0.1

>>> Decimal(0.1)
Decimal('0.1000000000000000055511151231257827021181583404541015625')

You may create Decimal instances from strings containing the decimal numbers you need in order to maintain exact precision.

Rounding a Decimal using the .quantize() method:

>>> Decimal("1.85").quantize(Decimal("1.0"))
Decimal('1.8')

The Decimal("1.0") argument in .quantize() allows to determine the number of decimal places in order to round the number. As 1.0 has one decimal place, the number 1.85 rounds to a single decimal place. Rounding half to even is the default strategy, hence the result is 1.8.

Decimal class:

>>> Decimal("2.775").quantize(Decimal("1.00"))
Decimal('2.78')

Decimal module provides another benefit. After performing arithmetic the rounding is taken care of automatically and also the significant digits are preserved.

>>> decimal.getcontext().prec = 2
>>> Decimal("2.23") + Decimal("1.12")
Decimal('3.4')

To change the default rounding strategy, you can set the decimal.getcontect().rounding property to any one of several  flags. The following table summarizes these flags and which rounding strategy they implement:

FlagRounding Strategy
decimal.ROUND_CEILINGRounding up
decimal.ROUND_FLOORRounding down
decimal.ROUND_DOWNTruncation
decimal.ROUND_UPRounding away from zero
decimal.ROUND_HALF_UPRounding half away from zero
decimal.ROUND_HALF_DOWNRounding half towards zero
decimal.ROUND_HALF_EVENRounding half to even
decimal.ROUND_05UPRounding up and rounding towards zero

Rounding NumPy Arrays

In Data Science and scientific computation, most of the times we store data as a  NumPy array. One of the most powerful features of NumPy is the use of  vectorization and broadcasting to apply operations to an entire array at once instead of one element at a time.

Let’s generate some data by creating a 3×4 NumPy array of pseudo-random numbers:

>>> import numpy as np
>>> np.random.seed(444)

>>> data = np.random.randn(34)
>>> data
array([[ 0.35743992,  0.3775384 ,  1.38233789,  1.17554883],
       [-0.9392757 , -1.14315015, -0.54243951, -0.54870808],
       [ 0.20851975, 0.21268956, 1.26802054, -0.80730293]])

Here, first we seed the np.random module to reproduce the output easily. Then a 3×4 NumPy array of floating-point numbers is created with np.random.randn().

Do not forget to install pip3 before executing the code mentioned above. If you are using  Anaconda you are good to go.

To round all of the values in the data array, pass data as the argument to the  np.around() function. The desired number of decimal places is set with the decimals keyword argument. In this case, round half to even strategy is used similar to Python’s built-in round() function.

To round the data in your array to integers, NumPy offers several options which are mentioned below:

The np.ceil() function rounds every value in the array to the nearest integer greater than or equal to the original value:

>>> np.ceil(data)
array([[ 1.,  1.,  2.,  2.],
       [-0., -1., -0., -0.],
       [ 1., 1., 2., -0.]])

Look at the code carefully, we have a new number! Negative zero! Let us now take a look at Pandas library, widely used in Data Science with Python.

Rounding Pandas Series and DataFrame

Pandas has been a game-changer for data analytics and data science. The two main data structures in Pandas are Dataframe and Series. Dataframe works like an Excel spreadsheet whereas you can consider Series to be columns in a spreadsheet. Series.round() and DataFrame.round() methods. Let us look at an example.

Do not forget to install pip3 before executing the code mentioned above. If you are using  Anaconda you are good to go.

>>> import pandas as pd

>>> # Re-seed np.random if you closed your REPL since the last example
>>> np.random.seed(444)

>>> series = pd.Series(np.random.randn(4))
>>> series
0    0.357440
1    0.377538
2    1.382338
3    1.175549
dtype: float64

>>> series.round(2)
0    0.36
1    0.38
2    1.38
3    1.18
dtype: float64

>>> df = pd.DataFrame(np.random.randn(33), columns=["A""B""C"])
>>> df
          A         B         C
0 -0.939276 -1.143150 -0.542440
1 -0.548708  0.208520  0.212690
2  1.268021 -0.807303 -3.303072

>>> df.round(3)
       A      B      C
0 -0.939 -1.143 -0.542
1 -0.549  0.209  0.213
2  1.268 -0.807 -3.303

The DataFrame.round() method can also accept a dictionary or a Series, to specify a different precision for each column. For instance, the following examples show how to round the first column of df to one decimal place, the second to two, and the third to three decimal places:
>>> # Specify column-by-column precision with a dictionary
>>> df.round({"A"1"B"2"C"3})
     A     B      C
0 -0.9 -1.14 -0.542
1 -0.5  0.21  0.213
2  1.3 -0.81 -3.303

>>> # Specify column-by-column precision with a Series
>>> decimals = pd.Series([123], index=["A""B""C"])
>>> df.round(decimals)
     A     B      C
0 -0.9 -1.14 -0.542
1 -0.5  0.21  0.213
2  1.3 -0.81 -3.303

If you need more rounding flexibility, you can apply NumPy's floor(), ceil(), and print() functions to Pandas Series and DataFrame objects:
>>> np.floor(df)
     A    B    C
0 -1.0 -2.0 -1.0
1 -1.0  0.0  0.0
2  1.0 -1.0 -4.0

>>> np.ceil(df)
     A    B    C
0 -0.0 -1.0 -0.0
1 -0.0  1.0  1.0
2  2.0 -0.0 -3.0

>>> np.rint(df)
     A    B    C
0 -1.0 -1.0 -1.0
1 -1.0  0.0  0.0
2  1.0 -1.0 -3.0

The modified round_half_up() function from the previous section will also work here:
>>> round_half_up(df, decimals=2)
      A     B     C
0 -0.94 -1.14 -0.54
1 -0.55  0.21  0.21
2 1.27 -0.81 -3.30

Best Practices and Applications

Now that you have come across most of the rounding techniques, let us learn some of the best practices to make sure we round numbers in the correct way.

Generate More Data and Round Later

Suppose you are dealing with a large set of data, storage can be a problem at times. For example, in an industrial oven you would want to measure the temperature every ten seconds accurate to eight decimal places, using a temperature sensor. These readings will help to avoid large fluctuations which may lead to failure of any heating element or components. We can write a Python script to compare the readings and check for large fluctuations.

There will be a large number of readings as they are being recorded each and everyday. You may consider to maintain three decimal places of precision. But again, removing too much precision may result in a change in the calculation. However, if you have enough space, you can easily store the entire data at full precision. With less storage, it is always better to store at least two or three decimal places of precision which are required for calculation.

In the end, once you are done computing the daily average of the temperature, you may calculate it to the maximum precision available and finally round the result.

Currency Exchange and Regulations

Whenever we purchase an item from a particular place, the tax amount paid against the amount of the item depends largely on geographical factors. An item which costs you $2 may cost you less (say $1.8)  if you buy the same item from a different state. It is due to regulations set forth by the local government.

In another case, when the minimum unit of currency at the accounting level in a country is smaller than the lowest unit of physical currency, Swedish rounding is done. You can find a list of such rounding methods used by various countries if you look up on the internet.

If you want to design any such software for calculating currencies, keep in mind to check the local laws and regulations applicable in your present location.

Reduce error

As you are rounding numbers in a large datasets used in complex computations, your primary concern should be to limit the growth of the error due to rounding.

Summary

In this article we have seen a few methods to round numbers, out of those “rounding half to even” strategy minimizes rounding bias the best. We are lucky to have Python, NumPy, and Pandas already have built-in rounding functions to use this strategy. Here, we have learned about -

  • Several rounding strategies, and how to implement in pure Python.
  • Every rounding strategy inherently introduces a rounding bias, and the “rounding half to even” strategy mitigates this bias well, most of the time.
  • You can round NumPy arrays and Pandas Series and DataFrame objects.

If you enjoyed reading this article and found it to be interesting, leave a comment. To learn more about rounding numbers and other features of Python, join our Python certification course.

Priyankur

Priyankur Sarkar

Data Science Enthusiast

Priyankur Sarkar loves to play with data and get insightful results out of it, then turn those data insights and results in business growth. He is an electronics engineer with a versatile experience as an individual contributor and leading teams, and has actively worked towards building Machine Learning capabilities for organizations.

Join the Discussion

Your email address will not be published. Required fields are marked *

Suggested Blogs

How to use sys.argv in Python?

The sys module is one of the common and frequently used modules in Python. In this article, we will walk you through how to use the sys module. We will learn about what argv[0] and sys.argv[1] are and how they work. We will then go into how to parse Command Line options and arguments, the various ways to use argv and how to pass command line arguments in Python 3.x In simple terms,Command Line arguments are a way of managing the script or program externally by providing the script name and the input parameters from command line options while executing the script. Command line arguments are not specific just to Python. These can be found in other programming languages like C, C# , C++, PHP, Java, Perl, Ruby and Shell scripting. Understanding sys.argv with examples  sys.argv is a list in Python that contains all the command-line arguments passed to the script. It is essential in Python while working with Command Line arguments. Let us take a closer look with a few examples. With the len(sys.argv) function, you can count the number of arguments. import sys print ("Number of arguments:", len(sys.argv), "arguments") print ("Argument List:", str(sys.argv)) $ python test.py arg1 arg2 arg3 Number of arguments: 4 arguments. Argument List: ['test.py', 'arg1', 'arg2', 'arg3']Module name to be used while using sys.argv To use sys.argv, you will first need to the sys module. What is argv[0]? Remember that sys.argv[0] is the name of the script. Here – Script name is sysargv.py import sys print ("This is the name of the script: ", sys.argv[0]) print ("Number of arguments: ", len(sys.argv)) print ("The arguments are: " , str(sys.argv))Output:This is the name of the script:  sysargv.py                                                                               Number of arguments:  1                                                                                                 The arguments are:  ['sysargv.py']What is "sys. argv [1]"? How does it work? When a python script is executed with arguments, it is captured by Python and stored in a list called sys.argv. So, if the below script is executed: python sample.py Hello Python Then inside sample.py, arguments are stored as: sys.argv[0] == ‘sample.py’ sys.argv[1] == ‘Hello’ sys.argv[2] == ‘Python’Here,sys.argv[0] is always the filename/script executed and sys.argv[1] is the first command line argument passed to the script . Parsing Command Line options and arguments  Python provides a module named as getopt which helps to parse command line options and arguments. Itprovides a function – getopt, whichis used for parsing the argument sequence:sys.argv. Below is the syntax: getopt.getopt(argv, shortopts, longopts=[]) argv: argument list to be passed.shortopts: String of short options as list . Options in the arguments should be followed by a colon (:).longopts: String of long options as list. Options in the arguments should be followed by an equal sign (=). import getopt import sys   first ="" last ="" argv = sys.argv[1:] try:     options, args = getopt.getopt(argv, "f:l:",                                ["first =",                                 "last ="]) except:     print("Error Message ")   for name, value in options:     if name in ['-f', '--first']:         first = value     elif name in ['-l', '--last']:         last = value   print(first + " " + last)Output:(venv) C:\Users\Nandank\PycharmProjects\DSA\venv>python getopt_ex.py -f Knowledge -l Hut Knowledge Hut (venv) C:\Users\Nandank\PycharmProjects\DSA\venv>python getopt_ex.py --first Knowledge –last Hut Knowledge HutWhat are command line arguments? Why do we use them? Command line arguments are parameters passed to a program/script at runtime. They provide additional information to the program so that it can execute. It allows us to provide different inputs at the runtime without changing the code. Here is a script named as argparse_ex.py: import argparse parser = argparse.ArgumentParser() parser.add_argument("-n", "--name", required=True) args = parser.parse_args() print(f'Hi {args.name} , Welcome ')Here we need to import argparse package Then we need to instantiate the ArgumentParser object as parser. Then in the next line , we add the only argument, --name . We must specify either shorthand (-n) or longhand versions (--name)  where either flag could be used in the command line as shown above . This is a required argument as mentioned by required=True Output:  (venv) C:\Users\Nandank\PycharmProjects\DSA\venv>python argparse_ex.py --name Nandan  Hi Nandan , Welcome  (venv) C:\Users\Nandank\PycharmProjects\DSA\venv>python argparse_ex.py -n Nandan  Hi Nandan , Welcome The example above must have the --name or –n option, or else it will fail.(venv) C:\Users\Nandank\PycharmProjects\DSA\venv>python argparse_ex.py --name   usage: argparse_ex.py [-h] --name NAME argparse_ex.py: error: the following arguments are required: --namePassing command line arguments in Python 3.x argv represents an array having the command line arguments of thescript . Remember that here, counting starts fromzero [0], not one (1). To use it, we first need to import sys module (import sys). The first argument, sys.argv[0], is always the name of the script and sys.argv[1] is the first argument passed to the script. Here, we need to slice the list to access all the actual command line arguments. import sys if __name__ == '__main__':     for idx, arg in enumerate(sys.argv):        print("Argument #{} is {}".format(idx, arg))     print ("No. of arguments passed is ", len(sys.argv))Output:(venv) C:\Users\Nandank\PycharmProjects\DSA\venv\Scripts>python argv_count.py Knowledge Hut 21 Argument #0 is argv_count.py Argument #1 is Knowledge Argument #2 is Hut Argument #3 is 21 No. of arguments passed is  4Below script - password_gen.py is used to generate a secret password by taking password length as command line argument.import secrets , sys, os , string ''' This script generates a secret password using possible key combinations''' ''' Length of the password is passed as Command line argument as sys.argv[1]''' char = string.ascii_letters+string.punctuation+string.digits length_pwd = int(sys.argv[1])   result = "" for i in range(length_pwd):     next= secrets.SystemRandom().randrange(len(char))     result = result + char[next] print("Secret Password ==" ,result,"\n")Output:(venv) C:\Users\Nandank\PycharmProjects\DSA\venv\Scripts>python password_gen.py 12 Secret Password == E!MV|,M][i*[Key takeaways Let us summarise what we have learned in this article so far –  The use of the sys module in Python What areargv[0] and sys.argv[1] and how they work What are Command Line arguments and why we use them How to parse Command Line options and arguments Multiple ways to use argv How to pass command line arguments in Python 3.x Hope this mini tutorial has been helpful in explaining the usage of sys.argv and how it works in Python. Be sure to check out the rest of the tutorials on KnowledgeHut’s website! 
Rated 4.0/5 based on 14 customer reviews
5898
How to use sys.argv in Python?

The sys module is one of the common and frequently... Read More

The self variable in Python explained with Python tips

If you have been working on Python, you might have across the self variable. You can find it in method definitions and in initializing variables. However, before coming to the self variable, let us have an idea about classes and instances in Python.  What are Instance methods and Class methods in Python? You might have heard of instances and classes while working on Python. Class variables are defined within a class and they are shared with all the instances (objects) of the class whereas instance variables are owned by the instances of a class. For different instances, the instance variables are different. Likewise, Python also contains class methods and instance methods. The class methods inform about the status of the class. On the other hand, instance methods are to set or get details about instances or objects.   If you want to define an instance method, the foremost parameter of the method should always have to be self. Let us understand this with the help of example code – class myClass:      def instance_method(self):          return “Instance method is called”, self The method “instance_method” is a regular instance method. The method accepts one single parameter – self. The self variable points to an instance of the class myClass when the method is revoked. Though the method takes only one parameter here, it can also accept more than one parameter. The instance methods can easily access different attributes and other methods on the same object with the help of the self variable. The self variable also has the power to modify the state of an object and using the self.__class__  attribute, instance methods can also access the class. Thus, the instance methods are also able to modify the class state. Now, let us see what happens when we call “instance_method”: >>>obj = myClass()  >>>obj.instance_method()  ('Instance method is called', ) This shows that “instance_method” can access the object instance (printed as ) through the self parameter. What happens is that the self parameter is replaced with the instance object obj when the method is called. However, if you pass the instance object manually, you will get the same result as before: >>>myClass.instance_method(obj)  ('Instance method is called', ) Note that self is actually not a defined keyword in Python but a convention.What is Self in Python? Unlike this variable in C++, self is not a keyword rather it is more of a coding convention. It represents the instance or objects of a class and binds the attributes of a class with specific arguments. The use of self variable in Python helps to differentiate between the instance attributes (and methods) and local variables. If you do not want the variables of your class to be shared by all instances of the class, you can declare variables within your class without self. Let us understand this with an example: class Car:      def __init__(self, model):  self.model = model      def Car_info(self):  print("Model : ", self.model) Here, we have declared a class Car with one instance variable self.model = model. The value of the instance variable will be unique to the instance objects of the class that might be declared. However, if you want the variables to be shared by all instances of the class, you need to declare the instance variables without self. Otherwise, it would be ambiguous since all cars will have the same model. Need for Self in Python The self variable is used to represent the instance of the class which is often used in object-oriented programming. It works as a reference to the object. Python uses the self parameter to refer to instance attributes and methods of the class.  Unlike other programming languages, Python does not use the “@” syntax to access the instance attributes. This is the sole reason why you need to use the self variable in Python. The language contains methods that allow the instance to be passed automatically but not received automatically. Explicit definition of self The Zen of Python says “Explicit is better than Implicit”. Programmers of other languages often ask why self is passed as an explicit parameter every time to define a method. There are a few reasons for this. Firstly, since Python uses a method or instance attribute instead of a local variable when we read self.name or self.age, it makes it absolutely clear you are using an instance variable or method even if you have no knowledge about the class definition. Secondly, if you explicitly refer or call a method from a particular class in Python, you do not need to use any special syntax for that.  Finally, the third reason is that the explicit definition of self helps Python understand whether to assign to an instance variable or to a local variable. In simpler terms, local variables and instances exist in two separate namespaces and we need to inform Python which namespace should be used. What is a Python class self constructor? The self variable in Python can also be used to access a variable field within the class definition. Let us understand this with the help of example code: class Student:     def __init__(self, Alex):          self.name = Alex    #name created in constructor      def get_student_name(self):          return self.name In the example above, self refers to the variable age of the class Student. The variable age is local to the method. While the method is running, the variable age exists within the class. If there is a variable within a method, the self variable will not work. If you wish to define global fields, you need to define variables outside the class methods.   Is self a keyword in Python? There is a question that always hovers among Python programmers. Is self actually a keyword in Python? Unlike other programming languages like C++, where self is considered to be a keyword, in Python it is a convention that programmers tend to follow. It is basically a parameter in a method definition. However, you can use any other name in place of self like another or me or anything else for the first parameter of a method. Another reason why it is suggested by most people is that it enhances the readability of your code. Let us see an example to understand it: class myClass:   def show(another):   print(“another is used in place of self”)  If you compare this code with the code for the Python class self constructor, you will notice that here we have used the name another in place of self. Now let us create an object of this class and see the output: object = myClass()   object.show()another is used in place of selfYou can see that the program works even if we use some other name in place of the self variable. It works exactly the same way as the self variable does. Why "self" should be used as the first parameter of instance methods in python  This can be understood by the below example. We have a Rectangle class which defines a method area to calculate the area : class Rectangle ():      def __init__(self,x = 0,y = 0):          self.x = x          self.y = y      def area (self):          """Find area of rectangle"""          return (self.x * self.y)  rec1=Rectangle(6,10)  print ("Area is:", rec1.area()) Output:  Area is: 60    In the above example, __init__() defines three parameters but only 2 arguments are passed (6 and 10). Similarly, area () requires one but no arguments are passed.  Rectangle.area and rec1.area in the above example are different and not exactly the same. >>> type(Rectangle.area )    >>> type(rec1.area)  Here, the first one is a function and the second one is a method. A unique feature of Python is that the object itself is passed as the first argument to the corresponding function. In the above example, the method call:  rec1.area()is equivalent to:  Rectangle.area(rec1). Generally, when the method is called with some arguments, the corresponding class function is called by placing the method's object before the first argument.  Therefore: obj.method(args) becomes Class.method(obj, args). This is the reason the first parameter of a function in a class must be the object itself. Writing this parameter as self is merely a convention and not a keyword so it has no special meaning in Python. We could use other names (like this, that) but it is not preferred as it degrades code readability.Should we pass self to a method? Since we can use any other name instead of using the self variable, then what will happen if we just pass self to a method definition. Let us consider the class myClass we have used earlier.  A method named something is defined within the class with a parameter another and two arguments: class myClass:   def something(another, argument1, argument2):   pass Now, let us declare an instance obj of myClass and call the method something with the help of the instance object: obj = myClass()  obj.something(argument1, argument2) Python performs an internal work on the method call and converts it into something like this: myClass.something(obj, argument1, argument2)  This shows that another variable (used in place of self) refers to the instance object of the class. Note that the pass keyword used in the method something does nothing. It is used as a dummy in situations where you do not want any operation to be performed but there is a syntax requirement of a certain programming element. How can we skip self in Python? Consider a situation where the instance method does not need to have access to the instance variables. In such cases, we can consider skipping the self variable in defining methods. Let us have a clear understanding of the fact with example code: class Vehicle:  def Car():  print(“Rolls Royce 1948”)  obj = Vehicle()  print(“Complete”) If you run the following code, the output will be as follows: Complete We have not declared the self variable here but there is still no error in the program and the output comes out fine. However, what will be the case if we call the Car() method: obj = Vehicle()  obj.Car() When we compile the code after calling the Car() method, it shows an error like this: Traceback (most recent call last):   File "", line 11, in   TypeError: Car() takes 0 positional arguments but 1 was given The output shows an error since the method Car() takes 0 positional arguments but we have given 1 positional argument to it. This is because when the instance obj is created, it is automatically passed as the first argument to the method Car() even if we have not declared the self variable. However, if you try to access the instance method Car() with the help of the class reference, there will be no errors and the program will work fine: class Vehicle:  def Car():  print("Rolls Royce 1948")  obj = Vehicle()  Vehicle.Car()  Rolls Royce 1948 Difference between self and __init__  self : self represents the instance of the class. By using the "self" keyword all the attributes and methods of the python class can be accessed. __init__ : "__init__" is a reserved method in python classes. It is known as a constructor in object oriented concepts. This method is called when an object is created from the class and allows the class to initialize class attributes .. Usage of "self" in class to access the methods and attributes: class Rectangle:     def __init__(self, length, breadth, cost_per_unit =0):         self.length = length         self.breadth = breadth         self.cost_per_unit = cost_per_unit     def perimeter(self):         return 2 * (self.length + self.breadth)     def area(self):         return self.length * self.breadth     def calculate_cost(self):         area = self.area()         return area * self.cost_per_unit  # length = 40 cm, breadth = 30 cm and 1 cm^2 = Rs 100  r = Rectangle(40, 30, 100)  print("Area of Rectangle:",r.area())  print("Cost of rectangular field is : Rs ",r.calculate_cost()) Output:  Area of Rectangle: 1200  Cost of rectangular field is : Rs  120000 We have created an object of Rectangle class. While creating the Rectangle object, we passed 3 arguments – 40,30,100; all these arguments are passed to "__init__"method to initialize the object. Here, the keyword "self” represents the instance of the class. It binds the attributes with the given arguments. Self represents the same object or instance of the class. If you see, inside the method "area” , self.length" is used to get the value of the attribute "length".  attribute "length" is bind to the object (instance of the class) at the time of object creation. "self" represents the object inside the class. "self" works just like "r" in the statement “r = Rectangle(40,30, 100)".  If you see the method definition "def area(self): ” , here "self" is used as a parameter in the method because whenever we call the method,  the object (instance of class) is automatically passed as a first argument along with other arguments of the method. If no other arguments are provided only "self" is passed to the method. That's the reason "self" is used to call the method inside the class("self.area()").  We used object (instance of class) to call the method outside of the class definition("r.area()").  "r" is the instance of the class when the method "r.area()” is called; the instance "r" is passed as first argument in the place of self. Miscellaneous Implementations of self Let us now discuss some of the miscellaneous implementations of the self variable. Similar variables for Class Method and Static Method A class method is a method that is bound to the class. Let us understand a class method with an example – class myClass:  @classmethod  def classmethod(cls):  return “Class Method is called”  obj.classmethod() The same behavior of the self variable is present with the Class methods too but the only difference is that for class methods, the convention is to use cls as the variable name instead of self. The class methods take a cls parameter instead of the self parameter. When the method is called, it points to the class. The class method cannot modify the object state but it can modify the class state of all the class instances. On the other hand, static methods are self-sufficient functions and this type of method takes neither a self nor a cls parameter. Let us see an example of a static method – class myClass:  @staticmethod  def staticmethod():  return “Static Method is called”  obj.staticmethod() Since a static method does not accept any parameter, they cannot modify object state or even class state. They are primarily used to namespace different methods and Python restricts them in the data they can access. Note that both the methods here are marked with @classmethod and @staticmethod decorators to flag it as a class method and static method respectively. The self variable is bound to the current instance The self variable allows us to access the properties of the current instance. Let us understand this with an example – class Person:  def __init__(self, n):  self.name = n  def walk(self):  print(f“{self.name} is walking”)  obj1 = Person(“Alex”)  obj2 = Person(“Charles”)  obj1.walk()  obj2.walk()  Alex is walking Charles is walking Here, we have a class Person with two methods __init__ and walk declared with the self parameter. We have created two different instances of the class – obj1 and obj2. When the first instance is revoked, “Alex” is printed with the method walk() whereas when the second instance is revoked, “Charles” gets printed with the properties of the instance method walk(). Tips about the Python self variable Since we have now reached the end of the article, let me give you some tips about when to use and when not to use the self variable in Python. Use self  – When you are defining an instance method since it is passed automatically as the first parameter when the method is called. While referencing a class or an instance attribute from inside an instance method. When you want to refer to instance variables and methods from other instance methods. Don’t use self – When you want to call an instance method normally. While referencing a class attribute inside the class definition but outside an instance method. When you are inside a static method.  Conclusion Let us go through the points we have covered in this article - Instances and Classes in Python. Self variable and its importance. The explicitness of the self variable. Python class self constructor. Passing self as a method. Skipping self in Python. Variables used for Class methods and Static methods. Bounding of self to the current instance. When to use and when not to use self in Python. You have gathered enough knowledge about the self variable in Python and its internal working in Python. However, if you wish to know more about Python self, you can always head on to the official documentation of Python. 
Rated 4.0/5 based on 10 customer reviews
12357
The self variable in Python explained with Python ...

If you have been working on Python, you might have... Read More

How to Work with Excel Spreadsheets using Python

Excel is considered as one of the most popular and widely used spreadsheet applications developed by Microsoft. You can organize, analyze and store your data into tabular sheets with the help of Excel. From analysts and sales managers, to CEOs, professionals from every field use Excel for creating quick statistics and for data crunching.Spreadsheets are commonly used in the present world because of their intuitive nature and the ability to handle large datasets. Most importantly, they can work without any prior technical background.Finding different ways to work with Excel using code is essential since working with data and in Python has some serious advantages in comparison with Excel’s UI. Developers of Python have implemented ways to read, write and manipulate Excel documents.You can check the quality of your spreadsheet application by going over the checklist below:Is the spreadsheet able to represent static data?Is the spreadsheet able to mix data, calculations, and reports?Is the data in your spreadsheet complete and consistent in nature?Does the spreadsheet have an organized worksheet structure?This checklist will help you in verifying the qualitative nature of the spreadsheet application you’re going to work on.Practical Applications  In this article, we would be using openpyxl to work on data. With the help of this module, you can extract data from a database into an Excel spreadsheet or you can also convert an Excel spreadsheet into a programmatic format. There can be a lot of possible situations where you might feel the need to use a package like openpyxl. Let us discuss a few of them to get a comprehensive overview of it.Importing New Products Into a Database Consider yourself working in an online store company. When they want to add new products to the online store, they make an Excel spreadsheet with a few hundred rows along with the name of the product, description, price and a few more basic information and then they give it to you. Now, if you want to import this particular data, you need to iterate over each row of the spreadsheet and then add each of the products into the database of the online store.[Text Wrapping Break] Exporting Database Data Into a SpreadsheetConsider you have a Database table. In this particular table, you have collected information of all your users which includes their name, contact number, email address, and so forth. Now, the Marketing Team is willing to collectively contact all the users and promote a new product of the company. However, neither do they have access to the Database nor they have any idea about using SQL to extract the information. In this situation, openpyxl comes to play. You can use it effectively to iterate over each User record and transform the required information into an Excel spreadsheet.    Appending Information to an Existing SpreadsheetConsider the same online store example we discussed above. You have an Excel spreadsheet with a list of users and your job is to append to each row the total amount they have spent in your store.In order to perform this, you have to read the spreadsheet first and then iterate through each row and fetch the total amount spent from the Database. Finally, you need to write it back to the spreadsheet.Starting openpyxlYou can install the openpyxl package using pip. Open your terminal and write the following command: $ pip install openpyxlAfter you have installed the spreadsheet, you can make up your own simple spreadsheet: from openpyxl import Workbook workbook = Workbook() spreadsheet = workbook.active spreadsheet["A1"] = "Hello" spreadsheet["B1"] = "World!" workbook.save(filename="HelloWorld.xlsx")How to Read Excel Spreadsheets with openpyxl Let us start with the most important thing that you can do with a spreadsheet,i.e. read it. We will be using a Watch Sample Dataset which contains a list of 100 watches with information like product name, product ID, review and so forth.  A Simple Way to Read an Excel Spreadsheet Let us start with opening our sample spreadsheet:>>> from openpyxl import load_workbook >>> workbook = load_workbook(filename="sample.xlsx") >>> workbook.sheetnames ['Sheet 1'] >>> spreadsheet = workbook.active >>> spreadsheet >>> spreadsheet.titleIn the example code above, we open the spreadsheet using load_workbook and then we check all the sheets that are available to work with using workbook.sheetnames. Then Sheet 1 is automatically selected using workbook.active since it is the first sheet available. This is the most common way of opening a spreadsheet.  Now, let us see the code to retrieve data from the spreadsheet: >>> spreadsheet["A1"] >>> spreadsheet["A1"].value 'marketplace' >>> spreadsheet["F10"].value "G-Shock Men's Grey Sport Watch"You can retrieve the actual value and the cell value  both. To get the actual value, use .value and to get the cell, you can use .cell():>>> spreadsheet.cell(row=10, column=6) >>> spreadsheet.cell(row=10, column=6).value "G-Shock Men's Grey Sport Watch"Importing Data from a Spreadsheet In this section, we will discuss how to iterate through the data, and about conversion into a more useful format using Python.Let us first start with iterating through the data. There are a number of iterating methods that depend solely on the user.You can slice the data with a combination of rows and columns:>>> spreadsheet["A1:C2"] ((, , ),  (, , )) You can also iterate through the dataset by ranging between rows and columns: >>> # Get all cells from column A  >>> spreadsheet["A"] (,  ,   ...   ,   ) >>> # Get all cells for a range of columns >>> spreadsheet["A:B"]  ((,    ,    ...    ,    ),   (,    ,    ...    ,    ))  >>> # Get all cells from row 5 >>> spreadsheet[5] (,  ,  ...   ,  ) >>> # Get all cells for a range of rows >>> spreadsheet[5:6] ((,   ,    ...    ,    ),   (,    ,    ...    ,    )) Python offers arguments by which you can set limits to the iteration with the help of Python generators like .iter_rows() and .iter_cols(): >>> for row in spreadsheet.iter_rows(min_row=1, ... max_row=2, ... min_col=1, ... max_col=3): ... print(row) (, , ) (, , ) >>> for column in spreadsheet.iter_cols(min_row=1,  ... max_row=2, ... min_col=1, ... max_col=3): ... print(column) (, ) (, )  (, ) You can also add Boolean values_only in the above example and set it to True to get the values of cell: >>> for value in spreadsheet.iter_rows(min_row=1,  ... max_row=2,  ... min_col=1,  ... max_col=3,  ... values_only=True): ... print(value) ('marketplace', 'customer_id', 'review_id') ('US', 3653882, 'R3O9SGZBVQBV76')Since we are now done with iterating the data, let us now manipulate data using Python’s primitive data structures. Consider a situation where you want to extract information of a product from the sample spreadsheet and then store it into the dictionary. The key to the dictionary would be the product ID.   Convert Data into Python classesTo convert data into Python data classes, let us first decide what we want to store and how to store it.  The two essential elements that can be extracted from the data are as follows:                                                     1. Products                                             2. Review                                                          • ID                                                         • ID                                                          • Title                                                     • Customers ID                                                          • Parent                                                 • Headline                                                          • Category                                            • Body                                                                                                                         • DateLet us implement the two elements: import datetime from dataclasses import dataclass @dataclass class Product: id: str parent: str title: str category: str @dataclass class Review: id: str customer_id: str stars: int headline: str body: str  date: datetime.datetime The next step is to create a mapping between columns and the required fields: >>> for value in spreadsheet.iter_rows(min_row=1, ... max_row=1, ... values_only=True): ... print(value) ('marketplace', 'customer_id', 'review_id', 'product_id', ...) >>> # Or an alternative >>> for cell in sheet[1]: ... print(cell.value) marketplace Customer_ID Review_ID Product_ID Product_Parent ...Finally, let us convert the data into new structures which will parse the data in spreadsheet into a list of products and review objects: from datetime import datetime  from openpyxl import load_workbook  from classes import Product,Review  from mapping import PRODUCT_ID,PRODUCT_PARENT,PRODUCT_TITLE, \ PRODUCT_CATEGORY,REVIEW_DATE,REVIEW_ID,REVIEW_CUSTOMER, \ REVIEW_STARS,REVIEW_HEADLINE,REVIEW_BODY # Using the read_only method since you're not gonna be editing the spreadsheet workbook = load_workbook(filename="watch_sample.xlsx",read_only=True)  spreadsheet = workbook.active products = [] reviews = [] # Using the values_only because you just want to return the cell value for row in spreadsheet .iter_rows(min_row=2, values_only=True):  product = Product(id=row[PRODUCT_ID],  parent=row[PRODUCT_PARENT],  title=row[PRODUCT_TITLE],  category=row[PRODUCT_CATEGORY])  products.append(product) # You need to parse the date from the spreadsheet into a datetime format spread_date = row[REVIEW_DATE]  parsed_date = datetime.strptime(spread_date,"%Y-%m-%d") review = Review(id=row[REVIEW_ID], Customer_ID=row[REVIEW_CUSTOMER], stars=row[REVIEW_STARS], headline=row[REVIEW_HEADLINE], body=row[REVIEW_BODY], date=parsed_date) reviews.append(review) print(products[0]) print(reviews[0])After you execute the code, you will get an output that looks like this:Product(id='A90FALZ1ZC',parent=937111370,...) Review(id='D3O9OGZVVQBV76',customer_id=3903882,...)Appending Data To understanding how to append data, let us hover back to the first sample spreadsheet. We will open the document and append some data to it: from openpyxl import load_workbook # Start by opening the spreadsheet and selecting the main sheet workbook = load_workbook(filename="hello_world.xlsx") spreadsheet = workbook.active # Write what you want into a specific cell spreadsheet["C1"]="Manipulating_Data ;)" # Save the spreadsheet workbook.save(filename="hello_world_append.xlsx"If you open your Excel file, you will notice the additional Manipulating_Data being added to an adjacent cell. Writing Excel Spreadsheets With openpyxl A spreadsheet is a file that helps to store data in specific rows and columns. We can calculate and store numerical data and also perform computation using formulas. So, let’s begin with some simple Spreadsheets and understand what each line means. Creating our first simple Spreadsheet 1 from openpyxl import Workbook  2    3 filename = "first_program.xlsx"  4    5 workbook = Workbook()  6 spreadsheet = workbook.active  7    8 sheet["A1"] = "first"  9 sheet["B1"] = "program!" 10   11 workbook.save(filename=filename)Line 5: In order to make a Spreadsheet, at first,  we have to create an Empty workbook to perform further operations. Lines 8 and 9 : We can add data to a specific cell as per our requirement. In this example, we can see that two values “first” and “program” have been added to specific cells in the sheet. Line 11: The line shows how to save data after all the operations we have done. Basic Spreadsheet Operations Before going to the difficult coding part, at first we have to build our building blocks like how to add and update values, how to manage rows and columns, adding filters, styles or formulas in a Spreadsheet. We have already explained the following code by which we can add values to a Spreadsheet: >>> spreadsheet["A1"] = "the_value_we_want_to_add"There is another way that we can add values to Spreadsheet: >>> cell = sheet["A1"] >>> cell >>> cell.value 'hello' >>> cell.value = "hey" >>> cell.value 'hey'Line 1: In the first line at first we have declared the cell and updated its value. Line 5: We have printed the value of the cell as “first”  because  in the first program we have already assigned sheet["A1"]with “first” Line 8 : We have updated the value of the cell as "second"by simply assigning it to cell.value. Lines 9 : In this line, we have just printed the updated value of cell. Finally, you have to save all the operations you have performed into the spreadsheet once you call workbook.save().If  the cell didn’t exist while adding a value ,then openpyxl creates a cell:>>> # Before, our spreadsheet has only 1 row >>> print_rows() ('first', 'program!') >>> # Try adding a value to row 10 >>> spreadsheet["B10"] = "test" >>> print_rows() ('first', 'program!') (None, None)  (None, None)  (None, None)  (None, None)  (None, None)  (None, None)  (None, None)  (None, None)  (None, 'test') Managing Rows and Columns in Spreadsheet Insertion or deletion of rows (adding or removing elements of rows /columns) is one of the most basic operations in Spreadsheet. In openpyxl.We can perform these operations by simply calling these methods and passing its arguments. .insert_rows().delete_rows().insert_cols().delete_cols()We can pass 2 types of arguments to the methods :  idx amount Idx stands for index position and amount refers to the number of values we can store in the Spreadsheet. Using our basic knowledge based on the first  simple program, let’s see how we can use these methods inside the program: >>> print_rows() ('first', 'program!') >>> # Insert a column at the first position before column 1 ("A") >>> spreadsheet.insert_cols(idx=1) >>> print_rows() (None, 'first', 'program!') >>> # Insert 5 columns in  between column 2 ("B") and 3 ("C") >>> spreadsheet.insert_cols(idx=3,amount=5) >>> print_rows() (None, 'first', None, None, None, None, None, 'program!') >>> # Delete the created columns >>> spreadsheet.delete_cols(idx=3,amount=5) >>> v.delete_cols(idx=1) >>> print_rows() ('first', 'program!') >>> # Insert a new row in the beginning >>> spreadsheet.insert_rows(idx=1) >>> print_rows() (None, None) ('first', 'program!') >>> # Insert 3 new rows in the beginning  >>> spreadsheet.insert_rows(idx=1,amount=3) >>> print_rows() (None, None) (None, None)  (None, None)  (None, None)  ('first', 'program!') >>> # Delete the first 4 rows  >>> spreadsheet.delete_rows(idx=1,amount=4)  >>> print_rows()  ('first', 'program!') Managing SheetsWe have seen the following recurring piece of code in our previous examples .This is one of the ways of selecting the default sheet from the Spreadsheet: spreadsheet = workbook.activeHowever, if you want to open a spreadsheet with multiple sheets, you can write the following command: >>> # Let's say you have two sheets: "Products" and "Company Sales" >>> workbook.sheetnames ['Products', 'Company Sales'] >>> # You can select a sheet using its title >>> Products_Sheet = workbook["Products"] >>> Sales_sheet = workbook["Company Sales"]If we want to change the title of the Sheet, execute the following code: >>> workbook.sheetnames ['Products', 'Company Sales'] >>> Products_Sheet = workbook["Products"] >>> Products_Sheet.title = "New Products" >>> workbook.sheetnames ['New Products', 'Company Sales']We can CREATE / DELETE Sheets also with the help of two methods - .create_sheet() and  .remove(): >>> #To print the available sheet names >>> workbook.sheetnames  ['Products', 'Company Sales'] >>> #To create a new Sheet named "Operations" >>> Operations_Sheet = workbook.create_sheet("Operations") >>> #To print the updated available sheet names >>> workbook.sheetnames ['Products', 'Company Sales', 'Operations'] >>> # To define the position Where we want to create the Sheet(here “HR” sheet is created at the first position .Here index 0 represents the first position) >>> HR_Sheet = workbook.create_sheet("HR",0) >>> #To again  print the updated available sheet names >>> workbook.sheetnames ['HR', 'Products', 'Company Sales', 'Operations'] >>> # To remove them,we just have to send the sheet names as an argument which we want to delete to the method  .remove()  >>> workbook.remove(Operations_Sheet) >>> workbook.sheetnames ['HR', 'Products', 'Company Sales'] >>> #To delete hr_sheet >>> workbook.remove(hr_sheet) >>> workbook.sheetnames ['Products', 'Company Sales']Adding Filters to the Spreadsheet We can use openpyxl to add filters in our Spreadsheet but when we open our Spreadsheet, the data won’t be rearranged according to these sorts and filters. When you’re programmatically creating a spreadsheet and it is going to be sent and used by someone else, it is a good practice to add different filters and allow people to use it afterward. In the code below there is a simple example which shows how to add a simple filter to your spreadsheet: >>> # Check the used spreadsheet space using the attribute "dimensions" >>> spreadsheet.dimensions 'A1:O100' >>> spreadsheet.auto_filter.ref="A1:O100" >>> workbook.save(filename="watch_sample_with_filters.xlsx")Adding Formulas to the Spreadsheet Formulas are one of the most commonly used and powerful features of spreadsheets. By using formulas, you can solve various mathematical equations with the additional support of openpyxl which makes those calculations as simple as editing a specific cell’s value.The list of formulas supported by openpyxl are:>>> from openpyxl.utils import FORMULAE >>> FORMULAE frozenset({'ABS',            'AMORLINC',            'ACCRINT',             'ACOS',             'ACCRINTM',             'ACOSH',              ...,                   'AND',            'YEARFRAC',             'YIELDDISC',             'AMORDEGRC',             'YIELDMAT',             'YIELD',             'ZTEST'}) Let’s add some formulas to our spreadsheet. Let’s check the average star rating of  the 99 reviews within the spreadsheet: >>> # Star rating is in column "H"  >>> spreadsheet["P2"] = "=AVERAGE(H2:H100)" >>> workbook.save(filename = "first_example.xlsx")Now, if we open your spreadsheet and go to cell P2, you can see the value to be 4.18181818181818.  Similarly, we can use this methodology to include any formulas for our requirements in our spreadsheet. For example, if we want to count the number of helpful reviews: >>> # The helpful votes  counted in column "I"  >>> spreadsheet["P3"] = '=COUNTIF(I2:I100, ">0")' >>> workbook.save(filename = "first_example.xlsx") Adding Styles to the SpreadsheetIt is not so important and usually, we don’t use this in everyday code but for the sake of completeness, we will also understand this with the following example.Using openpyxl, we get multiple styling options such as including fonts, colors,  borders,and so on.Let’s have a look at an example:>>> # Import necessary style classes >>> from openpyxl.styles import Font,Color,Alignment,Border,Side,colors >>> # Create a few styles >>> Bold_Font = Font(bold=True) >>> Big_Red_Text = Font(color=colors.RED,size=20)  >>> Center_Aligned_Text = Alignment(horizontal="center")  >>> Double_Border_Side = Side(border_style="double")  >>> Square_Border = Border(top=double_border_side,  ... right=double_border_side,  ... bottom=double_border_side,  ... left=double_border_side)  >>> # Style some cells! >>> spreadsheet["A2"].font = Bold_Font >>> spreadsheet["A3"].font = Big_Red_Text >>> spreadsheet["A4"].alignment = Center_Aligned_Text >>> spreadsheet["A5"].border = Square_Border >>> workbook.save(filename="sample_styles.xlsx") If you want to apply multiple styles to one or several cells in our spreadsheets,you can use  NamedStyle class: >>> from openpyxl.styles import NamedStyle >>> # Let's create a style template for the header row >>> header = NamedStyle(name="header") >>> header.font = Font(bold=True) >>> header.border = Border(bottom=Side(border_style="thin")) >>> header.alignment = Alignment(horizontal="center",vertical="center") >>> # Now let's apply this to all first row (header) cells >>> header_row = sheet[1] >>> for cell in header_row: ... cell.style = header >>> workbook.save(filename="sample_styles.xlsx") Adding Charts to our SpreadsheetCharts are a good way to compute and understand large amounts of data quickly and easily. We have a lot of charts such as bar chart, pie chart, line chart, and so on. Let us start by creating a new workbook with some data:  1 from openpyxl import Workbook   2 from openpyxl.chart import BarChart,Reference   3    4 workbook = Workbook()   5 spreadsheet = workbook.active   6    7 # Let's create some sample sales data  8 rows = [   9    ["Product","Online","Store"],  10    [1,30,45],  11    [2,40,30],  12    [3,40,25],  13    [4,50,30],  14    [5,30,25],  15    [6,25,35],  16    [7,20,40],  17 ]  18   19 for row in rows: 20    spreadsheet .append(row)Now let us create a bar chart that will show the total number of sales per product: 22 chart = BarChart() 23 data = Reference(worksheet=sheet, 24                 min_row=1,  25                 max_row=8,  26                 min_col=2,  27                 max_col=3)  28   29 chart.add_data(data,titles_from_data=True) 30 spreadsheet .add_chart(chart, "E2") 31 32 workbook.save("chart.xlsx")You can also create a line chart by simply making some changes to the data:  1 import random   2 from openpyxl import Workbook   3 from openpyxl.chart import LineChart,Reference   4    5 workbook = Workbook()  6 sheet = workbook.active  7    8 # Let's create some sample sales data   9 rows= [ 10    ["", "January", "February", "March", "April",  11    "May", "June", "July", "August", "September",  12     "October", "November", "December"],  13    [1, ], 14    [2, ], 15    [3, ], 16 ]  17   18 for row in rows:  19    sheet.append(row) 20   21 for row in sheet.iter_rows(min_row=2, 22                           max_row=4, 23                           min_col=2, 24                           max_col=13): 25    for cell in row: 26        cell.value = random.randrange(5,100)There are numerous types of charts and various types of customizations you can apply to your spreadsheet to make it more attractive.Convert Python Classes to Excel SpreadsheetLet us now learn how to convert the Excel Spreadsheet data to Python classes.  Assume we have a database and we use some Object Relational mapping to map the database into Python classes and then export the objects into spreadsheets: from dataclasses import dataclass from typing import List @dataclass class Sale: id: str  quantity: int @dataclass  class Product:  id: str  name: str  sales:List[Sale] Now, to generate some random data, let’s assume that the above classes are stored in  db_classes.py file then:  1 import random   2    3 # Ignore these for now. You'll use them in a sec ;)   4 from openpyxl import Workbook   5 from openpyxl.chart import LineChart,Reference   6    7 from db_classes import Product,Sale   8    9 products_range = []  10   11 # Let's create 5 products 12 for idx in range(1,6): 13    sales = []  14   15    # Create 5 months of sales  16    for_in range(5): 17        sale_range = Sale(quantity=random.randrange(5,100)) 18        sales.append(sale) 19   20    product = Product(id=str(idx), 21                      name="Product %s" % idx, 22                      sales=sales) 23    products_range.append(product)By running this code, we will get 5 products in 5 months of sale with a random quantity of sales for each month. Now, we have  to convert this into a spreadsheet in which we need to iterate over the data: 25 workbook = Workbook()  26 spreadsheet = workbook.active  27 28 # Append column names first  29 spreadsheet.append(["Product ID","Product Name","Month 1",  30              "Month 2","Month 3","Month 4","Month 5"])  31   32 # Append the data  33 for product in products_range: 34    data = [product.id,product.name] 35    for sale in product.sales: 36        data.append(sale.quantity)  37    spreadsheet.append(data) This will create a spreadsheet with some data coming from your database. How to work with pandas to handle Spreadsheets?We have learned to work with Excel in Python because Excel is one of the most popular tools and finding a way to work with Excel is critical. Pandas is a great tool to work with Excel in Python. It has unique methods to read all kinds of data in an Excel file and we can export items back to Excel using it. To use it, at first we need to install pandas package: $ pip install pandas Then, let’s create a simple DataFrame:  1 import pandas as pd   2    3 data = {   4    "Product Name":["Product 1","Product 2"],   5    "Sales Month 1":[10, 20],   6    "Sales Month 2":[5, 35],   7 }   8 dataframe = pd.DataFrame(data)Now we have some data, and to convert it from a DataFrame into a worksheet we generally use .dataframe_to_rows(): 10 from openpyxl import Workbook 11 from openpyxl.utils.dataframe import  dataframe_to_rows  12   13 workbook = Workbook()  14 spreadsheet = workbook.active  15   16 for row in dataframe_to_rows(df, index=False,header=True):  17    spreadsheet .append(row) 18   19 workbook.save("pandas_spreadsheet.xlsx")We need to use  read_excel method to read data from pandas DataFrame object. excel_file =’movies.xls’  movies=pd.read_excel(excel_file) We can also use Excel file class to use multiple sheets from the same excel file: movies_sheets = [] for sheet in xlsx.sheet_names:     movies_sheets.append(xlsx.parse(sheet))     movies = pd.concat(movies_sheets))Indexes and columns allows you to access data from your DataFrame easily: >>> df.columns  Index(['marketplace', 'customer_id', 'review_id', 'product_id',        'product_parent', 'product_title', 'product_category', 'star_rating',         'helpful_votes', 'total_votes', 'vine', 'verified_purchase',         'review_headline', 'review_body', 'review_date'],        dtype='object') >>> # Get first 10 reviews' star rating  >>> df["star_rating"][:10] R3O9SGZBVQBV76    5 RKH8BNC3L5DLF     5  R2HLE8WKZSU3NL    2  R31U3UH5AZ42LL    5  R2SV659OUJ945Y    4  RA51CP8TR5A2L     5  RB2Q7DLDN6TH6     5  R2RHFJV0UYBK3Y    1  R2Z6JOQ94LFHEP    5  RX27XIIWY5JPB     4  Name: star_rating, dtype: int64 >>> # Grab review with id "R2EQL1V1L6E0C9", using the index >>> df.loc["R2EQL1V1L6E0C9"] marketplace               US customer_id         15305006  review_id     R2EQL1V1L6E0C9  product_id        B004LURNO6  product_parent     892860326  review_headline   Five Stars  review_body          Love it  review_date       2015-08-31  Name: R2EQL1V1L6E0C9, dtype: object Summary In this article we have covered: How to extract information from spreadsheets  How to create Spreadsheets in different ways How to customize a spreadsheet by adding filters, styles, or charts and so on How to use pandas to work with spreadsheets Now you are well aware of the different types of implementations you can perform with spreadsheets using Python. However, if you are willing to gather more information on this topic, you can always rely on the official documentation of openpyxl. To gain more knowledge about Python tips and tricks, check out our Python tutorial. To gain mastery over Python coding,join ourPython certification course.  
Rated 4.5/5 based on 22 customer reviews
14108
How to Work with Excel Spreadsheets using Python

Excel is considered as one of the most popular and... Read More

20% Discount