Python Programming
Filter

- by Priyankur Sarkar
- 05th Sep, 2019
- Last updated on 21st Jul, 2020
- 26 mins read

While you are dealing with data, sometimes you may come across a biased dataset. In statistics, bias is whereby the expected value of the results differs from the true underlying quantitative parameter being estimated. Working with such data can be dangerous and can lead you to incorrect conclusions. To learn more about various other concepts of Python, go through our Python Tutorials or enroll to our Python Certification course online.

There are many types of biases such as selection bias, reporting bias, sampling bias and so on. Similarly, rounding bias is related to numeric data. In this article we will see:

- Why is it important to know the ways to round numbers
- How to use various strategies to round numbers
- How data is affected by rounding it
- How to use NumPy arrays and Pandas DataFrames to round numbers

Let us first learn about Python’s built-in rounding process.

Python Programming offers a built-in round() function which rounds off a number to the given number of digits and makes rounding of numbers easier. The function round() accepts two numeric arguments, n and n digits and then returns the number n after rounding it to ndigits. If the number of digits are not provided for round off, the function rounds off the number n to the nearest integer.

Suppose, you want to round off a number, say 4.5. It will be rounded to the nearest whole number which is 5. However, the number 4.74 will be rounded to one decimal place to give 4.7.

It is important to quickly and readily round numbers while you are working with floats which have many decimal places. The inbuilt Python function round() makes it simple and easy.

**Syntax**

round(number, number of digits)

The parameters in the round() function are:

- number - number to be rounded
- number of digits (Optional) - number of digits up to which the given number is to be rounded.

The second parameter is optional. In case, if it is missing then round() function returns:

- For an integer, 12, it rounds off to 12
- For a decimal number, if the last digit after the decimal point is >=5 it will round off to the next whole number, and if <5 it will round off to the floor integer

Let us look into an example where the second parameter is missing.

# For integers print(round(12)) # For floating point print(round(21.7)) print(round(21.4))

The output will be:

12 22 21

Now, if the second parameter is present.

# when the (ndigit+1)th digit is =5 print(round(5.465, 2)) # when the (ndigit+1)th digit is >=5 print(round(5.476, 2)) # when the (ndigit+1)th digit is <5 print(round(5.473, 2))

The output will be:

```
5.46
5.48
5.47
```

**A practical application of round() function**

There is always a mismatch between fractions and decimals. The rounding of functions can be used to handle such cases. While converting fractions to decimals, we generally get many digits after the decimal point such as for ⅙ we get 0.166666667 but we use either two or three digits to the right of the decimal point. This is where the round function saves the day.

For example:

x = 1/3 print(x) print(round(x, 2))

The output will be:

```
0.3333333333333333
0.33
```

**Some errors and exceptions associated with this function**

For example,

print(round("x", 2))

The output will be:

--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-9-6fc428ecf419> in <module>() ----> 1 print(round("x", 2)) TypeError: type str doesn't define __round__ method

Another example,

print(round(1.5)) print(round(2)) print(round(2.5))

The output will be:

```
2
2
2
```

The function round() rounds 1.5 up to 2, and 2.5 down to 2. This is not a bug, the round() function behaves this way. In this article you will learn a few other ways to round a number. Let us look at the variety of methods to round a number.

There are many ways to round a number with its own advantages and disadvantages. Here we will learn some of the techniques to rounding a number.

Truncation, as the name means to shorten things. It is one of the simplest methods to round a number which involves truncating a number to a given number of digits. In this method, each digit after a given position is replaced with 0. Let us look into some examples.

Value | Truncated To | Result |
---|---|---|

19.345 | Tens place | 10 |

19.345 | Ones place | 19 |

19.345 | Tenths place | 19.3 |

19.345 | Hundredths place | 19.34 |

The truncate() function can be used for positive as well as negative numbers:

>>> truncate(19.5) 19.0 >>> truncate(-2.852, 1) -2.8 >>> truncate(2.825, 2) 2.82

The truncate() function can also be used to truncate digits towards the left of the decimal point by passing a negative number.

>>> truncate(235.7, -1) 230.0 >>> truncate(-1936.37, -3) -1000.0

When a positive number is truncated, we are basically rounding it down. Similarly, when we truncate a negative number, the number is rounded up. Let us look at the various rounding methods.

There is another strategy called “rounding up” where a number is rounded up to a specified number of digits. For example:

Value | Round Up To | Result |
---|---|---|

12.345 | Tens place | 20 |

18.345 | Ones place | 19 |

18.345 | Tenths place | 18.4 |

18.345 | Hundredths place | 18.35 |

The term ceiling is used in mathematics to explain the nearest integer which is greater than or equal to a particular given number. In Python, for “rounding up” we use two functions namely,

*ceil()*function, and*math()*function

A non-integer number lies between two consecutive integers. For example, considering a number 5.2, this will lie between 4 and 5. Here, ceiling is the higher endpoint of the interval, whereas floor is the lower one. Therefore, ceiling of 5.2 is 5, and floor of 5.2 is 4. However, the ceiling of 5 is 5.

In Python, the function to implement the ceiling function is the math.ceil() function. It always returns the closest integer which is greater than or equal to its input.

>>> import math >>> math.ceil(5.2) 6 >>> math.ceil(5) 5 >>> math.ceil(-0.5) 0

If you notice you will see that the ceiling of -0.5 is 0, and not -1.

Let us look into a short code to implement the “rounding up” strategy using round_up() function:

def round_up(n, decimals=0): multiplier = 10 ** decimals return math.ceil(n * multiplier) / multiplier

Let’s look at how round_up() function works with various inputs:

>>> round_up(3.1) 4.0 >>> round_up(3.23, 1) 3.3 >>> round_up(3.543, 2) 3.55

You can pass negative values to decimals, just like we did in truncation.

>>> round_up(32.45, -1) 40.0 >>> round_up(3352, -2) 3400

You can follow the diagram below to understand round up and round down. Round up to the right and down to the left.

Rounding up always rounds a number to the right on the number line, and rounding down always rounds a number to the left on the number line.

Similar to rounding up we have another strategy called rounding down where

Value | Rounded Down To | Result |
---|---|---|

19.345 | Tens place | 10 |

19.345 | Ones place | 19 |

19.345 | Tenths place | 19.3 |

19.345 | Hundredths place | 19.34 |

In Python, rounding down can be implemented using a similar algorithm as we truncate or round up. Firstly you will have to shift the decimal point and then round an integer. Lastly shift the decimal point back.

math.ceil() is used to round up to the ceiling of the number once the decimal point is shifted. For “rounding down” we first need to round the floor of the number once the decimal point is shifted.

>>> math.floor(1.2) 1 >>> math.floor(-0.5) -1

Here’s the definition of round_down():

def round_down(n, decimals=0): multiplier = 10 ** decimals return math.floor(n * multiplier) / multiplier

This is quite similar to round_up() function. Here we are using math.floor() instead of math.ceil().

>>> round_down(1.5) 1 >>> round_down(1.48, 1) 1.4 >>> round_down(-0.5) -1

Rounding a number up or down has extreme effects in a large dataset. After rounding up or down, you can actually remove a lot of precision as well as alter computations.

The “rounding half up” strategy rounds every number to the nearest number with the specified precision, and breaks ties by rounding up. Here are some examples:

Value | Round Half Up To | Result |
---|---|---|

19.825 | Tens place | 10 |

19.825 | Ones place | 20 |

19.825 | Tenths place | 19.8 |

19.825 | Hundredths place | 19.83 |

In Python, rounding half up strategy can be implemented by shifting the decimal point to the right by the desired number of places. In this case you will have to determine whether the digit after the shifted decimal point is less than or greater than equal to 5.

You can add 0.5 to the value which is shifted and then round it down with the math.floor() function.

def round_half_up(n, decimals=0): multiplier = 10 ** decimals return math.floor(n*multiplier + 0.5) / multiplier

If you notice you might see that round_half_up() looks similar to round_down. The only difference is to add 0.5 after shifting the decimal point so that the result of rounding down matches with the expected value.

>>> round_half_up(19.23, 1) 19.2 >>> round_half_up(19.28, 1) 19.3 >>> round_half_up(19.25, 1) 19.3

In this method of rounding, it rounds to the nearest number similarly like “rounding half up” method, the difference is that it breaks ties by rounding to the lesser of the two numbers. Here are some examples:

Value | Round Half Down To | Result |
---|---|---|

16.825 | Tens place | 17 |

16.825 | Ones place | 17 |

16.825 | Tenths place | 16.8 |

16.825 | Hundredths place | 16.82 |

In Python, “rounding half down” strategy can be implemented by replacing math.floor() in the round_half_up() function with math.ceil() and then by subtracting 0.5 instead of adding:

def round_half_down(n, decimals=0): multiplier = 10 ** decimals return math.ceil(n*multiplier - 0.5) / multiplier

Let us look into some test cases.

>>> round_half_down(1.5) 1.0 >>> round_half_down(-1.5) -2.0 >>> round_half_down(2.25, 1) 2.2

In general there are no bias for both round_half_up() and round_half_down(). However, rounding of data with more number of ties results in bias. Let us consider an example to understand better.

>>> data = [-2.15, 1.45, 4.35, -12.75]

Let us compute the mean of these numbers:

>>> statistics.mean(data) -2.275

Now let us compute the mean on the data after rounding to one decimal place with round_half_up() and round_half_down():

>>> rhu_data = [round_half_up(n, 1) for n in data] >>> statistics.mean(rhu_data) -2.2249999999999996 >>> rhd_data = [round_half_down(n, 1) for n in data] >>> statistics.mean(rhd_data) -2.325

The round_half_up() function results in a round towards positive infinity bias, and round_half_down() results in a round towards negative infinity bias.

If you have noticed carefully while going through round_half_up() and round_half_down(), neither of the two is symmetric around zero:

>>> round_half_up(1.5) 2.0 >>> round_half_up(-1.5) -1.0 >>> round_half_down(1.5) 1.0 >>> round_half_down(-1.5) -2.0

In order to introduce symmetry, you can always round a tie away from zero. The table mentioned below illustrates it clearly:

Value | Round Half Away From Zero To | Result |
---|---|---|

16.25 | Tens place | 20 |

16.25 | Ones place | 16 |

16.25 | Tenths place | 16.3 |

-16.25 | Tens place | -20 |

-16.25 | Ones place | -16 |

-16.25 | Tenths place | -16.3 |

The implementation of “rounding half away from zero” strategy on a number n is very simple. All you need to do is start as usual by shifting the decimal point to the right a given number of places and then notice the digit d immediately to the right of the decimal place in this new number. Here, there are four cases to consider:

- If n is positive and d >= 5, round up
- If n is positive and d < 5, round down
- If n is negative and d >= 5, round down
- If n is negative and d < 5, round up

After rounding as per the rules mentioned above, you can shift the decimal place back to the left.

There is a question which might come to your mind - How do you handle situations where the number of positive and negative ties are drastically different? The answer to this question brings us full circle to the function that deceived us at the beginning of this article: Python’s built-in round() function.

There is a way to mitigate rounding bias while you are rounding values in a dataset. You can simply round ties to the nearest even number at the desired precision. Let us look at some examples:

Value | Round Half To Even To | Result |
---|---|---|

16.255 | Tens place | 20 |

16.255 | Ones place | 16 |

16.255 | Tenths place | 16.2 |

16.255 | Hundredths place | 16.26 |

To prove that round() really does round to even, let us try on a few different values:

>>> round(4.5) 4 >>> round(3.5) 4 >>> round(1.75, 1) 1.8 >>> round(1.65, 1) 1.6

The decimal module in Python is one of those features of the language which you might not be aware of if you have just started learning Python. Decimal “is based on a floating-point model which was designed with people in mind, and necessarily has a paramount guiding principle – computers must provide an arithmetic that works in the same way as the arithmetic that people learn at school.” – except from the decimal arithmetic specification.

Some of the benefits of the decimal module are mentioned below -

Exact decimal representation: 0.1 is actually 0.1, and 0.1 + 0.1 + 0.1 - 0.3 returns 0, as expected.

Preservation of significant digits: When you add 1.50 and 2.30, the result is 3.80 with the trailing zero maintained to indicate significance.

User-alterable precision: The default precision of the decimal module is twenty-eight digits, but this value can be altered by the user to match the problem at hand.

Let us see how rounding works in the decimal module.

>>> import decimal >>> decimal.getcontext() Context( prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[ InvalidOperation, DivisionByZero, Overflow ] )

The function decimal.getcontext() returns a context object which represents the default context of the decimal module. It also includes the default precision and the default rounding strategy.

In the above example, you will see that the default rounding strategy for the decimal module is ROUND_HALF_EVEN. It allows to align with the built-in round() function

Let us create a new Decimal instance by passing a string containing the desired value and declare a number using the decimal module’s Decimal class.

>>> from decimal import Decimal >>> Decimal("0.1") Decimal('0.1')

You may create a Decimal instance from a floating-point number but in that case, a floating-point representation error will be introduced. For example, this is what happens when you create a Decimal instance from the floating-point number 0.1

>>> Decimal(0.1) Decimal('0.1000000000000000055511151231257827021181583404541015625')

You may create Decimal instances from strings containing the decimal numbers you need in order to maintain exact precision.

**Rounding a Decimal using the .quantize() method:**

>>> Decimal("1.85").quantize(Decimal("1.0")) Decimal('1.8')

The Decimal("1.0") argument in .quantize() allows to determine the number of decimal places in order to round the number. As 1.0 has one decimal place, the number 1.85 rounds to a single decimal place. Rounding half to even is the default strategy, hence the result is 1.8.

Decimal class:

>>> Decimal("2.775").quantize(Decimal("1.00")) Decimal('2.78')

Decimal module provides another benefit. After performing arithmetic the rounding is taken care of automatically and also the significant digits are preserved.

>>> decimal.getcontext().prec = 2 >>> Decimal("2.23") + Decimal("1.12") Decimal('3.4')

To change the default rounding strategy, you can set the decimal.getcontect().rounding property to any one of several flags. The following table summarizes these flags and which rounding strategy they implement:

Flag | Rounding Strategy |
---|---|

decimal.ROUND_CEILING | Rounding up |

decimal.ROUND_FLOOR | Rounding down |

decimal.ROUND_DOWN | Truncation |

decimal.ROUND_UP | Rounding away from zero |

decimal.ROUND_HALF_UP | Rounding half away from zero |

decimal.ROUND_HALF_DOWN | Rounding half towards zero |

decimal.ROUND_HALF_EVEN | Rounding half to even |

decimal.ROUND_05UP | Rounding up and rounding towards zero |

In Data Science and scientific computation, most of the times we store data as a NumPy array. One of the most powerful features of NumPy is the use of vectorization and broadcasting to apply operations to an entire array at once instead of one element at a time.

Let’s generate some data by creating a 3×4 NumPy array of pseudo-random numbers:

>>> import numpy as np >>> np.random.seed(444) >>> data = np.random.randn(3, 4) >>> data array([[ 0.35743992, 0.3775384 , 1.38233789, 1.17554883], [-0.9392757 , -1.14315015, -0.54243951, -0.54870808], [ 0.20851975, 0.21268956, 1.26802054, -0.80730293]])

Here, first we seed the np.random module to reproduce the output easily. Then a 3×4 NumPy array of floating-point numbers is created with np.random.randn().

Do not forget to install pip3 before executing the code mentioned above. If you are using Anaconda you are good to go.

To round all of the values in the data array, pass data as the argument to the np.around() function. The desired number of decimal places is set with the decimals keyword argument. In this case, round half to even strategy is used similar to Python’s built-in round() function.

To round the data in your array to integers, NumPy offers several options which are mentioned below:

The np.ceil() function rounds every value in the array to the nearest integer greater than or equal to the original value:

>>> np.ceil(data) array([[ 1., 1., 2., 2.], [-0., -1., -0., -0.], [ 1., 1., 2., -0.]])

Look at the code carefully, we have a new number! Negative zero! Let us now take a look at Pandas library, widely used in Data Science with Python.

Pandas has been a game-changer for data analytics and data science. The two main data structures in Pandas are Dataframe and Series. Dataframe works like an Excel spreadsheet whereas you can consider Series to be columns in a spreadsheet. Series.round() and DataFrame.round() methods. Let us look at an example.

Do not forget to install pip3 before executing the code mentioned above. If you are using Anaconda you are good to go.

>>> import pandas as pd >>> # Re-seed np.random if you closed your REPL since the last example >>> np.random.seed(444) >>> series = pd.Series(np.random.randn(4)) >>> series 0 0.357440 1 0.377538 2 1.382338 3 1.175549 dtype: float64 >>> series.round(2) 0 0.36 1 0.38 2 1.38 3 1.18 dtype: float64 >>> df = pd.DataFrame(np.random.randn(3, 3), columns=["A", "B", "C"]) >>> df A B C 0 -0.939276 -1.143150 -0.542440 1 -0.548708 0.208520 0.212690 2 1.268021 -0.807303 -3.303072 >>> df.round(3) A B C 0 -0.939 -1.143 -0.542 1 -0.549 0.209 0.213 2 1.268 -0.807 -3.303 The DataFrame.round() method can also accept a dictionary or a Series, to specify a different precision for each column. For instance, the following examples show how to round the first column of df to one decimal place, the second to two, and the third to three decimal places: >>> # Specify column-by-column precision with a dictionary >>> df.round({"A": 1, "B": 2, "C": 3}) A B C 0 -0.9 -1.14 -0.542 1 -0.5 0.21 0.213 2 1.3 -0.81 -3.303 >>> # Specify column-by-column precision with a Series >>> decimals = pd.Series([1, 2, 3], index=["A", "B", "C"]) >>> df.round(decimals) A B C 0 -0.9 -1.14 -0.542 1 -0.5 0.21 0.213 2 1.3 -0.81 -3.303 If you need more rounding flexibility, you can apply NumPy's floor(), ceil(), and print() functions to Pandas Series and DataFrame objects: >>> np.floor(df) A B C 0 -1.0 -2.0 -1.0 1 -1.0 0.0 0.0 2 1.0 -1.0 -4.0 >>> np.ceil(df) A B C 0 -0.0 -1.0 -0.0 1 -0.0 1.0 1.0 2 2.0 -0.0 -3.0 >>> np.rint(df) A B C 0 -1.0 -1.0 -1.0 1 -1.0 0.0 0.0 2 1.0 -1.0 -3.0 The modified round_half_up() function from the previous section will also work here: >>> round_half_up(df, decimals=2) A B C 0 -0.94 -1.14 -0.54 1 -0.55 0.21 0.21 2 1.27 -0.81 -3.30

Now that you have come across most of the rounding techniques, let us learn some of the best practices to make sure we round numbers in the correct way.

Suppose you are dealing with a large set of data, storage can be a problem at times. For example, in an industrial oven you would want to measure the temperature every ten seconds accurate to eight decimal places, using a temperature sensor. These readings will help to avoid large fluctuations which may lead to failure of any heating element or components. We can write a Python script to compare the readings and check for large fluctuations.

There will be a large number of readings as they are being recorded each and everyday. You may consider to maintain three decimal places of precision. But again, removing too much precision may result in a change in the calculation. However, if you have enough space, you can easily store the entire data at full precision. With less storage, it is always better to store at least two or three decimal places of precision which are required for calculation.

In the end, once you are done computing the daily average of the temperature, you may calculate it to the maximum precision available and finally round the result.

Whenever we purchase an item from a particular place, the tax amount paid against the amount of the item depends largely on geographical factors. An item which costs you $2 may cost you less (say $1.8) if you buy the same item from a different state. It is due to regulations set forth by the local government.

In another case, when the minimum unit of currency at the accounting level in a country is smaller than the lowest unit of physical currency, Swedish rounding is done. You can find a list of such rounding methods used by various countries if you look up on the internet.

If you want to design any such software for calculating currencies, keep in mind to check the local laws and regulations applicable in your present location.

As you are rounding numbers in a large datasets used in complex computations, your primary concern should be to limit the growth of the error due to rounding.

In this article we have seen a few methods to round numbers, out of those “rounding half to even” strategy minimizes rounding bias the best. We are lucky to have Python, NumPy, and Pandas already have built-in rounding functions to use this strategy. Here, we have learned about -

- Several rounding strategies, and how to implement in pure Python.
- Every rounding strategy inherently introduces a rounding bias, and the “rounding half to even” strategy mitigates this bias well, most of the time.
- You can round NumPy arrays and Pandas Series and DataFrame objects.

If you enjoyed reading this article and found it to be interesting, leave a comment. To learn more about rounding numbers and other features of Python, join our Python certification course.

8520

- by KnowledgeHut
- 03 Nov 2020
- 9 mins read

Python was created by Guido van Rossum and fir... Read More

8058

- by KnowledgeHut
- 24 Aug 2020
- 8 mins read

What the hack is Context?Have you ever wondered ab... Read More

6458

- by Priyankur Sarkar
- 16 Jul 2019
- 8 mins read

If you are planning to enter the world of Python p... Read More

Subscribe to our newsletter.

## Join the Discussion

Your email address will not be published. Required fields are marked *